Thursday, January 14, 2010

Develop Watchdog Timer

Summary...
Project nature is critical and I have a core P2P service (Windows service in vc++) running at backend. Requirement is that the service code should be taken care of by another watchdog (dog/timer used for guarding) application so that the original service ensures that it is always be in running in normal mode. Watchdog timer restarts the service if on the occurrence of some critical fault, or hang or service became meaningless by neglecting its assigned functionality supposed to do in normal operation.

It is another service with a timer of small interval which ensures following


  1. Is P2P Service running? If not, then start it.
  2. Is P2P Service hanging? If yes, kills it and restarts it. (This is decided if P2P Service is not consuming any CPU usage during the last X times and since the previous check. and/or CPU usage of P2P Service is aboove 80% constantly for X times) 
  3. Is P2P Service already running for X days? If yes, kills it and restarts it.
  4. Is Local IP address changed by the user? If yes, kills it and restarts it. (so that old IP addresses are removed from the tracker and fresh information is exchanged – other peers are able to make connections)
  5. Is new version of UI application and/or P2P Service downloaded by the session? If yes, kills it and install new version and then restarts it.
  6. Is P2P Service neglecting its normal operations/threads? If yes, , kills it and install (This is decided if P2P Service is not giving its benchmark signals to the Watchdog application for certain timeout and for certain times)

The Watchdog Service should maintain a log file of its actions (checking, restarting, killing) and extra information at the point of action, if any.

It should copies the log file of Core P2P Service to another specified location (same name, plus date and time concatenated), so that the information in it will be acrhived.

No comments:

Post a Comment