Wget options for regular site updates

Warning: this is an utterly uninteresting post. However, I need to have that information available somewhere handy. Sorry for the noise!

The RINO Noord-Holland web site gets its course information updated from the RINO’s office system 4 times a day, action triggered by a cron entry on our office server, Lancelot.

Here is the cron entry with the Wget options that work nicely:

22 8,13,18,23 * * * raphael wget -q --spider http://www.rino.nl/UPDATE_FILE

Cron options:

  • 22: run on the 22nd minute of the hour
  • 8,13,18,23: run at 8 AM and 1, 6 and 11 PM (still on the 22nd minute, of course)
  • *: run every day of the month
  • *: run every month
  • *: run every day of the week
  • raphael: run as user raphael
  • wget… what to run

Wget options:

  • -q: quiet, which avoids cron sending my Unix user an email
  • –spider: do not download the page, only check it’s there

Then for verification purposes, the file at UPDATE_FILE is responsible for writing logs. It makes more sense to log at that point because that’s the place where the actual action takes place: you don’t care so much about knowing that your cron task did run, if for some reason (site down?) the updating in fact didn’t take place.

