Ticket #1845 (closed defect: fixed)
Opened 2013-04-27T13:34:10-05:00
Last modified 2013-04-28T18:13:57-05:00
Fix ImageJ 1.x mirror
| Reported by: | dscho | Owned by: | dscho | 
|---|---|---|---|
| Priority: | major | Milestone: |  | 
| Component: | Server Admin | Version: | |
| Severity: | serious | Keywords: | |
| Cc: | curtis, justin.senseney@… | Blocked By: | |
| Blocking: | #1705 | 
Description (last modified by dscho)
Since we cannot use rsync, we set up a mirror script. To be nice, we tried to use HEAD requests whenever possible (but quite a few directories do not have index.html files, making HEAD requests impossible). The Jenkins job ran twice a day:
Unfortunately, this was still too much and we were asked to download a large .tgz file with the complete files every single night.
So change the mirror yet again (the fifth iteration now).
The advantage now, of course, is that we get all the files that are there, not just the ones we can reach directly or indirectly via http://imagej.nih.gov/ij/index.html.
To make things a bit nicer for ourselves, let's put things into a Git repository.
So this is what I have done so far:
To determine the best time for this to run (I was told to use "the off hours" with a hint that I should heed both US and EU), I checked the timestamp of the ij.tgz file. From an awfully small n I deduce that the job is run at half past midnight by cron and that it runs for a little less than four minutes. My best bet was to leave things at when they used to run: five past one in the morning (local time, which is still one and a half hours after the cronjob starts). And I removed the noon mirroring which now means that whenever there are changes on the website, the mirror is out-of-date for most of the day.
All of this can be found here: