checkpoint/restart
Neal Becker
ndbecker2 at gmail.com
Sun Mar 4 21:02:33 UTC 2007
One of the features I'd really like to see more widely used in the linux
world is checkpoint/restart. I want it for long-running simulations,
although there may be other uses.
Basic requirement:
1) checkpoint a running process and be able to restart on the same machine.
Restore open files, including mappings of shared memory.
More advanced:
2) Migrate a process to another machine (probably with same libraries).
3) Handle multiple threads/processes?
I am pleased to find that blcr
http://ftg.lbl.gov/CheckpointRestart/CheckpointRestart.shtml
seems to do just what I need. It has an srpm and (with minor change)
builds/runs fine of fc6 x86_64.
I have been following this area for last several years. There have been
several starts at other checkpoint projects, but all except blcr seem to
have died.
I think this is really exciting technology and I'm betting others will want
to see this added to Fedora.
More information about the fedora-devel-list
mailing list