checkpoint/restart

Neal Becker ndbecker2 at gmail.com
Sun Mar 4 21:02:33 UTC 2007


One of the features I'd really like to see more widely used in the linux
world is checkpoint/restart.  I want it for long-running simulations,
although there may be other uses.

Basic requirement:
1) checkpoint a running process and be able to restart on the same machine. 
Restore open files, including mappings of shared memory.

More advanced:
2) Migrate a process to another machine (probably with same libraries).
3) Handle multiple threads/processes?

I am pleased to find that blcr
http://ftg.lbl.gov/CheckpointRestart/CheckpointRestart.shtml

seems to do just what I need.  It has an srpm and (with minor change)
builds/runs fine of fc6 x86_64.

I have been following this area for last several years.  There have been
several starts at other checkpoint projects, but all except blcr seem to
have died.

I think this is really exciting technology and I'm betting others will want
to see this added to Fedora. 




More information about the fedora-devel-list mailing list