[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: InstantMirror needs a rethink

Ian Burrell wrote:
On Jan 23, 2008 4:02 PM, Warren Togami <wtogami redhat com> wrote:
- Synchronization/locking of multiple connections downloading the same
file is awkward and broken.

I think a locking scheme on the files could solve this problem.  The
normal file would always be the complete downloaded file.  The first
downloading process would create the temp file and lock it.  When it
finishes, it moves it to the real file and unlocks it.  Any other
downloading processes see the locked temp file and wait for it to be
unlocked.  An unlocked temp file indicates a download failure.
Waiters would have to start over if there was a failure.  Things would
be more complicated if we want the waiters to stream the partial file
as it is downloaded.

- There is no good way to clean up aborted tmp files.
- There is no good way to know what are old files that need pruning.
- There is no good way of keeping track of the "Big Picture" of its own
cache, "least recently used" knowing what files were unpopular locally
and should be pruned.

These could be solved with a cache cleaning script.  The script would
remove aborted (ie unlocked) tmp files.  The least-recently-used can
be determined with the atime of the files.

An alternative is to store some metadata about the cached files in a
separate database.  Berkeley DB or SQLite would work as would
per-directory or per-file data files.  This would make the most sense
if the Etag and Last-Modified-Tiome need to be stored for the caching
to work correctly.  It could also store the last-accessed-time.
Locking on the entries would be required and that would provide the
locking for simultaneous downloads.

 - Ian

These might be good ideas, and many of which were theorized in the InstantMirror wiki. If you want to continue InstantMirror development then please submit patches to hosted project. If they look sane maybe you can take over development.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]