After the copy, there is a directory size mismatch.

Cameron Simpson cs at zip.com.au
Mon Jun 5 05:36:56 UTC 2006


On 05Jun2006 09:14, nilesh vaghela <nileshj.vaghela at gmail.com> wrote:
| It is also possible, though unlikely, that there were "sparse" files in
| your original directory; they may not be sparse in the new copy.
| ----------------------
| In that case can we compare the word count or charactor count ??

Regarding telling that a file is sparse: No, they will look the same
(after all, a copy of a directory where the files do not have the same
content as the original is not a very useful copy, is it). Sparse files
are very rare; they are files which have a large logical size, but have
only ever been written to at specific places in the file.  You can imagine
a data hash table where only particular slots have data in them yet.

On UNIX, you create a sparse file by opening the file, seek()ing to a long
way into it and then writing something. The OS allocated some storage to
hold the data your have written, but no storage for the area of the file
before it. When you read those areas they appear to be blocks of zeroes.

You can't really tell from the outside whether a file is sparse or
geniunely has lots of blocks of storage filled with zero bytes - the
behaviour is the same. This is why a copy will normally not be sparse;
the copy program just reads and writes data, unaware that a lot of the
empty data from the sparse file is being made up by the OS on demand,
not coming from a disc block.

The only real clue is things like the "du" command, which read the
"blocks" field of the inode; that reports the actual storage consumed
by the file rather than just its size.

However, it is unlikely that you have any sparse files.

Most likely the discrepancies you see are due to directory internal
structure.

Cheers,
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

The top three answers:  Yes I *am* going to a fire!
                        Oh! We're using *kilometers* per hour now.
                        I have to go that fast to get back to my own time.
- Peter Harper <bo165 at FreeNet.Carleton.CA>




More information about the redhat-list mailing list