From y-takahashi at gmo-hs.com  Fri Mar  4 06:52:01 2011
From: y-takahashi at gmo-hs.com (GMO-HS Yoichi Takahashi)
Date: Fri, 04 Mar 2011 15:52:01 +0900
Subject: minus disk usage
Message-ID: <20110304155201.895B.A9C031E0@gmo-hs.com>

Hi,This is Yoichi Takahashi

I have a trouble on the ext3 filesystem.
The display changes whenever the df command is executed. ?at short intervals?
It is a minus display for the following, and normal displays. 
see below /dev/sda2

Filesystem          Size  Used Avail Use% Mounted on
/dev/sda2              97G -345M   93G   0% /
/dev/sda1              99M   15M   80M  16% /boot
tmpfs                 2.0G     0  2.0G   0% /dev/shm
/dev/sda3             803G  2.5G  759G   1% /home

Filesystem          Size  Used Avail Use% Mounted on
/dev/sda2              97G  1.2G   91G   2% /
/dev/sda1              99M   15M   80M  16% /boot
tmpfs                 2.0G     0  2.0G   0% /dev/shm
/dev/sda3             803G  2.5G  759G   1% /home

Filesystem          Size  Used Avail Use% Mounted on
/dev/sda2              97G  448M   92G   1% /
/dev/sda1              99M   15M   80M  16% /boot
tmpfs                 2.0G     0  2.0G   0% /dev/shm
/dev/sda3             803G  2.5G  759G   1% /home

Filesystem          Size  Used Avail Use% Mounted on
/dev/sda2              97G -109M   92G   0% /
/dev/sda1              99M   15M   80M  16% /boot
tmpfs                 2.0G     0  2.0G   0% /dev/shm
/dev/sda3             803G  2.5G  759G   1% /home

The load is always a high server. 
LoadAverage is always 3or4 in the server

Dose anyone know why this happned ?
Any ideas be appreciated.


??????????????????????????????
GMO?????? & ??????????
???23?4?1???GMO???????????????????

??????????????
????????????????????????  

??????Youichi Takahashi

?150-8512?????????26?1?????????
Cerulean Tower
26-1 Sakuragaoka-cho,Shibuya-ku,Tokyo (150-8512) Japan

TEL                 +81-3-6415-7075
FAX   ?            +81-3-6415-6108
E-MAIL              y-takahashi at gmo-hs.com
URL                  http://www.gmo-hs.com
STOCK CODE          ??????3788????????
??????????????????????????????


From scerveau at awox.com  Fri Mar  4 10:52:35 2011
From: scerveau at awox.com (Stephane Cerveau)
Date: Fri, 4 Mar 2011 11:52:35 +0100
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>
	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
Message-ID: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>

Dear all,


I?m formatting a specific MSC key in ext3. This key sizes 4GB.

When I copy a file which is more than 1GB ( 1024MB), I got many errors ?Ext3-fs error ( device sda1): ext3_free_blocks_sb: bit already cleared for block xxxx? when I try to remove it after having sync the copy.

Do you know why I could have this kind of error?

Why the problem appears only on this kind of key ?

What can I do to identify the problem?

It seems that there is no error in the FS (e2fsck done successfully) before removing the key.


Best regards.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20110304/fdf8d077/attachment.htm>

From scerveau at awox.com  Fri Mar  4 15:45:39 2011
From: scerveau at awox.com (Stephane Cerveau)
Date: Fri, 4 Mar 2011 16:45:39 +0100
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>
	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
Message-ID: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>

Dear all,


It seems that if I change the size of blocks to 2048 by mkfs.ext3 ?b 2048 /dev/sda1, the problem does not appear.


Is there a way to know by advance what is the best block size for an external device ?

BR


Stephane


From: Stephane Cerveau [mailto:scerveau at awox.com]
Sent: vendredi 4 mars 2011 11:53
To: Andreas Dilger
Cc: ext3-users at redhat.com
Subject: ext3_free_blocks_sb when removing a more than 1GB file


Dear all,


I?m formatting a specific MSC key in ext3. This key sizes 4GB.

When I copy a file which is more than 1GB ( 1024MB), I got many errors ?Ext3-fs error ( device sda1): ext3_free_blocks_sb: bit already cleared for block xxxx? when I try to remove it after having sync the copy.

Do you know why I could have this kind of error?

Why the problem appears only on this kind of key ?

What can I do to identify the problem?

It seems that there is no error in the FS (e2fsck done successfully) before removing the key.


Best regards.


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5924 (20110303) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5924 (20110303) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20110304/7b515879/attachment.htm>

From sandeen at redhat.com  Fri Mar  4 16:26:12 2011
From: sandeen at redhat.com (Eric Sandeen)
Date: Fri, 04 Mar 2011 10:26:12 -0600
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
Message-ID: <4D7112A4.6050209@redhat.com>

On 3/4/11 9:45 AM, Stephane Cerveau wrote:
> Dear all,
> 
> It seems that if I change the size of blocks to 2048 by mkfs.ext3 ?b
> 2048 /dev/sda1, the problem does not appear.
> 
> Is there a way to know by advance what is the best block size for an
> external device ?
> 
> BR

Sounds like a storage problem; not a filesystem problem, something
to do with the flash behaving badly.

-Eric


From scerveau at awox.com  Fri Mar  4 17:33:23 2011
From: scerveau at awox.com (Stephane Cerveau)
Date: Fri, 4 Mar 2011 18:33:23 +0100
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <4D7112A4.6050209@redhat.com>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>
	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
Message-ID: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>

Hello,


I checked the storage that seems to be ok ( check bad block) and I still have the problem.
I did the test on vfat and I don?t have the problem.
I'm using a 2.6.23 kernel ?
When the ext3 fs is considered as stable ?

BR

-----Original Message-----
From: Eric Sandeen [mailto:sandeen at redhat.com]
Sent: vendredi 4 mars 2011 17:26
To: Stephane Cerveau
Cc: ext3-users at redhat.com; Tristan Pateloup
Subject: Re: ext3_free_blocks_sb when removing a more than 1GB file

On 3/4/11 9:45 AM, Stephane Cerveau wrote:
> Dear all,
>
> It seems that if I change the size of blocks to 2048 by mkfs.ext3 ?b
> 2048 /dev/sda1, the problem does not appear.
>
> Is there a way to know by advance what is the best block size for an
> external device ?
>
> BR

Sounds like a storage problem; not a filesystem problem, something
to do with the flash behaving badly.

-Eric


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5925 (20110304) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5926 (20110304) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


From sandeen at redhat.com  Fri Mar  4 17:44:38 2011
From: sandeen at redhat.com (Eric Sandeen)
Date: Fri, 04 Mar 2011 11:44:38 -0600
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
Message-ID: <4D712506.4070405@redhat.com>

On 3/4/11 11:33 AM, Stephane Cerveau wrote:
> Hello,
> 
> 
> I checked the storage that seems to be ok ( check bad block) and I
> still have the problem. I did the test on vfat and I don?t have the
> problem. I'm using a 2.6.23 kernel ? When the ext3 fs is considered
> as stable ?

I don't mean that it's an IO error or a bad block, but perhaps a behavioral
problem with the flash; maybe not syncing properly before it's powered
off, etc.  If it only happens with some USB drives, I do not think
it is an ext3 issue.

Your original problem report may not have been totally clear so maybe
I misunderstand.

Can you show exactly what you did, and exactly what error messages
you received, keeping in mind that any IO type errors that don't
say "ext3" are also relevant?

-Eric

> BR
> 
> -----Original Message----- From: Eric Sandeen
> [mailto:sandeen at redhat.com] Sent: vendredi 4 mars 2011 17:26 To:
> Stephane Cerveau Cc: ext3-users at redhat.com; Tristan Pateloup Subject:
> Re: ext3_free_blocks_sb when removing a more than 1GB file
> 
> On 3/4/11 9:45 AM, Stephane Cerveau wrote:
>> Dear all,
>> 
>> It seems that if I change the size of blocks to 2048 by mkfs.ext3
>> ?b 2048 /dev/sda1, the problem does not appear.
>> 
>> Is there a way to know by advance what is the best block size for
>> an external device ?
>> 
>> BR
> 
> Sounds like a storage problem; not a filesystem problem, something to
> do with the flash behaving badly.
> 
> -Eric
> 
> 
> __________ Information from ESET NOD32 Antivirus, version of virus
> signature database 5925 (20110304) __________
> 
> The message was checked by ESET NOD32 Antivirus.
> 
> http://www.eset.com
> 
> 
> 
> __________ Information from ESET NOD32 Antivirus, version of virus
> signature database 5926 (20110304) __________
> 
> The message was checked by ESET NOD32 Antivirus.
> 
> http://www.eset.com
> 


From scerveau at awox.com  Fri Mar  4 17:54:23 2011
From: scerveau at awox.com (Stephane Cerveau)
Date: Fri, 4 Mar 2011 18:54:23 +0100
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <4D712506.4070405@redhat.com>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>
	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
	<4D712506.4070405@redhat.com>
Message-ID: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>

Hi,

Thanks for your answer.
Here is my steps:

- mkfs.ext3 /dev/sda1
- mount /dev/sda1 /mnt/usb
- dd if=/dev/zero of=/mnt/usb/test_file bs=1M count=1025   ( the size is important)
- sync
- rm /mnt/usb/test_file

Then many errors appears "Ext3-fs error ( device sda1): ext3_free_blocks_sb: bit already cleared for block xxxx"

I tried to umount/mount the storage but its not working also.
I tried to check the device before removing the file, not working also.
Indeed with another usb key it's working...
I'm using a kernel 2.6.23

The problem does NOT appear with mkfs.ext2 /dev/sda1 before

What do you advise to do ?

BR

Stephane.


-----Original Message-----
From: Eric Sandeen [mailto:sandeen at redhat.com]
Sent: vendredi 4 mars 2011 18:45
To: Stephane Cerveau
Cc: ext3-users at redhat.com; Tristan Pateloup
Subject: Re: ext3_free_blocks_sb when removing a more than 1GB file

On 3/4/11 11:33 AM, Stephane Cerveau wrote:
> Hello,
>
>
> I checked the storage that seems to be ok ( check bad block) and I
> still have the problem. I did the test on vfat and I don?t have the
> problem. I'm using a 2.6.23 kernel ? When the ext3 fs is considered
> as stable ?

I don't mean that it's an IO error or a bad block, but perhaps a behavioral
problem with the flash; maybe not syncing properly before it's powered
off, etc.  If it only happens with some USB drives, I do not think
it is an ext3 issue.

Your original problem report may not have been totally clear so maybe
I misunderstand.

Can you show exactly what you did, and exactly what error messages
you received, keeping in mind that any IO type errors that don't
say "ext3" are also relevant?

-Eric

> BR
>
> -----Original Message----- From: Eric Sandeen
> [mailto:sandeen at redhat.com] Sent: vendredi 4 mars 2011 17:26 To:
> Stephane Cerveau Cc: ext3-users at redhat.com; Tristan Pateloup Subject:
> Re: ext3_free_blocks_sb when removing a more than 1GB file
>
> On 3/4/11 9:45 AM, Stephane Cerveau wrote:
>> Dear all,
>>
>> It seems that if I change the size of blocks to 2048 by mkfs.ext3
>> ?b 2048 /dev/sda1, the problem does not appear.
>>
>> Is there a way to know by advance what is the best block size for
>> an external device ?
>>
>> BR
>
> Sounds like a storage problem; not a filesystem problem, something to
> do with the flash behaving badly.
>
> -Eric
>
>
> __________ Information from ESET NOD32 Antivirus, version of virus
> signature database 5925 (20110304) __________
>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
>
> __________ Information from ESET NOD32 Antivirus, version of virus
> signature database 5926 (20110304) __________
>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5926 (20110304) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5926 (20110304) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


From sandeen at redhat.com  Fri Mar  4 17:55:52 2011
From: sandeen at redhat.com (Eric Sandeen)
Date: Fri, 04 Mar 2011 11:55:52 -0600
Subject: minus disk usage
In-Reply-To: <20110304155201.895B.A9C031E0@gmo-hs.com>
References: <20110304155201.895B.A9C031E0@gmo-hs.com>
Message-ID: <4D7127A8.6070302@redhat.com>

On 3/4/11 12:52 AM, GMO-HS Yoichi Takahashi wrote:
> Hi,This is Yoichi Takahashi
> 
> I have a trouble on the ext3 filesystem.
> The display changes whenever the df command is executed. ?at short intervals?
> It is a minus display for the following, and normal displays. 
> see below /dev/sda2
> 
> Filesystem          Size  Used Avail Use% Mounted on
> /dev/sda2              97G -345M   93G   0% /
> /dev/sda1              99M   15M   80M  16% /boot
> tmpfs                 2.0G     0  2.0G   0% /dev/shm
> /dev/sda3             803G  2.5G  759G   1% /home
> 
> Filesystem          Size  Used Avail Use% Mounted on
> /dev/sda2              97G  1.2G   91G   2% /
> /dev/sda1              99M   15M   80M  16% /boot
> tmpfs                 2.0G     0  2.0G   0% /dev/shm
> /dev/sda3             803G  2.5G  759G   1% /home
> 
> Filesystem          Size  Used Avail Use% Mounted on
> /dev/sda2              97G  448M   92G   1% /
> /dev/sda1              99M   15M   80M  16% /boot
> tmpfs                 2.0G     0  2.0G   0% /dev/shm
> /dev/sda3             803G  2.5G  759G   1% /home
> 
> Filesystem          Size  Used Avail Use% Mounted on
> /dev/sda2              97G -109M   92G   0% /
> /dev/sda1              99M   15M   80M  16% /boot
> tmpfs                 2.0G     0  2.0G   0% /dev/shm
> /dev/sda3             803G  2.5G  759G   1% /home
> 
> The load is always a high server. 
> LoadAverage is always 3or4 in the server
> 
> Dose anyone know why this happned ?
> Any ideas be appreciated.
> 

It may be changing because files are being added & removed
at the time?  As for the negative...

What kernel and what coreutils are you using?

stat -f / would let us know what the kernel is returning;
I am guessing that this is a bug in coreutils related to
the handling of reserved-for-root space...

If you can try the test again, but do:

# df -B 4096 /; stat -f /

(assuming 4k fs blocksize)

a few times, and see what those return.

-Eric


From sandeen at redhat.com  Fri Mar  4 17:59:08 2011
From: sandeen at redhat.com (Eric Sandeen)
Date: Fri, 04 Mar 2011 11:59:08 -0600
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
	<4D712506.4070405@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>
Message-ID: <4D71286C.2080408@redhat.com>

On 3/4/11 11:54 AM, Stephane Cerveau wrote:
> Hi,
> 
> Thanks for your answer.
> Here is my steps:
> 
> - mkfs.ext3 /dev/sda1
> - mount /dev/sda1 /mnt/usb
> - dd if=/dev/zero of=/mnt/usb/test_file bs=1M count=1025   ( the size is important)
> - sync
> - rm /mnt/usb/test_file

Ok, I had the impression that you were removing the usb key at
some point in the test, but I guess not.

> Then many errors appears "Ext3-fs error ( device sda1): ext3_free_blocks_sb: bit already cleared for block xxxx"
> 
> I tried to umount/mount the storage but its not working also.
> I tried to check the device before removing the file, not working also.

you mean that umount/mount/rm gives the same error?  As does umount/fsck/mount/rm ?

> Indeed with another usb key it's working...
> I'm using a kernel 2.6.23
> 
> The problem does NOT appear with mkfs.ext2 /dev/sda1 before
> 
> What do you advise to do ?

Try a much newer kernel, first of all, to see if it's a known, fixed bug.

But since it works on another usb key, I still tend to blame the hardware.
"bit already cleared" makes it sound like it is reading zeros when it
should not be.

-Eric
 
> BR
> 
> Stephane.


From scerveau at awox.com  Fri Mar  4 18:07:34 2011
From: scerveau at awox.com (Stephane Cerveau)
Date: Fri, 4 Mar 2011 19:07:34 +0100
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <4D71286C.2080408@redhat.com>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>
	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
	<4D712506.4070405@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>
	<4D71286C.2080408@redhat.com>
Message-ID: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B4A0@TENERIFE.awox.com>

:) I'm not removing the key during the test.

Yes "umount/mount/rm", "umount/fsck/mount/rm" and "rm" give the same error...

I have several keys from the same brand, model and I have the same issue.

When I said, a different key, it was a different brand.

At the end, it seems that ext2 is working fine!

So maybe a problem in ext3 in 2.6.23 kernel ?!?
I had a try on 2.6.32_27, I did not succeed to reproduce the issue.

Do you know when ext3 is supposed to be stable ?

BR

Stephane


-----Original Message-----
From: Eric Sandeen [mailto:sandeen at redhat.com]
Sent: vendredi 4 mars 2011 18:59
To: Stephane Cerveau
Cc: ext3-users at redhat.com; Tristan Pateloup
Subject: Re: ext3_free_blocks_sb when removing a more than 1GB file

On 3/4/11 11:54 AM, Stephane Cerveau wrote:
> Hi,
>
> Thanks for your answer.
> Here is my steps:
>
> - mkfs.ext3 /dev/sda1
> - mount /dev/sda1 /mnt/usb
> - dd if=/dev/zero of=/mnt/usb/test_file bs=1M count=1025   ( the size is important)
> - sync
> - rm /mnt/usb/test_file

Ok, I had the impression that you were removing the usb key at
some point in the test, but I guess not.

> Then many errors appears "Ext3-fs error ( device sda1): ext3_free_blocks_sb: bit already cleared for block xxxx"
>
> I tried to umount/mount the storage but its not working also.
> I tried to check the device before removing the file, not working also.

you mean that umount/mount/rm gives the same error?  As does umount/fsck/mount/rm ?

> Indeed with another usb key it's working...
> I'm using a kernel 2.6.23
>
> The problem does NOT appear with mkfs.ext2 /dev/sda1 before
>
> What do you advise to do ?

Try a much newer kernel, first of all, to see if it's a known, fixed bug.

But since it works on another usb key, I still tend to blame the hardware.
"bit already cleared" makes it sound like it is reading zeros when it
should not be.

-Eric

> BR
>
> Stephane.


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5926 (20110304) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5926 (20110304) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


From sandeen at redhat.com  Fri Mar  4 18:13:45 2011
From: sandeen at redhat.com (Eric Sandeen)
Date: Fri, 04 Mar 2011 12:13:45 -0600
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B4A0@TENERIFE.awox.com>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
	<4D712506.4070405@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>
	<4D71286C.2080408@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B4A0@TENERIFE.awox.com>
Message-ID: <4D712BD9.7090000@redhat.com>

On 3/4/11 12:07 PM, Stephane Cerveau wrote:
> :) I'm not removing the key during the test.
> 
> Yes "umount/mount/rm", "umount/fsck/mount/rm" and "rm" give the same error...
> 
> I have several keys from the same brand, model and I have the same issue.
> 
> When I said, a different key, it was a different brand.
> 
> At the end, it seems that ext2 is working fine!

It may well have different IO patterns.

> So maybe a problem in ext3 in 2.6.23 kernel ?!?
> I had a try on 2.6.32_27, I did not succeed to reproduce the issue.
> 
> Do you know when ext3 is supposed to be stable ?

heh, 2.4.x or so.

You could bisect kernel versions and see if you can arrive at a change
that fixed it.

I'd also find some tools to do more extensive IO testing on your usb
key, I still think it might be a hardware problem since only some
brand/models are affected.

-Eric

> BR
> 
> Stephane
> 
> 
> -----Original Message-----
> From: Eric Sandeen [mailto:sandeen at redhat.com]
> Sent: vendredi 4 mars 2011 18:59
> To: Stephane Cerveau
> Cc: ext3-users at redhat.com; Tristan Pateloup
> Subject: Re: ext3_free_blocks_sb when removing a more than 1GB file
> 
> On 3/4/11 11:54 AM, Stephane Cerveau wrote:
>> Hi,
>>
>> Thanks for your answer.
>> Here is my steps:
>>
>> - mkfs.ext3 /dev/sda1
>> - mount /dev/sda1 /mnt/usb
>> - dd if=/dev/zero of=/mnt/usb/test_file bs=1M count=1025   ( the size is important)
>> - sync
>> - rm /mnt/usb/test_file
> 
> Ok, I had the impression that you were removing the usb key at
> some point in the test, but I guess not.
> 
>> Then many errors appears "Ext3-fs error ( device sda1): ext3_free_blocks_sb: bit already cleared for block xxxx"
>>
>> I tried to umount/mount the storage but its not working also.
>> I tried to check the device before removing the file, not working also.
> 
> you mean that umount/mount/rm gives the same error?  As does umount/fsck/mount/rm ?
> 
>> Indeed with another usb key it's working...
>> I'm using a kernel 2.6.23
>>
>> The problem does NOT appear with mkfs.ext2 /dev/sda1 before
>>
>> What do you advise to do ?
> 
> Try a much newer kernel, first of all, to see if it's a known, fixed bug.
> 
> But since it works on another usb key, I still tend to blame the hardware.
> "bit already cleared" makes it sound like it is reading zeros when it
> should not be.
> 
> -Eric
> 
>> BR
>>
>> Stephane.
> 
> 
> 
> 
> __________ Information from ESET NOD32 Antivirus, version of virus signature database 5926 (20110304) __________
> 
> The message was checked by ESET NOD32 Antivirus.
> 
> http://www.eset.com
> 
> 
> 
> __________ Information from ESET NOD32 Antivirus, version of virus signature database 5926 (20110304) __________
> 
> The message was checked by ESET NOD32 Antivirus.
> 
> http://www.eset.com
> 


From adilger at dilger.ca  Fri Mar  4 22:17:55 2011
From: adilger at dilger.ca (Andreas Dilger)
Date: Fri, 4 Mar 2011 15:17:55 -0700
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B4A0@TENERIFE.awox.com>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>
	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
	<4D712506.4070405@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>
	<4D71286C.2080408@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B4A0@TENERIFE.awox.com>
Message-ID: <38CFAFAE-F09D-4FE7-AFD1-C79D2844144C@dilger.ca>

On 2011-03-04, at 11:07 AM, Stephane Cerveau wrote:
> I have several keys from the same brand, model and I have the same issue.
> 
> When I said, a different key, it was a different brand.

I would typically blame the USB key.  Some cheap vendors use unreliable chips, and sometimes even mis-label e.g. 1GB flash as 2GB.

> At the end, it seems that ext2 is working fine!

Except I don't think ext2 is doing this bitmap validation at runtime, like ext3/4 is doing.

I'm not sure whether "badblocks" is verifying that the storage is behaving correctly (i.e. correct block addressing), or only whether it is able to write/read a particular sector on disk.

You could use a more advanced block device verification tool, like llverdev from Lustre, which writes a unique test pattern to every block, and then reads it back afterward.

> So maybe a problem in ext3 in 2.6.23 kernel ?!?
> I had a try on 2.6.32_27, I did not succeed to reproduce the issue.
> 
> Do you know when ext3 is supposed to be stable ?

For 10+ years already.

> -----Original Message-----
> From: Eric Sandeen [mailto:sandeen at redhat.com]
> Sent: vendredi 4 mars 2011 18:59
> To: Stephane Cerveau
> Cc: ext3-users at redhat.com; Tristan Pateloup
> Subject: Re: ext3_free_blocks_sb when removing a more than 1GB file
> 
> On 3/4/11 11:54 AM, Stephane Cerveau wrote:
>> Hi,
>> 
>> Thanks for your answer.
>> Here is my steps:
>> 
>> - mkfs.ext3 /dev/sda1
>> - mount /dev/sda1 /mnt/usb
>> - dd if=/dev/zero of=/mnt/usb/test_file bs=1M count=1025   ( the size is important)
>> - sync
>> - rm /mnt/usb/test_file
> 
> Ok, I had the impression that you were removing the usb key at
> some point in the test, but I guess not.
> 
>> Then many errors appears "Ext3-fs error ( device sda1): ext3_free_blocks_sb: bit already cleared for block xxxx"
>> 
>> I tried to umount/mount the storage but its not working also.
>> I tried to check the device before removing the file, not working also.
> 
> you mean that umount/mount/rm gives the same error?  As does umount/fsck/mount/rm ?
> 
>> Indeed with another usb key it's working...
>> I'm using a kernel 2.6.23
>> 
>> The problem does NOT appear with mkfs.ext2 /dev/sda1 before
>> 
>> What do you advise to do ?
> 
> Try a much newer kernel, first of all, to see if it's a known, fixed bug.
> 
> But since it works on another usb key, I still tend to blame the hardware.
> "bit already cleared" makes it sound like it is reading zeros when it
> should not be.
> 
> -Eric
> 
>> BR
>> 
>> Stephane.
> 
> 
> 
> 
> __________ Information from ESET NOD32 Antivirus, version of virus signature database 5926 (20110304) __________
> 
> The message was checked by ESET NOD32 Antivirus.
> 
> http://www.eset.com
> 
> 
> 
> __________ Information from ESET NOD32 Antivirus, version of virus signature database 5926 (20110304) __________
> 
> The message was checked by ESET NOD32 Antivirus.
> 
> http://www.eset.com
> 
> 
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users


Cheers, Andreas


From alex at alex.org.uk  Sat Mar  5 09:21:38 2011
From: alex at alex.org.uk (Alex Bligh)
Date: Sat, 05 Mar 2011 09:21:38 +0000
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>
	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
	<4D712506.4070405@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>
Message-ID: <026CC28AC9A32E0A7FD2D20F@nimrod.local>


--On 4 March 2011 18:54:23 +0100 Stephane Cerveau <scerveau at awox.com> wrote:

> Then many errors appears "Ext3-fs error ( device sda1):
> ext3_free_blocks_sb: bit already cleared for block xxxx"
>
> I tried to umount/mount the storage but its not working also.
> I tried to check the device before removing the file, not working also.
> Indeed with another usb key it's working...
> I'm using a kernel 2.6.23

If it's that old, perhaps it is
 http://lkml.org/lkml/2008/11/14/121
fixed by
 http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.29
in 2.6.29
 commit 7ef0d7377cb287e08f3ae94cebc919448e1f5dff
I think.

I am interested in this particular error. We see it very occasionally
on 2.6.31 in an environment where we can be sure no underlying I/O
error occurred (because it's on a VM whose dom0 uses iSCSI mapped
to the domU's disk) and we would see error logging. It is normally
during intense disk activity (unlike the OP), such as running
"aptitude update", often while unlinking a file. It does not
appear to happen on ext4. Unfortunately the result is that the disk
goes readonly. Our current theory is that the disk got
damaged in some way during a previous unclean shutdown that fsck
did not fix. Is that possible?

-- 
Alex Bligh


From scerveau at awox.com  Sat Mar  5 15:52:33 2011
From: scerveau at awox.com (Stephane Cerveau)
Date: Sat, 5 Mar 2011 16:52:33 +0100
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <38CFAFAE-F09D-4FE7-AFD1-C79D2844144C@dilger.ca>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>
	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
	<4D712506.4070405@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>
	<4D71286C.2080408@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B4A0@TENERIFE.awox.com>,
	<38CFAFAE-F09D-4FE7-AFD1-C79D2844144C@dilger.ca>
Message-ID: <B670FA1DB6CA6E4A945001BA08E200CE4D5B89D786@TENERIFE.awox.com>

Hello,

Thanks for your answer,  i will have a look to this tool and test my hardware ...

I have many similar keys and the problem appears systematically on these devices ...
So yes i could blame the hardware but it seems to be validated by the provide and I tried on a desktop linux ( embedded system problem) with a 2.6.32 and I dont have the issue.

I dont have the problem neither, when I change the block size of ext3. But I thnik that the performance can be decreased ( 4096 to 2048 ).

BR

Stephane
________________________________________
De : Andreas Dilger [adilger at dilger.ca]
Date d'envoi : vendredi 4 mars 2011 23:17
? : Stephane Cerveau
Cc : Eric Sandeen; ext3-users at redhat.com; Tristan Pateloup
Objet : Re: ext3_free_blocks_sb when removing a more than 1GB file

On 2011-03-04, at 11:07 AM, Stephane Cerveau wrote:
> I have several keys from the same brand, model and I have the same issue.
>
> When I said, a different key, it was a different brand.

I would typically blame the USB key.  Some cheap vendors use unreliable chips, and sometimes even mis-label e.g. 1GB flash as 2GB.

> At the end, it seems that ext2 is working fine!

Except I don't think ext2 is doing this bitmap validation at runtime, like ext3/4 is doing.

I'm not sure whether "badblocks" is verifying that the storage is behaving correctly (i.e. correct block addressing), or only whether it is able to write/read a particular sector on disk.

You could use a more advanced block device verification tool, like llverdev from Lustre, which writes a unique test pattern to every block, and then reads it back afterward.

> So maybe a problem in ext3 in 2.6.23 kernel ?!?
> I had a try on 2.6.32_27, I did not succeed to reproduce the issue.
>
> Do you know when ext3 is supposed to be stable ?

For 10+ years already.

> -----Original Message-----
> From: Eric Sandeen [mailto:sandeen at redhat.com]
> Sent: vendredi 4 mars 2011 18:59
> To: Stephane Cerveau
> Cc: ext3-users at redhat.com; Tristan Pateloup
> Subject: Re: ext3_free_blocks_sb when removing a more than 1GB file
>
> On 3/4/11 11:54 AM, Stephane Cerveau wrote:
>> Hi,
>>
>> Thanks for your answer.
>> Here is my steps:
>>
>> - mkfs.ext3 /dev/sda1
>> - mount /dev/sda1 /mnt/usb
>> - dd if=/dev/zero of=/mnt/usb/test_file bs=1M count=1025   ( the size is important)
>> - sync
>> - rm /mnt/usb/test_file
>
> Ok, I had the impression that you were removing the usb key at
> some point in the test, but I guess not.
>
>> Then many errors appears "Ext3-fs error ( device sda1): ext3_free_blocks_sb: bit already cleared for block xxxx"
>>
>> I tried to umount/mount the storage but its not working also.
>> I tried to check the device before removing the file, not working also.
>
> you mean that umount/mount/rm gives the same error?  As does umount/fsck/mount/rm ?
>
>> Indeed with another usb key it's working...
>> I'm using a kernel 2.6.23
>>
>> The problem does NOT appear with mkfs.ext2 /dev/sda1 before
>>
>> What do you advise to do ?
>
> Try a much newer kernel, first of all, to see if it's a known, fixed bug.
>
> But since it works on another usb key, I still tend to blame the hardware.
> "bit already cleared" makes it sound like it is reading zeros when it
> should not be.
>
> -Eric
>
>> BR
>>
>> Stephane.
>
>
>
>
> __________ Information from ESET NOD32 Antivirus, version of virus signature database 5926 (20110304) __________
>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
>
> __________ Information from ESET NOD32 Antivirus, version of virus signature database 5926 (20110304) __________
>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users


Cheers, Andreas


From scerveau at awox.com  Sat Mar  5 15:52:50 2011
From: scerveau at awox.com (Stephane Cerveau)
Date: Sat, 5 Mar 2011 16:52:50 +0100
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <026CC28AC9A32E0A7FD2D20F@nimrod.local>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>
	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
	<4D712506.4070405@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>,
	<026CC28AC9A32E0A7FD2D20F@nimrod.local>
Message-ID: <B670FA1DB6CA6E4A945001BA08E200CE4D5B89D787@TENERIFE.awox.com>

Hello Alex,

With a brand new key, I had the issue after formatting it, copying the file and erasing the file without any shutdown or any trouble...

I will have a look into the commit you stipulated in your email and let you know ...

Stephane
________________________________________
De : Alex Bligh [alex at alex.org.uk]
Date d'envoi : samedi 5 mars 2011 10:21
? : Stephane Cerveau; Eric Sandeen
Cc : ext3-users at redhat.com; Tristan Pateloup; Alex Bligh
Objet : RE: ext3_free_blocks_sb when removing a more than 1GB file

--On 4 March 2011 18:54:23 +0100 Stephane Cerveau <scerveau at awox.com> wrote:

> Then many errors appears "Ext3-fs error ( device sda1):
> ext3_free_blocks_sb: bit already cleared for block xxxx"
>
> I tried to umount/mount the storage but its not working also.
> I tried to check the device before removing the file, not working also.
> Indeed with another usb key it's working...
> I'm using a kernel 2.6.23

If it's that old, perhaps it is
 http://lkml.org/lkml/2008/11/14/121
fixed by
 http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.29
in 2.6.29
 commit 7ef0d7377cb287e08f3ae94cebc919448e1f5dff
I think.

I am interested in this particular error. We see it very occasionally
on 2.6.31 in an environment where we can be sure no underlying I/O
error occurred (because it's on a VM whose dom0 uses iSCSI mapped
to the domU's disk) and we would see error logging. It is normally
during intense disk activity (unlike the OP), such as running
"aptitude update", often while unlinking a file. It does not
appear to happen on ext4. Unfortunately the result is that the disk
goes readonly. Our current theory is that the disk got
damaged in some way during a previous unclean shutdown that fsck
did not fix. Is that possible?

--
Alex Bligh


From samuel at bcgreen.com  Sat Mar  5 18:50:43 2011
From: samuel at bcgreen.com (Stephen Samuel)
Date: Sat, 5 Mar 2011 10:50:43 -0800
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <38CFAFAE-F09D-4FE7-AFD1-C79D2844144C@dilger.ca>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>
	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
	<4D712506.4070405@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>
	<4D71286C.2080408@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B4A0@TENERIFE.awox.com>
	<38CFAFAE-F09D-4FE7-AFD1-C79D2844144C@dilger.ca>
Message-ID: <AANLkTi=2ewnj_-s_QXj7MF9D+7NRQ4MRh86jdzB5vBN7@mail.gmail.com>

On Fri, Mar 4, 2011 at 2:17 PM, Andreas Dilger <adilger at dilger.ca> wrote:

> On 2011-03-04, at 11:07 AM, Stephane Cerveau wrote:
> > I have several keys from the same brand, model and I have the same issue.
> >
> > When I said, a different key, it was a different brand.
>
> I would typically blame the USB key.  Some cheap vendors use unreliable
> chips, and sometimes even mis-label e.g. 1GB flash as 2GB.
>
> > At the end, it seems that ext2 is working fine!
>
> Except I don't think ext2 is doing this bitmap validation at runtime, like
> ext3/4 is doing.
>
> I'm not sure whether "badblocks" is verifying that the storage is behaving
> correctly (i.e. correct block addressing), or only whether it is able to
> write/read a particular sector on disk.
>

You could use a more advanced block device verification tool, like llverdev
> from Lustre, which writes a unique test pattern to every block, and then
> reads it back afterward.
>

Quick test, in the meantime:

badblocks -n -t0xffff /dev/the_thumb_drive

-n is non-destructive.  -w is destructive of data.

then I'd try '-n -trandom -p5'

If you don't mind losing the data (I don't think you do), then use -w,
rather than -n.

-- 
Stephen Samuel http://www.bcgreen.com  Software, like love,
778-861-7641                              grows when you give it away
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20110305/f7a01684/attachment.htm>

From y-takahashi at gmo-hs.com  Mon Mar  7 12:00:33 2011
From: y-takahashi at gmo-hs.com (GMO-HS Yoichi Takahashi)
Date: Mon, 07 Mar 2011 21:00:33 +0900
Subject: minus disk usage
In-Reply-To: <4D7127A8.6070302@redhat.com>
References: <20110304155201.895B.A9C031E0@gmo-hs.com>
	<4D7127A8.6070302@redhat.com>
Message-ID: <20110307210032.01AA.A9C031E0@gmo-hs.com>

Hi Eric 

Thank you for your prompt reply.

> It may be changing because files are being added & removed
> at the time?  As for the negative...

I imagine you're right

> What kernel and what coreutils are you using?
 2.6.18-8.el5PAE #1 SMP Thu Mar 15 20:29:51 EDT 2007 i686 i686 i386 GNU/Linux

Name        : coreutils                    Relocations: (not relocatable)
Version     : 5.97                              Vendor: CentOS
Release     : 23.el5_4.1                    Build Date: Tue Oct 27 11:12:41 2009
Install Date: Wed Feb  9 17:41:18 2011      Build Host: builder16.centos.org
Group       : System Environment/Base       Source RPM: coreutils-5.97-23.el5_4.1.src.rpm
Size        : 9053932                          License: GPLv2+
Signature   : DSA/SHA1, Tue Oct 27 23:47:24 2009, Key ID a8a447dce8562897
URL         : http://www.gnu.org/software/coreutils/
Summary     : The GNU core utilities: a set of tools commonly used in shell scripts
Description :
These are the GNU core utilities.  This package is the combination of
the old GNU fileutils, sh-utils, and textutils packages.

Filesystem           4K-blocks      Used Available Use% Mounted on
/dev/sda2             25393143   2685539  21396901  12% /
  File: "/"
    ID: 0        Namelen: 255     Type: ext2/ext3
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 25393143   Free: 22707604   Available: 21396901
Inodes: Total: 26214400   Free: 26037500

I did many things,The server has recovered. 
Something about this bothers me
Let's bring this matter to a close.
You will hear from me again.


From scerveau at awox.com  Mon Mar  7 15:05:12 2011
From: scerveau at awox.com (Stephane Cerveau)
Date: Mon, 7 Mar 2011 16:05:12 +0100
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <026CC28AC9A32E0A7FD2D20F@nimrod.local>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>
	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
	<4D712506.4070405@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>
	<026CC28AC9A32E0A7FD2D20F@nimrod.local>
Message-ID: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B4F6@TENERIFE.awox.com>

I tried to integrate this patch but it still does not work.

http://gitorious.org/opensuse/kernel-source/commit/9f62d21f70e77298018f63c72e6d10a621ee6dcf

I don't know how to debug it and don't understand why it happens only with large files.
Is there anyone who can help me or advise me on how I could debug it ...Get some log or anything...:)

I have to say that I'm working on an embedded system with a SH4 processor...

Thanks

St?phane.

-----Original Message-----
From: Alex Bligh [mailto:alex at alex.org.uk]
Sent: samedi 5 mars 2011 10:22
To: Stephane Cerveau; Eric Sandeen
Cc: ext3-users at redhat.com; Tristan Pateloup; Alex Bligh
Subject: RE: ext3_free_blocks_sb when removing a more than 1GB file


--On 4 March 2011 18:54:23 +0100 Stephane Cerveau <scerveau at awox.com> wrote:

> Then many errors appears "Ext3-fs error ( device sda1):
> ext3_free_blocks_sb: bit already cleared for block xxxx"
>
> I tried to umount/mount the storage but its not working also.
> I tried to check the device before removing the file, not working also.
> Indeed with another usb key it's working...
> I'm using a kernel 2.6.23

If it's that old, perhaps it is
 http://lkml.org/lkml/2008/11/14/121
fixed by
 http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.29
in 2.6.29
 commit 7ef0d7377cb287e08f3ae94cebc919448e1f5dff
I think.

I am interested in this particular error. We see it very occasionally
on 2.6.31 in an environment where we can be sure no underlying I/O
error occurred (because it's on a VM whose dom0 uses iSCSI mapped
to the domU's disk) and we would see error logging. It is normally
during intense disk activity (unlike the OP), such as running
"aptitude update", often while unlinking a file. It does not
appear to happen on ext4. Unfortunately the result is that the disk
goes readonly. Our current theory is that the disk got
damaged in some way during a previous unclean shutdown that fsck
did not fix. Is that possible?

--
Alex Bligh


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5931 (20110306) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5933 (20110307) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


From y-takahashi at gmo-hs.com  Thu Mar  3 05:53:16 2011
From: y-takahashi at gmo-hs.com (GMO-HS Yoichi Takahashi)
Date: Thu, 03 Mar 2011 14:53:16 +0900
Subject: minus disk usage
Message-ID: <20110303145315.3883.A9C031E0@gmo-hs.com>

Hi,This is Yoichi Takahashi

I have a trouble on the ext3 filesystem.
The display changes whenever the df command is executed. ?at short intervals?
It is a minus display for the following, and normal displays. 
see below /dev/sda2

Filesystem          Size  Used Avail Use% Mounted on
/dev/sda2              97G -345M   93G   0% /
/dev/sda1              99M   15M   80M  16% /boot
tmpfs                 2.0G     0  2.0G   0% /dev/shm
/dev/sda3             803G  2.5G  759G   1% /home

Filesystem          Size  Used Avail Use% Mounted on
/dev/sda2              97G  1.2G   91G   2% /
/dev/sda1              99M   15M   80M  16% /boot
tmpfs                 2.0G     0  2.0G   0% /dev/shm
/dev/sda3             803G  2.5G  759G   1% /home

Filesystem          Size  Used Avail Use% Mounted on
/dev/sda2              97G  448M   92G   1% /
/dev/sda1              99M   15M   80M  16% /boot
tmpfs                 2.0G     0  2.0G   0% /dev/shm
/dev/sda3             803G  2.5G  759G   1% /home

Filesystem          Size  Used Avail Use% Mounted on
/dev/sda2              97G -109M   92G   0% /
/dev/sda1              99M   15M   80M  16% /boot
tmpfs                 2.0G     0  2.0G   0% /dev/shm
/dev/sda3             803G  2.5G  759G   1% /home

The load is always a high server. 
LoadAverage is always 3or4 in the server

Dose anyone know why this happned ?
Any ideas be appreciated.


??????????????????????????????
GMO?????? & ??????????
???23?4?1???GMO???????????????????

??????????????
????????????????????????  

??????Youichi Takahashi

?150-8512?????????26?1?????????
Cerulean Tower
26-1 Sakuragaoka-cho,Shibuya-ku,Tokyo (150-8512) Japan

TEL                 +81-3-6415-7075
FAX   ?            +81-3-6415-6108
E-MAIL              y-takahashi at gmo-hs.com
URL                  http://www.gmo-hs.com
STOCK CODE          ??????3788????????
??????????????????????????????


From scerveau at awox.com  Tue Mar  8 15:24:46 2011
From: scerveau at awox.com (Stephane Cerveau)
Date: Tue, 8 Mar 2011 16:24:46 +0100
Subject: ext3_free_blocks_sb when removing a more than 1GB file
In-Reply-To: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B4F6@TENERIFE.awox.com>
References: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B16F@TENERIFE.awox.com>
	<B45A1386-C838-4A69-AB63-39612158C465@dilger.ca>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B437@TENERIFE.awox.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B47D@TENERIFE.awox.com>
	<4D7112A4.6050209@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49A@TENERIFE.awox.com>
	<4D712506.4070405@redhat.com>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B49D@TENERIFE.awox.com>
	<026CC28AC9A32E0A7FD2D20F@nimrod.local>
	<B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B4F6@TENERIFE.awox.com>
Message-ID: <B670FA1DB6CA6E4A945001BA08E200CE4D5BA4B570@TENERIFE.awox.com>

Dear all,

First of all, it seems that I don't have any trouble with 2048 block size. I did a test with random size from 1024 to 2048 MB and I did not have the issue.
Do you know which drawback I can have using this size ? ( except the speed ??)

Concerning the key using 4096, it seems that I have some trouble also on a regular desktop with 2.6.23 and 2.6.28 kernel from ubuntu dist (live cd).
But I don't have the issue on a ubuntu 2.6.32.27 generic kernel.

Best regards.

Stephane

-----Original Message-----
From: Stephane Cerveau [mailto:scerveau at awox.com]
Sent: lundi 7 mars 2011 16:05
To: Alex Bligh; Eric Sandeen
Cc: ext3-users at redhat.com; Tristan Pateloup
Subject: RE: ext3_free_blocks_sb when removing a more than 1GB file

I tried to integrate this patch but it still does not work.

http://gitorious.org/opensuse/kernel-source/commit/9f62d21f70e77298018f63c72e6d10a621ee6dcf

I don't know how to debug it and don't understand why it happens only with large files.
Is there anyone who can help me or advise me on how I could debug it ...Get some log or anything...:)

I have to say that I'm working on an embedded system with a SH4 processor...

Thanks

St?phane.

-----Original Message-----
From: Alex Bligh [mailto:alex at alex.org.uk]
Sent: samedi 5 mars 2011 10:22
To: Stephane Cerveau; Eric Sandeen
Cc: ext3-users at redhat.com; Tristan Pateloup; Alex Bligh
Subject: RE: ext3_free_blocks_sb when removing a more than 1GB file


--On 4 March 2011 18:54:23 +0100 Stephane Cerveau <scerveau at awox.com> wrote:

> Then many errors appears "Ext3-fs error ( device sda1):
> ext3_free_blocks_sb: bit already cleared for block xxxx"
>
> I tried to umount/mount the storage but its not working also.
> I tried to check the device before removing the file, not working also.
> Indeed with another usb key it's working...
> I'm using a kernel 2.6.23

If it's that old, perhaps it is
 http://lkml.org/lkml/2008/11/14/121
fixed by
 http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.29
in 2.6.29
 commit 7ef0d7377cb287e08f3ae94cebc919448e1f5dff
I think.

I am interested in this particular error. We see it very occasionally
on 2.6.31 in an environment where we can be sure no underlying I/O
error occurred (because it's on a VM whose dom0 uses iSCSI mapped
to the domU's disk) and we would see error logging. It is normally
during intense disk activity (unlike the OP), such as running
"aptitude update", often while unlinking a file. It does not
appear to happen on ext4. Unfortunately the result is that the disk
goes readonly. Our current theory is that the disk got
damaged in some way during a previous unclean shutdown that fsck
did not fix. Is that possible?

--
Alex Bligh


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5931 (20110306) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5933 (20110307) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


_______________________________________________
Ext3-users mailing list
Ext3-users at redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5933 (20110307) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


__________ Information from ESET NOD32 Antivirus, version of virus signature database 5936 (20110308) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


From dshaw at JABBERWOCKY.COM  Tue Mar 15 22:42:35 2011
From: dshaw at JABBERWOCKY.COM (David Shaw)
Date: Tue, 15 Mar 2011 18:42:35 -0400
Subject: Using stride on non-RAID
Message-ID: <DD985080-ACC8-495C-B360-E8858B59A0D0@JABBERWOCKY.COM>

Hello,

I understand the need for a proper stride setting when formatting a filesystem on a RAID device.  However, is there any problem in using a stride setting when formatting a filesystem on a regular non-RAID, non-SSD, just plain-vanilla-single-disk block device?  I'm sure there isn't any benefit to it, but I'm curious if there is any harm.

The reason I ask is I'm looking at some code here that can be used on either RAID or non-RAID devices.  The stride setting it has is correct for the particular RAID setup it is intended for, but it also uses those settings when formatting a non-RAID device.

David


From sandeen at redhat.com  Tue Mar 15 22:53:55 2011
From: sandeen at redhat.com (Eric Sandeen)
Date: Tue, 15 Mar 2011 17:53:55 -0500
Subject: Using stride on non-RAID
In-Reply-To: <DD985080-ACC8-495C-B360-E8858B59A0D0@JABBERWOCKY.COM>
References: <DD985080-ACC8-495C-B360-E8858B59A0D0@JABBERWOCKY.COM>
Message-ID: <4D7FEE03.3020809@redhat.com>

On 3/15/11 5:42 PM, David Shaw wrote:
> Hello,
> 
> I understand the need for a proper stride setting when formatting a
> filesystem on a RAID device.  However, is there any problem in using
> a stride setting when formatting a filesystem on a regular non-RAID,
> non-SSD, just plain-vanilla-single-disk block device?  I'm sure there
> isn't any benefit to it, but I'm curious if there is any harm.
> 
> The reason I ask is I'm looking at some code here that can be used on
> either RAID or non-RAID devices.  The stride setting it has is
> correct for the particular RAID setup it is intended for, but it also
> uses those settings when formatting a non-RAID device.
> 
> David

just FWIW, recent kernels & e2fsprogs will just automatically pick
stride based on storage geometry - for md/lvm at least, and for
scsi devices that export this geometry as well.

ext4 has a little stripe-awareness in its allocator; otherwise, stride
just staggers bitmap starts so they don't all end up on the same spindle; [1]
Offhand I don't think it'd cause any harm to set stride on non-raid.

-Eric

[1] ext2fs_allocate_group_table() in lib/ext2fs/alloc_tables.c


From dshaw at jabberwocky.com  Wed Mar 16 18:02:34 2011
From: dshaw at jabberwocky.com (David Shaw)
Date: Wed, 16 Mar 2011 14:02:34 -0400
Subject: Using stride on non-RAID
In-Reply-To: <4D7FEE03.3020809@redhat.com>
References: <DD985080-ACC8-495C-B360-E8858B59A0D0@JABBERWOCKY.COM>
	<4D7FEE03.3020809@redhat.com>
Message-ID: <84874DA1-AB26-413B-9496-D1CD7986FDAE@jabberwocky.com>

On Mar 15, 2011, at 6:53 PM, Eric Sandeen wrote:

> On 3/15/11 5:42 PM, David Shaw wrote:
>> Hello,
>> 
>> I understand the need for a proper stride setting when formatting a
>> filesystem on a RAID device.  However, is there any problem in using
>> a stride setting when formatting a filesystem on a regular non-RAID,
>> non-SSD, just plain-vanilla-single-disk block device?  I'm sure there
>> isn't any benefit to it, but I'm curious if there is any harm.
>> 
>> The reason I ask is I'm looking at some code here that can be used on
>> either RAID or non-RAID devices.  The stride setting it has is
>> correct for the particular RAID setup it is intended for, but it also
>> uses those settings when formatting a non-RAID device.
>> 
>> David
> 
> just FWIW, recent kernels & e2fsprogs will just automatically pick
> stride based on storage geometry - for md/lvm at least, and for
> scsi devices that export this geometry as well.
> 
> ext4 has a little stripe-awareness in its allocator; otherwise, stride
> just staggers bitmap starts so they don't all end up on the same spindle; [1]
> Offhand I don't think it'd cause any harm to set stride on non-raid.

Thanks very much for your pointers.  It's a nice enhancement that this is done automatically now.

David


From jidong.xiao at gmail.com  Sat Mar 26 23:20:08 2011
From: jidong.xiao at gmail.com (Jidong Xiao)
Date: Sat, 26 Mar 2011 19:20:08 -0400
Subject: Ext3: Why data=journal is better than data=ordered when data needs to
	be read from and written to disk at the same time
Message-ID: <AANLkTi=By4aBsN4bygBncPw4pyuz8uLqRcNKSBTWi=OJ@mail.gmail.com>

Hi,

I see many literatures mentioned this, but I have never seen any one
explains it in detail.(Although this link exposed the original story:
http://lkml.indiana.edu/hypermail//linux/kernel/0107.1/0364.html)

"Journal mode: This mode is the slowest except when data needs to be
read from and written to disk at the same time where it outperform all
others mode."

Since this is pretty counter-intuitive, I believe many people are not
aware about the root cause, thus I won't be the last one to ask this
same question. Can any one kindly explain it so as to make it more
clear? Thank you!

Regards
Jidong


From tytso at mit.edu  Sat Mar 26 23:53:11 2011
From: tytso at mit.edu (Ted Ts'o)
Date: Sat, 26 Mar 2011 19:53:11 -0400
Subject: Ext3: Why data=journal is better than data=ordered when data
	needs to be read from and written to disk at the same time
In-Reply-To: <AANLkTi=By4aBsN4bygBncPw4pyuz8uLqRcNKSBTWi=OJ@mail.gmail.com>
References: <AANLkTi=By4aBsN4bygBncPw4pyuz8uLqRcNKSBTWi=OJ@mail.gmail.com>
Message-ID: <20110326235311.GB21075@thunk.org>

On Sat, Mar 26, 2011 at 07:20:08PM -0400, Jidong Xiao wrote:
> Hi,
> 
> I see many literatures mentioned this, but I have never seen any one
> explains it in detail.(Although this link exposed the original story:
> http://lkml.indiana.edu/hypermail//linux/kernel/0107.1/0364.html)
> 
> "Journal mode: This mode is the slowest except when data needs to be
> read from and written to disk at the same time where it outperform all
> others mode."

I didn't see any reference to that in that mail thread (which seemed
to be mostly about reiserfs).  It is true that you have a bursty,
fsync-heavy workload, you can reduce latency by using data=journal
mode, because it avoids seeks --- the data and metadata blocks are
written into the journal, and this allows the fsync() to finish more
quickly.  There are some applications where this might be useful, such
as NFS file serving, where the NFS server is not allowed to send an
acknowledgement back to the client until the data is written to stable
store.

	  	    	 	      	    	 - Ted


From jidong.xiao at gmail.com  Sun Mar 27 00:25:23 2011
From: jidong.xiao at gmail.com (Jidong Xiao)
Date: Sat, 26 Mar 2011 20:25:23 -0400
Subject: Ext3: Why data=journal is better than data=ordered when data
	needs to be read from and written to disk at the same time
In-Reply-To: <20110326235311.GB21075@thunk.org>
References: <AANLkTi=By4aBsN4bygBncPw4pyuz8uLqRcNKSBTWi=OJ@mail.gmail.com>
	<20110326235311.GB21075@thunk.org>
Message-ID: <AANLkTi=NMSQJqzYW6RqRicRQQo8t5ad_KriCoUs4kAzm@mail.gmail.com>

On Sat, Mar 26, 2011 at 7:53 PM, Ted Ts'o <tytso at mit.edu> wrote:
> On Sat, Mar 26, 2011 at 07:20:08PM -0400, Jidong Xiao wrote:
>> Hi,
>>
>> I see many literatures mentioned this, but I have never seen any one
>> explains it in detail.(Although this link exposed the original story:
>> http://lkml.indiana.edu/hypermail//linux/kernel/0107.1/0364.html)
>>
>> "Journal mode: This mode is the slowest except when data needs to be
>> read from and written to disk at the same time where it outperform all
>> others mode."
>
> I didn't see any reference to that in that mail thread (which seemed
> to be mostly about reiserfs). ?It is true that you have a bursty,
> fsync-heavy workload, you can reduce latency by using data=journal
> mode, because it avoids seeks --- the data and metadata blocks are
> written into the journal, and this allows the fsync() to finish more
> quickly. ?There are some applications where this might be useful, such
> as NFS file serving, where the NFS server is not allowed to send an
> acknowledgement back to the client until the data is written to stable
> store.
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - Ted
>

Well, this first time when Andrew Morton claimed that data=journal
better than data=ordered in certain conditions was when he announced
the release of ext3-2.4-0.9.4:

http://www.redhat.com/archives/ext3-users/2001-July/msg00169.html

And the link I provided in the original email actually is source or
background of this story. This release was immediately after the
previous discussion.

But my question is, why data=journal could outperform data=ordered,
for the data=journal mode, you have to write the data and metadata
blocks into the journal, but for the data=ordered mode, you only have
to write the metadata blocks into the journal. If, in some certain
cases, the former mode can avoid seeks, then the same behavior should
apply to the latter mode. So it's really odd that the former mode can
outperform the latter mode.

Regards
Jidong


From tytso at mit.edu  Sun Mar 27 02:44:10 2011
From: tytso at mit.edu (Ted Ts'o)
Date: Sat, 26 Mar 2011 22:44:10 -0400
Subject: Ext3: Why data=journal is better than data=ordered when data
	needs to be read from and written to disk at the same time
In-Reply-To: <AANLkTi=NMSQJqzYW6RqRicRQQo8t5ad_KriCoUs4kAzm@mail.gmail.com>
References: <AANLkTi=By4aBsN4bygBncPw4pyuz8uLqRcNKSBTWi=OJ@mail.gmail.com>
	<20110326235311.GB21075@thunk.org>
	<AANLkTi=NMSQJqzYW6RqRicRQQo8t5ad_KriCoUs4kAzm@mail.gmail.com>
Message-ID: <20110327024410.GC21075@thunk.org>

On Sat, Mar 26, 2011 at 08:25:23PM -0400, Jidong Xiao wrote:
> 
> But my question is, why data=journal could outperform data=ordered,
> for the data=journal mode, you have to write the data and metadata
> blocks into the journal, but for the data=ordered mode, you only have
> to write the metadata blocks into the journal. If, in some certain
> cases, the former mode can avoid seeks, then the same behavior should
> apply to the latter mode. So it's really odd that the former mode can
> outperform the latter mode.

When executing an fsync(), in data=ordered mode you have to write the
data data blocks into the journal and wait for the data blocks to be
written.  This requires generally will require extra seeks.  In
data=journaled mode, the data blocks can be written directly into the
sjoujournal without needing to seek.

Of course eventually the data and metadata blocks will need to be
written to their permanent locations before the journal space can be
reused.  But for short bursty write patterns, the fsync() latency will
be much smaller in data=journal mode.

Regards,

						- Ted


From jidong.xiao at gmail.com  Sun Mar 27 04:52:21 2011
From: jidong.xiao at gmail.com (Jidong Xiao)
Date: Sun, 27 Mar 2011 00:52:21 -0400
Subject: Ext3: Why data=journal is better than data=ordered when data
	needs to be read from and written to disk at the same time
In-Reply-To: <20110327024410.GC21075@thunk.org>
References: <AANLkTi=By4aBsN4bygBncPw4pyuz8uLqRcNKSBTWi=OJ@mail.gmail.com>
	<20110326235311.GB21075@thunk.org>
	<AANLkTi=NMSQJqzYW6RqRicRQQo8t5ad_KriCoUs4kAzm@mail.gmail.com>
	<20110327024410.GC21075@thunk.org>
Message-ID: <AANLkTik-NLpXh8q6t=RRJ8HZ+366zJ-Ot16Lq=J_zTkv@mail.gmail.com>

On Sat, Mar 26, 2011 at 10:44 PM, Ted Ts'o <tytso at mit.edu> wrote:
> On Sat, Mar 26, 2011 at 08:25:23PM -0400, Jidong Xiao wrote:
>>
>> But my question is, why data=journal could outperform data=ordered,
>> for the data=journal mode, you have to write the data and metadata
>> blocks into the journal, but for the data=ordered mode, you only have
>> to write the metadata blocks into the journal. If, in some certain
>> cases, the former mode can avoid seeks, then the same behavior should
>> apply to the latter mode. So it's really odd that the former mode can
>> outperform the latter mode.
>
> When executing an fsync(), in data=ordered mode you have to write the
> data data blocks into the journal and wait for the data blocks to be
> written. ?This requires generally will require extra seeks. ?In
> data=journaled mode, the data blocks can be written directly into the
> sjoujournal without needing to seek.
>
> Of course eventually the data and metadata blocks will need to be
> written to their permanent locations before the journal space can be
> reused. ?But for short bursty write patterns, the fsync() latency will
> be much smaller in data=journal mode.
>

Thank you Ted, it is really helpful!

So the difference is:
data=ordered mode: fsync() will return only if the meta data blocks
have been written into the journal and the data blocks have been
written into the disk.
data=journal mode: fsync() returns if the meta data and data have been
written into the journal. The journal is contiguous, so data=journal
mode means no seeking needed, therefore, fsync() would return more
quicker.

If, we perform read from and write to the disk simultaneously, like
following example:

First, write data to the filesystem as quickly as possible:

Rapid writing

while true
do
	dd if=/dev/zero of=largefile bs=16384 count=131072
done

While data was being written to the test filesystem, read 16Mb of data
from the same filesystem on the same disk, timing the results:

Reading a 16Mb file

time cat 16-meg-file > /dev/null

In this case, if we conduct the experiment in data=journal mode and
data=ordered mode respectively, since write latency is much smaller in
data=journal mode, the disk will focus more on the read operation,
hence, the read operation will also finish earlier than it do in the
data=ordered mode. Am I understanding correctly?

Regards
Jidong


From pg_ext3 at ext3.for.sabi.co.UK  Mon Mar 28 16:43:20 2011
From: pg_ext3 at ext3.for.sabi.co.UK (Peter Grandi)
Date: Mon, 28 Mar 2011 17:43:20 +0100
Subject: Ext3: Why data=journal is better than data=ordered when data
	needs to be read from and written to disk at the same time
In-Reply-To: <AANLkTik-NLpXh8q6t=RRJ8HZ+366zJ-Ot16Lq=J_zTkv@mail.gmail.com>
References: <AANLkTi=By4aBsN4bygBncPw4pyuz8uLqRcNKSBTWi=OJ@mail.gmail.com>
	<20110326235311.GB21075@thunk.org>
	<AANLkTi=NMSQJqzYW6RqRicRQQo8t5ad_KriCoUs4kAzm@mail.gmail.com>
	<20110327024410.GC21075@thunk.org>
	<AANLkTik-NLpXh8q6t=RRJ8HZ+366zJ-Ot16Lq=J_zTkv@mail.gmail.com>
Message-ID: <19856.47784.421703.81840@tree.ty.sabi.co.UK>

[ ... ]

>> When executing an fsync(), in data=ordered mode you have to
>> write the data data blocks into the journal and wait for the
>> data blocks to be written.  This requires generally will
>> require extra seeks.  In data=journaled mode, the data blocks
>> can be written directly into the sjoujournal without needing
>> to seek.

>> Of course eventually the data and metadata blocks will need
>> to be written to their permanent locations before the journal
>> space can be reused.  But for short bursty write patterns,
>> the fsync() latency will be much smaller in data=journal
>> mode.

>  [ ... ]

> In this case, if we conduct the experiment in data=journal
> mode and data=ordered mode respectively,

That experiment is not necessarily demonstrative, it depends on
RAM caching, elevator, ...

> since write latency is much smaller in data=journal mode,

Write latency is actually much longer: because it requires *two*
writes instead of one. It is *fsync* latency as mentioned above
that is smaller, because it depends only on the first write to
what is in effect a small log based filesystem. This distinction
matters a great deal, because it is the reason why "short bursty
write patterns" is the qualification above. For long write
patterns things are very different as the journal eventually
fills up. For any given size it will also fill up a lot faster
for 'data=journal'.

Ahhh while writing that I have just realized that large journals
can be a bad idea especially for metadata operations. Will have
to think more about that.

> the disk will focus more on the read operation, hence, the
> read operation will also finish earlier than it do in the
> data=ordered mode. Am I understanding correctly?

That again depends on a lot of things, including caching, the
elevator, flusher behaviour, exactly where the files are...

ALso, whether the journal is on the same drive as the filesystem
or another drive can matter enormously; also whether for example
the journal is on SSD or battery backed RAM. There are reasons
why 'ext2' still quite outperforms 'ext3' on simple tests.