From Arun.Somasundaram at honeywell.com  Tue Jun  5 16:31:00 2007
From: Arun.Somasundaram at honeywell.com (Somasundaram, Arun (IE10))
Date: Tue, 5 Jun 2007 22:01:00 +0530
Subject: Help on ext3 file system corruption issue
Message-ID: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com>

Hi All,

  I m a novice developer of Linux applications. Recently I faced a file
system corruption. (I guess)

I have a Kernel 2.4.7-10 with ext3 file system in compact flash. The
system was up for 3 months and was running with average load conditions.

One fine day, it just started sending kernel messages on the serial
console. The message was like this.

 

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unab

le to read inode block - inode=20089, block=81926

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20090, block=81926

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20091, block=81926

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20092, block=81926

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20093, block=81926

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20094, block=81926

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20095, block=81926

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20096, block=81926

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20097, block=81927

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20098, block=81927

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20099, block=81927

Assertion failure in do_get_write_access() at transaction.c:606:
"!(((jh2bh(jh))

->b_state & (1UL << BH_Lock)) != 0)"

invalid operand: 0000

CPU:    0

EIP:    0010:[<d080e9e5>]

EFLAGS: 00010286

eax: 00000021   ebx: c171ce94   ecx: 00000001   edx: 0015991c

esi: c171ce00   edi: c788cdc0   ebp: cb5f4690   esp: cf581bf0

ds: 0018   es: 0018   ss: 0018

Process syslogd (pid: 608, stackpage=cf581000)

Stack: d0816990 0000025e 00000000 00000000 c171ce00 cf641820 c171ce94
c171ce00

       c788cdc0 cb5f4690 d080edb5 c788cdc0 cb5f4690 00000000 00000000
000001ae

       cfc86400 cfa670e0 d081a0cb c788cdc0 cfa67140 cf581c60 c1448088
cfa66800

Call Trace: [<d0816990>] [<d080edb5>] [<d081a0cb>] [<c018526c>]
[<d081c049>]

   [<d081c33d>] [<d081c1d0>] [<d081c990>] [<d080f3de>] [<d081ec95>]
[<d081ed1a>]

 

   [<d081caac>] [<c0133283>] [<d0813ebb>] [<d080e276>] [<c01339cd>]
[<d081ca48>]

 

   [<d081cfa6>] [<d081ca48>] [<c01265c8>] [<d081aa7b>] [<c0130ec9>]
[<d081aa5c>]

 

   [<d080f777>] [<d081ab0f>] [<c0130fe1>] [<c0106dd3>]

 

Code: 0f 0b 5b 5e 8b 54 24 24 f6 42 10 04 bb e2 ff ff ff b8 01 00

 hda: read_intr: status=0x51 { DriveReady SeekComplete Error }

hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854,
sector=163854

end_request: I/O error, dev 03:02 (hda), sector 163854

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20100, block=81927

hda: read_intr: status=0x51 { DriveReady SeekComplete Error }

hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854,
sector=163854

end_request: I/O error, dev 03:02 (hda), sector 163854

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20101, block=81927

hda: read_intr: status=0x51 { DriveReady SeekComplete Error }

hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854,
sector=163854

end_request: I/O error, dev 03:02 (hda), sector 163854

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20102, block=81927

hda: read_intr: status=0x51 { DriveReady SeekComplete Error }

hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854,
sector=163854

end_request: I/O error, dev 03:02 (hda), sector 163854

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20103, block=81927

hda: read_intr: status=0x51 { DriveReady SeekComplete Error }

hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854,
sector=163854

end_request: I/O error, dev 03:02 (hda), sector 163854

EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read
inode block

 - inode=20104, block=81927

30916

 

Some more information on this system: Wondering if this could have some
deep impact on this issue.

1.	LILO boots the 2.4.7-10 kernel image with option ide=nodma. Can
this have any impact on these errors?
2.	The system has a postgresql database which writes data to the
maximum of 1 record per 5 second. That much data writes will it do.
3.	On restart, the fsck in the bootup scipts (rc.sysinit)  could
not resolve this, 

It said.

Checking filesystems

Could this be a zero-length partition?

fsck.ext3: Attempt to read block from filesystem resulted in short read
while tr

ying to open /dev/hda2

/dev/hda3: recovering journal

/dev/hda3: clean, 81/125488 files, 35684/500472 blocks

Checking all file systems.

[/sbin/fsck.ext3 -- /tmp] fsck.ext3 -a /dev/hda2

[/sbin/fsck.ext3 -- /tmp2] fsck.ext3 -a /dev/hda3

[FAILED]

 

*** An error occurred during the file system check.

*** Dropping you to a shell; the system will reboot

*** when you leave the shell.

Give root password for maintenance

(or type Control-D for normal startup):

 

 

I went ahead and further gave root password and ran the command. e2fsck
-a -c -C 0 /dev/hda2

It said:

 

e2fsck: Attempt to read block from filesystem resulted in short read
while trying to open /dev/hda2

Could this be a zero-length partition?

 

 

Please give your advice, as this problem has become a big-bang show
stopper for our product.

Your advice will be very helpful for me to go ahead with this issue.

 

Thanks in advance,

Arun S

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070605/a00af6f9/attachment.htm>

From adilger at clusterfs.com  Tue Jun  5 21:35:46 2007
From: adilger at clusterfs.com (Andreas Dilger)
Date: Tue, 5 Jun 2007 15:35:46 -0600
Subject: Help on ext3 file system corruption issue
In-Reply-To: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com>
References: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com>
Message-ID: <20070605213546.GO5181@schatzie.adilger.int>

On Jun 05, 2007  22:01 +0530, Somasundaram, Arun (IE10) wrote:
> I have a Kernel 2.4.7-10 with ext3 file system in compact flash. The
> system was up for 3 months and was running with average load conditions.

Unless you have a support contract with some vendor, nobody will look at
bugs from such an old kernel.  There are a hundred old bugs that might
have been fixed already.

> One fine day, it just started sending kernel messages on the serial
> console. The message was like this.
> 
>  
> 
> hda: read_intr: status=0x51 { DriveReady SeekComplete Error }
> 
> hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854,
> sector=163854
> 
> end_request: I/O error, dev 03:02 (hda), sector 163854
> 
> EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unab
> 
> le to read inode block - inode=20089, block=81926

This is likely a hardware error.  Probably due to the fact that ext3
is not a good filesystem to use on CF because the journal is always
overwriting the same part of the CF device.  Try something like JFFS2
instead.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.



From tod at gust.sr.unh.edu  Tue Jun  5 22:47:59 2007
From: tod at gust.sr.unh.edu (Tod Hagan)
Date: Tue, 05 Jun 2007 18:47:59 -0400
Subject: Calculating stride values?
Message-ID: <1181083679.8077.23.camel@trop.sr.unh.edu>

All,

I have a question about calculating the value for the -E stride option
to mke2fs.

The mke2fs man page says

      stride=stripe-size
             Configure the filesystem for a RAID array with stripe-size filesystem blocks per stripe.

So stride = size of stripe/blocksize.

The size of a stripe is the RAID chunk size * the number of drives in the RAID.

My question: are parity disks included in the number of drives, or are
only data drives counted?

For example, take the example of six drives configured for RAID 5 with a
chunk size of 64 and a 4K blocksize:

1. Parity drive included: 64*6/4 = 96
2. Parity drive excluded: 64*5/4 = 80

Which is correct?

Thanks.

Tod

-- 
Tod Hagan
Information Technologist
AIRMAP/Climate Change Research Center
Institute for the Study of Earth, Oceans, and Space
University of New Hampshire
Durham, NH 03824
Phone: 603-862-3116




From adilger at clusterfs.com  Tue Jun  5 23:23:11 2007
From: adilger at clusterfs.com (Andreas Dilger)
Date: Tue, 5 Jun 2007 17:23:11 -0600
Subject: Calculating stride values?
In-Reply-To: <1181083679.8077.23.camel@trop.sr.unh.edu>
References: <1181083679.8077.23.camel@trop.sr.unh.edu>
Message-ID: <20070605232311.GT5181@schatzie.adilger.int>

On Jun 05, 2007  18:47 -0400, Tod Hagan wrote:
> The mke2fs man page says
> 
>       stride=stripe-size
>              Configure the filesystem for a RAID array with stripe-size filesystem blocks per stripe.
> 
> So stride = size of stripe/blocksize.
> 
> The size of a stripe is the RAID chunk size * the number of drives in the RAID.

Not really.  We submitted a patch to clarify this, so the "stride=" value
is the number of blocks on a SINGLE disk.  This ensures that the bitmaps
are round-robined across all disks.

> For example, take the example of six drives configured for RAID 5 with a
> chunk size of 64 and a 4K blocksize:
> 
> 1. Parity drive included: 64*6/4 = 96
> 2. Parity drive excluded: 64*5/4 = 80
> 
> Which is correct?

-E stride=16, based on 64k / 4k


Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.



From darkonc at gmail.com  Tue Jun  5 23:37:50 2007
From: darkonc at gmail.com (Stephen Samuel)
Date: Tue, 5 Jun 2007 16:37:50 -0700
Subject: Help on ext3 file system corruption issue
In-Reply-To: <20070605213546.GO5181@schatzie.adilger.int>
References: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com>
	<20070605213546.GO5181@schatzie.adilger.int>
Message-ID: <6cd50f9f0706051637o69c72f8qb7e186fa31d2ebd9@mail.gmail.com>

On 6/5/07, Andreas Dilger <adilger at clusterfs.com> wrote:
>
> On Jun 05, 2007  22:01 +0530, Somasundaram, Arun (IE10) wrote:
> > I have a Kernel 2.4.7-10 with ext3 file system in compact flash. The
> > system was up for 3 months and was running with average load conditions.
>
> Unless you have a support contract with some vendor, nobody will look at
> bugs from such an old kernel.  There are a hundred old bugs that might
> have been fixed already.


There are extensions to DD that will, on an error allow you to skip over the
block(s) in error while zeroing (instead of just ignoring) the blocks on the
output... (I think that Knoppix has such a version of DD, if that'll help
you)  This means that everything that can be read will be where (relative to
the start of your recovery partition or file ) ext3fs is  expecting to  find
it.   Once you find the recovery DD, use it to copy your filesystem to a
hard drive or whatever, You can then recover your data and then
--- presuming andreas is right -- you'll have to replace your flash, and
then put a filesystem on it that's more conducive to how flash works.


> hda: read_intr: status=0x51 { DriveReady SeekComplete Error }
> >
> > hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854,
> > sector=163854
> >
> > end_request: I/O error, dev 03:02 (hda), sector 163854
> >
> > EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unab
> >
> > le to read inode block - inode=20089, block=81926
>
> This is likely a hardware error.  Probably due to the fact that ext3
> is not a good filesystem to use on CF because the journal is always
> overwriting the same part of the CF device.  Try something like JFFS2
> instead.
>
> Cheers, Andreas
>
>


-- 
Stephen Samuel http://www.bcgreen.com
778-861-7641
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070605/0a04db1b/attachment.htm>

From Arun.Somasundaram at honeywell.com  Wed Jun  6 10:18:23 2007
From: Arun.Somasundaram at honeywell.com (Somasundaram, Arun (IE10))
Date: Wed, 6 Jun 2007 15:48:23 +0530
Subject: Help on ext3 file system corruption issue
In-Reply-To: <20070605213546.GO5181@schatzie.adilger.int>
References: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com>
	<20070605213546.GO5181@schatzie.adilger.int>
Message-ID: <1E675F21DFB0C74294A0FCA4987BE45F724FE1@IE10EV811.global.ds.honeywell.com>

Hi Andreas,

  Thanks for your reply.

Are there any patches in this kernel for these ext3 bugs? Please guide
me to these patches if available.

What is the suitable file system for compact flash? I read that JFFS2 is
suitable for raw NAND flash card and is not suitable for CF card. Is it
true?

My application has write operation performed to the CF card even every 3
secs.(worst case)

 

Thanks,

Arun

 

 

-----Original Message-----
From: Andreas Dilger [mailto:adilger at clusterfs.com] 
Sent: Wednesday, June 06, 2007 3:06 AM
To: Somasundaram, Arun (IE10)
Cc: ext3-users at redhat.com
Subject: Re: Help on ext3 file system corruption issue

 

On Jun 05, 2007  22:01 +0530, Somasundaram, Arun (IE10) wrote:

> I have a Kernel 2.4.7-10 with ext3 file system in compact flash. The

> system was up for 3 months and was running with average load
conditions.

 

Unless you have a support contract with some vendor, nobody will look at

bugs from such an old kernel.  There are a hundred old bugs that might

have been fixed already.

 

> One fine day, it just started sending kernel messages on the serial

> console. The message was like this.

> 

>  

> 

> hda: read_intr: status=0x51 { DriveReady SeekComplete Error }

> 

> hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854,

> sector=163854

> 

> end_request: I/O error, dev 03:02 (hda), sector 163854

> 

> EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unab

> 

> le to read inode block - inode=20089, block=81926

 

This is likely a hardware error.  Probably due to the fact that ext3

is not a good filesystem to use on CF because the journal is always

overwriting the same part of the CF device.  Try something like JFFS2

instead.

 

Cheers, Andreas

--

Andreas Dilger

Principal Software Engineer

Cluster File Systems, Inc.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070606/c181a2c3/attachment.htm>

From duaneg at dghda.com  Wed Jun  6 12:22:06 2007
From: duaneg at dghda.com (Duane Griffin)
Date: Wed, 6 Jun 2007 13:22:06 +0100
Subject: Help on ext3 file system corruption issue
In-Reply-To: <6cd50f9f0706051637o69c72f8qb7e186fa31d2ebd9@mail.gmail.com>
References: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com>
	<20070605213546.GO5181@schatzie.adilger.int>
	<6cd50f9f0706051637o69c72f8qb7e186fa31d2ebd9@mail.gmail.com>
Message-ID: <e9e943910706060522s5652ff32g48c3bce33c889f08@mail.gmail.com>

On 06/06/07, Stephen Samuel <darkonc at gmail.com> wrote:
> There are extensions to DD that will, on an error allow you to skip over the
> block(s) in error while zeroing (instead of just ignoring) the blocks on the
> output...

I use ddrescue for this. It works very well.

Cheers,
Duane.

-- 
"I never could learn to drink that blood and call it wine" - Bob Dylan



From tod at gust.sr.unh.edu  Wed Jun  6 15:23:48 2007
From: tod at gust.sr.unh.edu (Tod Hagan)
Date: Wed, 06 Jun 2007 11:23:48 -0400
Subject: Calculating stride values?
In-Reply-To: <20070605232311.GT5181@schatzie.adilger.int>
References: <1181083679.8077.23.camel@trop.sr.unh.edu>
	<20070605232311.GT5181@schatzie.adilger.int>
Message-ID: <1181143428.17547.5.camel@trop.sr.unh.edu>

On Tue, 2007-06-05 at 17:23 -0600, Andreas Dilger wrote:
> Not really.  We submitted a patch to clarify this, so the "stride=" value
> is the number of blocks on a SINGLE disk.  This ensures that the bitmaps
> are round-robined across all disks.
> 
> > For example, take the example of six drives configured for RAID 5 with a
> > chunk size of 64 and a 4K blocksize:
>
> -E stride=16, based on 64k / 4k

Thanks for clearing this up. The number of blocks for a single disk
means you don't have to worry about parity drives, so it's much easier
to deal with.

And good to hear about the clarifying patch, as I'm not the only person
confused by this -- currently, the information on the link below is
wrong:

http://wiki.centos.org/HowTos/Disk_Optimization

Tod

-- 
Tod Hagan
Information Technologist
AIRMAP/Climate Change Research Center
Institute for the Study of Earth, Oceans, and Space
University of New Hampshire
Durham, NH 03824
Phone: 603-862-3116




From lanzi at quantentunnel.de  Thu Jun  7 16:56:23 2007
From: lanzi at quantentunnel.de (=?ISO-8859-15?Q?J=FCrgen_Landsmann?=)
Date: Thu, 07 Jun 2007 18:56:23 +0200
Subject: Crashed ext3-filesystem
Message-ID: <466838B7.6060800@quantentunnel.de>

Hi!

We have a server still running Debian 3.0 (Woody) that nobody likes to 
touch for maintenance ... ;)

Our home-directories are located on a separate HDD (30GB, 1 large 
primary ext3 partition) and until yesterday it worked correctly. Because 
the partition was nearly full we had to enlarge our home-space by moving 
it to a larger HDD.

We decided to try a copy of whole partition by using gparted from the 
"SystemRescueCd" (http://www.sysresccd.org).

The first try failed, because the FS-type of our home-partition wasn't 
recognized.

A second boot was tried and this time the FS-type was recognized 
correctly so we started copying the partition to another HDD (80GB).

After about 40% to 50% the copy-procedure crashed an left our system in 
an unusable state. The only way to re-use the sytem was to press the 
"Reset"-button.

Because we thought that the reason of this crash was caused by an error 
in the APM-funcionality we tried it once more by booting the kernel 
using the "noapm" parameter. But even this try crashed ...

After we rebooted again I mounted the source-partition to check it's 
content. But all I found were three files visible on the partition.

The directory containing the userfiles was completely gone an in 
"lost&found" there are hundreds of items.

After this horrifying discovery I unmounted the partition and subscribed 
this mailing-list ... ;)

Unfortunately also our whole webserver-files were located in this 
directory ...



Now my question:

Is there any possibility to restore my directory (completely or at least 
partial)?

Thanx in advance for your help!!!

Bye
Juergen



From lists at nerdbynature.de  Sat Jun  9 12:03:36 2007
From: lists at nerdbynature.de (Christian Kujau)
Date: Sat, 9 Jun 2007 14:03:36 +0200 (CEST)
Subject: Crashed ext3-filesystem
In-Reply-To: <466838B7.6060800@quantentunnel.de>
References: <466838B7.6060800@quantentunnel.de>
Message-ID: <alpine.DEB.0.99.0706091349520.8312@foobar-g4>

On Thu, 7 Jun 2007, J?rgen Landsmann wrote:
> Our home-directories are located on a separate HDD (30GB, 1 large primary 
> ext3 partition) and until yesterday it worked correctly.

...30 GB and no backups?

> Because the 
> partition was nearly full we had to enlarge our home-space by moving it to a 
> larger HDD.
> We decided to try a copy of whole partition by using gparted from the 
> "SystemRescueCd" (http://www.sysresccd.org).

Why would you do this? What's wrong with tar/cp?

> Because we thought that the reason of this crash was caused by an error in 
> the APM-funcionality we tried it once more by booting the kernel using the 
> "noapm" parameter. But even this try crashed ...

Any more details regarding the crashes? log messages, sysrq-t available?

> The directory containing the userfiles was completely gone an in "lost&found" 
> there are hundreds of items.

Ouch :(
Not much you can do here. I'd take a first look with "file /lost+found/*"
to see if there's something useful in there. ext2/3-recovery tools are 
out there, but I guess you'll have to try a few and see if they can 
recover anything:

  - e2undel, recover (both available as debian packages in unstable)
  - R-Linux, a free (as in beer) recovery tool for win32 (works pretty good though)
  - ...and then there's always grep(1) & friends :(

hth,
Christian.
-- 
make bzImage, not war

From lanzi at quantentunnel.de  Fri Jun 15 09:08:55 2007
From: lanzi at quantentunnel.de (=?ISO-8859-15?Q?J=FCrgen_Landsmann?=)
Date: Fri, 15 Jun 2007 11:08:55 +0200
Subject: Crashed ext3-filesystem
In-Reply-To: <alpine.DEB.0.99.0706091349520.8312@foobar-g4>
References: <466838B7.6060800@quantentunnel.de>
	<alpine.DEB.0.99.0706091349520.8312@foobar-g4>
Message-ID: <46725727.8010508@quantentunnel.de>

Christian Kujau schrieb:
> On Thu, 7 Jun 2007, J?rgen Landsmann wrote:
>> Our home-directories are located on a separate HDD (30GB, 1 large 
>> primary ext3 partition) and until yesterday it worked correctly.
> 
> ...30 GB and no backups?
plz don't ask, why! ;)

> 
>> Because the partition was nearly full we had to enlarge our home-space 
>> by moving it to a larger HDD.
>> We decided to try a copy of whole partition by using gparted from the 
>> "SystemRescueCd" (http://www.sysresccd.org).
> 
> Why would you do this? What's wrong with tar/cp?
It was just a try and the crashed partition wasn't event mountet. The
partition must have had an unrecognized error!

>  - e2undel, recover (both available as debian packages in unstable)
>  - R-Linux, a free (as in beer) recovery tool for win32 (works pretty 
> good though)
>  - ...and then there's always grep(1) & friends :(
Some items listed there with the type "file" are directories and others
listed as "directory" are files.

I already tried to find some files by using grep, find, cat (...) but no
chance!

I will try some of the utilities mentioned above but I don't think,
there is any possibility to get some data back ...

Thanx for your hints!!

Bye
J. Landsmann



From public at miernik.name  Sat Jun 16 00:41:42 2007
From: public at miernik.name (Miernik)
Date: Sat, 16 Jun 2007 02:41:42 +0200
Subject: Help on ext3 file system corruption issue
References: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com>
	<20070605213546.GO5181@schatzie.adilger.int>
Message-ID: <20070616004142.6C4A.0.NOFFLE@debian107.local>

Andreas Dilger <adilger at clusterfs.com> wrote:
> This is likely a hardware error.  Probably due to the fact that ext3
> is not a good filesystem to use on CF because the journal is always
> overwriting the same part of the CF device.  Try something like JFFS2
> instead.

Isn't CF always wear-levelled internally, so it shouln't matter, and the
internal compact flash controller will take care not to write to the
same physical chip all the time?

I wonder, because I had recently had two CF cards used as root sidk in a
CF-ICE adapter go bad, one with unrecoverable bad sectors (ext3 couldn't
be used on it, it was only 32 sectors = 16 kB, but still I couldn't use
the card at all, because these sectors where coming back over and over
again, like if the CF was remapping there somewhere else, and ext3 not
knowing about that jumed upon them again, and so on, very strange). Then
a second card got completely destroyed in just couple of months standard
desktop usage as root filesystem. I didn't use swap on any of the cards,
/home was also somewhere else, no really often changing data.

-- 
Miernik
http://miernik.name/



From public at miernik.name  Sat Jun 16 00:56:49 2007
From: public at miernik.name (Miernik)
Date: Sat, 16 Jun 2007 02:56:49 +0200
Subject: 4 GB USB flash disk with FAT ok, with ext3 corrupted files
Message-ID: <20070616005649.6C4A.1.NOFFLE@debian107.local>

I recently bought 2 different USB flash disks. These are some cheap no-name
devices.  Their parameters:

   bytes            C/H/S       ID
4194304512       509/255/63     Vendor: Generic Model: USB Flash Drive  Rev: 1.00  ANSI SCSI revision: 02
4288676352      1023/132/62     Vendor: USB     Model: USB 2.0          Rev: 1.00  ANSI SCSI revision: 02

When I put a FAT32 filesystem on them, everything is OK, but when I put an ext3
filesystem, everything is OK when I write files to the disk, I can fill it with
files, but then when I remove the disk from the computer (after a proper
umount) and putting it in again, most of the files have corrupted direcotry
entries (they look red in midnight commander, some of them pink). But some
(about 5 to 10%) files are normal, and normally accessible.

I tried them both on two completely different computers with very different
hardware, and different Linux versions, and the effect is the same.

One of the computers is a desktop with and old AMD K7 Clayton motherboard with
only old USB1.1: VT82xxxxx UHCI USB 1.1 Controller, and Debian sid with
2.6.18-4-k7 kernel from Debian.

The other computer is a much newer AMD Athlon64 HP laptop with USB2.0 port and
SuSE 10.2.

Did anyone observe anything similar with any USB flash drives (FAT OK, ext3
corrupted)?

I've put files on these disks on FAT32 and run fsck.vfat, and everything looks
fine:

root at tarnica:~# dosfsck -v /dev/sda1
dosfsck 2.11 (12 Mar 2005)
dosfsck 2.11, 12 Mar 2005, FAT32, LFN
Checking we can access the last sector of the filesystem
Boot sector contents:
System ID "mkdosfs"
Media byte 0xf8 (hard disk)
       512 bytes per logical sector
      4096 bytes per cluster
        32 reserved sectors
First FAT starts at byte 16384 (sector 32)
         2 FATs, 32 bit entries
   4177920 bytes per FAT (= 8160 sectors)
Root directory start at cluster 2 (arbitrary size)
Data area starts at byte 8372224 (sector 16352)
   1044477 data clusters (4278177792 bytes)
62 sectors/track, 132 heads
         0 hidden sectors
   8372170 sectors total
Checking for unused clusters.
Checking free cluster summary.
/dev/sda1: 121 files, 116397/1044477 clusters
root at tarnica:~# echo "$?"
0
root at tarnica:~#

With FAT I can read any file I saved to the disk just fine, it
simply works.

Maybe you'd like my 'lsusb -v' (this is on the USB1.1-only Debian machine):


Bus 002 Device 001: ID 0000:0000  
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               1.10
  bDeviceClass            9 Hub
  bDeviceSubClass         0 Unused
  bDeviceProtocol         0 Full speed hub
  bMaxPacketSize0        64
  idVendor           0x0000 
  idProduct          0x0000 
  bcdDevice            2.06
  iManufacturer           3 Linux 2.6.18-4-k7 uhci_hcd
  iProduct                2 UHCI Host Controller
  iSerial                 1 0000:00:07.3
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           25
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      0 Full speed hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0002  1x 2 bytes
        bInterval             255
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             2
  wHubCharacteristic 0x000a
    No power switching (usb 1.0)
    Per-port overcurrent protection
  bPwrOn2PwrGood        1 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0300 lowspeed power
   Port 2: 0000.0300 lowspeed power
Device Status:     0x0003
  Self Powered
  Remote Wakeup Enabled

Bus 001 Device 002: ID 1043:8012 iCreate Technologies Corp. 
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x1043 iCreate Technologies Corp.
  idProduct          0x8012 
  bcdDevice            1.00
  iManufacturer           1 USB
  iProduct                2 USB 2.0
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           32
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              100mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk (Zip)
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0000
  (Bus Powered)

Bus 001 Device 001: ID 0000:0000  
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               1.10
  bDeviceClass            9 Hub
  bDeviceSubClass         0 Unused
  bDeviceProtocol         0 Full speed hub
  bMaxPacketSize0        64
  idVendor           0x0000 
  idProduct          0x0000 
  bcdDevice            2.06
  iManufacturer           3 Linux 2.6.18-4-k7 uhci_hcd
  iProduct                2 UHCI Host Controller
  iSerial                 1 0000:00:07.2
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           25
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      0 Full speed hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0002  1x 2 bytes
        bInterval             255
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             2
  wHubCharacteristic 0x000a
    No power switching (usb 1.0)
    Per-port overcurrent protection
  bPwrOn2PwrGood        1 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0103 power enable connect
   Port 2: 0000.0100 power
Device Status:     0x0003
  Self Powered
  Remote Wakeup Enabled

Let me give you some more diag.
Here is what I did:

Having read that too large max_sectors sometimes gives problems, I did:

root at tarnica:~# echo "64" > /sys/block/sda/device/max_sectors

But before I tried without reducing max_sectors from the default - no
difference.

root at tarnica:~# mkfs.ext3 /dev/sda1
mke2fs 1.40-WIP (07-Apr-2007)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
523264 inodes, 1046521 blocks
52326 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1073741824
32 block groups
32768 blocks per group, 32768 fragments per group
16352 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736

Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 24 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
root at tarnica:~# mount /dev/sda1 /mnt/sda1

No errors up till here.

root at tarnica:~# cp -dr /usr/share/doc/ /mnt/sda1/

Here we get this error in kern.log:

Jun 16 00:47:09 tarnica kernel: scsi0: PCI error Interrupt at seqaddr = 0x8
Jun 16 00:47:09 tarnica kernel: scsi0: Data Parity Error Detected during address or write data phase

root at tarnica:~# sync

Later I did an 'find -ls' in the /mnt/sda1/ directory,
then 'umount /mnt/sda1' and then 'mount /dev/sda1 /mnt/sda1' again.


The above commands caused that to appear in the output of 'dmesg':

EXT3-fs error (device sda1): ext3_readdir: bad entry in directory #11: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Aborting journal on device sda1.
EXT3-fs error (device sda1) in ext3_ordered_writepage: IO failure
ext3_abort called.
EXT3-fs error (device sda1): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
EXT3-fs error (device sda1): ext3_readdir: bad entry in directory #11: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
EXT3-fs error (device sda1): ext3_readdir: bad entry in directory #11: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_committed_data
kjournald starting.  Commit interval 5 seconds
EXT3-fs warning (device sda1): ext3_clear_journal_err: Filesystem error recorded from previous mount: IO failure
EXT3-fs warning (device sda1): ext3_clear_journal_err: Marking fs in need of filesystem check.
EXT3-fs warning: mounting fs with errors, running e2fsck is recommended
EXT3 FS on sda1, internal journal
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.

At this point:

root at tarnica:~# ls -al /mnt/sda1
total 24
drwxr-xr-x  4 root root  4096 2007-06-16 00:47 .
drwxr-xr-x 27 root root  4096 2007-06-09 09:25 ..
drwx------  2 root root 16384 2007-06-16 00:37 lost+found
?---------  ? ?    ?        ?                ? /mnt/sda1/doc
root at tarnica:~#

So I did 'umount /mnt/sda1' and 'e2fsck -v -y /dev/sda1', which runs endlessly
with a zillion errors, for example:

Inode 133561 has compression flag set on filesystem without compression support.  Clear? yes

Inode 133561 has illegal block(s).  Clear? yes

Illegal block #0 (189057594) in inode 133561.  CLEARED.
Illegal block #1 (3559149010) in inode 133561.  CLEARED.
Illegal block #2 (4279737499) in inode 133561.  CLEARED.
Illegal block #3 (362979125) in inode 133561.  CLEARED.
Illegal block #4 (3152073428) in inode 133561.  CLEARED.
Illegal block #5 (679595262) in inode 133561.  CLEARED.
Illegal block #6 (1924390837) in inode 133561.  CLEARED.
Illegal block #7 (1058295063) in inode 133561.  CLEARED.
Illegal block #8 (795243680) in inode 133561.  CLEARED.
Illegal block #9 (3130620932) in inode 133561.  CLEARED.
Illegal block #10 (1544529913) in inode 133561.  CLEARED.
Too many illegal blocks in inode 133561.
Clear inode? yes


or:


Inode 134287 has compression flag set on filesystem without compression support.  Clear? yes

Inode 134287, i_size is 7400753060221116605, should be 0.  Fix? yes

Inode 134287, i_blocks is 2017258698, should be 0.  Fix? yes

Inode 134303 has compression flag set on filesystem without compression support.  Clear? yes

Inode 134303, i_size is 7400753060221116605, should be 0.  Fix? yes

Inode 134303, i_blocks is 2017258698, should be 0.  Fix? yes


or:


Inode 132391 has imagic flag set.  Clear? yes

Special (device/socket/fifo) inode 132391 has non-zero size.  Fix? yes

Inode 132392 is in use, but has dtime set.  Fix? yes

Inode 132392 has imagic flag set.  Clear? yes


and other different types of errors alternatively, from time to time
doing:

Restarting e2fsck from the beginning...
/dev/sda1 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes


and it seems this goes on forever. Nothing shows in 'dmesg', nor
kern.log nor 'cat /proc/kmsg' nor any other log file/output during doing
the filesystem check.


I would think it's a broken hardware, but this happens on two different
4 GB USB sticks, from two different sources, one used, one new, on
different computers, so it's quite unlikely that both of these sticks
would be bad. Besides that they work perfectly with FAT32. And it is
also unlikely that so different USB controllers on two different
computers would be bad at the same time.

Similar symptoms are in this bug report:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=404486

But I have never saw such symptoms using HDDs, CF cards (also 4 GB ones, on the
same machine) and 256 MB USB flash disks.

Any clues?


-- 
Miernik
http://miernik.name/



From jamarconi at sbcglobal.net  Sat Jun 16 13:17:37 2007
From: jamarconi at sbcglobal.net (John Marconi)
Date: Sat, 16 Jun 2007 08:17:37 -0500
Subject: kjournald hang on ext3 to ext3 copy
Message-ID: <4673E2F1.2090704@sbcglobal.net>

All,

I am running into a situation in which one of my ext3 filesystems is 
getting hung during normal usage.  There are three ext3 filesystems on a 
CompactFLASH.  One is mounted as / and one as /tmp.  In my test, I am 
copying a 100 MB file from /root to /tmp repeatedly.  While doing this 
test, I eventually see the copying stop, and any attempts to access /tmp 
fail - if I even do ls /tmp the command will hang.

I suspect kjournald because of the following ps output:
PID      PPID   WCHAN:20      PCPU  %MEM  PSR  COMM
 8847    99 start_this_handle        1.1  0.0  28     pdflush
 8853    99 schedule_timeout       0.2  0.0   7     pdflush
  188     1 kswapd                       0.0  0.0  19   kswapd0
 8051     1 mtd_blktrans_thread   0.0  0.0  22   mtdblockd
 8243     1 kjournald                    0.0  0.0   0   kjournald
 8305     1 schedule_timeout        0.0  0.0   2   udevd
 8378     1 kjournald                    0.0  0.0   0   kjournald
 8379     1 journal_commit_trans 16.6  0.0   0   kjournald
 8437     1 schedule_timeout       0.0  0.0   0   evlogd
 8527     1 syslog                        0.0  0.0   1   klogd
 8534     1 schedule_timeout       0.0  0.0   0   portmap
 8569     1 schedule_timeout       0.0  0.0   0   rngd
 8639     1 schedule_timeout       0.1  0.0  24   sshd
 8741  8639 schedule_timeout    0.0  0.0   0     sshd
 8743  8741 wait                        0.0  0.0   9       bash
 8857  8743 schedule_timeout    4.9  0.0   7         cp
 8664     1 schedule_timeout       0.0  0.0   0   xinetd
 8679     1 schedule_timeout       0.0  0.0   0   evlnotifyd
 8689     1 schedule_timeout       0.0  0.0   0   evlactiond
 8704     1 wait                           0.0  0.0   1   bash
 8882  8704 -                            0.0  0.0   2     ps

If I run ps repeatedly, I always see process 8379 in 
journal_commit_transaction, and it is always taking between 12% and 20% 
of processor 0 up.  This process never completes.  I also see process 
8847 in start_this_handle forever as well - so I believe they are related. 

This system is using a 2.6.14 kernel.

Has anyone seen this type of behaviour before?  Note, if I change /tmp 
to ext2 I never see this issue - it is only when /tmp is mounted as ext3.

Thank you,
John



From tytso at mit.edu  Sat Jun 16 15:57:29 2007
From: tytso at mit.edu (Theodore Tso)
Date: Sat, 16 Jun 2007 11:57:29 -0400
Subject: Help on ext3 file system corruption issue
In-Reply-To: <20070616004142.6C4A.0.NOFFLE@debian107.local>
References: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com>
	<20070605213546.GO5181@schatzie.adilger.int>
	<20070616004142.6C4A.0.NOFFLE@debian107.local>
Message-ID: <20070616155728.GA5351@thunk.org>

On Sat, Jun 16, 2007 at 02:41:42AM +0200, Miernik wrote:
> Isn't CF always wear-levelled internally, so it shouln't matter, and the
> internal compact flash controller will take care not to write to the
> same physical chip all the time?

Cards do seem to have some differences in quality and quality of their
wear levelling algorithms (some of which I believe are patented, but
I'm not an expert in this area).

> I wonder, because I had recently had two CF cards used as root sidk in a
> CF-ICE adapter go bad, one with unrecoverable bad sectors (ext3 couldn't
> be used on it, it was only 32 sectors = 16 kB, but still I couldn't use
> the card at all, because these sectors where coming back over and over
> again, like if the CF was remapping there somewhere else, and ext3 not
> knowing about that jumed upon them again, and so on, very strange). Then
> a second card got completely destroyed in just couple of months standard
> desktop usage as root filesystem. I didn't use swap on any of the cards,
> /home was also somewhere else, no really often changing data.

Did you mount the filesystems with the noatime mount option?  If not,
then there was probably a huge amount of changes to the CF caused by
the last access time getting updated.

Regards,

							- Ted



From public at miernik.name  Sat Jun 16 16:11:50 2007
From: public at miernik.name (Miernik)
Date: Sat, 16 Jun 2007 18:11:50 +0200
Subject: 4 GB USB flash disk with FAT ok, with ext3 corrupted files
References: <20070616005649.6C4A.1.NOFFLE@debian107.local>
Message-ID: <20070616161150.6FBD.0.NOFFLE@debian107.local>

Posting now to two lists, one about USB and the other about ext3 as I
don't know what is the source of the problem.

Miernik <public at miernik.name> wrote:
> I recently bought 2 different USB flash disks. These are some cheap no-name
> devices.  Their parameters:
> 
>   bytes            C/H/S       ID
> 4288676352      1023/132/62     Vendor: USB     Model: USB 2.0          Rev: 1.00  ANSI SCSI revision: 02

Right now after trying to copy about 0.5 GB of files to a freshly created ext3
filesystem on the device, this is the output of dmesg:

ontroller doesn't have AUX irq; using default 12
serio: i8042 KBD port at 0x60,0x64 irq 1
mice: PS/2 mouse device common for all mice
TCP bic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI No-Shortcut mode
Freeing unused kernel memory: 212k freed
input: AT Translated Set 2 keyboard as /class/input/input0
ACPI: CPU0 (power states: C1[C1] C2[C2])
ACPI: Processor [CPU0] (supports 2 throttling states)
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11
PCI: setting IRQ 11 as level-triggered
ACPI: PCI Interrupt 0000:00:07.2[D] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
uhci_hcd 0000:00:07.2: UHCI Host Controller
uhci_hcd 0000:00:07.2: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:07.2: irq 11, io base 0x0000d400
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ACPI: PCI Interrupt 0000:00:07.3[D] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
uhci_hcd 0000:00:07.3: UHCI Host Controller
uhci_hcd 0000:00:07.3: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:07.3: irq 11, io base 0x0000d800
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
SCSI subsystem initialized
VP_IDE: IDE controller at PCI slot 0000:00:07.1
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci0000:00:07.1
    ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
Time: acpi_pm clocksource has been installed.
hda: ST33210A, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
8139cp 0000:00:08.0: This (id 10ec:8139 rev 10) is not an 8139C+ compatible chip
8139cp 0000:00:08.0: Try the "8139too" driver instead.
ACPI: PCI Interrupt 0000:00:0b.0[A] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
libata version 2.20 loaded.
8139too Fast Ethernet driver 0.9.28
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
        <Adaptec 2940 SCSI adapter>
        aic7870: Single Channel A, SCSI Id=7, 16/253 SCBs

ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10
PCI: setting IRQ 10 as level-triggered
ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LNKA] -> GSI 10 (level, low) -> IRQ 10
eth0: RealTek RTL8139 at 0xe800, 00:02:44:29:57:bf, IRQ 10
eth0:  Identified 8139 chip type 'RTL-8139C'
hda: max request size: 128KiB
hda: 6346368 sectors (3249 MB) w/256KiB Cache, CHS=6296/16/63, UDMA(33)
hda: cache flushes not supported
 hda: hda1 hda2 < hda5 >
scsi 0:0:2:0: Processor         HP       C5110A           3638 PQ: 0 ANSI: 2
 target0:0:2: Beginning Domain Validation
 target0:0:2: Ending Domain Validation
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
scsi 0:0:2:0: Attached scsi generic sg0 type 3
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
input: PC Speaker as /class/input/input1
Real Time Clock Driver v1.12ac
Linux agpgart interface v0.102 (c) Dave Jones
agpgart: Detected VIA Twister-K/KT133x/KM133 chipset
agpgart: AGP aperture is 64M @ 0xe0000000
parport_pc: VIA 686A/8231 detected
parport_pc: probing current configuration
parport_pc: Current parallel port base: 0x378
parport0: PC-style at 0x378, irq 7 [PCSPP,EPP]
parport_pc: VIA parallel port: io=0x378, irq=7
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 12
PCI: setting IRQ 12 as level-triggered
ACPI: PCI Interrupt 0000:00:07.5[C] -> Link [LNKC] -> GSI 12 (level, low) -> IRQ 12
PCI: Setting latency timer of device 0000:00:07.5 to 64
EXT3 FS on hda1, internal journal
Probing IDE interface ide1...
device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel at redhat.com
Sound Blaster 16 soundcard not found or device busy
In case, if you have non-AWE card, try snd-sb16 module
[drm] Initialized drm 1.1.0 20060810
ACPI: PCI Interrupt 0000:01:00.0[A] -> Link [LNKA] -> GSI 10 (level, low) -> IRQ 10
[drm] Initialized radeon 1.25.0 20060524 on minor 0
radeonfb: Found Intel x86 BIOS ROM Image
radeonfb: Retrieved PLL infos from BIOS
radeonfb: Reference=27.00 MHz (RefDiv=12) Memory=240.00 Mhz, System=166.00 MHz
radeonfb: PLL min 20000 max 40000
i2c_adapter i2c-2: unable to read EDID block.
i2c_adapter i2c-2: unable to read EDID block.
i2c_adapter i2c-2: unable to read EDID block.
i2c_adapter i2c-4: unable to read EDID block.
i2c_adapter i2c-4: unable to read EDID block.
i2c_adapter i2c-4: unable to read EDID block.
radeonfb: Monitor 1 type CRT found
radeonfb: EDID probed
radeonfb: Monitor 2 type no found
Console: switching to colour frame buffer device 200x75
radeonfb (0000:01:00.0): ATI Radeon Y` 
Intel ISA PCIC probe: 
  Intel i82365sl B step ISA-to-PCMCIA at port 0x3e0 ofs 0x00, 2 sockets
    host opts [0]: none
    host opts [1]: none
    ISA irqs (scanned) = 9,15 polling interval = 1000 ms
pccard: PCMCIA card inserted into slot 0
TCP hybla registered
cs: IO port probe 0x100-0x3af: clean.
cs: IO port probe 0x3e0-0x4ff: excluding 0x4d0-0x4d7
cs: IO port probe 0x820-0x8ff: clean.
cs: IO port probe 0xc00-0xcf7: clean.
cs: IO port probe 0xa00-0xaff: clean.
cs: memory probe 0x0d0000-0x0dffff: excluding 0xd0000-0xd7fff
cs: memory probe 0x0e0000-0x0effff: clean.
pcmcia: registering new device pcmcia0.0
cs: IO port probe 0x100-0x3af: clean.
cs: IO port probe 0x3e0-0x4ff: excluding 0x4d0-0x4d7
cs: IO port probe 0x820-0x8ff: clean.
cs: IO port probe 0xc00-0xcf7: clean.
cs: IO port probe 0xa00-0xaff: clean.
Probing IDE interface ide2...
hde: SAMSUNG CF/ATA, CFA DISK drive
ide2 at 0x100-0x107,0x10e on irq 9
hde: max request size: 128KiB
hde: 8211168 sectors (4204 MB) w/0KiB Cache, CHS=8146/16/63
 hde: hde1 hde2
ide-cs: hde: Vpp = 0.0
pcmcia: Detected deprecated PCMCIA ioctl usage from process: discover.
pcmcia: This interface will soon be removed from the kernel; please expect breakage unless you upgrade to new tools.
pcmcia: see http://www.kernel.org/pub/linux/utils/kernel/pcmcia/pcmcia.html for details.
eth0: link up, 100Mbps, half-duplex, lpa 0x44E1
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
eth0: no IPv6 routers present
input: Power Button (FF) as /class/input/input2
ACPI: Power Button (FF) [PWRF]
input: Power Button (CM) as /class/input/input3
ACPI: Power Button (CM) [PWRB]
input: Sleep Button (CM) as /class/input/input4
ACPI: Sleep Button (CM) [SLPB]
irda_init()
NET: Registered protocol family 23
powernow-k8: Processor cpuid 680 not supported
agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0.
agpgart: Putting AGP V2 device at 0000:00:00.0 into 1x mode
agpgart: Putting AGP V2 device at 0000:01:00.0 into 1x mode
[drm] Setting GART location based on new memory map
[drm] Loading R200 Microcode
[drm] writeback test succeeded in 1 usecs
kjournald starting.  Commit interval 5 seconds
EXT3 FS on dm-0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0.
agpgart: Putting AGP V2 device at 0000:00:00.0 into 1x mode
agpgart: Putting AGP V2 device at 0000:01:00.0 into 1x mode
[drm] Loading R200 Microcode
usb 1-1: new full speed USB device using uhci_hcd and address 2
usb 1-1: configuration #1 chosen from 1 choice
Initializing USB Mass Storage driver...
scsi1 : SCSI emulation for USB Mass Storage devices
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
usb-storage: device found at 2
usb-storage: waiting for device to settle before scanning
usb-storage: device scan complete
scsi 1:0:0:0: Direct-Access     USB      USB 2.0          1.00 PQ: 0 ANSI: 2
SCSI device sda: 8376321 512-byte hdwr sectors (4289 MB)
sda: Write Protect is off
sda: Mode Sense: 03 00 00 00
sda: assuming drive cache: write through
SCSI device sda: 8376321 512-byte hdwr sectors (4289 MB)
sda: Write Protect is off
sda: Mode Sense: 03 00 00 00
sda: assuming drive cache: write through
 sda: sda1
sd 1:0:0:0: Attached scsi removable disk sda
sd 1:0:0:0: Attached scsi generic sg1 type 0
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
scsi0: PCI error Interrupt at seqaddr = 0x7
scsi0: Data Parity Error Detected during address or write data phase
usb 1-2: new full speed USB device using uhci_hcd and address 3
usb 1-2: configuration #1 chosen from 1 choice
scsi2 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 3
usb-storage: waiting for device to settle before scanning
usb-storage: device scan complete
scsi 2:0:0:0: Direct-Access     USB 2.0  Mobile Disk      PMAP PQ: 0 ANSI: 0 CCS
SCSI device sdb: 1003520 512-byte hdwr sectors (514 MB)
sdb: Write Protect is off
sdb: Mode Sense: 23 00 00 00
sdb: assuming drive cache: write through
SCSI device sdb: 1003520 512-byte hdwr sectors (514 MB)
sdb: Write Protect is off
sdb: Mode Sense: 23 00 00 00
sdb: assuming drive cache: write through
 sdb: sdb1
sd 2:0:0:0: Attached scsi removable disk sdb
sd 2:0:0:0: Attached scsi generic sg2 type 0
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
EXT3-fs error (device sda1): ext3_new_block: block(1046522) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046523) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046524) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046525) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046526) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046531) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046532) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046535) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046537) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046541) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046542) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046544) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046546) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046548) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046549) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046550) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046553) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046554) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046556) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046558) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046561) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046562) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046563) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046565) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046566) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046567) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046568) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046571) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046573) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046575) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046578) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046579) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046581) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046582) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046583) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046585) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046586) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046587) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046589) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046590) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046593) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046594) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046595) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046596) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046597) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046598) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046602) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046603) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046604) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046607) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046610) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046612) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046613) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046614) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046615) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046616) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046620) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046622) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046623) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046624) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046625) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046626) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046627) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046628) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046629) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046633) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046636) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046638) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046639) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046641) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046642) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046643) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046646) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046647) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046648) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046650) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046651) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046652) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046653) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046655) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046657) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046661) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046667) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046669) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046671) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046672) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046673) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046674) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046676) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046680) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046681) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046683) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046685) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046686) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046688) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046689) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046690) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046692) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046694) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046697) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046698) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046701) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046705) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046708) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046713) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046714) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046716) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046718) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046721) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046723) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046726) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046727) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046730) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046731) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046732) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046733) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046735) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046736) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046738) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046740) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046741) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046743) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046749) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046751) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046752) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046757) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046758) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046759) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046760) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046762) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046764) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046766) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046767) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046768) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046770) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046772) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046774) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046776) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046777) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046783) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046784) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046786) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046787) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046788) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046790) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046791) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046792) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046795) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046796) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046797) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046799) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046800) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046804) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046806) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046811) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046812) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046813) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046815) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046816) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046817) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046818) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046821) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046822) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046823) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046825) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046826) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046827) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046828) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046830) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046831) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046832) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046834) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046836) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046838) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046839) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046840) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046841) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046843) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046844) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
EXT3-fs error (device sda1): ext3_new_block: block(1046845) >= blocks count(1046521) - block_group = 31, es == d8f5d400 


And trying to write any more files gives "No space left on device" message,
while only 8% of the device is used:

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1              4120356    305988   3605064   8% /mnt/sda1

Its mounted like this:

miernik at tarnica:~$ cat /proc/mounts | grep sda1
/dev/sda1 /mnt/sda1 ext3 rw,nosuid,nodev,noexec,data=ordered 0 0
miernik at tarnica:~$

-- 
Miernik
http://miernik.name/



From public at miernik.name  Sat Jun 16 17:48:25 2007
From: public at miernik.name (Miernik)
Date: Sat, 16 Jun 2007 19:48:25 +0200
Subject: 4 GB USB flash disk with FAT ok, with ext3 corrupted files
References: <20070616005649.6C4A.1.NOFFLE@debian107.local>
	<20070616161150.6FBD.0.NOFFLE@debian107.local>
Message-ID: <20070616174825.6FBD.1.NOFFLE@debian107.local>

Miernik <public at miernik.name> wrote:
> And trying to write any more files gives "No space left on device" message,
> while only 8% of the device is used:
> 
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sda1              4120356    305988   3605064   8% /mnt/sda1

And the device really has the 4 GB:

debian105:~# dd if=/dev/zero of=/dev/sda
dd: writing to `/dev/sda': No space left on device
8376322+0 records in
8376321+0 records out
4288676352 bytes (4.3 GB) copied, 4145.71 seconds, 1.0 MB/s
debian105:~#

And no error was encountered while doing this dd write.
Neither in dmesg, nor in kern.log. Everything was fine.

-- 
Miernik
http://miernik.name/



From public at miernik.name  Sat Jun 16 19:01:02 2007
From: public at miernik.name (Miernik)
Date: Sat, 16 Jun 2007 21:01:02 +0200
Subject: 4 GB USB flash disk with FAT ok, with ext3 corrupted files
References: <20070616005649.6C4A.1.NOFFLE@debian107.local>
	<20070616161150.6FBD.0.NOFFLE@debian107.local>
Message-ID: <20070616190102.6FBD.2.NOFFLE@debian107.local>

Miernik <public at miernik.name> wrote:
> And trying to write any more files gives "No space left on device" message,
> while only 8% of the device is used:
> 
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sda1              4120356    305988   3605064   8% /mnt/sda1

And the device really has the 4 GB:

debian105:~# dd if=/dev/zero of=/dev/sda
dd: writing to `/dev/sda': No space left on device
8376322+0 records in
8376321+0 records out
4288676352 bytes (4.3 GB) copied, 4145.71 seconds, 1.0 MB/s
debian105:~#

And no error was encountered while doing this dd write.
Neither in dmesg, nor in kern.log. Everything was fine.

Reading also fine:

debian105:~# dd if=/dev/sda of=/dev/null
8376321+0 records in
8376321+0 records out
4288676352 bytes (4.3 GB) copied, 4128.19 seconds, 1.0 MB/s
debian105:~#

No strange messages in any of the logs while doing that.

Also I consider it unlikely that these sticks are bad, because the same
happens on both ot them.

I bought them on these Internet auctions:

http://allegro.pl/item203519391_203519391.html
http://allegro.pl/item201343628_pendrive_4_gb_od_1_zl_.html

As you can see the first one was a multi-item fixed price sale of 20 of
such sticks, and many people bought these, some of whom already given
positive comments. The seller has 100% of positive comments, many of
which from sale of the same type of USB sticks. I would contact the
seller if only one of the sticks was bad - well, that could happen, but
would he give me two different bad sticks? Both where bought from the
same seller.

-- 
Miernik
http://miernik.name/



From adilger at clusterfs.com  Mon Jun 18 06:20:27 2007
From: adilger at clusterfs.com (Andreas Dilger)
Date: Mon, 18 Jun 2007 00:20:27 -0600
Subject: kjournald hang on ext3 to ext3 copy
In-Reply-To: <4673E2F1.2090704@sbcglobal.net>
References: <4673E2F1.2090704@sbcglobal.net>
Message-ID: <20070618062027.GB5181@schatzie.adilger.int>

On Jun 16, 2007  08:17 -0500, John Marconi wrote:
> I am running into a situation in which one of my ext3 filesystems is 
> getting hung during normal usage.  There are three ext3 filesystems on a 
> CompactFLASH.  One is mounted as / and one as /tmp.  In my test, I am 
> copying a 100 MB file from /root to /tmp repeatedly.  While doing this 
> test, I eventually see the copying stop, and any attempts to access /tmp 
> fail - if I even do ls /tmp the command will hang.
> 
> I suspect kjournald because of the following ps output:
> PID      PPID   WCHAN:20      PCPU  %MEM  PSR  COMM
> 8847    99 start_this_handle        1.1  0.0  28     pdflush
> 8853    99 schedule_timeout       0.2  0.0   7     pdflush
>  188     1 kswapd                       0.0  0.0  19   kswapd0
> 8051     1 mtd_blktrans_thread   0.0  0.0  22   mtdblockd
> 8243     1 kjournald                    0.0  0.0   0   kjournald
> 8305     1 schedule_timeout        0.0  0.0   2   udevd
> 8378     1 kjournald                    0.0  0.0   0   kjournald
> 8379     1 journal_commit_trans 16.6  0.0   0   kjournald
> 8437     1 schedule_timeout       0.0  0.0   0   evlogd
> 8527     1 syslog                        0.0  0.0   1   klogd
> 8534     1 schedule_timeout       0.0  0.0   0   portmap
> 8569     1 schedule_timeout       0.0  0.0   0   rngd
> 8639     1 schedule_timeout       0.1  0.0  24   sshd
> 8741  8639 schedule_timeout    0.0  0.0   0     sshd
> 8743  8741 wait                        0.0  0.0   9       bash
> 8857  8743 schedule_timeout    4.9  0.0   7         cp
> 8664     1 schedule_timeout       0.0  0.0   0   xinetd
> 8679     1 schedule_timeout       0.0  0.0   0   evlnotifyd
> 8689     1 schedule_timeout       0.0  0.0   0   evlactiond
> 8704     1 wait                           0.0  0.0   1   bash
> 8882  8704 -                            0.0  0.0   2     ps
> 
> If I run ps repeatedly, I always see process 8379 in 
> journal_commit_transaction, and it is always taking between 12% and 20% 
> of processor 0 up.  This process never completes.  I also see process 
> 8847 in start_this_handle forever as well - so I believe they are related. 
> 
> This system is using a 2.6.14 kernel.

Please try to reproduce with a newer kernel, as this kind of problem
might have been fixed already.


Two tips for debugging this kind of issue:
- you need to have detailed stack traces (e.g. sysrq-t) of all the
  interesting processes

- if a process is stuck inside a large function (e.g. 8379 in example)
  you need to provide the exact line number.  this can be found by compiling
  the kernel with CONFIG_DEBUG_INFO (-g flag to gcc) and then doing
  "gdb vmlinux" and "p *(journal_commit_transaction+{offset})", where the
  byte offset is printed in the sysrq-t output, and then include the code
  surrounding that line from the source file

- a process stuck in "start_this_handle()" is often just an innocent
  bystander.  It is waiting for the currently committing transaction to
  complete before it can start a new filesystem-modifying operation (handle).
  That said, the journal handle acts like a lock and has been the cause of
  many deadlock problems (e.g. process 1 holds lock, waits for handle;
  process 2 holds transaction open waiting for lock).  pdflush might be one
  of the "process 1" kind of tasks, and some other process is holding the
  transaction open preventing it from completing.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.



From public at miernik.name  Mon Jun 18 08:59:44 2007
From: public at miernik.name (Miernik)
Date: Mon, 18 Jun 2007 10:59:44 +0200
Subject: 4 GB USB flash disk with FAT ok,   with ext3 corrupted files
References: <20070616213253.GA6528@tarnica>
	<Pine.LNX.4.44L0.0706161739410.3643-100000@netrider.rowland.org>
Message-ID: <20070618085944.7924.0.NOFFLE@debian107.local>

Alan Stern <stern at rowland.harvard.edu> wrote:
> What you should do is fill up the drive with known data (not just 0's 
> like in your dd test), and then read it back to see if the data has
> changed.

Using dd I found out the reason of the problem:

http://reviews.ebay.co.uk/Beware-of-FAKE-1GB-2GB-4GB-8GB-USB-Flash-Drives-on-eBay_W0QQugidZ10000000000953346
http://blog.uhuru.de/?p=1080
http://projectglop.com/2007/05/14/fake-usb-memory-keys/
http://reviews.ebay.com.au/BEWARE-of-FAKE-1GB-2GB-4GB-8GB-USB-Flash-Drives-on-eBay_W0QQugidZ10000000000706427

These USB sticks are fake, and have a 1 GB flash chip, and a fake
controller which makes the computer think it is a 4 GB stick.

Any data written past the first 1066401792 bytes is lost, and reading
any data over that boundary gives a copy of the last 2048 bytes of the
real flash chip, repeated as many times to fill the whole stick.

I taken one of the sticks apart. The flash chip is
FBNM40A4GK3WG

The controller is
iCreate
I5128-LG
L702
CE7103

http://www.icreate.com.tw/img/PDF/i5128-L_datasheet_preliminary_v010.pdf

Sorry for wasting your time. But the benefit is that now searching for
the error messages that I encountered in Gmane or Google will reveal
this thread with the real cause.

It's only very strange that if 95% of the sticks sold on eBay are such
fake's, then why noone on this mailing list about USB knew about and my
Googling for the error messages didn't reveal any posts about the cause.
I hope this post will fix this lack of knowledge spread.

I am also very surprised that these sellers manage to get positive
comments for these sticks, and the people who buy them don't notice?
People don't fill them past 1 GB? If so, why buy a 4 GB stick, you could
have bought a 1 GB one? And when they fail a lot of time after they buy
it, when they finally try to fill it past 1 GB and actually read that
data, maybe its so lot of time since they bought it by average that they
think that the stick just got broken? Am I one of the few ones who tried
to fill it past 1 GB on the first day I got it?

-- 
Miernik
http://miernik.name/



From alex at alex.org.uk  Mon Jun 18 09:33:26 2007
From: alex at alex.org.uk (Alex Bligh)
Date: Mon, 18 Jun 2007 10:33:26 +0100
Subject: 4 GB USB flash disk with FAT ok,   with ext3 corrupted files
In-Reply-To: <20070618085944.7924.0.NOFFLE@debian107.local>
References: <20070616213253.GA6528@tarnica>
	<Pine.LNX.4.44L0.0706161739410.3643-100000@netrider.rowland.org>
	<20070618085944.7924.0.NOFFLE@debian107.local>
Message-ID: <E10FBACFDCEC1985E7142BE8@[192.168.100.25]>



--On 18 June 2007 10:59 +0200 Miernik <public at miernik.name> wrote:

> These USB sticks are fake, and have a 1 GB flash chip, and a fake
> controller which makes the computer think it is a 4 GB stick.
>
> Any data written past the first 1066401792 bytes is lost, and reading
> any data over that boundary gives a copy of the last 2048 bytes of the
> real flash chip, repeated as many times to fill the whole stick.

Hmmmm.... I wonder whether it would be useful for mke2fs etc. to write
to sector n-1 and n-2 (where there are n sectors on the disk) and read
the sectors back to check the last sectors on the disk actually work.
This would detect bad extents very easily and quickly. I am sure there
are innocent causes of this problem on other media (i.e. it would
be useful beyond fake USB drives)

Alex



From stern at rowland.harvard.edu  Sat Jun 16 21:20:46 2007
From: stern at rowland.harvard.edu (Alan Stern)
Date: Sat, 16 Jun 2007 17:20:46 -0400 (EDT)
Subject: [Linux-usb-users] 4 GB USB flash disk with FAT ok, with ext3
 corrupted files
In-Reply-To: <20070616174825.6FBD.1.NOFFLE@debian107.local>
Message-ID: <Pine.LNX.4.44L0.0706161717450.3643-100000@netrider.rowland.org>

On Sat, 16 Jun 2007, Miernik wrote:

> Miernik <public at miernik.name> wrote:
> > And trying to write any more files gives "No space left on device" message,
> > while only 8% of the device is used:
> > 
> > Filesystem           1K-blocks      Used Available Use% Mounted on
> > /dev/sda1              4120356    305988   3605064   8% /mnt/sda1
> 
> And the device really has the 4 GB:
> 
> debian105:~# dd if=/dev/zero of=/dev/sda
> dd: writing to `/dev/sda': No space left on device
> 8376322+0 records in
> 8376321+0 records out
> 4288676352 bytes (4.3 GB) copied, 4145.71 seconds, 1.0 MB/s
> debian105:~#
> 
> And no error was encountered while doing this dd write.
> Neither in dmesg, nor in kern.log. Everything was fine.

This is a little misleading.  You are comparing the "df" output for 
/dev/sda1 with a transfer to /dev/sda.  Furthermore the units are 
different; df uses 1-KB blocks and dd uses 512-byte blocks.

It would help to see the output from "fdisk -l /dev/sda".

Alan Stern



From public at miernik.name  Sat Jun 16 21:32:53 2007
From: public at miernik.name (Miernik)
Date: Sat, 16 Jun 2007 23:32:53 +0200
Subject: [Linux-usb-users] 4 GB USB flash disk with FAT ok,
	with ext3 corrupted files
In-Reply-To: <Pine.LNX.4.44L0.0706161717450.3643-100000@netrider.rowland.org>
References: <20070616174825.6FBD.1.NOFFLE@debian107.local>
	<Pine.LNX.4.44L0.0706161717450.3643-100000@netrider.rowland.org>
Message-ID: <20070616213253.GA6528@tarnica>

On Sat, Jun 16, 2007 at 05:20:46PM -0400, Alan Stern wrote:
> This is a little misleading.  You are comparing the "df" output for 
> /dev/sda1 with a transfer to /dev/sda.  Furthermore the units are 
> different; df uses 1-KB blocks and dd uses 512-byte blocks.

The point of doing dd was to find out if there will be any errors, not
to check the size.

> It would help to see the output from "fdisk -l /dev/sda".

debian105:~# fdisk -l /dev/sda

Disk /dev/sda: 4288 MB, 4288676352 bytes
132 heads, 62 sectors/track, 1023 cylinders
Units = cylinders of 8184 * 512 = 4190208 bytes

Disk /dev/sda doesn't contain a valid partition table
debian105:~#

Ah, that was after the dd zeroing. I created the partition table and partition,
and here it is again:

debian105:~# fdisk -l /dev/sda

Disk /dev/sda: 4288 MB, 4288676352 bytes
132 heads, 62 sectors/track, 1023 cylinders
Units = cylinders of 8184 * 512 = 4190208 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1        1023     4186085   83  Linux
debian105:~#

And created the filesystem:

debian105:~# mkfs.ext3 /dev/sda1
mke2fs 1.40-WIP (14-Nov-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
523264 inodes, 1046521 blocks
52326 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1073741824
32 block groups
32768 blocks per group, 32768 fragments per group
16352 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736

Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 29 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
debian105:~# fdisk -l /dev/sda

Disk /dev/sda: 4288 MB, 4288676352 bytes
132 heads, 62 sectors/track, 1023 cylinders
Units = cylinders of 8184 * 512 = 4190208 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1        1023     4186085   83  Linux
debian105:~#

Did it help?

Is this a bug in ext3 code?
My USB sticks are broken?
Bug in kernel USB subsystem?
My USB ports suck?

Whatever else? Did anyone see any cheap no-name USB 4 GB stick work with ext3?
The seller of these sticks got plenty of positive comments on the auction site,
and no complaints, so they work for everyone besides me. But porbably everyone
else uses windows. I don't have a windows system to test this.

-- 
Miernik
http://miernik.name/



From stern at rowland.harvard.edu  Sat Jun 16 21:34:05 2007
From: stern at rowland.harvard.edu (Alan Stern)
Date: Sat, 16 Jun 2007 17:34:05 -0400 (EDT)
Subject: [Linux-usb-users] 4 GB USB flash disk with FAT ok, with ext3
 corrupted files
In-Reply-To: <20070616161150.6FBD.0.NOFFLE@debian107.local>
Message-ID: <Pine.LNX.4.44L0.0706161722140.3643-100000@netrider.rowland.org>

On Sat, 16 Jun 2007, Miernik wrote:

> Posting now to two lists, one about USB and the other about ext3 as I
> don't know what is the source of the problem.
> 
> Miernik <public at miernik.name> wrote:
> > I recently bought 2 different USB flash disks. These are some cheap no-name
> > devices.  Their parameters:
> > 
> >   bytes            C/H/S       ID
> > 4288676352      1023/132/62     Vendor: USB     Model: USB 2.0          Rev: 1.00  ANSI SCSI revision: 02
> 
> Right now after trying to copy about 0.5 GB of files to a freshly created ext3
> filesystem on the device, this is the output of dmesg:
...
> EXT3-fs error (device sda1): ext3_new_block: block(1046522) >= blocks count(1046521) - block_group = 31, es == d8f5d400 
> EXT3-fs error (device sda1): ext3_new_block: block(1046523) >= blocks count(1046521) - block_group = 31, es == d8f5d400 

> And trying to write any more files gives "No space left on device" message,
> while only 8% of the device is used:
> 
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sda1              4120356    305988   3605064   8% /mnt/sda1

This doesn't seem to be a USB error.  Look at the ext3 error message.  
It's complaining about a block number being out of range, not any sort 
of I/O problem.

Also I have no idea where that value of 1046521 for the total block
count came from.  These are 4-KB size blocks; converting to 1-KB blocks
gives 4186084, which is larger than than total size listed above for
/dev/sda1.  The output from "fdisk -l /dev/sda" would come in useful 
here.

Alan Stern



From stern at rowland.harvard.edu  Sat Jun 16 21:47:02 2007
From: stern at rowland.harvard.edu (Alan Stern)
Date: Sat, 16 Jun 2007 17:47:02 -0400 (EDT)
Subject: [Linux-usb-users] 4 GB USB flash disk with FAT ok, with ext3
 corrupted files
In-Reply-To: <20070616213253.GA6528@tarnica>
Message-ID: <Pine.LNX.4.44L0.0706161739410.3643-100000@netrider.rowland.org>

On Sat, 16 Jun 2007, Miernik wrote:

> Ah, that was after the dd zeroing. I created the partition table and partition,
> and here it is again:
> 
> debian105:~# fdisk -l /dev/sda
> 
> Disk /dev/sda: 4288 MB, 4288676352 bytes
> 132 heads, 62 sectors/track, 1023 cylinders
> Units = cylinders of 8184 * 512 = 4190208 bytes
> 
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sda1               1        1023     4186085   83  Linux
> debian105:~#

Ah, good.  Note that 4186085 / 4 = 1046521, which agrees with the value 
below and explains those ext3 error messages.

> And created the filesystem:
> 
> debian105:~# mkfs.ext3 /dev/sda1
> mke2fs 1.40-WIP (14-Nov-2006)
> Filesystem label=
> OS type: Linux
> Block size=4096 (log=2)
> Fragment size=4096 (log=2)
> 523264 inodes, 1046521 blocks
> 52326 blocks (5.00%) reserved for the super user
> First data block=0
> Maximum filesystem blocks=1073741824
> 32 block groups
> 32768 blocks per group, 32768 fragments per group
> 16352 inodes per group
> Superblock backups stored on blocks:
>         32768, 98304, 163840, 229376, 294912, 819200, 884736
> 
> Writing inode tables: done
> Creating journal (16384 blocks): done
> Writing superblocks and filesystem accounting information: done
> 
> This filesystem will be automatically checked every 29 mounts or
> 180 days, whichever comes first.  Use tune2fs -c or -i to override.
> debian105:~# fdisk -l /dev/sda

> Did it help?
> 
> Is this a bug in ext3 code?

I don't know; maybe.  Or maybe the code is okay but it's getting bad
data from somewhere.  For example, even though the USB reads succeed, 
they might not return the same data that was originally written to the 
device.

> My USB sticks are broken?

Maybe.

> Bug in kernel USB subsystem?

No.

> My USB ports suck?

No.  A problem in the port would cause a USB error, not bad data.

What you should do is fill up the drive with known data (not just 0's 
like in your dd test), and then read it back to see if the data has
changed.

Alan Stern



From jamarconi at sbcglobal.net  Tue Jun 19 03:53:02 2007
From: jamarconi at sbcglobal.net (John Marconi)
Date: Mon, 18 Jun 2007 22:53:02 -0500
Subject: kjournald hang on ext3 to ext3 copy
In-Reply-To: <20070618062027.GB5181@schatzie.adilger.int>
References: <4673E2F1.2090704@sbcglobal.net>
	<20070618062027.GB5181@schatzie.adilger.int>
Message-ID: <4677531E.1030108@sbcglobal.net>

Andreas Dilger wrote:
> On Jun 16, 2007  08:17 -0500, John Marconi wrote:
>   
>> I am running into a situation in which one of my ext3 filesystems is 
>> getting hung during normal usage.  There are three ext3 filesystems on a 
>> CompactFLASH.  One is mounted as / and one as /tmp.  In my test, I am 
>> copying a 100 MB file from /root to /tmp repeatedly.  While doing this 
>> test, I eventually see the copying stop, and any attempts to access /tmp 
>> fail - if I even do ls /tmp the command will hang.
>>
>> I suspect kjournald because of the following ps output:
>> PID      PPID   WCHAN:20      PCPU  %MEM  PSR  COMM
>> 8847    99 start_this_handle        1.1  0.0  28     pdflush
>> 8853    99 schedule_timeout       0.2  0.0   7     pdflush
>>  188     1 kswapd                       0.0  0.0  19   kswapd0
>> 8051     1 mtd_blktrans_thread   0.0  0.0  22   mtdblockd
>> 8243     1 kjournald                    0.0  0.0   0   kjournald
>> 8305     1 schedule_timeout        0.0  0.0   2   udevd
>> 8378     1 kjournald                    0.0  0.0   0   kjournald
>> 8379     1 journal_commit_trans 16.6  0.0   0   kjournald
>> 8437     1 schedule_timeout       0.0  0.0   0   evlogd
>> 8527     1 syslog                        0.0  0.0   1   klogd
>> 8534     1 schedule_timeout       0.0  0.0   0   portmap
>> 8569     1 schedule_timeout       0.0  0.0   0   rngd
>> 8639     1 schedule_timeout       0.1  0.0  24   sshd
>> 8741  8639 schedule_timeout    0.0  0.0   0     sshd
>> 8743  8741 wait                        0.0  0.0   9       bash
>> 8857  8743 schedule_timeout    4.9  0.0   7         cp
>> 8664     1 schedule_timeout       0.0  0.0   0   xinetd
>> 8679     1 schedule_timeout       0.0  0.0   0   evlnotifyd
>> 8689     1 schedule_timeout       0.0  0.0   0   evlactiond
>> 8704     1 wait                           0.0  0.0   1   bash
>> 8882  8704 -                            0.0  0.0   2     ps
>>
>> If I run ps repeatedly, I always see process 8379 in 
>> journal_commit_transaction, and it is always taking between 12% and 20% 
>> of processor 0 up.  This process never completes.  I also see process 
>> 8847 in start_this_handle forever as well - so I believe they are related. 
>>
>> This system is using a 2.6.14 kernel.
>>     
>
> Please try to reproduce with a newer kernel, as this kind of problem
> might have been fixed already.
>
>
> Two tips for debugging this kind of issue:
> - you need to have detailed stack traces (e.g. sysrq-t) of all the
>   interesting processes
>
> - if a process is stuck inside a large function (e.g. 8379 in example)
>   you need to provide the exact line number.  this can be found by compiling
>   the kernel with CONFIG_DEBUG_INFO (-g flag to gcc) and then doing
>   "gdb vmlinux" and "p *(journal_commit_transaction+{offset})", where the
>   byte offset is printed in the sysrq-t output, and then include the code
>   surrounding that line from the source file
>
> - a process stuck in "start_this_handle()" is often just an innocent
>   bystander.  It is waiting for the currently committing transaction to
>   complete before it can start a new filesystem-modifying operation (handle).
>   That said, the journal handle acts like a lock and has been the cause of
>   many deadlock problems (e.g. process 1 holds lock, waits for handle;
>   process 2 holds transaction open waiting for lock).  pdflush might be one
>   of the "process 1" kind of tasks, and some other process is holding the
>   transaction open preventing it from completing.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.
>
>
>   
Andreas,

Thanks for the information.

I am not able to update the entire kernel to a new version for a variety 
of reasons, however I can update certain parts in my system (such as the 
filesystem).  I did a diff of the 2.6.16 kernel against my kernel, and 
the changes to jbd were minimal.  I plan on looking at the latest 
versions of the kernel to determine if anything has changed since 2.6.16.

I took a look at the place that kjournald was stuck - it is in the 
journal_commit_transaction "while (comiit_transaction->t_updates)" loop 
and it is trying to "spin_lock(&journal->j_state_lock).  When I look at 
pdflush, it is also trying to take the journal->j_state_lock.  Do you 
have any tips on finding out which process might own journal->j_state_lock?

Thanks again,
John



From adilger at clusterfs.com  Tue Jun 19 05:14:02 2007
From: adilger at clusterfs.com (Andreas Dilger)
Date: Mon, 18 Jun 2007 23:14:02 -0600
Subject: kjournald hang on ext3 to ext3 copy
In-Reply-To: <4677531E.1030108@sbcglobal.net>
References: <4673E2F1.2090704@sbcglobal.net>
	<20070618062027.GB5181@schatzie.adilger.int>
	<4677531E.1030108@sbcglobal.net>
Message-ID: <20070619051402.GO5181@schatzie.adilger.int>

On Jun 18, 2007  22:53 -0500, John Marconi wrote:
> Andreas Dilger wrote:
> >Two tips for debugging this kind of issue:
> >- you need to have detailed stack traces (e.g. sysrq-t) of all the
> >  interesting processes
> >
> >- if a process is stuck inside a large function (e.g. 8379 in example)
> >  you need to provide the exact line number.  this can be found by 
> >  compiling
> >  the kernel with CONFIG_DEBUG_INFO (-g flag to gcc) and then doing
> >  "gdb vmlinux" and "p *(journal_commit_transaction+{offset})", where the
> >  byte offset is printed in the sysrq-t output, and then include the code
> >  surrounding that line from the source file
> >
> >- a process stuck in "start_this_handle()" is often just an innocent
> >  bystander.  It is waiting for the currently committing transaction to
> >  complete before it can start a new filesystem-modifying operation 
> >  (handle).
> >  That said, the journal handle acts like a lock and has been the cause of
> >  many deadlock problems (e.g. process 1 holds lock, waits for handle;
> >  process 2 holds transaction open waiting for lock).  pdflush might be one
> >  of the "process 1" kind of tasks, and some other process is holding the
> >  transaction open preventing it from completing.
> 
> I am not able to update the entire kernel to a new version for a variety 
> of reasons, however I can update certain parts in my system (such as the 
> filesystem).  I did a diff of the 2.6.16 kernel against my kernel, and 
> the changes to jbd were minimal.  I plan on looking at the latest 
> versions of the kernel to determine if anything has changed since 2.6.16.

The problem may also be in the ext3 layer and not jbd.

> I took a look at the place that kjournald was stuck - it is in the 
> journal_commit_transaction "while (comiit_transaction->t_updates)" loop 
> and it is trying to "spin_lock(&journal->j_state_lock).  When I look at 
> pdflush, it is also trying to take the journal->j_state_lock.  Do you 
> have any tips on finding out which process might own journal->j_state_lock?

You can enable CONFIG_DEBUG_SPINLOCK in newer kernels and it appears the
spinlock will set the "owner" field to the task struct.  You still need
to get access to this via e.g. "crash" or lkcd or something.

Hmm, it seems this is only set for ppc and s390???  That is how I would
debug this in any case.  The other way (I've done this too many times
in the past) is to look through all of the stack traces and figure out
which ones are in a filesystem context, then check if any of them are
blocked on locks while holding transactions open.  Needs a detailed
understanding of kernel callpaths.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.



From doseyg at r-networks.net  Fri Jun 29 23:21:42 2007
From: doseyg at r-networks.net (Glen Dosey)
Date: Fri, 29 Jun 2007 19:21:42 -0400
Subject: poor read performance
Message-ID: <1183159302.30971.37.camel@localhost.localdomain>

I am seeing what seems to be a notable limit on read performance of an
ext3 filesystem. If anyone could offer some insight it would be helpful.

Background:
12 x 500G SATA disks in a Hardware RAID enclosure connected via 2Gb/s FC
to a 4 x 2.6 Ghz system with 4GB ram running RHEL4.5. Initially the
enclosure was configured RAID5 10+1 parity, although I've also tried
RAID 50 and currently RAID 0. I've varied chunk sizes from 64-256K. 

Problem:
No matter what I do I cannot get the ext3 read performance above
~90MB/s. Under virtually every configuration listed above the write
performance is greater than the read performance. I've run a large
number of Bonnie++ and IOzone tests, but for the sake of simplicity in
this email I'll just refer to simple dd's with /dev/zero.

Details:
Under the current RAID0 setup I see the following when dd'ing.

DD 4G from /dev/zero to /dev/sdd disk (no filesystem) & sync
28 seconds
DD 4G from /dev/sdd to /dev/null 32 seconds
DD 4G to ext3 on /dev/sdd & sync 32 seconds
DD 4G from ext3 file to /dev/null 48 seconds.

I've been watching the port usage on the FC switch and it verifies what
I am seeing, Writes max out near 2Gb/s but reads hit some artificial
limit around 90 MB/s and never ever exceed it with the filesystem,
regardless of they underlying RAID configuration. Without a filesystem
the reads are atleast 50% faster, and it can be seen on the FC switch
graphs as well.

Any help or thoughts would be appreciated.

Thanks,
~Glen




From ling at fnal.gov  Sat Jun 30 05:18:22 2007
From: ling at fnal.gov (Ling C. Ho)
Date: Sat, 30 Jun 2007 00:18:22 -0500
Subject: poor read performance
In-Reply-To: <1183159302.30971.37.camel@localhost.localdomain>
References: <1183159302.30971.37.camel@localhost.localdomain>
Message-ID: <4685E79E.3060709@fnal.gov>

Hi,

Did you see any difference when different block size is used (for 
example, dd with bs=64k or 128k)? Try also change the read-ahead cache. 
blockdev --getra /dev/sdd to see what is the current value, and blockdev 
--setra 8192 /dev/sdd to change it. 8192 is a good number that has been 
working well for me for the similar size setup.

...
ling

Glen Dosey wrote:
> I am seeing what seems to be a notable limit on read performance of an
> ext3 filesystem. If anyone could offer some insight it would be helpful.
>
> Background:
> 12 x 500G SATA disks in a Hardware RAID enclosure connected via 2Gb/s FC
> to a 4 x 2.6 Ghz system with 4GB ram running RHEL4.5. Initially the
> enclosure was configured RAID5 10+1 parity, although I've also tried
> RAID 50 and currently RAID 0. I've varied chunk sizes from 64-256K. 
>
> Problem:
> No matter what I do I cannot get the ext3 read performance above
> ~90MB/s. Under virtually every configuration listed above the write
> performance is greater than the read performance. I've run a large
> number of Bonnie++ and IOzone tests, but for the sake of simplicity in
> this email I'll just refer to simple dd's with /dev/zero.
>
> Details:
> Under the current RAID0 setup I see the following when dd'ing.
>
> DD 4G from /dev/zero to /dev/sdd disk (no filesystem) & sync
> 28 seconds
> DD 4G from /dev/sdd to /dev/null 32 seconds
> DD 4G to ext3 on /dev/sdd & sync 32 seconds
> DD 4G from ext3 file to /dev/null 48 seconds.
>
> I've been watching the port usage on the FC switch and it verifies what
> I am seeing, Writes max out near 2Gb/s but reads hit some artificial
> limit around 90 MB/s and never ever exceed it with the filesystem,
> regardless of they underlying RAID configuration. Without a filesystem
> the reads are atleast 50% faster, and it can be seen on the FC switch
> graphs as well.
>
> Any help or thoughts would be appreciated.
>
> Thanks,
> ~Glen
>
>
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users
>