CD driver reads causes error by reading too far ahead
D. Hugh Redelmeier
hugh at mimosa.com
Fri Mar 11 22:35:32 UTC 2005
summary: reading from a CD block device generates spurious errors near
the end. These errors prevent reasonable tasks from being done.
I think that this behaviour is new with LINUX 2.6. I'm using Fedora
Core 3 with kernel 2.6.10-1.770_FC3 on x86_64.
I burned a CD from a .iso (you may ignore the details):
I used Fedora core 3's Nautilus desktop to burn (rightclick on .iso,
choose "Write to Disc...").
-rw-rw-r-- 1 hugh hugh 582391808 Feb 4 19:37 w2k3sp1_1433_usa_x64fre_pro.iso
I attempt to check the result. I moved the burnt CD to another drive, then
used the command
cmp w2k3sp1_1433_usa_x64fre_pro.iso /dev/hdd
to test the result.
I got what looks like failure:
cmp: /dev/hdd: Input/output error
[I later tried the same test on Fedora Core 1 with a 2.4 kernel. The
drive tested as correct (using device /dev/scd0).]
Let's investigate the error:
dmesg of error:
hdd: media error (bad sector): status=0x51 { DriveReady SeekComplete Error }
hdd: media error (bad sector): error=0x30
ide: failed opcode was 100
ATAPI device hdd:
Error: Medium error -- (Sense key=0x03)
Unrecovered read error -- (asc=0x11, ascq=0x00)
The failed "Read 10" packet command was:
"28 00 00 04 56 b4 00 00 21 00 00 00 00 00 00 00 "
end_request: I/O error, dev hdd, sector 1137360
Buffer I/O error on device hdd, logical block 284340
Buffer I/O error on device hdd, logical block 284341
Buffer I/O error on device hdd, logical block 284342
Buffer I/O error on device hdd, logical block 284343
Buffer I/O error on device hdd, logical block 284344
Buffer I/O error on device hdd, logical block 284345
Buffer I/O error on device hdd, logical block 284346
Buffer I/O error on device hdd, logical block 284347
Buffer I/O error on device hdd, logical block 284348
Buffer I/O error on device hdd, logical block 284349
Buffer I/O error on device hdd, logical block 284350
Buffer I/O error on device hdd, logical block 284351
Buffer I/O error on device hdd, logical block 284352
Buffer I/O error on device hdd, logical block 284353
Buffer I/O error on device hdd, logical block 284354
Buffer I/O error on device hdd, logical block 284355
Buffer I/O error on device hdd, logical block 284356
Buffer I/O error on device hdd, logical block 284357
Buffer I/O error on device hdd, logical block 284358
Buffer I/O error on device hdd, logical block 284359
Buffer I/O error on device hdd, logical block 284360
Buffer I/O error on device hdd, logical block 284361
Buffer I/O error on device hdd, logical block 284362
Buffer I/O error on device hdd, logical block 284363
Buffer I/O error on device hdd, logical block 284364
Buffer I/O error on device hdd, logical block 284365
Buffer I/O error on device hdd, logical block 284366
Buffer I/O error on device hdd, logical block 284367
Buffer I/O error on device hdd, logical block 284368
Buffer I/O error on device hdd, logical block 284369
Buffer I/O error on device hdd, logical block 284370
Buffer I/O error on device hdd, logical block 284371
Buffer I/O error on device hdd, logical block 284372
Let's decode the failing command:
28 00 00 04 56 b4 00 00 21 00 00 00 00 00 00 00
28 opcode: READ(10) [as the message said] [a 10-byte command]
00 Logical Unit = 0, DP0=0, FUA=0, RelAdr=0
00 04 56 b4 Logical Block Address = 284340
00 reserved
00 21 Transfer Length = 33
00 Control
00 00 00 00 00 00 crap???
This is a read request, asking for 33 blocks, starting at block number
284340.
The .iso is 582391808 bytes or 284371 blocks of 2k.
blocks in .iso - block for start of command == 284371 - 284340 == 31
So 31 good blocks should be found at 284340 on. But the read is for
33 blocks.
The request is asking for blocks beyond the end of the .iso. No
wonder the request is failing: you cannot read runout blocks!
==> the system should not be reading blocks it was not asked to.
Or, if it wants to read them, it should not return an error
when the error is for blocks that were not requested.
Notice that the error is reported as being on sector 1137360 and block
284340. I'm pretty sure that is is actually on block 284372.
==> the error message ought to have the correct block number, if
possible
I claim this behaviour is wrong and broken. But how this happens is
complicated. Where should the fix be?
I now try with dd(1), hoping to control the readahead that is getting
us into the runout (cmp probably uses stdio which naturally tries to
read large chunks but dd should use unbuffered I/O).
[hugh at redclaw hugh]$ dd if=/dev/hdd bs=2048 skip=284340 count=1 of=0
dd: reading `/dev/hdd': Input/output error
0+0 records in
0+0 records out
dmesg shows:
hdd: media error (bad sector): status=0x51 { DriveReady SeekComplete Error }
hdd: media error (bad sector): error=0x30
ide: failed opcode was 100
ATAPI device hdd:
Error: Medium error -- (Sense key=0x03)
Unrecovered read error -- (asc=0x11, ascq=0x00)
The failed "Read 10" packet command was:
"28 00 00 04 56 b4 00 00 21 00 00 00 00 00 00 00 "
end_request: I/O error, dev hdd, sector 1137360
Buffer I/O error on device hdd, logical block 284340
Notice that even though I specified a count of 1, the failing SCSI
command shows a count of 33! This, itself, seems like a bug (perhaps
I have UNIX expectations of LINUX).
Here's another dd experiment, meant to avoid the dreaded readahead.
Attempt to read several blocks, in one read, starting at 284339. It
turns out that 284339 is divisible by 11, so we can try to read 11
blocks with the following command:
dd if=/dev/hdd bs=22528 skip=25849 count=1 of=0
The command was successful, but only 2048 bytes were read. So the
block device is acting like one: it will limit a read to one physical
block.
Is there any way I can stop this stupid readahead? I say stupid
because it causes an I/O error by reading something that I never asked
it too. It compounds the mistake by reporting the error as happening
on a legitimate block.
More information about the fedora-list
mailing list