From adilger at sun.com Wed Jul 2 07:28:53 2008 From: adilger at sun.com (Andreas Dilger) Date: Wed, 02 Jul 2008 01:28:53 -0600 Subject: debugfs question In-Reply-To: References: Message-ID: <20080702072853.GV6239@webber.adilger.int> On Jun 28, 2008 23:50 +0300, Yamin, Yossi wrote: > I am trying to read a file directly from the disk using debugfs utility. > > I am running "stat" on the file I want, filter out IND and Bind blocks, > and then copy the data blocks using dd directly from the Disk. > > On small files it work'd (13MB). > > On big files (3.5 , 440 GB) the size is the same but md5sum get differ. > > What am I doing wrong? What you are doing wrong is that you aren't using the debugfs "dump" command, which will do what you are trying to do. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. From pegasus at nerv.eu.org Wed Jul 2 11:18:23 2008 From: pegasus at nerv.eu.org (Jure =?UTF-8?B?UGXEjWFy?=) Date: Wed, 2 Jul 2008 13:18:23 +0200 Subject: max fs size with 1k blocks Message-ID: <20080702131823.57d8ed66.pegasus@nerv.eu.org> Hello, what is max ext3 filesystem size when mke2fs is called with -i 1024 -b 1024? If rhel5 supports 8T ext3, I assume that is with 4k blocks, right? So it is right to assume this number to be 2T with 1k blocks? What about rhel4? Thanks for answers. -- Jure Pe?ar http://jure.pecar.org From yamin_yossi at diligent.com Thu Jul 3 14:07:36 2008 From: yamin_yossi at diligent.com (Yamin, Yossi) Date: Thu, 3 Jul 2008 17:07:36 +0300 Subject: debugfs question In-Reply-To: <20080702072853.GV6239@webber.adilger.int> References: <20080702072853.GV6239@webber.adilger.int> Message-ID: Thanks Andreas, I have a problem that for the file data I need the fs MetaData is corrupted. I hope that because I am using same pre allocate logic all over my FS then I can get the Maping from other clear state FS file And read the data directly from the corrupted FS. Is that making sense? Thanks, Yossi -----Original Message----- From: Andreas.Dilger at sun.com [mailto:Andreas.Dilger at sun.com] On Behalf Of Andreas Dilger Sent: Wednesday, July 02, 2008 10:29 AM To: Yamin, Yossi Cc: ext3-users at redhat.com; tytso at mit.edu Subject: Re: debugfs question On Jun 28, 2008 23:50 +0300, Yamin, Yossi wrote: > I am trying to read a file directly from the disk using debugfs utility. > > I am running "stat" on the file I want, filter out IND and Bind blocks, > and then copy the data blocks using dd directly from the Disk. > > On small files it work'd (13MB). > > On big files (3.5 , 440 GB) the size is the same but md5sum get differ. > > What am I doing wrong? What you are doing wrong is that you aren't using the debugfs "dump" command, which will do what you are trying to do. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc . -------------- next part -------------- An HTML attachment was scrubbed... URL: From sandeen at redhat.com Thu Jul 3 17:36:51 2008 From: sandeen at redhat.com (Eric Sandeen) Date: Thu, 03 Jul 2008 12:36:51 -0500 Subject: max fs size with 1k blocks In-Reply-To: <20080702131823.57d8ed66.pegasus@nerv.eu.org> References: <20080702131823.57d8ed66.pegasus@nerv.eu.org> Message-ID: <486D0E33.7080700@redhat.com> Jure Pe?ar wrote: > Hello, > > what is max ext3 filesystem size when mke2fs is called with -i 1024 -b 1024? 32-bit block number container, so 2^32 * 1024 = 4T. > > If rhel5 supports 8T ext3, 16T actually, in 2.6.19 and beyond (or distros which backported the fixes) > I assume that is with 4k blocks, right? Yep. > So it is right to assume this number to be 2T with 1k blocks? s/b 4T, see above. > What about rhel4? on pre-2.6.18/19 kernels there were really only 31 safe bits, so cut it in half. -Eric From pgquiles at elpauer.org Fri Jul 4 16:16:33 2008 From: pgquiles at elpauer.org (Pau Garcia i Quiles) Date: Fri, 04 Jul 2008 18:16:33 +0200 Subject: Nanosecond date resolution Message-ID: <20080704181633.qxygbndqgcowwg0g@www.elpauer.org> Hello, By default ext3 stores file with 1 second date resolution but I have read it is possible to enable big inode and that will improve resolution to 1 nanosecond. How can I enable the 1-nanosecond date resolution? I've found some mails by Alex Tomas, Andi Kleen and Andreas Gruenbacher to the linux filesystems mailing list but no clear way to enable this feature. Will it harm performance? (I do know it consumes more space) Thank you -- Pau Garcia i Quiles http://www.elpauer.org (Due to my workload, I may need 10 days to answer) From sandeen at redhat.com Fri Jul 4 17:04:24 2008 From: sandeen at redhat.com (Eric Sandeen) Date: Fri, 04 Jul 2008 12:04:24 -0500 Subject: Nanosecond date resolution In-Reply-To: <20080704181633.qxygbndqgcowwg0g@www.elpauer.org> References: <20080704181633.qxygbndqgcowwg0g@www.elpauer.org> Message-ID: <486E5818.1020007@redhat.com> Pau Garcia i Quiles wrote: > Hello, > > By default ext3 stores file with 1 second date resolution but I have > read it is possible to enable big inode and that will improve > resolution to 1 nanosecond. > > How can I enable the 1-nanosecond date resolution? I've found some > mails by Alex Tomas, Andi Kleen and Andreas Gruenbacher to the linux > filesystems mailing list but no clear way to enable this feature. Will > it harm performance? (I do know it consumes more space) > > Thank you > ext4 makes use of larger inodes for this purpose, but ext3 does not. -Eric From iskandarprins at gmail.com Sat Jul 12 23:01:27 2008 From: iskandarprins at gmail.com (Iskandar Prins) Date: Sun, 13 Jul 2008 01:01:27 +0200 Subject: Ext3 problem Message-ID: Hey All, I have a serious and weird problem with my harddrive and ext3. About a week ago I re-arranged my entire directory structure. I moved, deleted, renamed files and dirs. I did some reboots and shutdowns in the mean time. And all was fine, up until now. At the moment I'm looking at the filesystem (onto which I made those changed last week) , but it seems I'm looking at the filesystem from a month ago ( june 20th). Now I have a couple of questions: - How could this have happened? - Is this a journaling problem? If so, how can I fix this and get the filesystem back from july 12th instead of looking at the filesystem how it was on june 20th. I did a fsck/e2fsck on the umounted devs and they can't get any cleaner. Regards, Iskandar -------------- next part -------------- An HTML attachment was scrubbed... URL: From sandeen at redhat.com Sat Jul 12 23:30:57 2008 From: sandeen at redhat.com (Eric Sandeen) Date: Sat, 12 Jul 2008 18:30:57 -0500 Subject: Ext3 problem In-Reply-To: References: Message-ID: <48793EB1.3050208@redhat.com> Iskandar Prins wrote: > Hey All, > > I have a serious and weird problem with my harddrive and ext3. About a > week ago I re-arranged my entire directory structure. I moved, deleted, > renamed files and dirs. I did some reboots and shutdowns in the mean > time. And all was fine, up until now. > > At the moment I'm looking at the filesystem (onto which I made those > changed last week) , but it seems I'm looking at the filesystem from a > month ago ( june 20th). > > Now I have a couple of questions: > > - How could this have happened? No idea. > - Is this a journaling problem? It is, but... > If so, how can I fix this and get the > filesystem back from july 12th instead of looking at the filesystem how > it was on june 20th. ... that's not what journaling does I'm afraid. > I did a fsck/e2fsck on the umounted devs and they can't get any cleaner. I don't suppose you changed hard drives and you're looking at the old one? :) -Eric From lserinol at gmail.com Sun Jul 13 09:14:39 2008 From: lserinol at gmail.com (Levent Serinol) Date: Sun, 13 Jul 2008 12:14:39 +0300 Subject: indexing symbolic links In-Reply-To: <1cbd6f830806221337y5dbc5173qbf4e7222b3fa9f67@mail.gmail.com> References: <1cbd6f830806211903i4cc02814gc5517934e3952694@mail.gmail.com> <1cbd6f830806220612k1e3126e5t2c91a1164321c9e5@mail.gmail.com> <61CF57D9DB48898E17DC2EF7@Ximines.local> <1cbd6f830806221337y5dbc5173qbf4e7222b3fa9f67@mail.gmail.com> Message-ID: <2c1942a70807130214p1aceec6bi55fcaa37a8f5f2e6@mail.gmail.com> you can use inotify and register it to notify you when a symbolic link created or unlinked. By this way you can put or remove names from your database automatically, with a small C,perl,etc. program :) 2008/6/22 Mag Gam : > Unfortunately, tracking space wasn't me goal. I want to keep track of my > symbolic links :-) > > > > On Sun, Jun 22, 2008 at 3:04 PM, Alex Bligh wrote: >> >> >> --On 22 June 2008 09:12:26 -0400 Mag Gam wrote: >> >>> At my university, we have physical storage in a filesystem, and we assign >>> professors and students space by doing a symbolic link. Basically I want >>> to keep track of physical storage with virtual/logical storage. Thats why >>> I ask :-) >> >> If you want to track space usage, I suggest you track it using quota >> or similar. "man quota" will give you a start. >> >> Alex > > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users > From magawake at gmail.com Sun Jul 13 16:04:17 2008 From: magawake at gmail.com (Mag Gam) Date: Sun, 13 Jul 2008 12:04:17 -0400 Subject: indexing symbolic links In-Reply-To: <2c1942a70807130214p1aceec6bi55fcaa37a8f5f2e6@mail.gmail.com> References: <1cbd6f830806211903i4cc02814gc5517934e3952694@mail.gmail.com> <1cbd6f830806220612k1e3126e5t2c91a1164321c9e5@mail.gmail.com> <61CF57D9DB48898E17DC2EF7@Ximines.local> <1cbd6f830806221337y5dbc5173qbf4e7222b3fa9f67@mail.gmail.com> <2c1942a70807130214p1aceec6bi55fcaa37a8f5f2e6@mail.gmail.com> Message-ID: <1cbd6f830807130904r3ff5791k7cae51f36049da7c@mail.gmail.com> OH! Very nice solution. I guess I can listen to that. Thats a cool way of doing it. On Sun, Jul 13, 2008 at 5:14 AM, Levent Serinol wrote: > you can use inotify and register it to notify you when a symbolic link > created or unlinked. By this way you can put or remove names from your > database automatically, with a small C,perl,etc. program :) > > 2008/6/22 Mag Gam : >> Unfortunately, tracking space wasn't me goal. I want to keep track of my >> symbolic links :-) >> >> >> >> On Sun, Jun 22, 2008 at 3:04 PM, Alex Bligh wrote: >>> >>> >>> --On 22 June 2008 09:12:26 -0400 Mag Gam wrote: >>> >>>> At my university, we have physical storage in a filesystem, and we assign >>>> professors and students space by doing a symbolic link. Basically I want >>>> to keep track of physical storage with virtual/logical storage. Thats why >>>> I ask :-) >>> >>> If you want to track space usage, I suggest you track it using quota >>> or similar. "man quota" will give you a start. >>> >>> Alex >> >> >> _______________________________________________ >> Ext3-users mailing list >> Ext3-users at redhat.com >> https://www.redhat.com/mailman/listinfo/ext3-users >> > From mnalis-ml at voyager.hr Sun Jul 13 17:56:17 2008 From: mnalis-ml at voyager.hr (Matija Nalis) Date: Sun, 13 Jul 2008 19:56:17 +0200 Subject: Ext3 problem In-Reply-To: References: Message-ID: <20080713175617.GA5328@eagle102.home.lan> On Sun, Jul 13, 2008 at 01:01:27AM +0200, Iskandar Prins wrote: > At the moment I'm looking at the filesystem (onto which I made those changed > last week) , but it seems I'm looking at the filesystem from a month ago ( > june 20th). is underlying block device a single hard disk, or maybe a RAID1 or similar ? I had a similar problem once, which turned out to be RAID1 which was out of sync (but wrongly thining it is OK!), so sometimes it read "right" data from one disk, and some times "bad" data from other disk. Solution was to break the RAID, force fsck it, and recreate the raid afterwards. > I did a fsck/e2fsck on the umounted devs and they can't get any cleaner. you did specify -f to force it, I assume ? -- Opinions above are GNU-copylefted. From iskandarprins at gmail.com Mon Jul 14 19:03:52 2008 From: iskandarprins at gmail.com (Iskandar Prins) Date: Mon, 14 Jul 2008 21:03:52 +0200 Subject: Ext3 problem In-Reply-To: <20080713175617.GA5328@eagle102.home.lan> References: <20080713175617.GA5328@eagle102.home.lan> Message-ID: Yup, there is a Raid1 underlying. I'm using a *XFX REVO 64 SPU 3 Port SATA Raid Card **for my raid1 setup. Don't I lose my entire data when I break the raid? Yup, even with the fsck -f option, the drive is clean. ** * The windows was connecting via samba (hence windows). On Sun, Jul 13, 2008 at 7:56 PM, Matija Nalis wrote: > On Sun, Jul 13, 2008 at 01:01:27AM +0200, Iskandar Prins wrote: > > At the moment I'm looking at the filesystem (onto which I made those > changed > > last week) , but it seems I'm looking at the filesystem from a month ago > ( > > june 20th). > > is underlying block device a single hard disk, or maybe a RAID1 or similar > ? > > I had a similar problem once, which turned out to be RAID1 which was out of > sync (but wrongly thining it is OK!), so sometimes it read "right" data > from > one disk, and some times "bad" data from other disk. Solution was to break > the RAID, force fsck it, and recreate the raid afterwards. > > > I did a fsck/e2fsck on the umounted devs and they can't get any cleaner. > > you did specify -f to force it, I assume ? > > -- > Opinions above are GNU-copylefted. > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From iskandarprins at gmail.com Mon Jul 14 19:17:00 2008 From: iskandarprins at gmail.com (Iskandar Prins) Date: Mon, 14 Jul 2008 21:17:00 +0200 Subject: Ext3 problem In-Reply-To: References: <20080713175617.GA5328@eagle102.home.lan> Message-ID: Problem solved. It seemed one of my sata cables was loose, I reconnected it and and I got all my data back :-) I'm happy now. The tip about the raid made me relook into the raid array again Thanks for the help On Mon, Jul 14, 2008 at 9:03 PM, Iskandar Prins wrote: > Yup, there is a Raid1 underlying. I'm using a *XFX REVO 64 SPU 3 Port SATA > Raid Card **for my raid1 setup. Don't I lose my entire data when I break > the raid? > > Yup, even with the fsck -f option, the drive is clean. > > > > > > ** > * > The windows was connecting via samba (hence windows). > > > On Sun, Jul 13, 2008 at 7:56 PM, Matija Nalis > wrote: > >> On Sun, Jul 13, 2008 at 01:01:27AM +0200, Iskandar Prins wrote: >> > At the moment I'm looking at the filesystem (onto which I made those >> changed >> > last week) , but it seems I'm looking at the filesystem from a month ago >> ( >> > june 20th). >> >> is underlying block device a single hard disk, or maybe a RAID1 or similar >> ? >> >> I had a similar problem once, which turned out to be RAID1 which was out >> of >> sync (but wrongly thining it is OK!), so sometimes it read "right" data >> from >> one disk, and some times "bad" data from other disk. Solution was to break >> the RAID, force fsck it, and recreate the raid afterwards. >> >> > I did a fsck/e2fsck on the umounted devs and they can't get any cleaner. >> >> you did specify -f to force it, I assume ? >> >> -- >> Opinions above are GNU-copylefted. >> >> _______________________________________________ >> Ext3-users mailing list >> Ext3-users at redhat.com >> https://www.redhat.com/mailman/listinfo/ext3-users >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruno at wolff.to Tue Jul 15 13:37:15 2008 From: bruno at wolff.to (Bruno Wolff III) Date: Tue, 15 Jul 2008 08:37:15 -0500 Subject: Ext3 problem In-Reply-To: References: <20080713175617.GA5328@eagle102.home.lan> Message-ID: <20080715133715.GB28719@wolff.to> On Mon, Jul 14, 2008 at 21:17:00 +0200, Iskandar Prins wrote: > Problem solved. It seemed one of my sata cables was loose, I reconnected it > and and I got all my data back :-) I'm happy now. > > The tip about the raid made me relook into the raid array again You probably want to check the the arrays elements are actually in sync. If they aren't you should probably assume the one with the cable problem is the one that is wrong, fail it and then bring it back into the array so that it gets rebuilt from the other drive. (Assuming we aren't talking RAID 0.) From iskandarprins at gmail.com Tue Jul 15 17:32:53 2008 From: iskandarprins at gmail.com (Iskandar Prins) Date: Tue, 15 Jul 2008 19:32:53 +0200 Subject: Ext3 problem In-Reply-To: <20080715133715.GB28719@wolff.to> References: <20080713175617.GA5328@eagle102.home.lan> <20080715133715.GB28719@wolff.to> Message-ID: <487CDF45.3090905@gmail.com> Bruno Wolff III wrote: > On Mon, Jul 14, 2008 at 21:17:00 +0200, > Iskandar Prins wrote: > >> Problem solved. It seemed one of my sata cables was loose, I reconnected it >> and and I got all my data back :-) I'm happy now. >> >> The tip about the raid made me relook into the raid array again >> > > You probably want to check the the arrays elements are actually in sync. > If they aren't you should probably assume the one with the cable problem > is the one that is wrong, fail it and then bring it back into the array > so that it gets rebuilt from the other drive. (Assuming we aren't talking > RAID 0.) > > Thanks, already did that. Rebuild the array. Now everything isi working peachy. I'm fairly new to raid arrays. From Fabio.Rossoni at urmet.it Wed Jul 16 12:12:31 2008 From: Fabio.Rossoni at urmet.it (Rossoni Fabio) Date: Wed, 16 Jul 2008 14:12:31 +0200 Subject: aborted journal and kernel bug on RHEL AP 5.1 on SUN AMD 64bit (X4200M2) Message-ID: Hi, i'm reached a strange situation over my servers SUN X4200M2 running with Linux Advanced Platform 5.1 Linux fea.localdomain 2.6.18-53.el5 #1 SMP Wed Oct 10 16:34:19 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux.. This happen on both internal and external disks (Hitachi AMS 200 storage , emulex HBA , and HDLM sw Hitachi for multipath) After problem happening I'm not able to use the server due to root corruption files : -rwxr-xr-x 1 root root 14096 Sep 5 2007 rmmod -rwxr-xr-x 1 root root 521552 Aug 7 2006 rmt -rwxr-xr-x 1 root root 14648 Jul 13 2006 rngd -rwxr-xr-x 1 root root 57920 Aug 7 2006 route -rwxr-xr-x 1 root root 5904 Sep 25 2007 rpc.lockd -rwxr-xr-x 1 root root 49352 Sep 25 2007 rpc.statd ?--------- ? ? ? ? ? rrestore ?--------- ? ? ? ? ? rrestore.static -rwxr-xr-x 1 root root 29976 Jan 9 2007 rtmon -rwxr-xr-x 1 root root 7736 Oct 13 2006 runlevel -rwxr-xr-x 1 root root 30840 Nov 27 2006 runuser -rwxr-xr-x 1 root root 10376 Aug 17 2007 salsa [root at fea sbin]# And also file system are mounted in read-only mode The following is a parts of messages file: Jul 11 16:11:15 fea clurgmgrd[4739]: Service service:appl-dfdd is disabled Jul 11 16:29:56 fea clurgmgrd[4739]: Stopping service service:db-dfdd Jul 11 16:30:00 fea avahi-daemon[4622]: Withdrawing address record for 10.40.3.40 on eth1. Jul 11 16:30:11 fea dlm_controld[4281]: uevent message has 3 args Jul 11 16:30:11 fea clurgmgrd[4739]: Service service:db-dfdd is disabled Jul 11 16:31:44 fea clurgmgrd[4739]: Starting disabled service service:db-dfdd Jul 11 16:31:44 fea kernel: kjournald starting. Commit interval 5 seconds Jul 11 16:31:44 fea kernel: EXT3-fs warning: maximal mount count reached, running e2fsck is recommended Jul 11 16:31:44 fea kernel: EXT3 FS on sddlmab, internal journal Jul 11 16:31:44 fea kernel: EXT3-fs: mounted filesystem with ordered data mode. Jul 11 16:31:44 fea dlm_controld[4281]: uevent message has 3 args Jul 11 16:31:44 fea avahi-daemon[4622]: Registering new address record for 10.40.3.40 on eth1. Jul 11 16:31:48 fea clurgmgrd[4739]: Service service:db-dfdd started Jul 11 16:40:23 fea clurgmgrd[4739]: Stopping service service:db-dfdd Jul 11 16:40:25 fea avahi-daemon[4622]: Withdrawing address record for 10.40.3.40 on eth1. Jul 11 16:40:35 fea dlm_controld[4281]: uevent message has 3 args Jul 11 16:40:35 fea clurgmgrd[4739]: Service service:db-dfdd is disabled Jul 11 17:13:01 fea kernel: EXT3-fs error (device dm-0): ext3_free_blocks_sb: bit already cleared for block 382976 Jul 11 17:13:01 fea kernel: Aborting journal on device dm-0. Jul 11 17:13:01 fea kernel: EXT3-fs error (device dm-0): ext3_free_blocks_sb: bit already cleared for block 382977 Jul 11 17:13:01 fea kernel: EXT3-fs error (device dm-0): ext3_free_blocks_sb: bit already cleared for block 382978 Jul 11 17:13:01 fea kernel: EXT3-fs error (device dm-0): ext3_free_blocks_sb: bit already cleared for block 382979 Jul 11 17:13:01 fea kernel: EXT3-fs error (device dm-0): ext3_free_blocks_sb: bit already cleared for block 382980 Jul 11 17:13:02 fea kernel: EXT3-fs error (device dm-0) in ext3_reserve_inode_write: Journal has aborted Jul 11 17:13:02 fea kernel: EXT3-fs error (device dm-0) in ext3_reserve_inode_write: Journal has aborted Jul 11 17:13:02 fea kernel: EXT3-fs error (device dm-0) in ext3_orphan_del: Journal has aborted Jul 11 17:13:02 fea kernel: EXT3-fs error (device dm-0) in ext3_truncate: Journal has aborted Jul 11 17:13:02 fea kernel: ext3_abort called. Jul 11 17:13:02 fea kernel: EXT3-fs error (device dm-0): ext3_journal_start_sb: Detected aborted journal Jul 11 17:13:02 fea kernel: Remounting filesystem read-only Jul 11 17:27:30 fea clurgmgrd[4739]: State change: feb.iride DOWN Jul 11 17:27:30 fea clurgmgrd[4739]: State change: /dev/sddlmac UP Jul 11 17:27:30 fea clurgmgrd[4739]: Waiting for node #2 to be fenced Jul 11 17:28:50 fea qdiskd[4191]: Node 2 shutdown And also a kernel bug as: Jul 9 16:57:13 fea syslogd 1.4.1: restart. /trace Jul 10 17:41:09 fea kernel: EXT3-fs warning (device sddlmaa): ext3_unlink: Deleting nonexistent file (13353077), 0 Jul 10 18:20:04 fea dlm_controld[4260]: uevent message has 3 args Jul 10 18:20:04 fea kernel: sb orphan head is 13353077 Jul 10 18:20:04 fea kernel: sb_info orphan list: Jul 10 18:20:04 fea kernel: inode dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0 Jul 10 18:20:13 fea last message repeated 59479 times Jul 10 18:20:13 fea kernel: BUG: soft lockup detected on CPU#1! Jul 10 18:20:13 fea kernel: Jul 10 18:20:13 fea kernel: Call Trace: Jul 10 18:20:13 fea kernel: [] softlockup_tick+0xd5/0xe7 Jul 10 18:20:13 fea kernel: [] update_process_times+0x42/0x68 Jul 10 18:20:13 fea kernel: [] smp_local_timer_interrupt+0x23/0x47 Jul 10 18:20:13 fea kernel: [] smp_apic_timer_interrupt+0x41/0x47 Jul 10 18:20:13 fea kernel: [] apic_timer_interrupt+0x66/0x6c Jul 10 18:20:13 fea kernel: [] vprintk+0x29e/0x2ea Jul 10 18:20:13 fea kernel: [] printk+0x52/0xbd Jul 10 18:20:13 fea kernel: [] out_of_line_wait_on_bit+0x6c/0x78 Jul 10 18:20:13 fea kernel: [] :ext3:ext3_put_super+0x13e/0x1e0 Jul 10 18:20:13 fea kernel: [] generic_shutdown_super+0x79/0xfb Jul 10 18:20:13 fea kernel: [] kill_block_super+0x26/0x3a Jul 10 18:20:13 fea kernel: [] deactivate_super+0x6a/0x82 Jul 10 18:20:13 fea kernel: [] sys_umount+0x245/0x27b Jul 10 18:20:13 fea kernel: [] audit_syscall_entry+0x14d/0x180 Jul 10 18:20:13 fea kernel: [] tracesys+0xd5/0xe0 Jul 10 18:20:13 fea kernel: Jul 10 18:20:13 fea kernel: inode dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0 Jul 10 18:20:13 fea last message repeated 50 times Jul 10 18:20:13 fea kernel: inode dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink , nlink 1, next 0 Jul 10 18:20:13 fea kernel: inode dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0 Jul 10 18:20:13 fea last message repeated 54 times Jul 10 18:20:13 fea kernel: in, nlink 1, next 0 Jul 10 18:20:13 fea kernel: inode dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0 Jul 10 18:20:13 fea last message repeated 54 times Jul 10 18:20:13 fea kernel: in, nlink 1, next 0 Jul 10 18:20:13 fea kernel: inode dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0 Jul 10 18:20:13 fea last message repeated 54 times Jul 10 18:20:13 fea kernel: in, nlink 1, next 0 Jul 10 18:20:13 fea kernel: inode dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0 Jul 10 18:20:13 fea last message repeated 54 times Jul 10 18:20:13 fea kernel: in, nlink 1, next 0 Jul 10 18:20:13 fea kernel: inode dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0 Jul 10 18:20:13 fea last message repeated 54 times I'm planning to reinstall the server ... Some body can help me ? Thanks a lot Fabio -------------------------------------------- INFORMATIVA SULLA PRIVACY Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute in questo messaggio e nei suoi eventuali allegati sono riservate e per uso esclusivo del destinatario. Nessuno, all'infuori dello stesso, pu? copiare o distribuire il messaggio, o parte di esso, a terzi. Chiunque riceva questo messaggio per errore ? pregato di distruggerlo e di informare il mittente. PRIVACY NOTICE According to the D.Lgs. 196/2003 this document and its attachments are confidential and intended for the named addressee(s) only. If you are not the intended recipient of this message, any use or dissemination of this message is prohibited. If you have received this document by mistake, please notify the sender and destroy all physical and/or electronic copies. -------------------------------------------- INFORMATIVA SULLA PRIVACY Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute in questo messaggio e nei suoi eventuali allegati sono riservate e per uso esclusivo del destinatario. Nessuno, all'infuori dello stesso, pu? copiare o distribuire il messaggio, o parte di esso, a terzi. Chiunque riceva questo messaggio per errore ? pregato di distruggerlo e di informare il mittente. PRIVACY NOTICE According to the D.Lgs. 196/2003 this document and its attachments are confidential and intended for the named addressee(s) only. If you are not the intended recipient of this message, any use or dissemination of this message is prohibited. If you have received this document by mistake, please notify the sender and destroy all physical and/or electronic copies. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at nerdbynature.de Thu Jul 17 11:56:41 2008 From: lists at nerdbynature.de (Christian Kujau) Date: Thu, 17 Jul 2008 13:56:41 +0200 (CEST) Subject: aborted journal and kernel bug on RHEL AP 5.1 on SUN AMD 64bit (X4200M2) In-Reply-To: References: Message-ID: <66ff3cad6afc73c34f62458aedd1068f.squirrel@housecafe.dyndns.org> On Wed, July 16, 2008 14:12, Rossoni Fabio wrote: > i'm reached a strange situation over my servers SUN X4200M2 running with > Linux Advanced Platform 5.1 Linux fea.localdomain 2.6.18-53.el5 #1 SMP --------------------------------------------------------^ so, a rather old kernel, patched to hell probably :-) > Wed Oct 10 16:34:19 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux.. This > happen on both internal and external disks (Hitachi AMS 200 storage , > emulex HBA , and HDLM sw Hitachi for multipath) I don't have any multipath experience with ext3, so I hope that's not an issue here. > Jul 11 16:31:44 fea kernel: EXT3-fs warning: maximal mount count > reached, running e2fsck is recommended Well, did you run e2fsck on the filesystem? > Jul 10 18:20:13 fea kernel: BUG: soft lockup detected on CPU#1! Is this reproducible or did this only occure once? Christian. -- make bzImage, not war From vikimun at gmail.com Sun Jul 20 09:02:21 2008 From: vikimun at gmail.com (Victoria Muntean) Date: Sun, 20 Jul 2008 12:02:21 +0300 Subject: sharp increase in 'used' blocks after plain fs copy Message-ID: I copied all files from 8GB partition to 12GB empty partition using rsync. What caught my eyes, df -k was 1.2 used GB on the source partition, but 2.2 GB on the dest partition after copy. And dest was empty before copy. Why ? I took file list (filename and sizes) on both partitions, sorted and diffed, no differences. It's ~130,000 files , basically small debian 4 install. Block size is same on both partitions, 4kb. The 12GB partition was created with all defaults. Thanks Viki -------------- next part -------------- An HTML attachment was scrubbed... URL: From yamin_yossi at diligent.com Sun Jul 20 13:12:35 2008 From: yamin_yossi at diligent.com (Yamin, Yossi) Date: Sun, 20 Jul 2008 16:12:35 +0300 Subject: aborted journal and kernel bug on RHEL AP 5.1 on SUN AMD 64bit (X4200M2) In-Reply-To: <66ff3cad6afc73c34f62458aedd1068f.squirrel@housecafe.dyndns.org> References: <66ff3cad6afc73c34f62458aedd1068f.squirrel@housecafe.dyndns.org> Message-ID: -----Original Message----- From: ext3-users-bounces at redhat.com [mailto:ext3-users-bounces at redhat.com] On Behalf Of Christian Kujau Sent: Thursday, July 17, 2008 2:57 PM To: Rossoni Fabio Cc: ext3-users at redhat.com Subject: Re: aborted journal and kernel bug on RHEL AP 5.1 on SUN AMD 64bit (X4200M2) On Wed, July 16, 2008 14:12, Rossoni Fabio wrote: > i'm reached a strange situation over my servers SUN X4200M2 running with > Linux Advanced Platform 5.1 Linux fea.localdomain 2.6.18-53.el5 #1 SMP --------------------------------------------------------^ so, a rather old kernel, patched to hell probably :-) > Wed Oct 10 16:34:19 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux.. This > happen on both internal and external disks (Hitachi AMS 200 storage , > emulex HBA , and HDLM sw Hitachi for multipath) I don't have any multipath experience with ext3, so I hope that's not an issue here. > Jul 11 16:31:44 fea kernel: EXT3-fs warning: maximal mount count > reached, running e2fsck is recommended Well, did you run e2fsck on the filesystem? > Jul 10 18:20:13 fea kernel: BUG: soft lockup detected on CPU#1! [Yamin, Yossi] Upgrade to Linux Advanced Platform 5.2 Fixing this problem is mentioned on the release notes Is this reproducible or did this only occure once? Christian. -- make bzImage, not war _______________________________________________ Ext3-users mailing list Ext3-users at redhat.com https://www.redhat.com/mailman/listinfo/ext3-users From bryan at kadzban.is-a-geek.net Sun Jul 20 16:11:34 2008 From: bryan at kadzban.is-a-geek.net (Bryan Kadzban) Date: Sun, 20 Jul 2008 12:11:34 -0400 Subject: sharp increase in 'used' blocks after plain fs copy In-Reply-To: References: Message-ID: <488363B6.2070002@kadzban.is-a-geek.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 Victoria Muntean wrote: > I copied all files from 8GB partition to 12GB empty partition using > rsync. What caught my eyes, df -k was 1.2 used GB on the source > partition, but 2.2 GB on the dest partition after copy. And dest was > empty before copy. Hardlinks (especially to large files) that got copied as separate files, possibly? > basically small debian 4 install. Maybe hardlinks then. There are lots of hardlinks in e.g. a full gcc installation (that is, gcc plus binutils), and a fair number of other packages can create a hardlink or two. The default setup for the terminfo database from ncurses also uses lots of hardlinks. (There's an option in ncurses' configure script to make it use symlinks, but I don't know how Debian configures it.) See if rsync has an option to copy links as links instead of making another copy of the file, and see if that helps any. (Do this for symlinks as well, if it has that as an option. Symlinks to directories are particularly problematic.) Also check whether du and df on the old partition agree on the amount of space used, and whether du on the old partition matches df on the new fairly closely. If the latter but not the former, then it's either hardlinks (which would make the target of the copy larger) or unlinked files that are still on the disk because some process has them open (which would make the target of the copy smaller). (Or just blit over the partition's image, after ensuring it's not mounted... resize2fs can help with this.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIg2O1S5vET1Wea5wRAyJ9AKDYkcID3PoPCUEMkCnvZNIlq+jWFgCgxsSB t4UKd9oe9M6nSAvWKELAxoo= =uBjE -----END PGP SIGNATURE----- From criley at erad.com Mon Jul 21 14:57:19 2008 From: criley at erad.com (Charles Riley) Date: Mon, 21 Jul 2008 10:57:19 -0400 Subject: What used to be a directory is now a 0 length file - recovery possible? Message-ID: <4884A3CF.8060903@erad.com> Hello, I have a server that mounts ext3 partitions from an EMC SAN. A few days ago, EMC upgraded firmware on the SAN, and it appears they did so without first coordinating a filesystem unmount. The result is that where I once had a directory which contained subdirectories underneath it, I now have a 0 length file. I am by no means a filesystem expert, but I thought it might be possible to recover the data by setting the inode type back to directory? Hence my plea for help from you, the experts! Not all of the data on the filesystem is missing, just this one directory and everything under it. Any advice will be greatly appreciated. Charles From dorfman.eli at gmail.com Tue Jul 22 13:02:01 2008 From: dorfman.eli at gmail.com (Eli Dorfman) Date: Tue, 22 Jul 2008 16:02:01 +0300 Subject: ext3 filesystem becomes read-only Message-ID: <694d48600807220602i7f432cfdv4923644d9ee785d9@mail.gmail.com> Hello all, I have the following setup: iSCSI initiator on RHEL5. ISCSI target on RHEL5.1 I created an ext3 FS on the device # mkfs.ext3 /dev/sdc mounted the device # mount /dev/sdc /mnt/sdc Trying to create a file on the mounted FS # dd if=/dev/zero of=/mnt/sdc/tempfile bs=1k count=10M I got the following error on the initiator: dd: writing `/mnt/sdc/tempfile': Read-only file system 1489+0 records in 1488+0 records out This problem happens only with ext3 but not with ext2 and when created file is larger than 4GB. Please advise. Following is the output of /var/log/messages: Jul 22 15:16:53 nsg1 kernel: EXT3 FS on sdd, internal journal Jul 22 15:16:53 nsg1 kernel: EXT3-fs: mounted filesystem with ordered data mode. Jul 22 15:22:35 nsg1 kernel: EXT3-fs error (device sdd): ext3_new_block: Allocating block in system zone - blocks from 2719744, length 1 Jul 22 15:22:35 nsg1 kernel: Aborting journal on device sdd. Jul 22 15:22:35 nsg1 kernel: ext3_abort called. Jul 22 15:22:35 nsg1 kernel: EXT3-fs error (device sdd): ext3_journal_start_sb: Detected aborted journal Jul 22 15:22:35 nsg1 kernel: Remounting filesystem read-only Jul 22 15:22:35 nsg1 kernel: EXT3-fs error (device sdd): ext3_free_blocks: Freeing blocks in system zones - Block = 2719744, count = 1 Jul 22 15:22:35 nsg1 kernel: EXT3-fs error (device sdd) in ext3_free_blocks_sb: Journal has aborted Jul 22 15:22:37 nsg1 kernel: __journal_remove_journal_head: freeing b_committed_data Jul 22 15:22:37 nsg1 last message repeated 9 times Jul 22 15:22:37 nsg1 kernel: __journal_remove_journal_head: freeing b_frozen_data Jul 22 15:22:37 nsg1 kernel: __journal_remove_journal_head: freeing b_committed_data Jul 22 15:22:37 nsg1 kernel: __journal_remove_journal_head: freeing b_frozen_data Thanks, Eli From sandeen at redhat.com Tue Jul 22 15:07:51 2008 From: sandeen at redhat.com (Eric Sandeen) Date: Tue, 22 Jul 2008 10:07:51 -0500 Subject: ext3 filesystem becomes read-only In-Reply-To: <694d48600807220602i7f432cfdv4923644d9ee785d9@mail.gmail.com> References: <694d48600807220602i7f432cfdv4923644d9ee785d9@mail.gmail.com> Message-ID: <4885F7C7.6020406@redhat.com> Eli Dorfman wrote: > Hello all, > > I have the following setup: > iSCSI initiator on RHEL5. > ISCSI target on RHEL5.1 > > I created an ext3 FS on the device > # mkfs.ext3 /dev/sdc > > mounted the device > # mount /dev/sdc /mnt/sdc > > Trying to create a file on the mounted FS > # dd if=/dev/zero of=/mnt/sdc/tempfile bs=1k count=10M > > I got the following error on the initiator: > > dd: writing `/mnt/sdc/tempfile': Read-only file system > 1489+0 records in > 1488+0 records out > This problem happens only with ext3 but not with ext2 and when created > file is larger than 4GB. > > Please advise. Is it unique to iscsi? i.e. what if you do the same thing directly on the target? And, being iscsi, is anything else accessing the same block device on the target? -Eric From dorfman.eli at gmail.com Tue Jul 22 16:00:44 2008 From: dorfman.eli at gmail.com (Eli Dorfman) Date: Tue, 22 Jul 2008 19:00:44 +0300 Subject: ext3 filesystem becomes read-only In-Reply-To: <4885F7C7.6020406@redhat.com> References: <694d48600807220602i7f432cfdv4923644d9ee785d9@mail.gmail.com> <4885F7C7.6020406@redhat.com> Message-ID: <694d48600807220900lc6901abq85cde0c15d1b2a93@mail.gmail.com> >> Hello all, >> >> I have the following setup: >> iSCSI initiator on RHEL5. >> ISCSI target on RHEL5.1 >> >> I created an ext3 FS on the device >> # mkfs.ext3 /dev/sdc >> >> mounted the device >> # mount /dev/sdc /mnt/sdc >> >> Trying to create a file on the mounted FS >> # dd if=/dev/zero of=/mnt/sdc/tempfile bs=1k count=10M >> >> I got the following error on the initiator: >> >> dd: writing `/mnt/sdc/tempfile': Read-only file system >> 1489+0 records in >> 1488+0 records out > >> This problem happens only with ext3 but not with ext2 and when created >> file is larger than 4GB. >> >> Please advise. > > > Is it unique to iscsi? i.e. what if you do the same thing directly on > the target? it works directly. the problem happens only with iscsi (iser transport) with ext3 filesystem. iscsi with ext2 works. > > And, being iscsi, is anything else accessing the same block device on > the target? No. From vikimun at gmail.com Wed Jul 23 12:28:25 2008 From: vikimun at gmail.com (Victoria Muntean) Date: Wed, 23 Jul 2008 15:28:25 +0300 Subject: sharp increase in 'used' blocks after plain fs copy In-Reply-To: <488363B6.2070002@kadzban.is-a-geek.net> References: <488363B6.2070002@kadzban.is-a-geek.net> Message-ID: On 7/20/08, Bryan Kadzban wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: RIPEMD160 > > Victoria Muntean wrote: > > I copied all files from 8GB partition to 12GB empty partition using > > rsync. What caught my eyes, df -k was 1.2 used GB on the source > > partition, but 2.2 GB on the dest partition after copy. And dest was > > empty before copy. > > Hardlinks (especially to large files) that got copied as separate files, > possibly? Yes it's hardlinks you are right. I forgot -H to rsync -------------- next part -------------- An HTML attachment was scrubbed... URL: From criley at erad.com Wed Jul 23 14:55:30 2008 From: criley at erad.com (Charles Riley) Date: Wed, 23 Jul 2008 10:55:30 -0400 Subject: fsck.ext3 questions Message-ID: <48874662.909@erad.com> Hi, I posted recently about having a directory turn into a 0 length file.. After lots of reading, poking around with debugfs, and running fsck with the "-n" parameter, I have some questions. The problem directory is named 201311. It's inode 15542275. Poking around with debugfs, I can see that the former subdirectories of 201311 still have all of their data in them. In fact, all of their '..' entries still point back to the 15542275 inode. I think I have two options: let fsck do the work, or do it myself using debugfs. The question is, which one is best? When I ran fsck, it found all of the unconnected directories (which used to be subdirectories of 201311) and asked whether to connect them to lost and found. Of course since I ran fsck with the -n parameter the answer was no.. Unconnected directory inode 3141911 (???) Connect to /lost+found? no Then further on, I got this: '..' in ... (3141911) is ??? (15542275), should be (0). Fix? no If I had not run fsck with -n, would fsck have set '..' to lost+found's inode rather than ? I'm tempted to run fsck and let it do it's thing, and then just move things from lost+found to where they belong. But output from fsck scares me a little bit. The partition is 1.5TB in size, and the customer doesn't have space for me to back it up =(. So I want to make sure I understand what is going to happen if I run fsck. Thanks guys! Charles From carlo at alinoe.com Wed Jul 23 15:58:33 2008 From: carlo at alinoe.com (Carlo Wood) Date: Wed, 23 Jul 2008 17:58:33 +0200 Subject: fsck.ext3 questions In-Reply-To: <48874662.909@erad.com> References: <48874662.909@erad.com> Message-ID: <20080723155833.GA12051@alinoe.com> On Wed, Jul 23, 2008 at 10:55:30AM -0400, Charles Riley wrote: > Hi, > > I posted recently about having a directory turn into a 0 length file.. > After lots of reading, poking around with debugfs, and running fsck with > the "-n" parameter, I have some questions. > > The problem directory is named 201311. It's inode 15542275. > Poking around with debugfs, I can see that the former subdirectories of > 201311 still have all of their data in them. In fact, all of their '..' > entries still point back to the 15542275 inode. > > I think I have two options: let fsck do the work, or do it myself using > debugfs. The question is, which one is best? > > When I ran fsck, it found all of the unconnected directories (which used > to be subdirectories of 201311) and asked whether to connect them to > lost and found. Of course since I ran fsck with the -n parameter the > answer was no.. > > Unconnected directory inode 3141911 (???) > Connect to /lost+found? no > > Then further on, I got this: > '..' in ... (3141911) is ??? (15542275), should be (0). > Fix? no > > If I had not run fsck with -n, would fsck have set '..' to lost+found's > inode rather than ? > > I'm tempted to run fsck and let it do it's thing, and then just move > things from lost+found to where they belong. > But output from fsck scares me a little bit. > > The partition is 1.5TB in size, and the customer doesn't have space for > me to back it up =(. So I want to make sure I understand what is going > to happen if I run fsck. fsck makes sure that the file system is *consistent*. It does not garantee that missing data is recovered (although it will try to keep data as much as possible in those cases were it can make that decision). My advise would therefore be: Try to repair the system as much as possible manually. You can 'look at it' with other tools (such as ext3grep without entering stage1), until it looks like you did the repair correctly: the directory is linked in again, has it's inode with block pointers to the correct directory blocks etc. THEN run fsck before mounting it. There will still be lots of things that need to be updated/corrected at that point (counters and stuff). If fsck doesn't think the filesystem is clean after you messed with it, you shouldn't mount it. Doing things manually also solves your backup problem: as I told you before: make a backup of the journal and all groups that you are about to make changes to (that won't be too many). One group is only 135 MB, so that shouldn't be a problem. -- Carlo Wood From tytso at mit.edu Wed Jul 23 16:42:14 2008 From: tytso at mit.edu (Theodore Tso) Date: Wed, 23 Jul 2008 12:42:14 -0400 Subject: fsck.ext3 questions In-Reply-To: <48874662.909@erad.com> References: <48874662.909@erad.com> Message-ID: <20080723164214.GK8826@mit.edu> On Wed, Jul 23, 2008 at 10:55:30AM -0400, Charles Riley wrote: > > Unconnected directory inode 3141911 (???) > Connect to /lost+found? no > > Then further on, I got this: > '..' in ... (3141911) is ??? (15542275), should be (0). > Fix? no > > If I had not run fsck with -n, would fsck have set '..' to lost+found's > inode rather than ? Yes, it will set '..' to the lost+found after moving the directory to lost+found. > I'm tempted to run fsck and let it do it's thing, and then just move > things from lost+found to where they belong. > But output from fsck scares me a little bit. Yeah, that's just because since you answered no to the "Connect to /lost+found" question, the field "what should .. really be" was left to zero. It's not a big deal. > The partition is 1.5TB in size, and the customer doesn't have space for > me to back it up =(. So I want to make sure I understand what is going > to happen if I run fsck. In general, it's always a good idea to do an image level backup just to be sure. Is this on an LVM? If so, you could create a snapshot that can act as a backup without it taking up the full 1.5TB in size. A snapshot volume with say, 50 megabytes reserved, is probably more than sufficient to maintain an LVM snapshot. - Ted From criley at erad.com Thu Jul 24 15:47:16 2008 From: criley at erad.com (Charles Riley) Date: Thu, 24 Jul 2008 11:47:16 -0400 Subject: debugfs question: What does "expand" do? Message-ID: <4888A404.1040105@erad.com> Hi, What does the "expand_dir" command do in debugfs? All the information I can find on it just says "Expand the directory filespec" I googled the archives and can't find a post where someone actually used it to give me some context. If I try it in debugfs (image opened ro) it seems to want to write to the image. I'm having a bit of a stack underflow here, please help. Thanks in advance, Charles From tytso at mit.edu Thu Jul 24 16:12:20 2008 From: tytso at mit.edu (Theodore Tso) Date: Thu, 24 Jul 2008 12:12:20 -0400 Subject: debugfs question: What does "expand" do? In-Reply-To: <4888A404.1040105@erad.com> References: <4888A404.1040105@erad.com> Message-ID: <20080724161220.GE23279@mit.edu> On Thu, Jul 24, 2008 at 11:47:16AM -0400, Charles Riley wrote: > Hi, > > What does the "expand_dir" command do in debugfs? > All the information I can find on it just says "Expand the directory > filespec" > I googled the archives and can't find a post where someone actually used > it to give me some context. > If I try it in debugfs (image opened ro) it seems to want to write to > the image. It adds an extra (empty) directory block to a directory inode. This something like this is used to recreate a lost+found directory with extra empty directory blocks so that e2fsck can reattach orphaned inodes without needing to allocate blocks from the filesystem (which might not be available if the filesystem is 100% full), for example. It's not normally useful for most debugfs users. (Heck, debugfs wasn't intended to be useful for most ext3 users; it's really designed for ext3 wizards that need to untangle badly corrupted filesystems, and/or ext3/4 developers that are debugging new ext4 code, and/or ext3/4 developers creating deliberately corrupted filesystems for e2fsprogs's regression test suite.) - Ted From balu.manyam at gmail.com Thu Jul 24 18:06:34 2008 From: balu.manyam at gmail.com (Balu manyam) Date: Thu, 24 Jul 2008 23:36:34 +0530 Subject: ext3 filesystem --> read-only In-Reply-To: <3e01d8c50807241102q6f84b9e1mac3c69d4b74b79c6@mail.gmail.com> References: <3e01d8c50807241102q6f84b9e1mac3c69d4b74b79c6@mail.gmail.com> Message-ID: <995392220807241106m5bf9aeb3m4b66416c4a2c7099@mail.gmail.com> hey guys -- one of my rhel4 device's ext3 filesystem went into read only state -- i could see the following errors during the same time from syslog -- kernel version 2.6.9-67 -- does this ring a bell with anyone ? ------------------------- kernel: EXT3-fs error (device dm-8): ext3_free_blocks_sb: bit already cleared for block 254146 kernel: EXT3-fs error (device dm-8) in ext3_reserve_inode_write: Journal has aborted EXT3-fs error (device dm-8) in ext3_new_inode: Journal has aborted kernel: EXT3-fs error (device dm-8) in ext3_truncate: Journal has aborted kernel: EXT3-fs error (device dm-8) in ext3_reserve_inode_write: Journal has aborted EXT3-fs error (device dm-8) in ext3_orphan_del: Journal has aborted kernel: EXT3-fs error (device dm-8) in ext3_reserve_inode_write: Journal has aborted kernel: EXT3-fs error (device dm-8) in ext3_reserve_inode_write: Journal has aborted kernel: EXT3-fs error (device dm-8) in ext3_delete_inode: Journal has aborted kernel: ext3_abort called. kernel: EXT3-fs error (device dm-8): ext3_journal_start_sb: Detected aborted journal kernel: Remounting filesystem read-only ------------------------- any suggestions/advice are much appreciated -------------- next part -------------- An HTML attachment was scrubbed... URL: From tytso at mit.edu Thu Jul 24 18:23:50 2008 From: tytso at mit.edu (Theodore Tso) Date: Thu, 24 Jul 2008 14:23:50 -0400 Subject: ext3 filesystem --> read-only In-Reply-To: <995392220807241106m5bf9aeb3m4b66416c4a2c7099@mail.gmail.com> References: <3e01d8c50807241102q6f84b9e1mac3c69d4b74b79c6@mail.gmail.com> <995392220807241106m5bf9aeb3m4b66416c4a2c7099@mail.gmail.com> Message-ID: <20080724182350.GC27129@mit.edu> On Thu, Jul 24, 2008 at 11:36:34PM +0530, Balu manyam wrote: > hey guys -- one of my rhel4 device's ext3 filesystem went into read only > state -- i could see the following errors during the same time from syslog > -- kernel version 2.6.9-67 -- does this ring a bell with anyone ? > ------------------------- > kernel: EXT3-fs error (device dm-8): ext3_free_blocks_sb: bit already > cleared for block 254146 Your filesystem was corrupted, and so e2fsprogs remounted it read-only to prevent further damage that might lead to data loss. Unmount it, and run e2fsck to fix the filesystem corruption. If this happens again, you may want to check your hardware; a bad or crimped hard drive cable, or a failing memory module can lead to on-disk filesystem corruption. - Ted From criley at erad.com Thu Jul 24 20:40:11 2008 From: criley at erad.com (Charles Riley) Date: Thu, 24 Jul 2008 16:40:11 -0400 Subject: debugfs question: What does "expand" do? In-Reply-To: <20080724161220.GE23279@mit.edu> References: <4888A404.1040105@erad.com> <20080724161220.GE23279@mit.edu> Message-ID: <4888E8AB.6050709@erad.com> An HTML attachment was scrubbed... URL: From thorsten.henrici at gfd.de Fri Jul 25 10:55:27 2008 From: thorsten.henrici at gfd.de (thorsten.henrici at gfd.de) Date: Fri, 25 Jul 2008 12:55:27 +0200 Subject: e2fsck message "inode is too big" Message-ID: Hallo List, after rebooting one of our systems one of the ext3 filesystems, that was cleanly unmounted, fsck stopped the boot process due to critical errors in that filesystem. Running fsck -y manually returned, that one of the i-nodes is "too big" an was truncated. After this about three million i-nodes were orphaned an moved to lost+found. # strings /sbin/e2fsck | grep -i 'too.*big' @i %i is too big. @b #%B (%b) causes @d to be too big. @b #%B (%b) causes file to be too big. @b #%B (%b) causes symlink to be too big. @h %i has a tree depth (%N) which is too big Ext2 file too big shows the according output sting. In the literature I found covering ext2/3 internals I didn't come accross anything, that suggets how an inode can become "too big". After all the inode-size is set, when the filesystem is created (usually to 128 byte, if I understand correctly). I have two questions regarding this: 1. What can cause this error 2. Is there any message logged to either syslogd or the kernel ring buffer, when this limit is reached, since we couldn't find any. At least this holds true for /var/log/messages. Kernel version is 2.6.9-55.0.0.0.2.ELsmp #1 SMP Wed May 2 14:59:56 PDT 2007 i686 athlon i386 GNU/Linux Thanks a lot for your help in this Kind wishes Thorsten -- IMPORTANT NOTICE: This email is confidential, may be legally privileged, and is for the intended recipient only. Access, disclosure, copying, distribution, or reliance on any of it by anyone else is prohibited and may be a criminal offence. Please delete if obtained in error and email confirmation to the sender. -------------- next part -------------- An HTML attachment was scrubbed... URL: From adilger at sun.com Sun Jul 27 18:38:55 2008 From: adilger at sun.com (Andreas Dilger) Date: Sun, 27 Jul 2008 14:38:55 -0400 Subject: e2fsck message "inode is too big" In-Reply-To: References: Message-ID: <20080727183855.GD3153@webber.adilger.int> On Jul 25, 2008 12:55 +0200, thorsten.henrici at gfd.de wrote: > after rebooting one of our systems one of the ext3 filesystems, that was > cleanly unmounted, fsck stopped the boot process due to critical errors in > that filesystem. > > Running fsck -y manually returned, that one of the i-nodes is "too big" an > was truncated. > After this about three million i-nodes were orphaned an moved to > lost+found. > > # strings /sbin/e2fsck | grep -i 'too.*big' > @i %i is too big. > @b #%B (%b) causes @d to be too big. > @b #%B (%b) causes file to be too big. > @b #%B (%b) causes symlink to be too big. > @h %i has a tree depth (%N) which is too big > Ext2 file too big > > shows the according output sting. You need to provide the exact error message, not one of many possible error messages. > 1. What can cause this error That depends on what the error is, exactly. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. From swapana_ghosh at yahoo.com Tue Jul 29 02:48:16 2008 From: swapana_ghosh at yahoo.com (Swapana Ghosh) Date: Mon, 28 Jul 2008 19:48:16 -0700 (PDT) Subject: File permission getting read only Message-ID: <169384.27942.qm@web58303.mail.re3.yahoo.com> Hi all, I am not sure this is the problem of NAS (nfs) or the File system, I am sending my problem to this list though. If this is wrong list for my post , then I do aplogise. Here is the problem in brief: - The server is rhel4. - One file system (NAS) has been exported. - The file system has been mounted in the server. The mount point is like as nas2.xxx.org:/dev/deg/temp /apps/deg/dev - I am doing cd /apps/deg/dev and , creating some file say xx. File is being created withthe the file permission 644 . Then I am changing the file permission as 444 as root. After that when if I try to change the permission of the xx file back to 644 or anything , it is giving "permission denied". I can't do any updates on xx file. - Same type of process I am doing on a simililar type of NAS mounted file system , it is perfectly alright. If anyone gives me some pointers that will be really appreciated. Thanks in advance From alex at alex.org.uk Tue Jul 29 08:48:10 2008 From: alex at alex.org.uk (Alex Bligh) Date: Tue, 29 Jul 2008 09:48:10 +0100 Subject: File permission getting read only In-Reply-To: <169384.27942.qm@web58303.mail.re3.yahoo.com> References: <169384.27942.qm@web58303.mail.re3.yahoo.com> Message-ID: I take it you've checked you are not using root squash NFS mount option. Alex --On 28 July 2008 19:48:16 -0700 Swapana Ghosh wrote: > > Hi all, > > I am not sure this is the problem of NAS (nfs) or the File system, I am > sending my problem to this list though. If this is wrong list for my > post , then I do aplogise. > > Here is the problem in brief: > > - The server is rhel4. > > - One file system (NAS) has been exported. > > - The file system has been mounted in the server. The mount point is > like as > > nas2.xxx.org:/dev/deg/temp /apps/deg/dev > > - I am doing > > cd /apps/deg/dev and , creating some file say xx. File is > being created withthe the file permission 644 . > > Then I am changing the file permission as 444 as root. > After that when if I try to change the permission of the xx > file back to 644 or anything , it is giving > "permission denied". I can't do any updates on xx file. > > - Same type of process I am doing on a simililar type of NAS > mounted file system , it is perfectly alright. > > If anyone gives me some pointers that will be really appreciated. > > Thanks in advance > > > > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users > > Alex From articpenguin3800 at gmail.com Thu Jul 31 02:11:48 2008 From: articpenguin3800 at gmail.com (John Nelson) Date: Wed, 30 Jul 2008 22:11:48 -0400 Subject: non-contiguous Inodes Message-ID: <48911F64.7090903@gmail.com> Does non-contiguous inodes mean the inode itself is fragmented or is the file itself? From sandeen at redhat.com Thu Jul 31 02:19:44 2008 From: sandeen at redhat.com (Eric Sandeen) Date: Wed, 30 Jul 2008 21:19:44 -0500 Subject: non-contiguous Inodes In-Reply-To: <48911F64.7090903@gmail.com> References: <48911F64.7090903@gmail.com> Message-ID: <48912140.5080203@redhat.com> John Nelson wrote: > Does non-contiguous inodes mean the inode itself is fragmented or is the > file itself? The file. The filefrag utility can give you details on any given file. -Eric