[Linux-cluster] The file on a GFS2-filesystem seems to be corrupted

Mon Dec 15 12:27:07 UTC 2014

Hi,

On 15/12/14 12:22, Vladimir Melnik wrote:
> Thank you!
>
> Is it safe if I run "fsck.gfs2 -n" without unmounting it?
It is better to unmount, as otherwise the view of the fs which fsck has 
will be different to that which the mount has, so there is no guarantee 
that fsck will give correct results without unmounting. So the short 
answer to your question is no,

Steve.

> On Mon, Dec 15, 2014 at 09:59:02AM +0000, Steven Whitehouse wrote:
>> Hi,
>>
>> On 15/12/14 09:54, Vladimir Melnik wrote:
>>> Hi!
>>>
>>> The qcow2 isn't inderneath it, we can assume it's an ordinary file on a
>>> filesystem. Its' size was about 300-400 GB, but now size is
>>> 7493992262336241664 bytes and I don't understand how it's happened. I'd
>>> like to remove it, but I worry about consequences. :(
>> Ok, I think I see now... either way though, if you are unsure about
>> whether there is a problem, then unmounting on all nodes and running
>> fsck is the way to go. That should pick up any problems that there
>> might be with the filesystem. If you have the ability to snapshot
>> the storage, then you could run fsck on a snapshot in order to avoid
>> so much downtime.
>>
>> An odd file size should not, in and of itself cause any problems
>> with removing a file, so it will only be an issue if other on-disk
>> metadata is incorrect,
>>
>> Steve.
>>
>>> On Mon, Dec 15, 2014 at 09:23:47AM +0000, Steven Whitehouse wrote:
>>>> Hi,
>>>>
>>>> How did you generate the image in the first place? I don't know if
>>>> we've ever really tested GFS2 with a qcow device underneath it -
>>>> normally even in virt clusters the storage for GFS2 would be a real
>>>> shared block device. Was this perhaps just a single node?
>>>>
>>>> Have you checked the image with fsck.gfs2 ?
>>>>
>>>> Steve.
>>>>
>>>> On 15/12/14 09:17, Vladimir Melnik wrote:
>>>>> And one more question,
>>>>>
>>>>> Is it safe to remove this file? What will happen if I try to run 'rm
>>>>> /mnt/sp1/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0.bak', won't it corrupt
>>>>> other files?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> On Sat, Dec 13, 2014 at 06:04:48PM +0200, Vladimir Melnik wrote:
>>>>>> Dear colleagues,
>>>>>>
>>>>>> I encountered some very strange issue and would be grateful if you share
>>>>>> your thoughts on that.
>>>>>>
>>>>>> I have a qcow2-image that is located at gfs2 filesystem on a cluster.
>>>>>> The cluster works fine and there are dozens of other qcow2-images, but,
>>>>>> as I can see, one of images seems to be corrupted.
>>>>>>
>>>>>> First of all, it has quite unusual size:
>>>>>>> stat /mnt/sp1/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0.bak
>>>>>>    File: `/mnt/sp1/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0.bak'
>>>>>>    Size: 7493992262336241664     Blocks: 821710640  IO Block: 4096   regular file
>>>>>> Device: fd06h/64774d    Inode: 220986752   Links: 1
>>>>>> Access: (0744/-rwxr--r--)  Uid: (    0/    root)   Gid: (    0/    root)
>>>>>> Access: 2014-10-09 16:25:24.864877839 +0300
>>>>>> Modify: 2014-12-13 14:41:29.335603509 +0200
>>>>>> Change: 2014-12-13 15:52:35.986888549 +0200
>>>>>>
>>>>>> By the way, I noticed that blocks' number looks rather okay.
>>>>>>
>>>>>> Also qemu-img can't recognize it as an image:
>>>>>>> qemu-img info /mnt/sp1/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0.bak
>>>>>> image: /mnt/sp1/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0.bak
>>>>>> file format: raw
>>>>>> virtual size: 6815746T (7493992262336241664 bytes)
>>>>>> disk size: 392G
>>>>>>
>>>>>> Disk size, although, looks more reasonable: the image's size is really
>>>>>> should be about 300-400G, as I remember.
>>>>>>
>>>>>> Alas, I can't do anything with this image. I can't check it by qemu-img,
>>>>>> neither I can convert it to the new image, as qemu-img can't do anything
>>>>>> with it:
>>>>>>
>>>>>>> qemu-img convert -p -f qcow2 -O qcow2 /mnt/sp1/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0.bak /mnt/tmp/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0
>>>>>> Could not open '/mnt/sp1/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0.bak': Invalid argument
>>>>>> Could not open '/mnt/sp1/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0.bak'
>>>>>>
>>>>>> Any one have experienced the same issue? What do you think, is it qcow2
>>>>>> issue or a gfs2 issue? What would you do in similar situation?
>>>>>>
>>>>>> Any ideas, hints and comments would be greatly appreciated.
>>>>>>
>>>>>> Yes, I have snapshots, that's good, but wouldn't like to lose today's
>>>>>> changes to the data on that image. And I'm worried about the filesystem
>>>>>> at all: what if something goes wrong if I try to remove that file?
>>>>>>
>>>>>> Thanks to all!
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> V.Melnik
>>>>>>
>>>>>> P.S. I use CentOS-6 and I have these packages installed:
>>>>>> 	qemu-img-0.12.1.2-2.415.el6_5.4.x86_64
>>>>>> 	gfs2-utils-3.0.12.1-59.el6_5.1.x86_64
>>>>>> 	lvm2-cluster-2.02.100-8.el6.x86_64
>>>>>> 	cman-3.0.12.1-59.el6_5.1.x86_64
>>>>>> 	clusterlib-3.0.12.1-59.el6_5.1.x86_64
>>>>>> 	kernel-2.6.32-431.5.1.el6.x86_64
>>>>>>
>>>>>> -- 
>>>>>> Linux-cluster mailing list
>>>>>> Linux-cluster at redhat.com
>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>> -- 
>>>> Linux-cluster mailing list
>>>> Linux-cluster at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>> -- 
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster