[Linux-cluster] GFS Data Journaling - answers

Thu Oct 26 14:44:49 UTC 2006

Thanks !
No I have one less performance test to try :)

Riaan van Niekerk wrote:
> hi all
>
> I asked Red Hat Global Support Services some questions about GFS Data
> Journaling. From what Red Hat tell me, I can sumarize:
>
> a) only benefits writes, not reads
> b) cannot be applied retroactively on non-empty files (I can think of
> a very cludgy way, touch newfile ; [set property using gfstool] ; cat
> oldfile > newfile ; mv newfile oldfile ). but this is not something I
> would like to do on 3 TB of small files
> c) we would have to test extensively to tell if it is worth the effort
> d) it might even make performance worse since it will place higher I/O
> load on the SAN / SP.
> e) it would be extremely difficult to do a representative simulation.
> It might perform better on a non-starved RAID controller or SAN SP,
> but the performance might drop when applying the change on our
> production, starved SP.
>
> (SP = Storage Processor, EMC speak for the thing that handles the I/O
> in the SAN storage array)
>
> E.g. in our environment at least, we can cross this off our list as an
> avenue of exploration for increasing performance.
>
> greetings
> Riaan
>
> ...
>
> Eduardo:
>
> I have talked to one of the GFS developers regarding this issue. I'll
> answer your questions one by one:
>
>
> a) is it safe to do?
>
> It is safe, but it is not recommended do be done on a production
> filesystem. You should create a new filesystem and experiment with it.
>
> b) does the change favour mostly writes?
>
> Yes, but Beware! This change will only affect the fsync() calls. The
> developer says that you will have quicker fsync() calls, BUT on the
> other hand, you will have more I/O. Now, if the SAN is saturated on
> I/O, or on CPU needed for I/O, then it might be that it will be
> actually slower for you.
>
> c) any ballpark performance improvement we can expect
>
> Engineering does not give you any measurable numbers. They say that
> the fsyncs are quicker because of the higher i/o, and that in some
> cases the performance can be better or worse, you would need to create
> a test case and see how it pans out for you.
>
> d) I see "gfs_tool setflag jdata" will only have an effect on
> zero-length files. is there any way to run "gfs_tool setflag jdata"
> against existing files with some kind of copy/cat/mv operation to have
> the existing mail data benefit from this changed property?
>
> the developer says that it is not possible.
>
> e) is there any way to see if the flag has been set on a directory /
> file? (I only see setflag/clearflag operations, not any kind "getflag"
>
> The developer says that you can run gfs_tool stat on a file, and see
> if the jdata flag is active for the file:
>
> # gfs_tool stat /mnt/gfs/x | grep jdata
>   jdata
>   inherit_jdata
>
> Therefore you can see if the jdata, and inherit_jdata is active per
> inode.
>
>
> As a last comment on this issue. I have asked the engineers if this
> option gives any performance increase in READs. The answer is NO. So I
> believe that if the system is getting maxed out on reads or 75% reads,
> there is very little performance boost this feature can provide, and
> might even cause drops in performance. The recommended method is to
> create a test environment and see how it pans out.
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster