[rhelv6-beta-list] My first experiences with RHEL6 beta

Tue Jun 15 16:31:44 UTC 2010

Okay, I'll bite ... (big grin)

John Summerfield <debian at herakles.homelinux.org> wrote:
> I would venture to suggest that most systems running Linux,
> especially those with IA32 or AMD-64 CPUs, have a single
> disk. I cannot see that LVM provides any benefit in such
> cases.

Based on my latter responses, I don't believe you know what LVM
does at all.  Read on ...

> I used to use OS/2. If you check googlism.com you will find
> vestiges of my reputation there. on OS/2, which never had
> swap partitions,

Oh, so you've use OS/2[1]'s LVM's correct?  IBM has had several
volume managers over the years, even donated one system to Linux.

OS/2 LVM allows the OS to get past the limitations of the legacy
PC BIOS/DOS Disk Label (aka MBR Partition Table), just like NT's
Logical Disk Manager (LDM) and Linux's Logical Volume Manager (LVM).

Furthermore, unlike OS/2 and NT, which require "drive letters"
because of the legacy tie to DOS 1.0[2], POSIX (UNIX/Linux)
systems are even more flexible.

Read on ...

> the recommend placement for swap is "the busiest partition
> on the least busy drive."  The "least busy drive" needs no
> explanation,

I'll even bite further on that, and agree to keep the focus limited
to only 1 drive.

Your comments are clearly 20th century, even questionable for the
'90s[2], but definitely rooted in the '80s.  Starting with the 21st
century, Linux offered LVM, as other OSes offered their volume
management.

LVM is version 2 in kernel 2.6, and merely just a kernel logical
addressing facility.  It's no different than for Multipath and other
facilities for that matter.  The kernel addresses storage linerally,
regardless if it is linear or not, and DeviceMapper provides the
user-space devices.

So let's get low-level ...

Because logical volumes (LV) are just that, logical, they don't need
to be linear, physically.  They only need to be linear, logically,
which the kernel, DeviceMapper, etc... work in tandem to provide.
There is _no_ more legacy limitations forcing one to the legacy
BIOS/DOS disk label.

So here's one such option ...

- Take one filesystem slice, physically break it into 5, contiguous parts
- Now take the swap slice, physically break it into 4, contiguous parts
- Fit the 4 parts of swap in between the 5 parts of filesystem

Now isn't that far more efficient?  It certainly is more manageable
to leverage LVM2/DM than to try to move around a "swap file" in a file
system.  The filesystem slice still looks like a single, linear slice,
even though it's physically not.  Same deal with the swap partition.

This is volume management 101, whether you're using Linux LVM,
OS/2 LVM, NT LDM, or countless other options out there.  Linux's
volume management is not only "free," but it's used by the kernel
itself, because it's how it addresses devices natively.

All DeviceMapper and related, user-space tools do is offer a means
to present it.

[ SIDE NOTE:  Linux LVM is modeled after Digital UNIX/Tru64's
volume manager ]

> but people do tend to choke on their weeties at "the busiest
> partition."

But where is it _physically_?

What if there was a way to place it "most efficiently" from a
"physical" standpoint, while it looked, "logically," as one, long,
linear set of addresses?  That's what LVM2/DeviceMapper can do,
leveraging what's built into kernel 2.6.

Thanx to DeviceMapper, I can lay out the physical, which as no bearing
on what is used logically.  Inside of a filesystem?  Not so much, and
far, far more difficult.  That's why DeviceMapper exists.  To make the
presentation of logical devices compatible with everything that assumes
things are linear/contiguous, without one having to know what is going
on physically -- especially when they are not physically linear.  ;)

> reflect a moment. At a random instant, where are the
> drive's heads likely to be? I suggest over the latest read
> or write operation. Where is that, usually? Someplace in the
> busiest partition.

But where?  Where at any time?  You cannot know what is most efficient.
At best, you could leverage DeviceMapper to break up slices into
contiguous portions, aligned with sector boundaries, and insert the
swap slice as its own, contiguous portions in between.

It would be far more efficient than a "swap file," and much, much
easier to position via DeviceMapper, than inside of a filesystem.
So this argument is absolutely a non-starter as well.
If you understood how LVM2/DM works, I would need to go over this.

> It does not matter what OS you use, the above is true.

What is true?  That it is a true assumption, based on theoretical
concepts of how the layout might be, in a meta-system that is not
real?  Or worse yet, based on OS/2 from the 20th century, and not
even modern systems (not even modern OS/2)?

Or is it more really the fact that one has little control over what
a filesystem does?  I'd argue the latter.

At the same time, on one _can_ lay out the physical extents (PEe) as
optimally as possible with something like LVM2, and then layer the
logical volumes (LVs) for filesystems and swap over those contiguous,
although not always linear, extents.

So swap files, yet again, make less sense here too.

> Now, with default partitioning on any Linux distro I have
> seen, where the user chooses "one partition for everything,"
> there are two partitions.
> One for the data, covering almost all the drive.

If you're really "anal" about performance, you don't mix these
three types of files:  
1.  Static binaries/content (e.g., /, /usr)
2.  Temporary, highly fragmenting files/logs (e.g., /tmp, /var)
3.  Dynamic data of varying small/large sizes (i.e., data)

Extents-based filesystem designs (e.g., XFS) help in the case of
data, but for the most part, segmenting #2 from #3 is highly
recommended.  We're talking about performance here, right?

> One for the swap, either on the inside edge or the outside
> edge of the disk, it makes little difference.

With LVM2/DM, one can change that.  In fact, even _after_ the physical
volumes (PVs) are created, organized into volume groups (VGs) and the
logical volumes (LVs) exist, one can re-organize the location of the
Physical Extents (PEs) _live_, while the system is running.  ;)

> Now, I do not understand how Linux filesystems decide where
> on the disk to create new files.

With LVM2/DM, you can tell it _exactly_ what PEs to use.  ;)

> btdt. I worked at the Australian Dept of Social Security
> when it implemented the original Medibank in the 1970s. We
> bought a Very Expensive Computer system (IBM's finest at the
> time), and and Especially Expensive Disk to hand hi I/O
> traffic.

And I did missile defense in my former life, including maintaining
every major platform under the Sun.  Does it apply to laying out
managed volumes in the 21st century on Linux?  No.  ;)

If I don't know how to effectively tune VM in Linux, I might as
well go back to being an engineer, working on avionics and telemetry.

Even on Red Hat Enterprise Linux release 5, we have ...

$ cat /etc/redhat-release ; ls /proc/sys/vm
Red Hat Enterprise Linux Client release 5.4 (Tikanga)
block_dump                 flush_mmap_pages      max_writeback_pages
overcommit_ratio           swap_token_timeout    dirty_background_ratio
hugetlb_shm_group          min_free_kbytes       pagecache                 topdown_allocate_fast      dirty_expire_centisecs     aptop_mode           mmap_min_addr              page-cluster          vdso_enabled
dirty_ratio                legacy_va_layout      nr_hugepages         panic_on_oom               vfs_cache_pressure    dirty_writeback_centisecs
lowmem_reserve_ratio       nr_pdflush_threads    percpu_pagelist_fraction
drop_caches                max_map_count         overcommit_memory   
swappiness

I see _at_least_ a dozen (12) tunables in there that make all-the-
difference.  As someone else pointed out earlier, "vm.swappiness" is #1.

I honestly think we're done here.  Systems differ.  If ones doesn't
know what they're doing with the Linux VM, then it really doesn't matter
when it comes to distro, or non-Linux experience for that matter.

-- Bryan, only speaking on behalf of myself, from experience

FOOTNOTES:  

[1] I am also a long-time OS/2 sysadmin.  I was also an original NT
3.1 beta tester, who was a technician at the largest installed base
of the first, native NT application.  I saw OS/2 and NT early on.
I also did BSD, SCO and SunOS (along with VMS) in those days.

Take this to heart, absolute 100% of PC assumptions based on OS/2
and NT have absolute 0% application when it comes to POSIX (UNIX/
Linux) platforms on the PC.  This cannot be stressed enough.

[2] Most of the assumptions on OS/2 and NT, especially when it comes
to volume management, are based on their legacy tie to DOS 1.0.  In
fact, OS/2 LVM and NT LDM exist to overcome them, just like Linux's
LVM.  Even NT has UNCs and "Anchors" to get away from drive letters.

The concept of a single filesystem aka "drive letter" is because DOS
1.0 didn't support directories, something that has plagued OS/2-NT
for their life.  This concept of a single filesystem is always going
to be the least inefficient.  Always has been.  Always will be.

And modern, volume management makes the point moot.

-- 
Bryan J  Smith             Professional, Technical Annoyance 
------------------------------------------------------------ 
"Now if you own an automatic ... sell it!
 You are totally missing out on the coolest part of driving"
                                         -- Johnny O'Connell