[rhelv6-beta-list] My first experiences with RHEL6 beta

Wed Jun 16 11:30:01 UTC 2010

Bryan J. Smith wrote:
> Okay, I'll bite ... (big grin)
> 

Brian, I don't think you've thought about my current environment or 
requirements. I have already mentioned them.

Blistering performance is not amongst my current or likely future 
requirements.

Right now as I type this, I have a dual-core Intel E6600 waiting for me 
to actually do something. It's supported by 5 Gbytes of RAM of which 
about 4 Gbytes is used as filesystem cache.

> John Summerfield <debian at herakles.homelinux.org> wrote:
>> I would venture to suggest that most systems running Linux,
>> especially those with IA32 or AMD-64 CPUs, have a single
>> disk. I cannot see that LVM provides any benefit in such
>> cases.
> 
> Based on my latter responses, I don't believe you know what LVM
> does at all.  Read on ...
> 
>> I used to use OS/2. If you check googlism.com you will find
>> vestiges of my reputation there. on OS/2, which never had
>> swap partitions,
> 
> Oh, so you've use OS/2[1]'s LVM's correct?  IBM has had several
> volume managers over the years, even donated one system to Linux.

No, I didn't even use LVM then.

> 
> OS/2 LVM allows the OS to get past the limitations of the legacy
> PC BIOS/DOS Disk Label (aka MBR Partition Table), just like NT's
> Logical Disk Manager (LDM) and Linux's Logical Volume Manager (LVM).
> 
> Furthermore, unlike OS/2 and NT, which require "drive letters"
> because of the legacy tie to DOS 1.0[2], POSIX (UNIX/Linux)
> systems are even more flexible.

Not relevant to anything I have said.

> 
> Read on ...
> 
>> the recommend placement for swap is "the busiest partition
>> on the least busy drive."  The "least busy drive" needs no
>> explanation,
> 
> I'll even bite further on that, and agree to keep the focus limited
> to only 1 drive.
> 
> Your comments are clearly 20th century, even questionable for the
> '90s[2], but definitely rooted in the '80s.  Starting with the 21st
> century, Linux offered LVM, as other OSes offered their volume
> management.
> 
> LVM is version 2 in kernel 2.6, and merely just a kernel logical
> addressing facility.  It's no different than for Multipath and other
> facilities for that matter.  The kernel addresses storage linerally,
> regardless if it is linear or not, and DeviceMapper provides the
> user-space devices.
> 
> So let's get low-level ...
> 
> Because logical volumes (LV) are just that, logical, they don't need
> to be linear, physically.  They only need to be linear, logically,
> which the kernel, DeviceMapper, etc... work in tandem to provide.
> There is _no_ more legacy limitations forcing one to the legacy
> BIOS/DOS disk label.
> 
> So here's one such option ...
> 
> - Take one filesystem slice, physically break it into 5, contiguous parts
> - Now take the swap slice, physically break it into 4, contiguous parts
> - Fit the 4 parts of swap in between the 5 parts of filesystem

If that isn't how Anaconda sets up the disk, it's pretty irrelevant.

> Now isn't that far more efficient?  It certainly is more manageable
> to leverage LVM2/DM than to try to move around a "swap file" in a file

But I don't move around a swap file. Those systems where I don't have 
any swap have been running for years in some cases:
[root at ns ~]# ls -l /var/swapfile
-rw-r--r--  1 root root 536870912 Jun 16  2005 /var/swapfile
[root at ns ~]#
and I've not touched it. That swap file was created the day I installed 
RHEL-clone on it.

> system.  The filesystem slice still looks like a single, linear slice,
> even though it's physically not.  Same deal with the swap partition.
> 
> This is volume management 101, whether you're using Linux LVM,
> OS/2 LVM, NT LDM, or countless other options out there.  Linux's
> volume management is not only "free," but it's used by the kernel
> itself, because it's how it addresses devices natively.

I'm sure all that is useful if you have lots of disks. I've hardly ever 
had more than two disk drives in any of my computers.

> 
> All DeviceMapper and related, user-space tools do is offer a means
> to present it.
> 
> [ SIDE NOTE:  Linux LVM is modeled after Digital UNIX/Tru64's
> volume manager ]
> 
>> but people do tend to choke on their weeties at "the busiest
>> partition."
> 
> But where is it _physically_?

It doesn't matter where it is physically, the point is that it's 
somewhat close to the rest of the high-use data.

In the '70s, we went to enormous trouble to determine what data 
(operating-system mostly) as most used, and clustering it together.

We allocated specific files to specific tracks to ensure that all the 
high-use data was as close together, in order of use.

It wasn't especially important just where it was, provided it was pretty 
much in the middle of the used space. We tended to put these files in 
the middle of the disk, because that seemed about right.

> 
> What if there was a way to place it "most efficiently" from a
> "physical" standpoint, while it looked, "logically," as one, long,
> linear set of addresses?  That's what LVM2/DeviceMapper can do,
> leveraging what's built into kernel 2.6.
> 

There is a definite limit to how much data can be under the heads at one 
time, on commodity disk drives, and that is precisely one cylinder. A 
larger amount is within a 5-cylinder seek, and no amount of magic with 
LVM and device manager will help that.

If you have a box of disks, significantly more disks, then you can have 
quite a large amount of data withing a short seek of the heads, but 
that's not the kind of system I even see these days.

> Thanx to DeviceMapper, I can lay out the physical, which as no bearing
> on what is used logically.  Inside of a filesystem?  Not so much, and
> far, far more difficult.  That's why DeviceMapper exists.  To make the
> presentation of logical devices compatible with everything that assumes
> things are linear/contiguous, without one having to know what is going
> on physically -- especially when they are not physically linear.  ;)
>  
>> reflect a moment. At a random instant, where are the
>> drive's heads likely to be? I suggest over the latest read
>> or write operation. Where is that, usually? Someplace in the
>> busiest partition.
> 
> But where?  Where at any time?  You cannot know what is most efficient.

In the 70s, it was my job to know just that.

> At best, you could leverage DeviceMapper to break up slices into
> contiguous portions, aligned with sector boundaries, and insert the
> swap slice as its own, contiguous portions in between.

However, "most efficient" isn't always necessary. Avoiding clearly bad 
is enough,

I can't imagine that Anaconda might, in a single disk environment, lay 
out the partitions on the physical disk, after all the remapping with 
LVM et al, do anything different from what I would using fdisk: if I 
think a separate /boot is necessary, then one partition (1) for /boot, 
another for / and a third for swap. Assuming I agreed to create a 
partition for swap. So swap would be on the edge of the disk.

> 
> It would be far more efficient than a "swap file," and much, much
> easier to position via DeviceMapper, than inside of a filesystem.
> So this argument is absolutely a non-starter as well.
> If you understood how LVM2/DM works, I would need to go over this.
>  
>> It does not matter what OS you use, the above is true.
> 
> What is true?  That it is a true assumption, based on theoretical
> concepts of how the layout might be, in a meta-system that is not
> real?  Or worse yet, based on OS/2 from the 20th century, and not
> even modern systems (not even modern OS/2)?
> 
> Or is it more really the fact that one has little control over what
> a filesystem does?  I'd argue the latter.
> 
> At the same time, on one _can_ lay out the physical extents (PEe) as
> optimally as possible with something like LVM2, and then layer the
> logical volumes (LVs) for filesystems and swap over those contiguous,
> although not always linear, extents.
> 
> So swap files, yet again, make less sense here too.

I have made it clear that I am not talking about enterprise systems.

One disk. Only one dm mapping possible, linear, and that's no different 
from no LVM.

> 
>> Now, with default partitioning on any Linux distro I have
>> seen, where the user chooses "one partition for everything,"
>> there are two partitions.
>> One for the data, covering almost all the drive.
> 
> If you're really "anal" about performance, you don't mix these

I've made it clear that I am not. Modern PCs offer much better 
performance than their owners generally require.

> three types of files:  
> 1.  Static binaries/content (e.g., /, /usr)
> 2.  Temporary, highly fragmenting files/logs (e.g., /tmp, /var)
> 3.  Dynamic data of varying small/large sizes (i.e., data)
> 
> Extents-based filesystem designs (e.g., XFS) help in the case of
> data, but for the most part, segmenting #2 from #3 is highly
> recommended.  We're talking about performance here, right?

Wrong.

> 
>> One for the swap, either on the inside edge or the outside
>> edge of the disk, it makes little difference.
> 
> With LVM2/DM, one can change that.  In fact, even _after_ the physical
> volumes (PVs) are created, organized into volume groups (VGs) and the
> logical volumes (LVs) exist, one can re-organize the location of the
> Physical Extents (PEs) _live_, while the system is running.  ;)
> 
>> Now, I do not understand how Linux filesystems decide where
>> on the disk to create new files.
> 
> With LVM2/DM, you can tell it _exactly_ what PEs to use.  ;)
> 
>> btdt. I worked at the Australian Dept of Social Security
>> when it implemented the original Medibank in the 1970s. We
>> bought a Very Expensive Computer system (IBM's finest at the
>> time), and and Especially Expensive Disk to hand hi I/O
>> traffic.
> 
> And I did missile defense in my former life, including maintaining
> every major platform under the Sun.  Does it apply to laying out
> managed volumes in the 21st century on Linux?  No.  ;)

I don't see the relevance of your previous life; I was contrasting my 
former life, where performance was important, with my current life when 
it's almost complete irrelevant.

> If I don't know how to effectively tune VM in Linux, I might as
> well go back to being an engineer, working on avionics and telemetry.
> 
> Even on Red Hat Enterprise Linux release 5, we have ...
> 
> $ cat /etc/redhat-release ; ls /proc/sys/vm
> Red Hat Enterprise Linux Client release 5.4 (Tikanga)
> block_dump                 flush_mmap_pages      max_writeback_pages
> overcommit_ratio           swap_token_timeout    dirty_background_ratio
> hugetlb_shm_group          min_free_kbytes       pagecache                 topdown_allocate_fast      dirty_expire_centisecs     aptop_mode           mmap_min_addr              page-cluster          vdso_enabled
> dirty_ratio                legacy_va_layout      nr_hugepages         panic_on_oom               vfs_cache_pressure    dirty_writeback_centisecs
> lowmem_reserve_ratio       nr_pdflush_threads    percpu_pagelist_fraction
> drop_caches                max_map_count         overcommit_memory   
> swappiness
> 
> I see _at_least_ a dozen (12) tunables in there that make all-the-
> difference.  As someone else pointed out earlier, "vm.swappiness" is #1.

I don't need expensive training to just add RAM if performance is a 
problem. I don't think I've had a high sustained workload in years, but 
sometimes there's a peak. Such as when trying to use rsync to mirror a 
few gbytes of linux filesystem.

> I honestly think we're done here.  Systems differ.  If ones doesn't
> know what they're doing with the Linux VM, then it really doesn't matter
> when it comes to distro, or non-Linux experience for that matter.
> 
> -- Bryan, only speaking on behalf of myself, from experience
> 
> FOOTNOTES:  
> 
> [1] I am also a long-time OS/2 sysadmin.  I was also an original NT

Your name seemed vaguely familiar, but I didn't place it.

> 3.1 beta tester, who was a technician at the largest installed base
> of the first, native NT application.  I saw OS/2 and NT early on.
> I also did BSD, SCO and SunOS (along with VMS) in those days.

I only ever used OS/2 at home.
> 
> Take this to heart, absolute 100% of PC assumptions based on OS/2
> and NT have absolute 0% application when it comes to POSIX (UNIX/
> Linux) platforms on the PC.  This cannot be stressed enough.

Bollocks.

When it gets down to it, they are writing data to disk and reading data 
from disk. The limiting factors get down to the physics of the system.

In another former life I was an ADABAS guru. After I left that and had a 
new job, I worked in another shop running ADABAS. There was a complaint, 
"our ADABAS application is performing badly, Can you have a look at it?"

I had a look, and the I/O traffic was right up there with the disk 
specifications. It simply could not sustain the workload. I made a few 
points, and that was it. Wouldn't have mattered what OS was running, or 
how clever it was. The ADABAS application was using one disk (advice is 
to have its workfile elsewhere.).

Yeah, LVM would have been great for that.

> 
> [2] Most of the assumptions on OS/2 and NT, especially when it comes
> to volume management, are based on their legacy tie to DOS 1.0.  In
> fact, OS/2 LVM and NT LDM exist to overcome them, just like Linux's
> LVM.  Even NT has UNCs and "Anchors" to get away from drive letters.
> 
> The concept of a single filesystem aka "drive letter" is because DOS

"drive letter" is of no relevance, it's simply a label attached to some 
storage. \\servername\sharename works as well for most applications, and 
that's little different from /net/someserver/someshare on *x.

> 1.0 didn't support directories, something that has plagued OS/2-NT
> for their life.  This concept of a single filesystem is always going
> to be the least inefficient.  Always has been.  Always will be.

"the least inefficient."

> 
> And modern, volume management makes the point moot.
> 

-- 

Cheers
John

-- spambait
1aaaaaaa at coco.merseine.nu  Z1aaaaaaa at coco.merseine.nu
-- Advice
http://webfoot.com/advice/email.top.php
http://www.catb.org/~esr/faqs/smart-questions.html
http://support.microsoft.com/kb/555375

You cannot reply off-list:-)