Formatting a mail

Gordon Messmer yinyang at eburg.com
Tue Oct 16 21:37:48 UTC 2007


Steven W. Orr wrote:
> 
> Ok. I now believe it but I don't understand why. The ls command gets 
> globbed and grows to be too big.  But the for commandline has 
> to also grow to the same size. Is it because the forloop is special?

No, it's because until the shell tries to call execve(), the "command 
line" is just memory.  The shell can allocate as much memory as it 
wants, within the constraints of ulimit and available memory.  Only when 
it tries to execute another process, by way of execve() do the "command 
line" limits apply.

Consider it this way: "for" isn't a command in you $PATH.  "for" 
instructs the shell to do something internally.

It may be helpful to describe how commands are executed in C (kinda, 
don't expect the following to be perfectly accurate).  When you enter 
"ls /*", your shell breaks the input into separate words, and stores 
each word as an element in an array.  It then examines each word to 
determine whether any are variables that need to be replaced, or globs 
that need to be expanded.  It will see that "/*" is a glob, and needs to 
be expanded, so it will search the filesystem for matching files using 
glob().  It will replace the item "/*" with the results from glob(), and 
will then have a larger array.  This array is subject to ulimit, and 
available memory restrictions, but other than that, it's still just 
memory allocated with malloc().  Once the shell has finished expansion 
and replacement tasks, it will fork(), and the new process will call 
execve() with the array of command line arguments.  The kernel may 
reject this call for a number of reasons, each documented in the 
execve() man page.

A for loop in the shell will also use glob() to expand the glob given as 
command line input, but it does not call execve() to run the loop, so 
the limitation doesn't apply.

Having explained all of that, I wonder if it might not have been easier 
to explain that there is no limit on the length of a command you can 
input to the shell, only on the size of an argument list (and 
environment) for a new process. :)

> Try this 
> 
> $ for n in /*/*/*/*/* ; do echo $n ; done | wc
> 
> Vs.
> 
> $ bash -c "for n in /*/*/*/*/* ; do echo \$n ; done | wc"
> 
> Now the stars get expanded by the outer bash and when the inner bash 
> starts, he doesn't know that there ever were any stars.

No, globs in quotes aren't expanded.  "bash", not your shell, will 
evaluate the glob.

> The funny part is 
> that in the new case the output numbers are different.

Perhaps the contents of /proc had changed.  wc's counts shouldn't be 
significantly different.

> Note that we are 
> exec'ing a new bash and that that bash only starts after the terminal 
> session has expanded the stars. AFAICT, we seem to be exec'ing an 8M 
> commandline which is bigger than the 130K in limits.h

You can test this sort of thing with strace:

strace -s 512 -e execve bash -c "for n in /* ; do echo \$n ; done | wc"

If the glob were performed in your shell, you'd see a very large 
argument passed to execve().




More information about the fedora-list mailing list