[Linux-cluster] Reduced performance question

Benjamin Marzinski bmarzins at redhat.com
Wed Jan 4 21:36:11 UTC 2006


On Mon, Dec 19, 2005 at 08:37:56AM +0200, Raz Ben-Jehuda(caro) wrote:
>    i have been measuring the performance of my cluster.
>    It is compsed of 4 machines, each machine exports 1 sata disk (maxtor
>    maxline III),
>    and theses machines are connected 1 Gbps ethernel drive.
> 
>    Peformance :
>    I am getting about 40% decrease in performance when i simply read a file.
> 
>    1. Would switching to infiniband make it faster ?

I couldn't tell you offhand.  If you are saturating your network bandwidth,
then anything to increase your bandwidth will definitely increase performance.
If not, then it's probably a toss-up.
 
>    2. What is GNBD IO model ?

the gnbd driver works just like any block device driver, except that instead
of writing the data to a locally attached device, it sends the data to the
server over the network, with a small (28 byte) header attached.  Once the
server has written the data to the underlying storage, a reply header is sent
back. For reads, only a header is send to the server, and the server sends
back the data with a reply header attached. 

On the server, there is one thread for each client/device pair. This thread
simply looks at the header, performs the necessary IO on the local device,
and sends back a reply header, along with the data for read request. If you
have cached mode on, the server will go through the cache, and use readahead.
If you don't, the server will use direct IO to the device.  At any rate, in
order to guarantee consistency in case of a crash, the server will not send
the reply until the data is on disk (the underlying device is opened O_SYNC).

Most of the slowness comes from the server. There is only a single thread
per client/device pair, so you will not start on the next request until the
current one completes to disk.  The best solution to this would probably
be async IO.  That way, the thread could pass the IO to the underlying device
as quickly as it comes in (up till it runs short of memory) and then reply
to the client as that IO comepletes to disk.  

>        2.1  Does it use async io ?
nope. It should.
>              multithreaded IO ?
nope. With async IO, I don't think multiple threads would be necessary.
>              sync IO ?
Yup.
>       
>         2.2 When a GNBD servs a node , does it know what is amount of data it
>    needs
>              to fetch right from the start ?
>              Meaning , if a node asks for 1MB buffer, would GNBD read 1 MB
>    buffer
>              or several small chunks ?

Since the server only processes on request at a time, it only needs as much
memory as the largest request that the client can send.  The kernel only
coalesces so many sectors into a single request.  The gnbd device driver can
handle that default limit, so this should not be a factor.  If aysnc io was
added, you could have arbitrarily many requests being processes by the server
at once, so you would either need to preallocate a fixed amount of memory,
and wait to process a request until there was some memory free, or you would
need to allocate memory dynamically, possibly up to some limit.

This would be a great performance enhancement, but since it's only a matter
of time until there will be rock solid software based iSCSI targets, it may
never happen. :(

-Ben

>    Thanks
>    --
>    Raz

> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




More information about the Linux-cluster mailing list