birger at uib.no
Wed May 18 21:51:05 UTC 2005
>On Tue, May 17, 2005 at 11:39:08PM -0600, Frank L. Setinsek wrote:
>> May 17 21:53:52 compute-0-2.local kernel: mptscsih: ioc0: WARNING - Device
>> (0:0:1) reported QUEUE_FULL!
>> May 17 21:53:52 compute-0-2.local kernel: SCSI disk error : host 0 channel 0
>> id 0 lun 1 return code = 440b0000
I would suspect this is an issue with tagged queueing.
Tagged queueing lets a host tag each I/O request with an identifier so
the I/O subsystem can answer the requests in a different order. The host
queries the device to find out how large the queue can be. If you have
several hosts, all assuming they have the whole queue to themselves they
could easily fill it...
Read the documentation for your device, and see what the tagged queue
depth is. See if it can be configured. Then find out how you can set the
queue depth in your scsi driver. Some drivers can set for each target in
some config file. Set max queue depth for the device in the scsi driver
on each node to 1/6 of the total queue depth on the device (since you
have a 6 node cluster).
Of course the easy test would be to disable tagged queueing completely,
but the performance hit can be bad. It would quickly show if the problem
Remember that you will have to reconfigure the queue depth on all nodes
before you can add a new node... So you may want to set the depth to 1/7
of the total so there is room for one more if these nodes run something
you cannot restart often.
More information about the Linux-cluster