[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH 1/2] storage: Resolve storage driver crash




On 11/07/2017 04:18 AM, Cedric Bosdonnat wrote:
> On Mon, 2017-11-06 at 15:53 -0500, John Ferlan wrote:
>> Resolve a storage driver crash as a result of a long running
>> storageVolCreateXML when the virStorageVolPoolRefreshThread is
>> run as a result of a storageVolUpload complete and ran the
>> virStoragePoolObjClearVols without checking if the creation
>> code was currently processing a buildVol after incrementing
>> the driver->asyncjob count.
>>
>> The refreshThread needs to wait until all creation threads
>> are completed so as to not alter the volume list while the
>> volume is being built.
>>
>> Crash from valgrind is as follows (with a bit of editing):
>>
>> ==21309== Invalid read of size 8
>> ==21309==    at 0x153E47AF: storageBackendUpdateVolTargetInfo
>> ==21309==    by 0x153E4C30: virStorageBackendUpdateVolInfo
>> ==21309==    by 0x153E52DE: virStorageBackendVolRefreshLocal
>> ==21309==    by 0x153DE29E: storageVolCreateXML
>> ==21309==    by 0x562035B: virStorageVolCreateXML
>> ==21309==    by 0x147366: remoteDispatchStorageVolCreateXML
>> ...
>> ==21309==  Address 0x2590a720 is 64 bytes inside a block of size 336 free'd
>> ==21309==    at 0x4C2F2BB: free
>> ==21309==    by 0x54CB9FA: virFree
>> ==21309==    by 0x55BC800: virStorageVolDefFree
>> ==21309==    by 0x55BF1D8: virStoragePoolObjClearVols
>> ==21309==    by 0x153D967E: virStorageVolPoolRefreshThread
>> ...
>> ==21309==  Block was alloc'd at
>> ==21309==    at 0x4C300A5: calloc
>> ==21309==    by 0x54CB483: virAlloc
>> ==21309==    by 0x55BDC1F: virStorageVolDefParseXML
>> ==21309==    by 0x55BDC1F: virStorageVolDefParseNode
>> ==21309==    by 0x55BE5A4: virStorageVolDefParse
>> ==21309==    by 0x153DDFF1: storageVolCreateXML
>> ==21309==    by 0x562035B: virStorageVolCreateXML
>> ==21309==    by 0x147366: remoteDispatchStorageVolCreateXML
>> ...
>>
>> Signed-off-by: John Ferlan <jferlan redhat com>
>> ---
>>  src/storage/storage_driver.c | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c
>> index b0edf9f885..5e920fc14c 100644
>> --- a/src/storage/storage_driver.c
>> +++ b/src/storage/storage_driver.c
>> @@ -2257,6 +2257,7 @@ virStorageVolPoolRefreshThread(void *opaque)
>>      virStorageBackendPtr backend;
>>      virObjectEventPtr event = NULL;
>>  
>> + retry:
>>      storageDriverLock();
>>      if (cbdata->vol_path) {
>>          if (virStorageBackendPloopRestoreDesc(cbdata->vol_path) < 0)
>> @@ -2270,6 +2271,14 @@ virStorageVolPoolRefreshThread(void *opaque)
>>      if (!(backend = virStorageBackendForType(def->type)))
>>          goto cleanup;
>>  
>> +    /* Some thread is creating a new volume in the pool, we need to retry */
>> +    if (virStoragePoolObjGetAsyncjobs(obj) > 0) {
>> +        virStoragePoolObjUnlock(obj);
>> +        storageDriverUnlock();
>> +        usleep(100 * 1000);
>> +        goto retry;
>> +    }
>> +
>>      virStoragePoolObjClearVols(obj);
>>      if (backend->refreshPool(NULL, obj) < 0)
>>          VIR_DEBUG("Failed to refresh storage pool");
> 
> ACK, does the job here. I'm rather surprised to see you implementing it
> with sleep, while you pointed me towards virCondWait yesterday. But using
> sleep is a way less intrusive change.
> 

Well you only asked about an alternative mechanism and condition
variable was the first thing that popped into my mind; however, as I got
to actually doing more thinking about it - asyncjobs is not blocking
multiple creates from occurring; whereas, a condition variable would be
waiting for one thing to complete.

My response to Jan also lists 2 other alternatives. This was just the
"easiest" of the 3.  If there's other ideas, I'm open to suggestions.

John

> --
> Cedric
> 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]