[libvirt] [PATCH v1 09/23] lock_protocol: Add two new remote procedures

Fri Oct 16 19:58:07 UTC 2015

On 10/16/2015 10:54 AM, Michal Privoznik wrote:
> On 16.10.2015 09:17, Peter Krempa wrote:
>> On Mon, Oct 12, 2015 at 12:25:54 +0200, Michal Privoznik wrote:
>>> These procedures will be used to store and bring back security
>>> labels. So far, the idea is that tuple (path, model, label) is
>>> enough. Well, certainly for DAC and SELinux. The functions are:
>>
>> I'm afraid that a 'path' per-se isn't enough to express storage
>> locations in a unique way. Said that a string is fine though, but I'd
>> rather call it 'identifier' or something similar and explicitly
>> document some formats for possibly other storage systems that might use
>> security labeling but don't have a local representation in the host.
>>
>> I'd imagine something like:
>> STORAGE_SYSTEM_UUID:VOLUME_UUID
> 
> I can imagine this, okay.
> 
>> and perhaps
>> HOST_UUID:/path for local files
> 
> Hm... what good it will have to store UUID among with the path? I mean,
> we can't know if two paths from two different hosts is the same file or
> not anyway.
> 
>>
>> One other thing to consider is that if this approach will be used across
>> multiple hosts the paths although equal string-wise might not result
>> being the same file. Not honoring that would result into security
>> implications.
> 
> What security implications do you have in mind?
> 
> On the other hand, I just realized that this approach will not fly. I
> mean, virtlockd is running per host. So consider a file on NFS share,
> and to make things more complicated assume it's accessible under
> different paths from two different hosts. There is still a race between
> those two virtlockd-s - they will not mutually exclude each other and
> therefore can store already chown()-ed owership info. For instance:
> 
> host1: stat()
> host1: store ownership (e.g. 100:20)
> host1: chown (e.g. to 20:20)
> 
> host2: stat()
> host2: store ownership (20:20)
> 
> host1: recall ownership (100:20)
> host1: restore ownership (100:20)
> 
> host2: recall ownership (20:20)
> host2: restore ownership (20:20)
> 
> Both hosts think that they are the last ones to use the file, therefore
> they restore ownerships. However, after all the original ownership is
> not preserved.

Ah - the age old problem of networked and shared storage...  And it's
compounded by two hosts that don't know they are modifying the same
resource. If realistically the server is the 'only' one to know, then
who's problem is this to solve?

Say nothing of choosing NFS...  For this one you have to also consider
root_squash vs. non root squash environment. The attempt to change
anything is managed by the server, but for the sake of your argument
it's set up to allow "no_root_squash" - meaning clients can have more
control...

In your scenario, H2 never chown()'s... So was it trying set to (20:20)
- in which case, why would it do that since the stat it would get would
show (20:20)... Or does it try to set say (40:20).  If the latter, then
H1 is going to have access problems as soon as H2 is successful. If H1
then goes to restore and finds the current stat of (40:20), but it
believes it set (20:20), then should it restore?  Or does it "try to"
restore anyway and then let virtlockd hold onto the restore until H2 is
done? Eventually H2 will restore to (20:20) at which time whatever is
waiting to restore H1's value finds the (20:20) and can finally
successfully restore back to (100:20)...  Conversely if H2 did save and
chown (20:20), but when it goes to restore (20:20) and finds (100:20),
you wouldn't want some thread hanging around...  However, in this case
if the "store" == "restore" and the current is something else, then
perhaps the restore gets dropped.

Maybe a few drinks will help the thought process more ;-)

John
> 
> One way out of this could be that we already require virtlockd in order
> to work properly that lockspace is on a shared mountpoint accessible by
> both hosts. So instead of keeping seclabel space in memory, we can load
> & store it within a file on that mountpoint and guard accesses via
> flock(). But this is rather unpleasant as we would have to parse and
> format the whole seclabel space just to update a refcounter to some entry.
> 
> The other approach would be to have a seclabel space as a directory on
> the shared mountpoint and create some file (with internal structure) per
> path remembered. The internal structure of the file would then contain
> tuple (model, seclabel, refcount) for each model. When updating the
> refcount, only the file would need to be flock()-ed getting us much
> higher throughput. But we would still need to parse and format a file on
> each access (even though the file would be much smaller).
> 
> I'm out of ideas here. Anybody has some?
> 
> Michal
> 
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
>