[libvirt] [PATCH v2 11/38] Introduce virStreamSkip

Daniel P. Berrange berrange at redhat.com
Mon May 15 08:56:58 UTC 2017


On Mon, May 15, 2017 at 10:54:03AM +0200, Michal Privoznik wrote:
> On 05/15/2017 10:25 AM, Daniel P. Berrange wrote:
> > On Fri, May 12, 2017 at 09:29:27AM +0200, Michal Privoznik wrote:
> >> On 05/05/2017 04:48 PM, Daniel P. Berrange wrote:
> >>> On Fri, May 05, 2017 at 01:25:34PM +0200, Michal Privoznik wrote:
> >>>> On 05/04/2017 11:29 PM, John Ferlan wrote:
> >>>>>
> >>>>>
> >>>>> On 04/20/2017 06:01 AM, Michal Privoznik wrote:
> >>>>>> This API can be used to tell the other side of the stream to skip
> >>>>>
> >>>>> s/can be/is  (unless it can be used for something else ;-))
> >>>>>
> >>>>>> some bytes in the stream. This can be used to create a sparse
> >>>>>> file on the receiving side of a stream.
> >>>>>>
> >>>>>> It takes just one argument @length, which says how big the hole
> >>>>>> is. Since our streams are not rewindable like regular files, we
> >>>>>> don't need @whence argument like seek(2) has.
> >>>>>
> >>>>> lseek is an implementation detail...  However, it could be stated that
> >>>>> the skipping would be from the current point in the file forward by some
> >>>>> number of bytes.  It's expected to be used in conjunction with code that
> >>>>> is copying over the real (or non-zero) data and should be considered an
> >>>>> optimization over sending zere data segments.
> >>>>>
> >>>>>>
> >>>>>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
> >>>>>> ---
> >>>>>>  include/libvirt/libvirt-stream.h |  3 +++
> >>>>>>  src/driver-stream.h              |  5 ++++
> >>>>>>  src/libvirt-stream.c             | 57 ++++++++++++++++++++++++++++++++++++++++
> >>>>>>  src/libvirt_public.syms          |  1 +
> >>>>>>  4 files changed, 66 insertions(+)
> >>>>>>
> >>>>>
> >>>>> While it would be unused for now, should @flags be added.  Who knows
> >>>>> what use it could have, but avoids a new Flags API, but does cause a few
> >>>>> other wording changes here.
> >>>>
> >>>> Ah sure. We should have @flags there. Good point.
> >>>>
> >>>>>
> >>>>> Perhaps it's just me - but "Skip" and "HoleSize" just seem awkward.
> >>>>> Would "virStreamSetSkip" and "virStreamGetSkip" be more apropos? (or
> >>>>> s/Skip/HoleSize/ - ewww). Names would then follow our more recent
> >>>>> function naming guidelines. I think I dislike the HoleSize much more
> >>>>> than the Skip.
> >>>>
> >>>> SetSkip and GetSkip sound wrong to me instead :D
> >>>>
> >>>>>
> >>>>>> diff --git a/include/libvirt/libvirt-stream.h b/include/libvirt/libvirt-stream.h
> >>>>>> index bee2516..4e0a599 100644
> >>>>>> --- a/include/libvirt/libvirt-stream.h
> >>>>>> +++ b/include/libvirt/libvirt-stream.h
> >>>>>> @@ -50,6 +50,9 @@ int virStreamRecvFlags(virStreamPtr st,
> >>>>>>                         size_t nbytes,
> >>>>>>                         unsigned int flags);
> >>>>>>  
> >>>>>> +int virStreamSkip(virStreamPtr st,
> >>>>>> +                  unsigned long long length);
> >>>>>
> >>>>> Was there consideration for using 'off_t' instead of ULL? I know it's an
> >>>>> implementation detail of virFDStreamData and lseek() usage, but it does
> >>>>> hide things... IDC either way.
> >>>>
> >>>> The problem with off_t is that it is signed type, while ULL is unsigned.
> >>>> There's not much point in sending a negative offset, is there?
> >>>> Moreover, we use ULL for arguments like offset (not sure really why).
> >>>> Frankly, I don't really know why. Perhaps some types don't exist everywhere?
> >>>
> >>> If anything, we would use  size_t, for consistency with the Send/Recv
> >>> methods. 
> >>
> >> So I've given this some though and ran some experiments. On a 32bit arch
> >> I've found this:
> >>
> >> long long 8 signed
> >> size_t 4 unsigned
> >> ssize_t 4 signed
> >> off_t 4 signed
> >>
> >> So size_t is 4 bytes long and long long is 8 bytes. This got me
> >> thinking, size_t type makes sense for those APIs where we need to
> >> address individual bytes. But what would happen if I have the following
> >> file on a 32 bit arch:
> >>
> >> [2MB data] -> [5GB hole] -> [2M data]
> >>
> >> The hole does not fit into size_t, but it would fit into long long. On
> >> the other hand, we are very likely to hit lseek() error as off_t is 4
> >> bytes also (WTF?!). On a 64 bit arch everything is as expected:
> > 
> > Were you testing libvirt, or testing a standalone demo program ?
> 
> Standalone demo program, that basically does this:
> 
> #define PRINT_SIZE(x)   printf( #x " %d %s\n", sizeof(x), (x)-1 < 0 ?
> "signed" : "unsigned")
> 
> PRINT_SIZE(long long);
> PRINT_SIZE(size_t);
> PRINT_SIZE(off_t);
> 
> 
> > 
> > Libvirt should be defining the macro that enables large file support,
> > which turns off_t into a 64-bit type.  So in fact, we would actually
> > be wrong to use size_t - off_t or long long is actually what we need
> > here.
> 
> I think we really need just long long. Consider that even though libvirt
> enables large file support internally, we don't enable it at the header
> files level. It's responsibility of developers of mgmt application to do
> that. Having said that, if we had off_t in public API we can't possibly
> know its size as it depends on something out of our control. Therefore,
> I see long long as our only option.

Yes, if we used off_t in the file, we'd be ABI-incompatible with
programs that don't enable large file. So using long long protects
us from that incompatibility - size_t is the only safe magic sized
type we can use.

> Now, question is whether we want signed or unsigned long long. I don't
> have an opinion about that. On one hand, off_t is signed, but that's
> because lseek() can seek backwards. We don't have that in our streams.
> Yet. On the other hand, our streams are different to regular files. I
> view them as a unidirectional pipe. With some extensions (e.g. sparse
> messages). lseek() doesn't work over pipes, does it. But then again,
> long long might be more future proof, if we will ever want to assign a
> meaning to negative seeks.

I guess we might as well use signed long long - we aren't gaining anything
by using unsigned long long, since the value will be truncated to signed
long long when we use it with lseek().

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvir-list mailing list