[Libguestfs] Proposed new file apis
Matthew Booth
mbooth at redhat.com
Mon Aug 23 16:29:28 UTC 2010
On 23/08/10 17:04, Richard W.M. Jones wrote:
> On Mon, Aug 23, 2010 at 04:58:38PM +0100, Matthew Booth wrote:
>> On 23/08/10 16:23, Richard W.M. Jones wrote:
>>> BTW you need two spaces after full stops.
>>
>> Out of curiosity is that a technical or a stylistic requirement?
>> Stylistically, double spaces after a full stop have always seemed
>> anachronistic to me.
>
> It's a stylistic one.
Meh. Done anyway.
>>>>> +hpread returns a NULL C<content> on error.");
>>>>> +
>>>>> + ("hwrite", (RErr, [Int "handle"; BufferIn "content"]), 273,
>>>>> + [ProtocolLimitWarning], [],
>>>>> + "write data to an open handle",
>>>>> + "\
>>>>> +This command writes all the data in C<content> to C<handle>. It
>>>>> +returns an error if this fails.");
>>> So we're definitely disallowing partial writes, even though in some
>>> future version we might allow writes to non-file handles? This API
>>> won't allow partial writes.
>>
>> Yes. I think reads and writes are quite different in this respect.
>> You always know exactly what you want to write, but you may not know
>> what can be read. If this assumption is wrong, now's the time to
>> disagree.
>
> I think we should change pwrite in that case. pwrite only works on
> files, and AFAICT there is never a situation where a partial write is
> possible for a file, so we should turn pwrite into a full write and
> change the documentation accordingly.
Sure.
>> Are you saying that a String can be magically turned into an
>> enumerated type somehow? I was trying to work out how to use an
>> enumerated type.
>
> Sadly not. However we have a long history of using strings instead of
> enumerated types throughout the API. In any case, either 0/1/2 needs
> to be documented, or this should be a string. It should not be an
> undocumented int.
Ok.
>>>>> +On success, hseek returns the new offset from the beginning of
>>>>> +the file. It returns an offset of -1 on failure.");
>>>>> +
>>>>> + ("htruncate", (RErr, [Int "handle"; Int64 "size"]), 276, [],
>>>>> + [],
>>>>> + "truncate a file",
>>>>> + "\
>>>>> +This command sets the size of the file referred to by C<handle>
>>>>> +to C<size>. If the file was previously larger than C<size>, the
>>>>> +extra data will be lost. If the file was previously smaller than
>>>>> +C<size>, the file will be zero-filled. The latter case will
>>>>> +typically create a sparse file.");
>>> Since we already have truncate and allocate calls, I wonder what the
>>> benefit is of having handle versions. I mean, these are extremely
>>> infrequent operations compared to hread/hwrite, so I doubt there's any
>>> optimization benefit in adding these, and if we add these, why not
>>> other very infrequent ops.
>>
>> htruncate is required when streaming a file with sparse sections. i.e.
>>
>> h = hopen_file("/foo");
>> hwrite(h, "foo");
>> htruncate(h, hseek(h, 1024, CUR));
>>
>> Technically you only need this for a sparse section at the end of a
>> file. You could do it by mixing paths and file handles, but that
>> would be a bit jarring.
>
> I don't think I understand this. If you're writing to a file, you can
> truncate it once (before any writing) to set its size:
>
> truncate-size "/output" 100M;
> hopen-file "/output";
> hwrite ...;
> hwrite ...;
> hwrite ...;
> hclose fd;
>
> Why is htruncate needed during writes?
You assume above that you know in advance how large the file you're
writing will be. As I said, you could move that truncate-size to the
end, but mixing paths and handles is nasty and occasionally error-prone.
Consider:
h = hopen_file("/foo");
hwrite ...;
hwrite ...;
mv("/foo", "/bar");
hwrite ...;
hwrite ...;
truncate("/foo", x);
You've now created 2 files rather than 1. If you use htruncate rather
than truncate here it's robust to any changes in paths.
>> hallocate is similarly there to avoid a jarring mix of paths and
>> file handles when operating on a single file. I could also add
>> fchmod and fchown, I guess. Any more?
>
> My point was we don't need to duplicate all of these.
hallocate has the same argument as above: it's a non-path based
alternative. The argument for it is the same as the argument for all the
posix f* calls which take file descriptors instead of path names. There
may be a performance benefit to it, but principally it's harder to make
a class of stupid mistake.
Matt
--
Matthew Booth, RHCA, RHCSS
Red Hat Engineering, Virtualisation Team
GPG ID: D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
More information about the Libguestfs
mailing list