[Libguestfs] [PATCH 9/12] hivex: Begin implementation of writing to hives.

Matthew Booth mbooth at redhat.com
Fri Feb 5 13:20:44 UTC 2010


On 05/02/10 12:46, Richard W.M. Jones wrote:
> On Thu, Feb 04, 2010 at 04:54:53PM +0000, Matthew Booth wrote:
> [...]
> 
> Thanks for this review.  Because I'm about 20 commits ahead of the
> public git repo at the moment, what I will do is to add another commit
> which addresses documentation issues in the code retrospectively.
> 
> More comments inline.
> 
>>> +static size_t
>>> +allocate_page (hive_h *h, size_t allocation_hint)
>>> +{
>>
>> Every usage of nr_4k_pages in this function multiplies it by 4096. It
>> would be more readable to choose a different name which reflects that
>> it's simply a rounding.
> 
> It's the number of 4K pages, so it seems like a reasonable name to me ...

I meant storing a value that is already multiplied by 4096, because
that's how it's used. Not important.

>>> +  page->offset_first = htole32 (offset - 0x1000);
>>
>> What's 0x1000? Can you comment this?
> 
> The first 4096 bytes of the file are the header.  Internal pointers
> within the file are relative to the first block, which is to say,
> relative to the file start + 4096 bytes.  Unfortunately the
> representation I chose for the file is a bit stupid in that we store
> the header + blocks as a single allocation, instead of storing those
> separately.  (I didn't make the same mistake in the OCaml analysis
> tools).  I intend to go back and correct this, but first I want a
> comprehensive test suite so that we don't introduce regressions.

Ok.

>>> +  if (seg_len < 4) {
>>
>> Can you replace this '4' with a sizeof(ntreg_hbin_block.seg_len)?
> 
> No because that's a syntax error ...  Apparently it's not possible to
> portably get the size of a structure member.

The above wasn't intended to be syntactically correct. I think, however,
you can do something like:

sizeof(((struct ntreg_hbin_block *) void)->seg_len)

This would be best implemented as a macro.

>>> +  blockhdr->seg_len = htole32 (- (int32_t) seg_len);
>>> +  if (id[0] && id[1] && seg_len >= 6) {
>>
>> Can you replace 6 with sizeof(ntreg_hbin_block)?
> 
> Or more correctly, offsetof (ntreg_hbin_block, <the field after
> id[1]>), but of course you can't do that in C ...  Stupid language.

If you redefine struct ntreg_hbin_block as:

struct ntreg_hbin_block {
  int32_t seg_len;
  char id[2];
  char data[];
} __attribute__((package__));

You can use:

offsetof(struct ntreg_hbin_block, data)

I haven't looked, but I'll bet gnulib contains a portable offsetof.

>>> +    assert (rem >= 4);
>>
>> This assertion doesn't look safe. Why must this be the case? In the
>> event that rem < 4, would you allocate a new page for the dummy free block?
> 
> Allocations are aligned to and rounded up to 8 bytes, so rem should be
> 0 or >= 8.  We know rem > 0 because we're in a conditional.  We need
> to write 4 bytes here.  Hence the assertion.

Ok.

>>> +    blockhdr = (struct ntreg_hbin_block *) (h->addr + h->endblocks);
>>> +    blockhdr->seg_len = htole32 ((int32_t) rem);
>>
>> Will blockhdr->id ever be read here? Is 0x00 valid?
> 
> blockhdr->id is not read for unused blocks, so this is safe.

Ok.

>>> +    size_t len;
>>> +    len = le32toh (vk->data_len);
>>> +    if (len == 0x80000000)      /* special case */
>>> +      len = 4;
>>> +    len &= 0x7fffffff;
>>
>> Comment, please. What's going on with this record length?
> 
> This is a mysterious and undocumented part of the format.  I don't
> claim to understand why, but it is necessary.

Hehe, ok. Can you at least put in a comment to that effect, preferrably
with a pointer to your source.

>>> +int
>>> +hivex_node_set_values (hive_h *h, hive_node_h node,
>>> +                       size_t nr_values, const hive_set_value *values,
>>> +                       int flags)
>>> +{
>>> +  if (!h->writable) {
>>> +    errno = EROFS;
>>> +    return -1;
>>> +  }
>>> +
>>> +  if (!IS_VALID_BLOCK (h, node) || !BLOCK_ID_EQ (h, node, "nk")) {
>>> +    errno = EINVAL;
>>> +    return -1;
>>
>> This would be an internal inconsistency error, surely. If so, you
>> probably want an assert here instead.
> 
> No, it could mean that someone passed in an offset which is a block
> of some sort, but is not an nk-record.  eg:
> 
>   size_t *values = hivex_node_values (h, node);
>   hivex_node_set_values (h, values[0], ...);
> 
> Unfortunately (and another stupidity about C) size_t and hive_node_h
> are compatible types so this won't generate any compile-time error.

This is a stylistic choice, I guess. I would say that incorrect use of
this function is a bug, therefore not something the user can do anything
about. The best outcome for the user is that it gets fixed :) I'd use an
assert. Not important.

>>> +  size_t seg_len =
>>> +    sizeof (struct ntreg_value_list) + (nr_values - 1) * sizeof (uint32_t);
>>
>> Why are you subtracting 1 from nr_values above?
> 
> There is already 1 value in the struct:
> 
> struct ntreg_value_list {
>   int32_t seg_len;
>   uint32_t offset[1];           /* list of pointers to vk records */
> } __attribute__((__packed__));

That's nasty. Is there a format reason for doing this, or could you
redefine this struct as:

struct ntreg_value_list {
  int32_t seg_len;
  uint32_t offset[];	/* list of pointers to vk records */
} __attribute__((__packed__));

>>> +    if (values[i].len <= 4)     /* Store data inline. */
>>
>> sizeof.
> 
> No, it's a fixed magic number in the format.  If the length is <= 4 it
> stores it inline.

Can you give the magic number a name and #define it up the top to be 4?

>>> +      memcpy (&vk->data_offset, values[i].value, values[i].len);
>>> +    else {
>>> +      size_t offs = allocate_block (h, values[i].len + 4, nul_id);
>>
>> sizeof.
> 
> There's no struct to take the sizeof here.  These are special id-less
> blocks.

offsetof(struct ntreg_hbin_block, id)

and include a comment about overwriting id.

>>> +      if (offs == 0)
>>> +        return -1;
>>> +      memcpy (h->addr + offs + 4, values[i].value, values[i].len);
>>
>> sizeof, or the address of a struct member.
> 
> As above.

Ditto ;)

Matt
-- 
Matthew Booth, RHCA, RHCSS
Red Hat Engineering, Virtualisation Team

M:       +44 (0)7977 267231
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490




More information about the Libguestfs mailing list