[Freeipa-devel] [PATCH] 353 Added initial vault implementation.

Wed Nov 5 10:06:04 UTC 2014

On 11/05/2014 01:24 AM, Endi Sukma Dewata wrote:
> Thanks for the review. I have some questions below. I'll post a new
> patch after the issues are addressed.
>
> On 11/4/2014 11:36 AM, Petr Viktorin wrote:
>> The new schema can go to 60basev3.ldif, no need for a new file.
>
> Fixed. Also removed nsContainer as suggested by Simo.
>
>> ipalib/plugins/user.py: Make the try: block as small as possible; maybe
>> something like this:
>>
>>       try:
>>           vaultcontainer_id = ...
>>       except errors.NotFound:
>>           pass
>>       else:
>>          (vaultcontainer_name, vaultcontainer_parent_id) = ...
>>          self.api.Command.vaultcontainer_del(...)
>
> Actually the command that generates the NotFound exception is the last
> command. The first two commands are preparing the parameters. I moved
> them before the try-block.

Thanks!

>> ipapython/dn.py: This change is not needed. If you have a sequence of
>> RNDs you can do `DN(*seq)`.
>
> This is actually needed to construct a parent DN from an existing DN:
>
>    parent_dn = DN(dn[1:])
>
> Note that the parameter is an array of RDNs, not a sequence of RDNs. I
> don't see an existing mechanism to get a parent DN from a DN. Without
> this change it would generate an error:
>
> ValueError: tuple or list must be 2-valued, not
> "[ipapython.dn.RDN('cn=admin'), ipapython.dn.RDN('cn=users'),
> ipapython.dn.RDN('cn=vaults'), ipapython.dn.RDN('dc=idm'),
> ipapython.dn.RDN('dc=lab'), ipapython.dn.RDN('dc=bos'),
> ipapython.dn.RDN('dc=redhat'), ipapython.dn.RDN('dc=com')]"

Right, for that you can do
     parent_dn = DN(*dn[1:])
(note the asterisk)

>> ipalib/plugins/vault.py:
>> Do not use star imports in new code; remove the line
>>       from ipalib.plugins.baseldap import *
>> and use e.g. `baseldap.LDAPObject` instead of just `LDAPObject`.
>> Hopefully the only star-imported used are the base classes?
>
> I found 20 star imports in the existing plugins :/

Yes, we need to fix those someday.

http://www.freeipa.org/page/Coding_Best_Practices#Do_not_use_star_imports

https://fedorahosted.org/freeipa/ticket/2653

> All existing plugins are actually using LDAPObject directly, not
> baseldap.LDAPObject. Is there a concern doing that? I fixed the code to
> do the same thing.

Your choice.
If there are many objects imported from a module I prefer importing the 
module itself, but individual names work too.

>> To make translators' jobs easier, split the help text in __doc__ into
>> strings that can be translated individually.
>> Replace every blank line by:
>> """) + _("""
>
> Fixed.
>
>> The pattern and pattern_errmsg for the 'cn' param don't match. Which one
>> is right? Shouldn't a similar check be there for parent?
>
> This is actually copied verbatim from user's uid. Could you give some
> sample values that show the problem?

These messages are misleading:

$ ipa vaultcontainer-add '$$'
ipa: ERROR: invalid 'container_name': may only include letters, numbers, 
_, -, . and $
$ ipa vaultcontainer-add -- '-myvault-'
ipa: ERROR: invalid 'container_name': may only include letters, numbers, 
_, -, . and $

(Note that in webUI you wouldn't need the awkward quoting and `--`)

> I think ideally there should be a similar check for parent, but I
> haven't figured out the proper pattern yet. I think we can do this as a
> later enhancement. Or maybe we should remove the validation for cn for now.

Well, what are the PKI limitations on ids?
Are there any best practices for the id names?
I think we want to limit the names to avoid nasty surprises later; 
'[a-zA-Z0-9_.-]+' seems like a good set.
For parent you also need the slash, otherwise use the same pattern.

I think this is quite an important decision; weird names could lead to 
security issues.

>> vaultcontainer.get_dn: Why put '/' at the end of container_id, if the
>> empty string is ignored later anyway?
>
> I'm still considering some options for container ID:
> a) use a slash at the end for all containers (e.g. /users/)
> b) don't use a trailing slash, except for root container (/)
>
> I think option (a) is more consistent, container ID always ends with
> slash, although in cases like this it can be redundant. This is how the
> code is currently written. We can make an exception for
> vaultcontainer.get_dn() though.
>
> Option (b) will require more careful coding in many places (e.g.
> concatenations) because it needs to handle a special case for root
> container.
>
> Maybe a better way is to use an array of names internally and the slash
> is only used during input/output. The problem is when calling another
> command the array has to be converted into string and parsed back into
> array, so it may even introduce more redundancies. I don't see a way to
> pass an object parameter to a command. Any suggestion?

That would indeed be better. Hopefully you can avoid re-parsing and 
re-validating by not calling nested Commands, see below.

>> Nitpick: In vaultcontainer.get_dn, prefer the if statement over the
>> if-else expressions; it's a bit longer but more readable.
>
> Fixed.
>
>> In vaultcontainer.get_id, what's the purpose of the len() check? Did you
>> want dn.endswith() instead?
>
> Yes, but my assumption is the DN will always be within the namespace
> because it's only used internally, and using len() will be faster than
> endswith(). The error case will never happen, but I was just making sure
> there won't be an infinite loop. Maybe instead of calling the method
> recursively I should just use a loop? That will avoid repetitive
> validations.

Well, if you end up with representing the id as a list the problem will 
largely go away, but: use an internal error (e.g. ValueError) for 
internal sanity checking. The public errors are for user errors, they 
are presented "nicely" which makes them much harder to debug. And 
"Invalid container DN" with no indication of what the DN is isn't 
helpful to the user either.

>> I sugggest writing doctests for the id manipulation methods (get_id,
>> get_private_id, ...) – it's not always obvious what exactly they do.
>
> Do you mean sample code in docs? Is it still necessary if we have some
> test code?

Not strictly necessary, but it better explains what the method does.

>> According to the design doc, container IDs shouldn't end in '/'. If you
>> did that I think the manipulation would be simpler, and it could be
>> shared with the vault equivalents.
>
> Yeah, as mentioned above I'm still considering several options. The
> design may need to be adjusted.
>
>> vaultcontainer_add: You should use ldap backend directly. Calling
>> Commands is costly, most of the call is spent doing validation of what
>> here is already validated. You should add a function to recursively
>> create vault container using just the ldap client, and call that here
>> and in vault_add.
>
> Hmm.. I'm not sure if we should write a code that will duplicate
> everything done by the LDAPCreate class. Shouldn't server-side
> command-to-command invocation be relatively faster?

But you don't everything done by LDAPCreate. Most of what it does is 
parsing options, validating options, formatting output. You don't even 
use the added entry, after LDAPCreate spends an additional LDAP call to 
re-retreive it after adding.

Commands are not really designed to be nested, and tend to have subtle 
assumptions of "owning" the current thread.

Not to mention that debugging nested Command calls is a nightmare.
Imagine calling your own vault_retrieve from an unrelated command, and 
getting a "Data not found." error, formatted for user consumption (e.g. 
no traceback).

(Yes, some parts of IPA do currently call nested commands. This is 
unfortunate and leads to hard-to-debug issues.)

> The recursion is mainly used to create the user's private container.
> Note that the private container is not created by default. We only
> provide /users top-level container. When a user creates a private vault
> the command will automatically create the /users/<username> private
> container on the user's behalf. This will only happen once for each
> user. In most other cases the parent would already exist.
>
> Another option is to require the user or the admin to create the private
> container manually, but that would be less user-friendly.
>
> Another option is to check the parent container using direct LDAP
> search, but still do the add using vaultcontainer_add. What do you think?

Why? Is adding the entry via ldap that hard to do?
What functionality from LDAPCreate do you need? Maybe we can pull it out 
into a function.

>> You can delete a container with children; is that expected?
>
> This seems to be LDAPDelete's built-in functionality. Is there an option
> to disallow non-empty subtree removal?
>
> For vault purposes, when a user is removed from IPA the server will
> remove the user's entire private container, so the feature would
> actually be useful here. But probably in general we should warn the user
> if they are deleting a non-empty container. If there's an option in
> LDAPDelete we can let the user specify that option if necessary.

A warning doesn't seem very useful. ("Yes sir, deleted your data! Oh, by 
the way, did you really want to do that?")
I think requiring a --force or --recursive option for non-empty trees 
could better prevent surprises.

>> vault_add should complain if it does not get exactly one of data/text/in.
>
> Actually, no. The primary way to archive data is via vault_archive. The
> vault_add is for creating a vault, but optionally you can specify the
> initial data to be archived. If nothing is specified it will archive an
> empty string. Note that for symmetric vault the encrypted empty string
> can be used to enforce that the same password will be used on on
> subsequent archival operations.

Then the command should complain if is gets more than one type of input.

It still seems redundant to me.
If I'm getting it correctly, to change a vault's attributes like 
description, you need vault_mod, but to change the stored secret you do 
vault_archive. Is that right? Why not do all modifications in vault_mod?

And if you do end up needing separate commands, you can pull the 
communication with KRA into a function and call that, rather than 
calling the whole Command machinery.

> I'm trying to change vault_archive to raise an error if it's missing the
> input data and change vault_add to archive empty string if nothing is
> specified, but it doesn't work since the framework seems to be
> converting the empty string into None, so vault_archive can't tell if
> it's archiving an empty string or missing the input data. Any suggestion?
>
>> What's the difference between vault_add and vault_archive? I don't see
>> vault_archive in the design.
>
> Yes, that's due to a recent 'standard vault' addition. I have not added
> vault_archive yet to the design. The vault_archive is used to archive a
> generic blob of data. The vaultsecret_archive will later be used to add
> secrets into a JSON structure which will then be archived using the
> vault_archive.

Okay. I'm not sure I get the details right, a design would help.

>> It seems '/' is equivalent to '-' as far as KRA is concerned; should we
>> disallow '-' in container/vault names?
>
> I think so, but as I mentioned earlier I haven't look very far on
> attribute validations.
>
> The slashes in vault ID is converted into dashes in KRA's client key ID.
> KRA actually accepts slashes, but the slashes will not look pretty in
> REST URL because they will be URL-encoded (in case it's needed for
> troubleshooting KRA via browser) so I changed them to dashes.

So it seems that these IDs would be equivalent for the KRA:

/users/admin/private/secret
/users/admin-private/secret

If this is just for troubleshooting purposes, please don't change to dashes.

>> You can specify an absolute id by starting it with a slash, but only in
>> --parent and not in the name itself. I think this should be possible in
>> the name too.
>
> Right, but I'm not sure how it will work with the IPA framework. The
> first command parameter is the cn (vault name). If we specify a vault ID
> (with slashes) there it will be entered into cn. Should we parse the ID
> and insert the vault name and parent ID back into keys and options
> variables? Which value will eventually be stored into cn in LDAP?

You're right, managing that would need some delicate coding.

But I you could override args_options_2_params to adjust the options 
before they're even validated, so the framework gets cn and parent as it 
does currently.

>> You can't include slashes in the name, so you always need to specify the
>> prefix with --parent. I don't think there's a technical reason for this
>> limitation.
>
> See the above explanation. What does primary_key actually mean in
> hierarchical structure like vault? Since cn is only unique locally, is
> cn a primary key? Can a virtual attribute be a primary key?
>
>> There are no tests.
>
> Right, I'll take a look at how to write tests. Is there a way to run
> tests without running the full build?

Yes, see http://www.freeipa.org/page/Testing
(Though I usually just copy changed files to /usr/lib/ and restart Apache)

-- 
Petr³