[Pulp-dev] pulp 3 upload API validation

Jeff Ortel jortel at redhat.com
Mon Jul 10 21:11:46 UTC 2017



On 07/10/2017 02:36 PM, Dennis Kliban wrote:
> On Mon, Jul 10, 2017 at 3:26 PM, Michael Hrivnak <mhrivnak at redhat.com <mailto:mhrivnak at redhat.com>> wrote:
> 
> 
> 
>     On Mon, Jul 10, 2017 at 3:06 PM, Dennis Kliban <dkliban at redhat.com <mailto:dkliban at redhat.com>> wrote:
> 
>         The upload API for Artifacts is going to allow users to specify the artifact size and a digest. The
>         Artifact model currently supports  'md5', 'sha1', 'sha224', 'sha256', 'sha384', and 'sha512' digests.
> 
>         Do we want to let users specify more than one digest per upload? e.g. md5 and sha256?
> 
> 
>     There may be no harm in this, but it would add complexity to the verification and not add much value. I'd
>     stick with just one unless there's a compelling reason for multiple.
> 
> 
> I agree. The API is going to raise a validation exception when more than 1 digest is provided.

+1

>  
> 
>      
> 
> 
>         Do we want to store all 6 digests for each Artifact? 
> 
> 
>     The expensive part of calculating the digests is reading the file. As long as you're already reading the
>     entire file, which we will during verification, you may as well stuff the bits through multiple hashers
>     (digesters?) and get all the digests. Pulp 2 has a function that does this:
> 
>     https://github.com/pulp/pulp/blob/2.13-release/server/pulp/server/util.py#L327-L353
>     <https://github.com/pulp/pulp/blob/2.13-release/server/pulp/server/util.py#L327-L353>
> 
>     But we can't always guarantee that we'll have all the checksums available, for at least two reasons. 1) If
>     in the future if we want to use yet another algorithm, we probably won't want to run a migration that
>     re-reads every file and calculates the additional digest. 2) For on-demand content, we don't have it
>     locally, so we can't calculate any additional checksums until it gets fetched.
> 
>     So this may be one of those times where we use a good-ole-fashioned getter method that returns the
>     requested digest if it's on the artifact, calculates it if not, or raises an exception if the value isn't
>     available and can't be calculated.
> 
> 
> For uploaded Artifacts, all of the digests will be calculated as the file is being processed during the
> upload. So I don't think calculating all of them should incur significantly more cost than just one. The code
> snippet from Pulp 2 looks similar to what I am doing.

This functionality should be a method on the Artifact and not a util function somewhere.

> 
> I haven't given much thought to the getter, but your idea sounds fine to me.
> Thanks,
> Dennis
> 
>  
> 
> 
>     -- 
> 
>     Michael Hrivnak
> 
>     Principal Software Engineer, RHCE 
> 
>     Red Hat
> 
> 
> 
> 
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 847 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170710/2fda1386/attachment.sig>


More information about the Pulp-dev mailing list