[Pulp-dev] pulp 3 upload API validation

Dennis Kliban dkliban at redhat.com
Tue Jul 11 17:27:19 UTC 2017


On Tue, Jul 11, 2017 at 1:20 PM, Brian Bouterse <bbouters at redhat.com> wrote:

> We should not raise a validation exception because due to semver we can't
> stop raising that exception later. Specifically if we want to ever allow
> double checksum being specified in the future.
>
> For the MVP, I think the choices are: only respect sha256 and ignore the
> rest OR do as many standard validators as they specify. I'm ok with either
> of these with a slight preference for validating as many of them as are
> specified would be the best. Since we're already handling all of the data
> at save time, pushing it through additional digest validators will cost a
> bit of cpu but not much additional I/O. Also if the user asked us to do
> that validation then having it demand some additional resources is OK. Also
> having feature parity with the changesets is good. The changesets can
> validate all standard digest validators so having uploads do the same would
> be consistent.
>

I am reversing what I had previously said, and I agree that we should not
raise an exception if a user provides more than one checksum at upload
time. My current implementation checks every checksum specified by the
user.




> On Mon, Jul 10, 2017 at 5:11 PM, Jeff Ortel <jortel at redhat.com> wrote:
>
>>
>>
>> On 07/10/2017 02:36 PM, Dennis Kliban wrote:
>> > On Mon, Jul 10, 2017 at 3:26 PM, Michael Hrivnak <mhrivnak at redhat.com
>> <mailto:mhrivnak at redhat.com>> wrote:
>> >
>> >
>> >
>> >     On Mon, Jul 10, 2017 at 3:06 PM, Dennis Kliban <dkliban at redhat.com
>> <mailto:dkliban at redhat.com>> wrote:
>> >
>> >         The upload API for Artifacts is going to allow users to specify
>> the artifact size and a digest. The
>> >         Artifact model currently supports  'md5', 'sha1', 'sha224',
>> 'sha256', 'sha384', and 'sha512' digests.
>> >
>> >         Do we want to let users specify more than one digest per
>> upload? e.g. md5 and sha256?
>> >
>> >
>> >     There may be no harm in this, but it would add complexity to the
>> verification and not add much value. I'd
>> >     stick with just one unless there's a compelling reason for multiple.
>> >
>> >
>> > I agree. The API is going to raise a validation exception when more
>> than 1 digest is provided.
>>
>> +1
>>
>> >
>> >
>> >
>> >
>> >
>> >         Do we want to store all 6 digests for each Artifact?
>> >
>> >
>> >     The expensive part of calculating the digests is reading the file.
>> As long as you're already reading the
>> >     entire file, which we will during verification, you may as well
>> stuff the bits through multiple hashers
>> >     (digesters?) and get all the digests. Pulp 2 has a function that
>> does this:
>> >
>> >     https://github.com/pulp/pulp/blob/2.13-release/server/pulp/
>> server/util.py#L327-L353
>> >     <https://github.com/pulp/pulp/blob/2.13-release/server/
>> pulp/server/util.py#L327-L353>
>> >
>> >     But we can't always guarantee that we'll have all the checksums
>> available, for at least two reasons. 1) If
>> >     in the future if we want to use yet another algorithm, we probably
>> won't want to run a migration that
>> >     re-reads every file and calculates the additional digest. 2) For
>> on-demand content, we don't have it
>> >     locally, so we can't calculate any additional checksums until it
>> gets fetched.
>> >
>> >     So this may be one of those times where we use a good-ole-fashioned
>> getter method that returns the
>> >     requested digest if it's on the artifact, calculates it if not, or
>> raises an exception if the value isn't
>> >     available and can't be calculated.
>> >
>> >
>> > For uploaded Artifacts, all of the digests will be calculated as the
>> file is being processed during the
>> > upload. So I don't think calculating all of them should incur
>> significantly more cost than just one. The code
>> > snippet from Pulp 2 looks similar to what I am doing.
>>
>> This functionality should be a method on the Artifact and not a util
>> function somewhere.
>>
>> >
>> > I haven't given much thought to the getter, but your idea sounds fine
>> to me.
>> > Thanks,
>> > Dennis
>> >
>> >
>> >
>> >
>> >     --
>> >
>> >     Michael Hrivnak
>> >
>> >     Principal Software Engineer, RHCE
>> >
>> >     Red Hat
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > Pulp-dev mailing list
>> > Pulp-dev at redhat.com
>> > https://www.redhat.com/mailman/listinfo/pulp-dev
>> >
>>
>>
>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>>
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170711/17306c1a/attachment.htm>


More information about the Pulp-dev mailing list