[Pulp-dev] Removing MD5 and SHA-1 as default available checksums in 3.11

Matthias Dellweg mdellweg at redhat.com
Thu Mar 11 07:58:38 UTC 2021


 I believe pretty much everything is said here, and I just want to add a
single thought.

> However, I do worry that this new default behaviour violates the basic
> user expectation that deb repositories published by pulp will include the
> same metadata fields as the upstream repository that was synced.
>
(In fact I have an open issue to ensure this is the case for non-checksum
fields).
Here i really see a difference between "checksum fields" and "other
metadata fields". Checksum fields are supposed to ensure the integrity of
the packages, and since multiple checksum algorithms are used, this
information is there with a certain multiplicity. All the other fields are
in a way unique. Therefore i do not see a problem omitting some hashing
algorithms. Obviously this needs to be verified against all apt-repo
consumers that are widespread.

On Thu, Mar 11, 2021 at 4:20 AM Brian Bouterse <bmbouter at redhat.com> wrote:

> Thanks Quirin for the questions. I put my understanding and
> recommendations inline. Other devs please share your perspectives and
> advice, especially if they differ from what is written here. More questions
> and discussion are welcome. This is complicated stuff, but we want to be
> here to help.
>
> On Wed, Mar 10, 2021 at 11:40 AM Quirin Pamp <pamp at atix.de> wrote:
>
>> The pulp_deb plugin currently makes use of md5, sha1, sha256, and sha512.
>> Using ALLOWED_CONTENT_CHECKSUMS to "prohibit" one or more of these
>> checksum types currently simply breaks the plugin.
>> This is one (of several) reasons why the pulp_deb CI tests are currently
>> broken against pulpcore master (they use the default configuration for
>> pulpcore 3.11, prohibiting md5 and sha1).
>>
> Putting something back in place temporarily to get the CI working prior to
> release would make sense to me. Something like what pulpcore used to have:
>
> https://github.com/pulp/pulpcore/pull/1123/files#diff-48fd5085612376b649898c2408233a202481ac724dd377c1369cf20ee34649ddL33-L38
>
>>
>> I could adjust the sync and publish, to only store and retrive checksums
>> permitted by ALLOWED_CONTENT_CHECKSUMS.
>> I am pretty sure current APT implementations can get by with (and indeed
>> prefer) sha256.
>>
> With 3.11 at least, after the user runs the `handle-content-checksums`
> command, the Artifacts will only contain those checksums in
> ALLOWED_CONTENT_CHECKSUMS. 3.11 also ships a migration which auto-runs this
> command so if a user already has sha1 and md5, and they want to keep those
> sha1 and md5 in the db on Artifacts, they need to add back sha1 and/or md5
> to ALLOWED_CONTENT_CHECKSUMS prior to running the migrations with 3.11 or
> the md5 and sha1 checksums will be removed from all artifacts.
>
> FYI Pulp will also refuse to start if there are additional or missing
> checksums on Artifacts that differ from the set in
> ALLOWED_CONTENT_CHECKSUMS. This is actually for plugin writer's benefits
> (we think) so that as plugin code interacts with Artifacts, whatever is
> there aligns with the administrators config of the
> ALLOWED_CONTENT_CHECKSUMS config. If checksums are added to
> ALLOWED_CONTENT_CHECKSUMS the user would have to re-run the
> `handle-content-checksums` command and Pulp will recalculate the checksums,
> but this is an expensive operation (obviously).
>
> 3.11 does not make changes/enforcement with RemoteArtifacts, only
> Artifacts. More changes addressing RemoteArtifacts are coming in 3.12,
> which should finish this epic code push around this setting.
>
>>
>> However, I do worry that this new default behaviour violates the basic
>> user expectation that deb repositories published by pulp will include the
>> same metadata fields as the upstream repository that was synced.
>>
> This concern makes sense to me. It's a balancing act; users rightfully may
> want to use all checksums for a content type like pulp_deb, as is typical
> for that content type, but admins deploying pulp in an organization may
> feel differently about what checksums should be allowed in their org. This
> change in 3.11 is really about defaults; any given system can be configured
> anyway they want to.
>
> (In fact I have an open issue to ensure this is the case for non-checksum
>> fields).
>> Publishing md5 and sha1 checksums (in addition to sha256 and possibly
>> others) is widespread practice. The official Debian and Ubuntu repos all do
>> so.
>> I have no idea to what extent real world APT usage depends on these
>> fields.
>>
> I hear what you're saying here. I don't really have advice in this area.
> RPM is more or less in a similar state, where repos use sha1 broadly, but
> some environments where Pulp is used want to prohibit that. My goal is to
> facilitate the choice to be made in one place, the
> ALLOWED_CONTENT_CHECKSUMS setting.
>
> In the most straightforward aspect, the ALLOWED_CONTENT_CHECKSUM setting
> should prevent checksum calculation for the purposes of integrity
> verification. For that, we recommend plugin writers use
> pulpcore.plugin.pulp_hashlib.new instead of hashlib.new which is a thin
> wrapper around hashlib.new allowing only "enabled" checksums to be
> instantiated. The less straightforward aspect is that these checksums
> should also not available for clients to ultimately do the same thing. Our
> intention with this setting is that pulpcore and plugins adhere to both of
> these practices, but the former is more straightforward at implementing
> than the latter. We documented both of these goals (and the machinery we
> recommend using) here in the plugin writer docs:
> https://github.com/pulp/pulpcore/blob/master/docs/plugins/plugin-writer/concepts/index.rst#checksum-use-in-plugins
>
>>
>> I could of course punt to users, and simply tell them in documentation to
>> re-add md5 and sha1 to ALLOWED_CONTENT_CHECKSUMS if they want the
>> corresponding metadata fields.
>>
> Doing it with docs is more or less the strategy most plugins are pursuing.
> md5 isn't used much anymore so that one isn't typically as big a problem.
> Sha1 you may want to think about prompting your users to think about via
> docs.
>
> Note that programmatically changing this setting can get tricky
> (practically speaking) if multiple plugins start to try to modify it on
> behalf of their users. Ultimately though it's up to you as a plugin writer
> to decide for users, or have them decide. We've opted mostly for letting
> users having to take action if they want md5 and sha1, but whatever you
> decide is ok with me. Generally I think your options are a) document adding
> them back, b) having a setting that can enable/disable checksums to
> default-re-enable one or both checksums when pulp_deb is installed, or c)
> doing nothing and defaulting to disincluding sha1 and md5.
>
> I don't feel particularly happy about this since this ammounts to an
>> indefinite recommendation against the default pulpcore configuration for
>> all users who want their pulp_deb repositories to resemble official
>> Debian/Ubuntu repositories.
>> (And for a security relevant configuration at that.)
>>
> I can understand being uneasy about setting pulp_deb to have a default
> that is not aligned with pulpcore; once it's put in place like that I don't
> think it'll be easy or possible to practically change it. This is one
> motivation for going the docs route overall, which keeps the defaults
> aligned, but users on the hook to opt-back-in.
>
> Note that Katello (I think) is re-enabling all of them by default on their
> installs, so also consider the choice in front of pulp_deb in the context
> of just independent pulp_deb users, not Katello users. Please confirm this
> with katello also, but this is my understanding.
>
>>
>> Alternatively, I could store the unloved md5 and sha1 hashes in the
>> content models themselves instead of the artifacts, but this is ugly for a
>> whole host of reasons:
>>
>> 1) It is a lot more work
>> 2) For artifacts pulpcore automatically does the work of actually
>> checking the checksums against the artifacts.
>> 3) It duplicates the source of truth for checksums (the horror!) for
>> users that do not go along with the new default pulpcore configuration.
>>
>> So that approach is probably a non-starter.
>>
> I agree this is a non-starter. Our goal is to not have them available at
> all if admins don't want them to be, so storing them to the side I wouldn't
> recommend. As an alternative to this idea, I think it would be better for
> your users to add back sha1 and md5 to ALLOWED_CONTENT_CHECKSUMS instead.
>
>>
>> From the point of view of pulp_deb it would be better if pulpcore did not
>> so much refuse to handle md5 and sha1, but rather would guarantee that at
>> least one strong checksum is also present and used for integrity checking.
>> Which I believe is the case anyway since we absolutely require sha256 to
>> be present, no?
>>
> The data model always require sha256, in fact Pulp will not start if this
> is removed from ALLOWED_CONTENT_CHECKSUM. It probably would be easier on
> users to do what you're describing, but we also have to consider the
> organizations who choose to use Pulp and have different expectations. We
> ended up with a pretty strict behavior of removing checksums that are no
> longer trusted from the db because it was the simplest way of causing
> plugin writers to have to honor the requirements of the organization
> deploying Pulp who decide at the organization level what they feel is
> trustworthy regarding package integrity checking. I understand this makes
> the life of a plugin writer more difficult. I also know plugin writers
> dealing with that can be a pain. We've been spending a lot of effort on
> this recently, so for sure, we feel the pain also.
>
>>
>> To summarize: I am uncertain how best to proceed, but perhaps I am
>> overthinking this and simply respecting ALLOWED_CONTENT_CHECKSUMS and
>> letting users decide is best.
>>
> The question I'll ask to help answer yours is: how much does pulp_deb
> break with 3.11's defaults? This would be good to know. Want to run a few
> tests and let us know? Maybe we can help give more info with that.
>
> Aside from that, my general advice is to expect that pulp_deb users will
> change this setting, and to have the pulp_deb code work with the checksums
> it has available and error when it cannot fulfill their request due to not
> having the checksums it would need to do so.
>
>>
>> regards,
>> Quirin
>> ------------------------------
>> *From:* pulp-dev-bounces at redhat.com <pulp-dev-bounces at redhat.com> on
>> behalf of Brian Bouterse <bmbouter at redhat.com>
>> *Sent:* 12 February 2021 21:13
>> *To:* Pulp-dev <pulp-dev at redhat.com>; pulp-list <pulp-list at redhat.com>
>> *Subject:* [Pulp-dev] Removing MD5 and SHA-1 as default available
>> checksums in 3.11
>>
>> tl;dr With pulpcore 3.11, the plan is to remove MD5 and SHA-1 from the
>> list of default available checksums.  RPM and Migration plugin users will
>> need to add this back in at 3.11 upgrade time for your systems to continue
>> working. Please give on-list feedback on this change.
>>
>> ## Background
>>
>> Pulp has the ALLOWED_CONTENT_CHECKSUMS setting [0] which, by default,
>> currently includes md5, sha-1, sha-224, sha-256, sha-384, and sha-512. Pulp
>> code is restricted to only using hashers from this list. This feature gives
>> admins the ability to prohibit hashers they do not trust. Pulp uses these
>> checksums for package integrity verification purposes when syncing and
>> publishing content.
>>
>> ## Motivation
>>
>> We need to make Pulp secure by default. MD5 is known to be insecure, and
>> therefore it is unsafe for Pulp to allow its use for calculating package
>> integrity by default. SHA-1 is widely believed to be insecure, or will be
>> soon, and should not be allowed by default for the same reason.
>>
>> ## Proposal
>>
>> Pulpcore 3.11 would remove md5 and sha-1 from the default list of allowed
>> checksums, leaving sha-224..sha-512. Specifically this change is occuring
>> in the `ALLOWED_CONTENT_CHECKSUMS` setting [0]. This is only a change to
>> the default settings; any specific system can be configured as desired.
>> Nothing is "being taken away".
>>
>> ## Required User Action with 3.11
>>
>> We believe both RPM plugin users and Migration plugin users will be
>> impacted by this and mostly from the SHA-1 removal. SHA-1 is still used on
>> a variety of CDNs including Red Hat's. Also as data is migrated from Pulp2
>> systems, this also likely uses SHA-1 and MD5 as the migration plugin runs.
>>
>> If users are using the defaults for `ALLOWED_CONTENT_CHECKSUMS` and want
>> to continue using SHA-1, they will need to update
>> `ALLOWED_CONTENT_CHECKSUMS` in their settings file. Alternatively, users
>> will need to run `pulpcore-manager handle-artifact-checksums` after upgrade
>> to update any existing artifacts after upgrading.
>>
>> ## Why not automate this?
>>
>> We do not take manual user action at upgrade time lightly. However, this
>> is a security change, and we believe we need each Pulp system to opt-in for
>> themselves.
>>
>> [0]:
>> https://docs.pulpproject.org/pulpcore/settings.html#allowed-content-checksums
>>
>> Thanks!
>> The Pulpcore Team
>>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://listman.redhat.com/mailman/listinfo/pulp-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20210311/f48c083c/attachment.htm>


More information about the Pulp-dev mailing list