[Pulp-dev] Removing MD5 and SHA-1 as default available checksums in 3.11

Brian Bouterse bmbouter at redhat.com
Thu Mar 11 03:19:44 UTC 2021

Thanks Quirin for the questions. I put my understanding and recommendations
inline. Other devs please share your perspectives and advice, especially if
they differ from what is written here. More questions and discussion are
welcome. This is complicated stuff, but we want to be here to help.

On Wed, Mar 10, 2021 at 11:40 AM Quirin Pamp <pamp at atix.de> wrote:

> The pulp_deb plugin currently makes use of md5, sha1, sha256, and sha512.
> Using ALLOWED_CONTENT_CHECKSUMS to "prohibit" one or more of these
> checksum types currently simply breaks the plugin.
> This is one (of several) reasons why the pulp_deb CI tests are currently
> broken against pulpcore master (they use the default configuration for
> pulpcore 3.11, prohibiting md5 and sha1).
Putting something back in place temporarily to get the CI working prior to
release would make sense to me. Something like what pulpcore used to have:


> I could adjust the sync and publish, to only store and retrive checksums
> I am pretty sure current APT implementations can get by with (and indeed
> prefer) sha256.
With 3.11 at least, after the user runs the `handle-content-checksums`
command, the Artifacts will only contain those checksums in
ALLOWED_CONTENT_CHECKSUMS. 3.11 also ships a migration which auto-runs this
command so if a user already has sha1 and md5, and they want to keep those
sha1 and md5 in the db on Artifacts, they need to add back sha1 and/or md5
to ALLOWED_CONTENT_CHECKSUMS prior to running the migrations with 3.11 or
the md5 and sha1 checksums will be removed from all artifacts.

FYI Pulp will also refuse to start if there are additional or missing
checksums on Artifacts that differ from the set in
ALLOWED_CONTENT_CHECKSUMS. This is actually for plugin writer's benefits
(we think) so that as plugin code interacts with Artifacts, whatever is
there aligns with the administrators config of the
ALLOWED_CONTENT_CHECKSUMS config. If checksums are added to
ALLOWED_CONTENT_CHECKSUMS the user would have to re-run the
`handle-content-checksums` command and Pulp will recalculate the checksums,
but this is an expensive operation (obviously).

3.11 does not make changes/enforcement with RemoteArtifacts, only
Artifacts. More changes addressing RemoteArtifacts are coming in 3.12,
which should finish this epic code push around this setting.

> However, I do worry that this new default behaviour violates the basic
> user expectation that deb repositories published by pulp will include the
> same metadata fields as the upstream repository that was synced.
This concern makes sense to me. It's a balancing act; users rightfully may
want to use all checksums for a content type like pulp_deb, as is typical
for that content type, but admins deploying pulp in an organization may
feel differently about what checksums should be allowed in their org. This
change in 3.11 is really about defaults; any given system can be configured
anyway they want to.

(In fact I have an open issue to ensure this is the case for non-checksum
> fields).
> Publishing md5 and sha1 checksums (in addition to sha256 and possibly
> others) is widespread practice. The official Debian and Ubuntu repos all do
> so.
> I have no idea to what extent real world APT usage depends on these
> fields.
I hear what you're saying here. I don't really have advice in this area.
RPM is more or less in a similar state, where repos use sha1 broadly, but
some environments where Pulp is used want to prohibit that. My goal is to
facilitate the choice to be made in one place, the

In the most straightforward aspect, the ALLOWED_CONTENT_CHECKSUM setting
should prevent checksum calculation for the purposes of integrity
verification. For that, we recommend plugin writers use
pulpcore.plugin.pulp_hashlib.new instead of hashlib.new which is a thin
wrapper around hashlib.new allowing only "enabled" checksums to be
instantiated. The less straightforward aspect is that these checksums
should also not available for clients to ultimately do the same thing. Our
intention with this setting is that pulpcore and plugins adhere to both of
these practices, but the former is more straightforward at implementing
than the latter. We documented both of these goals (and the machinery we
recommend using) here in the plugin writer docs:

> I could of course punt to users, and simply tell them in documentation to
> re-add md5 and sha1 to ALLOWED_CONTENT_CHECKSUMS if they want the
> corresponding metadata fields.
Doing it with docs is more or less the strategy most plugins are pursuing.
md5 isn't used much anymore so that one isn't typically as big a problem.
Sha1 you may want to think about prompting your users to think about via

Note that programmatically changing this setting can get tricky
(practically speaking) if multiple plugins start to try to modify it on
behalf of their users. Ultimately though it's up to you as a plugin writer
to decide for users, or have them decide. We've opted mostly for letting
users having to take action if they want md5 and sha1, but whatever you
decide is ok with me. Generally I think your options are a) document adding
them back, b) having a setting that can enable/disable checksums to
default-re-enable one or both checksums when pulp_deb is installed, or c)
doing nothing and defaulting to disincluding sha1 and md5.

I don't feel particularly happy about this since this ammounts to an
> indefinite recommendation against the default pulpcore configuration for
> all users who want their pulp_deb repositories to resemble official
> Debian/Ubuntu repositories.
> (And for a security relevant configuration at that.)
I can understand being uneasy about setting pulp_deb to have a default that
is not aligned with pulpcore; once it's put in place like that I don't
think it'll be easy or possible to practically change it. This is one
motivation for going the docs route overall, which keeps the defaults
aligned, but users on the hook to opt-back-in.

Note that Katello (I think) is re-enabling all of them by default on their
installs, so also consider the choice in front of pulp_deb in the context
of just independent pulp_deb users, not Katello users. Please confirm this
with katello also, but this is my understanding.

> Alternatively, I could store the unloved md5 and sha1 hashes in the
> content models themselves instead of the artifacts, but this is ugly for a
> whole host of reasons:
> 1) It is a lot more work
> 2) For artifacts pulpcore automatically does the work of actually checking
> the checksums against the artifacts.
> 3) It duplicates the source of truth for checksums (the horror!) for users
> that do not go along with the new default pulpcore configuration.
> So that approach is probably a non-starter.
I agree this is a non-starter. Our goal is to not have them available at
all if admins don't want them to be, so storing them to the side I wouldn't
recommend. As an alternative to this idea, I think it would be better for
your users to add back sha1 and md5 to ALLOWED_CONTENT_CHECKSUMS instead.

> From the point of view of pulp_deb it would be better if pulpcore did not
> so much refuse to handle md5 and sha1, but rather would guarantee that at
> least one strong checksum is also present and used for integrity checking.
> Which I believe is the case anyway since we absolutely require sha256 to
> be present, no?
The data model always require sha256, in fact Pulp will not start if this
is removed from ALLOWED_CONTENT_CHECKSUM. It probably would be easier on
users to do what you're describing, but we also have to consider the
organizations who choose to use Pulp and have different expectations. We
ended up with a pretty strict behavior of removing checksums that are no
longer trusted from the db because it was the simplest way of causing
plugin writers to have to honor the requirements of the organization
deploying Pulp who decide at the organization level what they feel is
trustworthy regarding package integrity checking. I understand this makes
the life of a plugin writer more difficult. I also know plugin writers
dealing with that can be a pain. We've been spending a lot of effort on
this recently, so for sure, we feel the pain also.

> To summarize: I am uncertain how best to proceed, but perhaps I am
> overthinking this and simply respecting ALLOWED_CONTENT_CHECKSUMS and
> letting users decide is best.
The question I'll ask to help answer yours is: how much does pulp_deb break
with 3.11's defaults? This would be good to know. Want to run a few tests
and let us know? Maybe we can help give more info with that.

Aside from that, my general advice is to expect that pulp_deb users will
change this setting, and to have the pulp_deb code work with the checksums
it has available and error when it cannot fulfill their request due to not
having the checksums it would need to do so.

> regards,
> Quirin
> ------------------------------
> *From:* pulp-dev-bounces at redhat.com <pulp-dev-bounces at redhat.com> on
> behalf of Brian Bouterse <bmbouter at redhat.com>
> *Sent:* 12 February 2021 21:13
> *To:* Pulp-dev <pulp-dev at redhat.com>; pulp-list <pulp-list at redhat.com>
> *Subject:* [Pulp-dev] Removing MD5 and SHA-1 as default available
> checksums in 3.11
> tl;dr With pulpcore 3.11, the plan is to remove MD5 and SHA-1 from the
> list of default available checksums.  RPM and Migration plugin users will
> need to add this back in at 3.11 upgrade time for your systems to continue
> working. Please give on-list feedback on this change.
> ## Background
> Pulp has the ALLOWED_CONTENT_CHECKSUMS setting [0] which, by default,
> currently includes md5, sha-1, sha-224, sha-256, sha-384, and sha-512. Pulp
> code is restricted to only using hashers from this list. This feature gives
> admins the ability to prohibit hashers they do not trust. Pulp uses these
> checksums for package integrity verification purposes when syncing and
> publishing content.
> ## Motivation
> We need to make Pulp secure by default. MD5 is known to be insecure, and
> therefore it is unsafe for Pulp to allow its use for calculating package
> integrity by default. SHA-1 is widely believed to be insecure, or will be
> soon, and should not be allowed by default for the same reason.
> ## Proposal
> Pulpcore 3.11 would remove md5 and sha-1 from the default list of allowed
> checksums, leaving sha-224..sha-512. Specifically this change is occuring
> in the `ALLOWED_CONTENT_CHECKSUMS` setting [0]. This is only a change to
> the default settings; any specific system can be configured as desired.
> Nothing is "being taken away".
> ## Required User Action with 3.11
> We believe both RPM plugin users and Migration plugin users will be
> impacted by this and mostly from the SHA-1 removal. SHA-1 is still used on
> a variety of CDNs including Red Hat's. Also as data is migrated from Pulp2
> systems, this also likely uses SHA-1 and MD5 as the migration plugin runs.
> If users are using the defaults for `ALLOWED_CONTENT_CHECKSUMS` and want
> to continue using SHA-1, they will need to update
> `ALLOWED_CONTENT_CHECKSUMS` in their settings file. Alternatively, users
> will need to run `pulpcore-manager handle-artifact-checksums` after upgrade
> to update any existing artifacts after upgrading.
> ## Why not automate this?
> We do not take manual user action at upgrade time lightly. However, this
> is a security change, and we believe we need each Pulp system to opt-in for
> themselves.
> [0]:
> https://docs.pulpproject.org/pulpcore/settings.html#allowed-content-checksums
> Thanks!
> The Pulpcore Team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20210310/50a55585/attachment.htm>

More information about the Pulp-dev mailing list