[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Pulp-dev] Handling RPM with long filelist in Pulp 2



Currently Pulp is able to import RPM with filelist up to ~14-15 MB which probably cover most repositories but not all of them.

Historically, for each RPM unit several potentially large data snippets are stored in db:
 - XML snippets for RPM metadata
 - parsed filelist
 - parsed changelog

XML snippets are compressed and so they require much less space than a huge parsed filelist or a changelog.
Here is the issue [0] to track the effort of eliminating this limitation or at least increasing the size of filelist that Pulp can handle for each RPM.

The question is what is the best way to handle the issue, keeping in mind that any substantial change or re-design introduces more risks and efforts to Pulp 2 line and at the same time this won't be an issue in Pulp 3.

So far the options are:
 1. Eliminate issue completely (e.g. by using GridFS)
 2. Increase current limit for filelist by removing parsed version of it from db
 3. Do not solve it in Pulp2, wait for Pulp3 which won't have this issue at all
 4. Any other idea/option

As an additional info:
 - some thoughts and options [1]  which were considered several months ago
 - by removing parsed filelist (and changelog?) from db we will give a room for a really large RPM metadata. Pulp will be able to import any RPM with uncompressed metadata up to ~200MB (~14-15MB currently). Just for comparison, this is ~1.5 times bigger than the filelists.xml and other.xml together of the whole EPEL7 repo.
 - removing data from db ^ will affect at least search endpoints like this [2] where all the data for unit is returned in response.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]