[Pulp-list] pulp 3.7.3 sync with checksum error

Bin Li (BLOOMBERG/ 120 PARK) bli111 at bloomberg.net
Tue Mar 23 14:04:00 UTC 2021


I found the package in the repodata xxx-primary.xml of the upstream repo but I don't see the rpm in the repo.

In primary.xml
  <package type="rpm">
    <name>protobuf</name>
    <arch>x86_64</arch>
    <version epoch="0" ver="2.5.0" rel="8.el7"/>
    <checksum type="sha" pkgid="YES">6c6abbab55502947f3139e42b4ba32bebf87eb99</checksum>
    <summary>Protocol Buffers - Google's data interchange format</summary>
    <description>Protocol Buffers are a way of encoding structured data in an efficient
yet extensible format. Google uses Protocol Buffers for almost all of
its internal RPC protocols and file formats.

Protocol buffers are a flexible, efficient, automated mechanism for
serializing structured data %G–%@ think XML, but smaller, faster, and
simpler. You define how you want your data to be structured once, then
you can use special generated source code to easily write and read
your structured data to and from a variety of data streams and using a
variety of languages. You can even update your data structure without
breaking deployed programs that are compiled against the "old" format.</description>
    <packager>Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla></packager>
    <url>http://code.google.com/p/protobuf/</url>
    <time file="1558030645410" build="1439394389"/>
    <size package="346608" installed="1160420" archive="1161616"/>
    <location href="ndismdns/ndismdns-stock-protobuf-2.5.0-8.el7.x86_64.rpm"/>
    <format>
      <rpm:license>BSD</rpm:license>
      <rpm:vendor>Red Hat, Inc.</rpm:vendor>
      <rpm:group>Development/Libraries</rpm:group>
      <rpm:buildhost>x86-019.build.eng.bos.redhat.com</rpm:buildhost>
      <rpm:sourcerpm>protobuf-2.5.0-8.el7.src.rpm</rpm:sourcerpm>
      <rpm:header-range start="1384" end="9092"/>
      <rpm:provides>
        <rpm:entry name="libprotobuf.so.8()(64bit)"/>
        <rpm:entry name="protobuf" flags="EQ" epoch="0" ver="2.5.0" rel="8.el7"/>
        <rpm:entry name="protobuf(x86-64)" flags="EQ" epoch="0" ver="2.5.0" rel="8.el7"/>
...


Here is more info in the postgresql.

sqld=> select name, epoch, version, release, arch, checksum_type  from rpm_package where "pkgId"='6c6abbab55502947f3139e42b4ba32bebf87eb99'
;
   name   | epoch | version | release |  arch  | checksum_type 
----------+-------+---------+---------+--------+---------------
 protobuf | 0     | 2.5.0   | 8.el7   | x86_64 | sha1
(1 row)

~          

From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/23/21 09:09:17To:  daviddavis at redhat.com
Cc:  dalley at redhat.com,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
Thanks Dave. The below error points to the protobuf package. Wonder if we could resolve the duplicate key.

sqld=> select name from rpm_package where "pkgId"='6c6abbab55502947f3139e42b4ba32bebf87eb99' 
;
   name   
----------
 protobuf
(1 row)


Mar 22 17:01:51 pulp-dev1 rq: pulp: rq.worker:ERROR: django.db.utils.IntegrityError: duplicate key value violates unique constraint "rpm_package_pkgId_key"
Mar 22 17:01:51 pulp-dev1 rq: DETAIL:  Key ("pkgId")=(6c6abbab55502947f3139e42b4ba32bebf87eb99) already exists.
Mar 22 17:01:51 pulp-dev1 rq: Traceback (most recent call last):
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/backends/utils.py", line 85, in _execute
Mar 22 17:01:51 pulp-dev1 rq: return self.cursor.execute(sql, params)
Mar 22 17:01:51 pulp-dev1 rq: psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "rpm_package_pkgId_key"
Mar 22 17:01:51 pulp-dev1 rq: DETAIL:  Key ("pkgId")=(6c6abbab55502947f3139e42b4ba32bebf87eb99) already exists.
Mar 22 17:01:51 pulp-dev1 rq: The above exception was the direct cause of the following exception:
Mar 22 17:01:51 pulp-dev1 rq: Traceback (most recent call last):
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/worker.py", line 886, in perform_job
Mar 22 17:01:51 pulp-dev1 rq: rv = job.perform()
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/job.py", line 664, in perform
Mar 22 17:01:51 pulp-dev1 rq: self._result = self._execute()
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/job.py", line 670, in _execute
Mar 22 17:01:51 pulp-dev1 rq: return self.func(*self.args, **self.kwargs)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", line 266, in synchronize
Mar 22 17:01:51 pulp-dev1 rq: dv.create()
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/declarative_version.py", line 148, in create
Mar 22 17:01:51 pulp-dev1 rq: loop.run_until_complete(pipeline)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/python/3.7.3/lib64/python3.7/asyncio/base_events.py", line 584, in run_until_complete
Mar 22 17:01:51 pulp-dev1 rq: return future.result()
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", line 225, in create_pipeline
Mar 22 17:01:51 pulp-dev1 rq: await asyncio.gather(*futures)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", line 43, in __call__
Mar 22 17:01:51 pulp-dev1 rq: await self.run()
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/content_stages.py", line 114, in run
Mar 22 17:01:51 pulp-dev1 rq: raise e
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/content_stages.py", line 103, in run
Mar 22 17:01:51 pulp-dev1 rq: d_content.content.save()
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/app/models/base.py", line 115, in save
Mar 22 17:01:51 pulp-dev1 rq: return super().save(*args, **kwargs)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django_lifecycle/mixins.py", line 129, in save
Mar 22 17:01:51 pulp-dev1 rq: save(*args, **kwargs)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/base.py", line 744, in save
Mar 22 17:01:51 pulp-dev1 rq: force_update=force_update, update_fields=update_fields)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/base.py", line 782, in save_base
Mar 22 17:01:51 pulp-dev1 rq: force_update, using, update_fields,
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/base.py", line 873, in _save_table
Mar 22 17:01:51 pulp-dev1 rq: result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/base.py", line 911, in _do_insert
Mar 22 17:01:51 pulp-dev1 rq: using=using, raw=raw)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/manager.py", line 82, in manager_method
Mar 22 17:01:51 pulp-dev1 rq: return getattr(self.get_queryset(), name)(*args, **kwargs)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/query.py", line 1186, in _insert
Mar 22 17:01:51 pulp-dev1 rq: return query.get_compiler(using=using).execute_sql(return_id)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/sql/compiler.py", line 1377, in execute_sql
Mar 22 17:01:51 pulp-dev1 rq: cursor.execute(sql, params)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/backends/utils.py", line 67, in execute
Mar 22 17:01:51 pulp-dev1 rq: return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
Mar 22 17:01:51 pulp-dev1 rq: return executor(sql, params, many, context)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/backends/utils.py", line 85, in _execute
Mar 22 17:01:51 pulp-dev1 rq: return self.cursor.execute(sql, params)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/utils.py", line 89, in __exit__
Mar 22 17:01:51 pulp-dev1 rq: raise dj_exc_value.with_traceback(traceback) from exc_value
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/backends/utils.py", line 85, in _execute
Mar 22 17:01:51 pulp-dev1 rq: return self.cursor.execute(sql, params)
Mar 22 17:01:51 pulp-dev1 rq: django.db.utils.IntegrityError: duplicate key value violates unique constraint "rpm_package_pkgId_key"
Mar 22 17:01:51 pulp-dev1 rq: DETAIL:  Key ("pkgId")=(6c6abbab55502947f3139e42b4ba32bebf87eb99) already exists.
Mar 22 17:01:51 pulp-dev1 rq: Traceback (most recent call last):
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/backends/utils.py", line 85, in _execute
Mar 22 17:01:51 pulp-dev1 rq: return self.cursor.execute(sql, params)
Mar 22 17:01:51 pulp-dev1 rq: psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "rpm_package_pkgId_key"
Mar 22 17:01:51 pulp-dev1 rq: DETAIL:  Key ("pkgId")=(6c6abbab55502947f3139e42b4ba32bebf87eb99) already exists.
Mar 22 17:01:51 pulp-dev1 rq: The above exception was the direct cause of the following exception:
Mar 22 17:01:51 pulp-dev1 rq: Traceback (most recent call last):
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/worker.py", line 886, in perform_job
Mar 22 17:01:51 pulp-dev1 rq: rv = job.perform()
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/job.py", line 664, in perform
Mar 22 17:01:51 pulp-dev1 rq: self._result = self._execute()
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/job.py", line 670, in _execute
Mar 22 17:01:51 pulp-dev1 rq: return self.func(*self.args, **self.kwargs)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py", line 266, in synchronize
Mar 22 17:01:51 pulp-dev1 rq: dv.create()
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/declarative_version.py", line 148, in create
Mar 22 17:01:51 pulp-dev1 rq: loop.run_until_complete(pipeline)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/python/3.7.3/lib64/python3.7/asyncio/base_events.py", line 584, in run_until_complete
Mar 22 17:01:51 pulp-dev1 rq: return future.result()
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", line 225, in create_pipeline
Mar 22 17:01:51 pulp-dev1 rq: await asyncio.gather(*futures)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py", line 43, in __call__
Mar 22 17:01:51 pulp-dev1 rq: await self.run()
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/content_stages.py", line 114, in run
Mar 22 17:01:51 pulp-dev1 rq: raise e
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/content_stages.py", line 103, in run
Mar 22 17:01:51 pulp-dev1 rq: d_content.content.save()
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/app/models/base.py", line 115, in save
Mar 22 17:01:51 pulp-dev1 rq: return super().save(*args, **kwargs)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django_lifecycle/mixins.py", line 129, in save
Mar 22 17:01:51 pulp-dev1 rq: save(*args, **kwargs)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/base.py", line 744, in save
Mar 22 17:01:51 pulp-dev1 rq: force_update=force_update, update_fields=update_fields)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/base.py", line 782, in save_base
Mar 22 17:01:51 pulp-dev1 rq: force_update, using, update_fields,
Mar 22 17:01:51 pulp-dev1 rq: result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/base.py", line 911, in _do_insert
Mar 22 17:01:51 pulp-dev1 rq: using=using, raw=raw)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/manager.py", line 82, in manager_method
Mar 22 17:01:51 pulp-dev1 rq: return getattr(self.get_queryset(), name)(*args, **kwargs)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/query.py", line 1186, in _insert
Mar 22 17:01:51 pulp-dev1 rq: return query.get_compiler(using=using).execute_sql(return_id)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/sql/compiler.py", line 1377, in execute_sql
Mar 22 17:01:51 pulp-dev1 rq: cursor.execute(sql, params)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/backends/utils.py", line 67, in execute
Mar 22 17:01:51 pulp-dev1 rq: return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
Mar 22 17:01:51 pulp-dev1 rq: return executor(sql, params, many, context)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/backends/utils.py", line 85, in _execute
Mar 22 17:01:51 pulp-dev1 rq: return self.cursor.execute(sql, params)
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/utils.py", line 89, in __exit__
Mar 22 17:01:51 pulp-dev1 rq: raise dj_exc_value.with_traceback(traceback) from exc_value
Mar 22 17:01:51 pulp-dev1 rq: File "/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/backends/utils.py", line 85, in _execute
Mar 22 17:01:51 pulp-dev1 rq: return self.cursor.execute(sql, params)
Mar 22 17:01:51 pulp-dev1 rq: django.db.utils.IntegrityError: duplicate key value violates unique constraint "rpm_package_pkgId_key"
Mar 22 17:01:51 pulp-dev1 rq: DETAIL:  Key ("pkgId")=(6c6abbab55502947f3139e42b4ba32bebf87eb99) already exists.
Mar 22 17:01:51 pulp-dev1 rq: pulp: rq.worker:INFO: 51823 at pulp-dev1.bloomberg.com: e8c66486-94ec-431f-9c7e-42b0120078d4
Mar 22 17:01:52 pulp-dev1 rq: pulp: rq.worker:INFO: 51823 at pulp-dev1.bloomberg.com: Job OK (e8c66486-94ec-431f-9c7e-42b0120078d4)


From: daviddavis at redhat.com At: 03/19/21 16:27:19To:  Bin Li (BLOOMBERG/ 120 PARK ) 
Cc:  dalley at redhat.com,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error

I'd be curious about the data that's causing this issue. Can you try applying this patch and rerunning sync? It should give you more info about the existing package that's causing the conflict.

https://gist.github.com/daviddavis/3716ff3a988be8a5d797da136c7b90bf

David

On Fri, Mar 19, 2021 at 3:27 PM Bin Li (BLOOMBERG/ 120 PARK) <bli111 at bloomberg.net> wrote:


The core_taskreservedresource is actually empty after restoring the database. I found 2 tasks in waiting state from 10 month ago. I got a 409 conflict error when try to delete them with api. I then deleted both tasks from core_task table. The sync job was able to run but got the same "Package matching query does not exist". It looks like the issue existed before we tried to sync a week ago. I am not sure what else we can do except a reset of the database from scratch. We don't add any packages locally. All packages are synced from the upstream. The only concern is how to keep distributions with the same content. Some of our distributions point to older version of repo. Is there a way to make these distribution to have the same content after reset database?


From: dalley at redhat.com At: 03/19/21 11:08:32To:  Bin Li (BLOOMBERG/ 120 PARK ) 
Cc:  daviddavis at redhat.com,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error

I believe the correct procedure is to stop the workers, delete all of the items in the "reserved_resources" table, and restart the workers.

You are still using Pulpcore 3.7, correct?  Once this is sorted, I would recommend trying to upgrade... which I'm sure doesn't sound appealing right now... but a lot of work was done in the 3.9 - 3.10 timeframe to prevent tasking system deadlocks like this, as well as some improvements around being able to check for corrupted artifacts.  

https://yum.theforeman.org/pulpcore/3.9/
On Fri, Mar 19, 2021 at 9:25 AM Bin Li (BLOOMBERG/ 120 PARK) <bli111 at bloomberg.net> wrote:

I tried to restore the database on a dev instance. I was able to recreate repo and remote with the restored database but the sync stucked on waiting. No other jobs was running. The dev instance has a different hostname. How do we find out why the sync job is in the waiting state?


From: daviddavis at redhat.com At: 03/18/21 15:30:29To:  dalley at redhat.com
Cc:  Bin Li (BLOOMBERG/ 120 PARK ) ,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error

I'm guessing there's a package that got saved with a bad pkgId. I think what dalley recommends should work. I'd also make a backup of your current database in case rolling back causes bigger problems.

David

On Thu, Mar 18, 2021 at 2:32 PM Daniel Alley <dalley at redhat.com> wrote:

Restoring postgresql from tape should fix your database, yes.

I'm not 100% sure what happens if the artifact store has untracked files, or missing files that the database expects to be there.  In newer versions of pulpcore we have a "repair" feature to help deal with such issues, but 3.7 predates it.
 
I would say, take a backup of your /var/lib/pulp directory and then try the database restore.

On Thu, Mar 18, 2021 at 10:23 AM Bin Li (BLOOMBERG/ 120 PARK) <bli111 at bloomberg.net> wrote:


We still need to restore our instance to the previous state before this happened. We already get the postgresql dump from tape. Will restoring database work in this case?

From: dalley at redhat.com At: 03/17/21 14:45:15To:  Bin Li (BLOOMBERG/ 120 PARK ) 
Cc:  bmbouter at redhat.com,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error

Thanks!

On Wed, Mar 17, 2021 at 11:34 AM Bin Li (BLOOMBERG/ 120 PARK) <bli111 at bloomberg.net> wrote:


FYI, I filed a new issue https://pulp.plan.io/issues/8411 to track this.

From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/16/21 15:31:14To:  dalley at redhat.com,  bmbouter at redhat.com
Cc:  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
I checked the metadate from primary.xml. The sha1sum matches with actual file.

<metadata xmlns="http://linux.duke.edu/metadata/common" xmlns:rpm="http://linux.duke.edu/metadata/rpm">
  <package type="rpm">
    <name>flume</name>
    <arch>noarch</arch>
    <version epoch="0" ver="1.9.0" rel="1"/>
    <checksum type="sha" pkgid="YES">b8b257c32135daf51e703d439594f1a676871d7d</checksum>

 # sha1sum flume-1.9.0-1.noarch.rpm
b8b257c32135daf51e703d439594f1a676871d7d  flume-1.9.0-1.noarch.rpm


From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/16/21 15:06:43To:  bmbouter at redhat.com,  dalley at redhat.com
Cc:  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
I downloaded the package and I didn't find the check_sum in rpm_checksum.

# sha256sum  flume-1.9.0-1.noarch.rpm
c7fcec6d3385c079af6ed83cb272f52cbe2cb30ca93ed91732b1c8698b2ad76b  flume-1.9.0-1.noarch.rpm

=> select checksum from rpm_checksum where checksum like '%a93ed91732b1c8698b2ad76b';                                          

 checksum 
----------
(0 rows)

Also,there is no result from core_contentartifact

=>  select * from core_contentartifact where relative_path like '%flume%';
 pulp_id | pulp_created | pulp_last_updated | relative_path | artifact_id | content_id 
---------+--------------+-------------------+---------------+-------------+------------
(0 rows)


The upstream repo was fixed.  I had no issues when I syncd from a fresh empty pulp instance.

From: dalley at redhat.com At: 03/16/21 14:37:29To:  bmbouter at redhat.com
Cc:  Bin Li (BLOOMBERG/ 120 PARK ) ,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error

Also, the checksum of the package is stored as "pkgId" rather than as "checksum".  Simply because that's what the RPM tools call it.  There's no field actually named "checksum".


pulp=> select * from rpm_package;
 content_ptr_id | name | epoch | version | release | arch | pkgId | checksum_type | summary | description | url | changelogs | files | requires | provides | conflicts | obsoletes | suggests | enhances | recommends | supplements | location
_base | location_href | rpm_buildhost | rpm_group | rpm_license | rpm_packager | rpm_sourcerpm | rpm_vendor | rpm_header_start | rpm_header_end | is_modular | size_archive | size_installed | size_package | time_build | time_file | evr 
----------------+------+-------+---------+---------+------+-------+---------------+---------+-------------+-----+------------+-------+----------+----------+-----------+-----------+----------+----------+------------+-------------+---------
------+---------------+---------------+-----------+-------------+--------------+---------------+------------+------------------+----------------+------------+--------------+----------------+--------------+------------+-----------+-----
(0 rows)


Going back to the original error though, this almost sounds like the file being downloaded doesn't match the checksum it's supposed to have. 


Received checksum b8b257c32135daf51e703d439594f1a676871d7d for http://something/something/flume-1.9.0-1.noarch.rpm but expected c281a94a354178c42800d47b63479c2621772351

Is there any chance you could manually download that file and verify its checksum, to make sure it's not corrupted at the source?

On Tue, Mar 16, 2021 at 2:24 PM Daniel Alley <dalley at redhat.com> wrote:


I tried to read content from v3/content/. There is too much content to  be listed. Not sure if I can specify a regex so I use select from db  directly 

You can use query parameters when making HTTP calls against Pulp, like so:

POST  .../pulp/api/v3/content/packages/ name=flume

There's a bunch of options available, they are documented here:https://pulp-rpm.readthedocs.io/en/latest/restapi.html#operation/content_rpm_packages_list
On Tue, Mar 16, 2021 at 12:44 PM Brian Bouterse <bmbouter at redhat.com> wrote:

This doesn't help you today, but I think this type of use case is what motivates an API call like this one that is being discussed:  https://pulp.plan.io/issues/8372
On Tue, Mar 16, 2021 at 12:30 PM Bin Li (BLOOMBERG/ 120 PARK) <bli111 at bloomberg.net> wrote:

I tried to read content from v3/content/. There is too much content to be listed. Not sure if I can specify a regex so I use select from db directly to see if I can find the package which causes the issue originally. The query returns 0 rows. It looks like it was cleaned out unless another content causes this issue. Let me know if there is anything else I can try.

Below is what I have tried.

The original error:
Received checksum b8b257c32135daf51e703d439594f1a676871d7d for http://something/something/flume-1.9.0-1.noarch.rpm but expected c281a94a354178c42800d47b63479c2621772351

=> select name from rpm_package where name like 'flume%' limit 100;
 name 
------
(0 rows)

=> select checksum from rpm_checksum where checksum like '%594f1a676871d7d' OR checksum like '%63479c2621772351';
 checksum 
----------
(0 rows)


From: dalley at redhat.com At: 03/15/21 11:03:48To:  Bin Li (BLOOMBERG/ 120 PARK ) 
Cc:  daviddavis at redhat.com,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error

Do you know if that package could possibly have been present in any other repositories also?  If you know which content unit it is, try accessing it via the content API after having done the orphan cleanup.  If it still exists, it wasn't cleaned up for some reason, which may mean it's used by some other repository.

This is... interesting.  Pulp seems to be attempting to save the package, hitting an IntegrityError because it already exists (expected), and then trying to retrieve the package, and not being able to find it.

Please file an issue with all the information you've posted so far, we will look into how this could be happening.

On Mon, Mar 15, 2021 at 9:37 AM Bin Li (BLOOMBERG/ 120 PARK) <bli111 at bloomberg.net> wrote:

I deleted the repo which failed sync and run "delete localhost/pulp/api/v3/orphans/" but I am still getting the same messages.


From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/12/21 12:18:44To:  dalley at redhat.com
Cc:  daviddavis at redhat.com,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
Hi Dan,

Here is the traceback?

    "error": {
        "description": "Package matching query does not exist.", 
        "traceback": "  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/worker.py\", line 886, in perform_job\n    rv = job.perform()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/job.py\", line 664, in perform\n    self._result = self._execute()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/job.py\", line 670, in _execute\n    return self.func(*self.args, **self.kwargs)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py\", line 266, in synchronize\n    dv.create()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/declarative_version.py\", line 148, in create\n    loop.run_until_complete(pipeline)\n  File \"/opt/python/3.7.3/lib64/python3.7/asyncio/base_events.py\", line 584, in run_until_complete\n    return future.result()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py\", line 225, in create_pipeline\n    await asyncio.gather(*futures)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py\", line 43, in __call__\n    await self.run()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/content_stages.py\", line 105, in run\n    d_content.content.q()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/manager.py\", line 82, in manager_method\n    return getattr(self.get_queryset(), name)(*args, **kwargs)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/query.py\", line 408, in get\n    self.model._meta.object_name\n"


I will try removing the repo first and deleting orphans.


From: dalley at redhat.com At: 03/12/21 11:19:17To:  Bin Li (BLOOMBERG/ 120 PARK ) 
Cc:  daviddavis at redhat.com,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error

Hi Bin,

It's difficult to understand what exactly this error is.  Is it an error message being printed out nicely, or part of a Python exception bubbling up?  And if it's the latter, do you have the rest of the traceback?

You can't manually delete specific content units but you can delete "orphan" content units that aren't part of any repository.  So if you know the content unit in question, you can delete it from your repositories, and then run orphan cleanup.

On Thu, Mar 11, 2021 at 11:27 AM Bin Li (BLOOMBERG/ 120 PARK) <bli111 at bloomberg.net> wrote:

If inconsistent repo data can cause Pulp to become unrecoverable, this is very difficult to prevent. Any inconsistent updates from upstream could potentially cause this to happen. At this point, we are thinking to restore the database from the backup before this sync issue happened, will this approach work? 


From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/10/21 14:41:38To:  daviddavis at redhat.com
Cc:  bmbouter at redhat.com,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error

We know the rpm name which has different check sum with repodata and caused the sync failure earlier. I am guessing the current issue is caused by this rpm. Is there any way we can remove it from database? 

From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/09/21 17:47:17To:  daviddavis at redhat.com
Cc:  bmbouter at redhat.com,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
Please ignore the last message. It is irrelevant.
We actually able to sync the upstream repo successfully after we reset the database. The question is how to we recover from previous failure? We keep getting "Package matching query does not exist." without reset the database. Recreating the repo didn't help either.

From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/09/21 17:06:53To:  daviddavis at redhat.com
Cc:  bmbouter at redhat.com,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
It looks like the last error caused failed sync process. I reset the db on a dev host and try to sync the same upstream repo and got 
""An error occurred (QuotaExceeded) when calling the PutObject operation: Unknown"

Any idea how to fix this?


        "traceback": "  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/worker.py\", line 886, in perform_job\n    rv = job.perform()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/job.py\", line 664, in perform\n    self._result = self._execute()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/job.py\", line 670, in _execute\n    return self.func(*self.args, **self.kwargs)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py\", line 266, in synchronize\n    dv.create()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/declarative_version.py\", line 148, in create\n    loop.run_until_complete(pipeline)\n  File \"/opt/python/3.7.3/lib64/python3.7/asyncio/base_events.py\", line 584, in run_until_complete\n    return future.result()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py\", line 225, in create_pipeline\n    await asyncio.gather(*futures)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py\", line 43, in __call__\n    await self.run()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/artifact_stages.py\", line 219, in run\n    d_artifact.artifact for d_artifact in da_to_save\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/app/models/content.py\", line 87, in bulk_get_or_create\n    return super().bulk_create(objs, batch_size=batch_size)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/manager.py\", line 82, in manager_method\n    return getattr(self.get_queryset(), name)(*args, **kwargs)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/query.py\", line 468, in bulk_create\n    self._batched_insert(objs_with_pk, fields, batch_size, ignore_conflicts=ignore_conflicts)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/query.py\", line 1204, in _batched_insert\n    ignore_conflicts=ignore_conflicts,\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/query.py\", line 1186, in _insert\n    return query.get_compiler(using=using).execute_sql(return_id)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/sql/compiler.py\", line 1376, in execute_sql\n    for sql, params in self.as_sql():\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django_readonly_field/compiler.py\", line 31, in as_sql\n    return super(ReadonlySQLCompilerMixin, self).as_sql()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/sql/compiler.py\", line 1320, in as_sql\n    for obj in self.query.objs\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/sql/compiler.py\", line 1320, in <listcomp>\n    for obj in self.query.objs\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/sql/compiler.py\", line 1319, in <listcomp>\n    [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/sql/compiler.py\", line 1270, in pre_save_val\n    return field.pre_save(obj, add=True)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/app/models/fields.py\", line 68, in pre_save\n    return super().pre_save(model_instance, add)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/fields/files.py\", line 288, in pre_save\n    file.save(file.name, file.file, save=False)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/fields/files.py\", line 87, in save\n    self.name = self.storage.save(name, content, max_length=self.field.max_length)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/core/files/storage.py\", line 52, in save\n    return self._save(name, content)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/storages/backends/s3boto3.py\", line 447, in _save\n    obj.upload_fileobj(content, ExtraArgs=params)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/boto3/s3/inject.py\", line 621, in object_upload_fileobj\n    ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/boto3/s3/inject.py\", line 539, in upload_fileobj\n    return future.result()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/s3transfer/futures.py\", line 106, in result\n    return self._coordinator.result()\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/s3transfer/futures.py\", line 265, in result\n    raise self._exception\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/s3transfer/tasks.py\", line 126, in __call__\n    return self._execute_main(kwargs)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/s3transfer/tasks.py\", line 150, in _execute_main\n    return_value = self._main(**kwargs)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/s3transfer/upload.py\", line 692, in _main\n    client.put_object(Bucket=bucket, Key=key, Body=body, **extra_args)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/botocore/client.py\", line 357, in _api_call\n    return self._make_api_call(operation_name, kwargs)\n  File \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/botocore/client.py\", line 676, in _make_api_call\n    raise error_class(parsed_response, operation_name)\n"


From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/09/21 11:34:57To:  daviddavis at redhat.com
Cc:  bmbouter at redhat.com,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
Got a new errors "Package matching query does not exist.". Is this also related to upstream repo? Can we have more details when this happens?

From: daviddavis at redhat.com At: 03/05/21 15:06:40To:  Bin Li (BLOOMBERG/ 120 PARK ) 
Cc:  bmbouter at redhat.com,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error

Great, thanks for the update. 

David

On Fri, Mar 5, 2021 at 2:47 PM Bin Li (BLOOMBERG/ 120 PARK) <bli111 at bloomberg.net> wrote:

Thanks Dave. Got the filename which has the inconsistent checksum after patching. We will ask upstream remote to update the repodata.

From: daviddavis at redhat.com At: 03/05/21 12:42:56To:  bmbouter at redhat.com
Cc:  Bin Li (BLOOMBERG/ 120 PARK ) ,  pulp-list at redhat.com
Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error

Regarding the error message, I've observed the problem myself. and have filed an issue:

https://pulp.plan.io/issues/8357

In the meantime, if you can patch the code, this should tell you give you more info:

https://gist.github.com/daviddavis/2e6ab1872d97230d144a6cd1f9d05e31

David

On Fri, Mar 5, 2021 at 12:35 PM Brian Bouterse <bmbouter at redhat.com> wrote:

Did this happen inside a task? Did you see a traceback for it also?

On Fri, Mar 5, 2021 at 12:00 PM Bin Li (BLOOMBERG/ 120 PARK) <bli111 at bloomberg.net> wrote:

The sync process gave an error "A file failed validation due to checksum". Is this error caused by remote repo? Is there a way to find out which file cause the issue?_______________________________________________
Pulp-list mailing list
Pulp-list at redhat.com
https://listman.redhat.com/mailman/listinfo/pulp-list
 _______________________________________________
Pulp-list mailing list
Pulp-list at redhat.com
https://listman.redhat.com/mailman/listinfo/pulp-list


_______________________________________________
Pulp-list mailing list
Pulp-list at redhat.com
https://listman.redhat.com/mailman/listinfo/pulp-list


_______________________________________________
Pulp-list mailing list
Pulp-list at redhat.com
https://listman.redhat.com/mailman/listinfo/pulp-list


 _______________________________________________
Pulp-list mailing list
Pulp-list at redhat.com
https://listman.redhat.com/mailman/listinfo/pulp-list


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-list/attachments/20210323/20d747e7/attachment.htm>


More information about the Pulp-list mailing list