[Pulp-list] pulp 3.7.3 sync with checksum error

Brian Bouterse bmbouter at redhat.com
Tue Mar 16 16:44:11 UTC 2021


This doesn't help you today, but I think this type of use case is what
motivates an API call like this one that is being discussed:
https://pulp.plan.io/issues/8372

On Tue, Mar 16, 2021 at 12:30 PM Bin Li (BLOOMBERG/ 120 PARK) <
bli111 at bloomberg.net> wrote:

> I tried to read content from v3/content/. There is too much content to be
> listed. Not sure if I can specify a regex so I use select from db directly
> to see if I can find the package which causes the issue originally. The
> query returns 0 rows. It looks like it was cleaned out unless another
> content causes this issue. Let me know if there is anything else I can try.
>
> Below is what I have tried.
>
> The original error:
> Received checksum b8b257c32135daf51e703d439594f1a676871d7d for
> http://something/something/flume-1.9.0-1.noarch.rpm but expected
> c281a94a354178c42800d47b63479c2621772351
>
> => select name from rpm_package where name like 'flume%' limit 100;
> name
> ------
> (0 rows)
>
> => select checksum from rpm_checksum where checksum like
> '%594f1a676871d7d' OR checksum like '%63479c2621772351';
> checksum
> ----------
> (0 rows)
>
>
>
>
> From: dalley at redhat.com At: 03/15/21 11:03:48
> To: Bin Li (BLOOMBERG/ 120 PARK ) <bli111 at bloomberg.net>
> Cc: daviddavis at redhat.com, pulp-list at redhat.com
> Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
>
> Do you know if that package could possibly have been present in any other
> repositories also?  If you know which content unit it is, try accessing it
> via the content API after having done the orphan cleanup.  If it still
> exists, it wasn't cleaned up for some reason, which may mean it's used by
> some other repository.
>
> This is... interesting.  Pulp seems to be attempting to save the package,
> hitting an IntegrityError because it already exists (expected), and then
> trying to retrieve the package, and not being able to find it.
>
> Please file an issue with all the information you've posted so far, we
> will look into how this could be happening.
>
> On Mon, Mar 15, 2021 at 9:37 AM Bin Li (BLOOMBERG/ 120 PARK) <
> bli111 at bloomberg.net> wrote:
>
>> I deleted the repo which failed sync and run "delete
>> localhost/pulp/api/v3/orphans/" but I am still getting the same messages.
>>
>>
>> From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/12/21 12:18:44
>> To: dalley at redhat.com
>> Cc: daviddavis at redhat.com, pulp-list at redhat.com
>> Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
>>
>> Hi Dan,
>>
>> Here is the traceback?
>>
>> "error": {
>> "description": "Package matching query does not exist.",
>> "traceback": " File
>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/worker.py\",
>> line 886, in perform_job\n rv = job.perform()\n File
>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/job.py\",
>> line 664, in perform\n self._result = self._execute()\n File
>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/job.py\",
>> line 670, in _execute\n return self.func(*self.args, **self.kwargs)\n File
>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py\",
>> line 266, in synchronize\n dv.create()\n File
>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/declarative_version.py\",
>> line 148, in create\n loop.run_until_complete(pipeline)\n File
>> \"/opt/python/3.7.3/lib64/python3.7/asyncio/base_events.py\", line 584, in
>> run_until_complete\n return future.result()\n File
>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py\",
>> line 225, in create_pipeline\n await asyncio.gather(*futures)\n File
>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py\",
>> line 43, in __call__\n await self.run()\n File
>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/content_stages.py\",
>> line 105, in run\n d_content.content.q()\n File
>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/manager.py\",
>> line 82, in manager_method\n return getattr(self.get_queryset(),
>> name)(*args, **kwargs)\n File
>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/query.py\",
>> line 408, in get\n self.model._meta.object_name\n"
>>
>>
>> I will try removing the repo first and deleting orphans.
>>
>>
>>
>> From: dalley at redhat.com At: 03/12/21 11:19:17
>> To: Bin Li (BLOOMBERG/ 120 PARK ) <bli111 at bloomberg.net>
>> Cc: daviddavis at redhat.com, pulp-list at redhat.com
>> Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
>>
>> Hi Bin,
>>
>> It's difficult to understand what exactly this error is.  Is it an error
>> message being printed out nicely, or part of a Python exception bubbling
>> up?  And if it's the latter, do you have the rest of the traceback?
>>
>> You can't manually delete specific content units but you can delete
>> "orphan" content units that aren't part of any repository.  So if you know
>> the content unit in question, you can delete it from your repositories, and
>> then run orphan cleanup.
>>
>> On Thu, Mar 11, 2021 at 11:27 AM Bin Li (BLOOMBERG/ 120 PARK) <
>> bli111 at bloomberg.net> wrote:
>>
>>> If inconsistent repo data can cause Pulp to become unrecoverable, this
>>> is very difficult to prevent. Any inconsistent updates from upstream could
>>> potentially cause this to happen. At this point, we are thinking to restore
>>> the database from the backup before this sync issue happened, will this
>>> approach work?
>>>
>>>
>>> From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/10/21 14:41:38
>>> To: daviddavis at redhat.com
>>> Cc: bmbouter at redhat.com, pulp-list at redhat.com
>>> Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
>>>
>>>
>>> We know the rpm name which has different check sum with repodata and
>>> caused the sync failure earlier. I am guessing the current issue is caused
>>> by this rpm. Is there any way we can remove it from database?
>>>
>>> From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/09/21 17:47:17
>>> To: daviddavis at redhat.com
>>> Cc: bmbouter at redhat.com, pulp-list at redhat.com
>>> Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
>>>
>>> Please ignore the last message. It is irrelevant.
>>> We actually able to sync the upstream repo successfully after we reset
>>> the database. The question is how to we recover from previous failure? We
>>> keep getting "Package matching query does not exist." without reset the
>>> database. Recreating the repo didn't help either.
>>>
>>> From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/09/21 17:06:53
>>> To: daviddavis at redhat.com
>>> Cc: bmbouter at redhat.com, pulp-list at redhat.com
>>> Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
>>>
>>> It looks like the last error caused failed sync process. I reset the db
>>> on a dev host and try to sync the same upstream repo and got
>>> ""An error occurred (QuotaExceeded) when calling the PutObject
>>> operation: Unknown"
>>>
>>> Any idea how to fix this?
>>>
>>>
>>> "traceback": " File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/worker.py\",
>>> line 886, in perform_job\n rv = job.perform()\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/job.py\",
>>> line 664, in perform\n self._result = self._execute()\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/rq/job.py\",
>>> line 670, in _execute\n return self.func(*self.args, **self.kwargs)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulp_rpm/app/tasks/synchronizing.py\",
>>> line 266, in synchronize\n dv.create()\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/declarative_version.py\",
>>> line 148, in create\n loop.run_until_complete(pipeline)\n File
>>> \"/opt/python/3.7.3/lib64/python3.7/asyncio/base_events.py\", line 584, in
>>> run_until_complete\n return future.result()\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py\",
>>> line 225, in create_pipeline\n await asyncio.gather(*futures)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/api.py\",
>>> line 43, in __call__\n await self.run()\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/plugin/stages/artifact_stages.py\",
>>> line 219, in run\n d_artifact.artifact for d_artifact in da_to_save\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/app/models/content.py\",
>>> line 87, in bulk_get_or_create\n return super().bulk_create(objs,
>>> batch_size=batch_size)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/manager.py\",
>>> line 82, in manager_method\n return getattr(self.get_queryset(),
>>> name)(*args, **kwargs)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/query.py\",
>>> line 468, in bulk_create\n self._batched_insert(objs_with_pk, fields,
>>> batch_size, ignore_conflicts=ignore_conflicts)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/query.py\",
>>> line 1204, in _batched_insert\n ignore_conflicts=ignore_conflicts,\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/query.py\",
>>> line 1186, in _insert\n return
>>> query.get_compiler(using=using).execute_sql(return_id)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/sql/compiler.py\",
>>> line 1376, in execute_sql\n for sql, params in self.as_sql():\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django_readonly_field/compiler.py\",
>>> line 31, in as_sql\n return super(ReadonlySQLCompilerMixin,
>>> self).as_sql()\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/sql/compiler.py\",
>>> line 1320, in as_sql\n for obj in self.query.objs\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/sql/compiler.py\",
>>> line 1320, in <listcomp>\n for obj in self.query.objs\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/sql/compiler.py\",
>>> line 1319, in <listcomp>\n [self.prepare_value(field,
>>> self.pre_save_val(field, obj)) for field in fields]\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/sql/compiler.py\",
>>> line 1270, in pre_save_val\n return field.pre_save(obj, add=True)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/pulpcore/app/models/fields.py\",
>>> line 68, in pre_save\n return super().pre_save(model_instance, add)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/fields/files.py\",
>>> line 288, in pre_save\n file.save(file.name, file.file, save=False)\n
>>> File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/db/models/fields/files.py\",
>>> line 87, in save\n self.name = self.storage.save(name, content,
>>> max_length=self.field.max_length)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/django/core/files/storage.py\",
>>> line 52, in save\n return self._save(name, content)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/storages/backends/s3boto3.py\",
>>> line 447, in _save\n obj.upload_fileobj(content, ExtraArgs=params)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/boto3/s3/inject.py\",
>>> line 621, in object_upload_fileobj\n ExtraArgs=ExtraArgs,
>>> Callback=Callback, Config=Config)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/boto3/s3/inject.py\",
>>> line 539, in upload_fileobj\n return future.result()\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/s3transfer/futures.py\",
>>> line 106, in result\n return self._coordinator.result()\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/s3transfer/futures.py\",
>>> line 265, in result\n raise self._exception\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/s3transfer/tasks.py\",
>>> line 126, in __call__\n return self._execute_main(kwargs)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/s3transfer/tasks.py\",
>>> line 150, in _execute_main\n return_value = self._main(**kwargs)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/s3transfer/upload.py\",
>>> line 692, in _main\n client.put_object(Bucket=bucket, Key=key, Body=body,
>>> **extra_args)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/botocore/client.py\",
>>> line 357, in _api_call\n return self._make_api_call(operation_name,
>>> kwargs)\n File
>>> \"/opt/utils/venv/pulp/3.7.3/lib64/python3.7/site-packages/botocore/client.py\",
>>> line 676, in _make_api_call\n raise error_class(parsed_response,
>>> operation_name)\n"
>>>
>>>
>>> From: Bin Li (BLOOMBERG/ 120 PARK) At: 03/09/21 11:34:57
>>> To: daviddavis at redhat.com
>>> Cc: bmbouter at redhat.com, pulp-list at redhat.com
>>> Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
>>>
>>> Got a new errors "Package matching query does not exist.". Is this also
>>> related to upstream repo? Can we have more details when this happens?
>>>
>>> From: daviddavis at redhat.com At: 03/05/21 15:06:40
>>> To: Bin Li (BLOOMBERG/ 120 PARK ) <bli111 at bloomberg.net>
>>> Cc: bmbouter at redhat.com, pulp-list at redhat.com
>>> Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
>>>
>>> Great, thanks for the update.
>>>
>>> David
>>>
>>>
>>> On Fri, Mar 5, 2021 at 2:47 PM Bin Li (BLOOMBERG/ 120 PARK) <
>>> bli111 at bloomberg.net> wrote:
>>>
>>>> Thanks Dave. Got the filename which has the inconsistent checksum after
>>>> patching. We will ask upstream remote to update the repodata.
>>>>
>>>> From: daviddavis at redhat.com At: 03/05/21 12:42:56
>>>> To: bmbouter at redhat.com
>>>> Cc: Bin Li (BLOOMBERG/ 120 PARK ) <bli111 at bloomberg.net>,
>>>> pulp-list at redhat.com
>>>> Subject: Re: [Pulp-list] pulp 3.7.3 sync with checksum error
>>>>
>>>> Regarding the error message, I've observed the problem myself. and have
>>>> filed an issue:
>>>>
>>>> https://pulp.plan.io/issues/8357
>>>>
>>>> In the meantime, if you can patch the code, this should tell you give
>>>> you more info:
>>>>
>>>> https://gist.github.com/daviddavis/2e6ab1872d97230d144a6cd1f9d05e31
>>>>
>>>> David
>>>>
>>>>
>>>> On Fri, Mar 5, 2021 at 12:35 PM Brian Bouterse <bmbouter at redhat.com>
>>>> wrote:
>>>>
>>>>> Did this happen inside a task? Did you see a traceback for it also?
>>>>>
>>>>> On Fri, Mar 5, 2021 at 12:00 PM Bin Li (BLOOMBERG/ 120 PARK) <
>>>>> bli111 at bloomberg.net> wrote:
>>>>>
>>>>>> The sync process gave an error "A file failed validation due to
>>>>>> checksum". Is this error caused by remote repo? Is there a way to find out
>>>>>> which file cause the issue?
>>>>>> _______________________________________________
>>>>>> Pulp-list mailing list
>>>>>> Pulp-list at redhat.com
>>>>>> https://listman.redhat.com/mailman/listinfo/pulp-list
>>>>>
>>>>> _______________________________________________
>>>>> Pulp-list mailing list
>>>>> Pulp-list at redhat.com
>>>>> https://listman.redhat.com/mailman/listinfo/pulp-list
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Pulp-list mailing list
>>> Pulp-list at redhat.com
>>> https://listman.redhat.com/mailman/listinfo/pulp-list
>>
>>
>>
> _______________________________________________
> Pulp-list mailing list
> Pulp-list at redhat.com
> https://listman.redhat.com/mailman/listinfo/pulp-list
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-list/attachments/20210316/064fe4ff/attachment.htm>


More information about the Pulp-list mailing list