[Spacewalk-list] Check "task queue" for client on server within database?

Wed Dec 9 00:24:53 UTC 2015

were there any DBI related errors in the logs?

On Tue, Dec 8, 2015 at 7:23 PM, Paul Robert Marino <prmarino1 at gmail.com> wrote:
> Wait I am confused by this bug report, how can you update a read only database?
> You should have gotten an error from the database when it could not
> write to the journal.
> Now PostgreSQL would have probably allowed you to keep reading as long
> as the queries fit within the memory boundaries, but you wouldn't be
> able to update the statuses in that case.
> now if the journal was on a different files system than the data, for
> example you have spacewalk in a separate table space. I have seen
> instances where PostgreSQL will commit to the journal but will fail to
> update the data. In those cases you need to fix the file system and
> then restart the PosgresSQL service to force a journal recovery, then
> you would have to bounce the spacewalk services to get them to
> reconnect.
>
>
>
>
>
> On Tue, Dec 8, 2015 at 3:41 PM, Robert Paschedag
> <robert.paschedag at web.de> wrote:
>> So... finally... after long digging... I found my error.
>>
>> Situation:
>>
>> Nearly all clients stopped picking up tasks from spacewalk at the same time.
>>
>> Problem was an error within the database because of a suddenly read only filesystem. All problematic clients had an "open" task, that was in status "picked up", but the client had nothing to return anymore.
>>
>> And all the "new" tasks had not been picked up, because within the handler "queue.get", the handler returns without doing anything if there is still an "open" task for the client.
>>
>> So the solution was to change the status of the "open" task within the database.
>>
>> I put all tasks to "fail" (status code 3) with
>>
>> "update rhnServerActions set status = 3, completed_time = CURRENT_TIMESTAMP where status = 1;"
>>
>> Database is PostgreSQL
>>
>> After that, all clients picked up the "remaining" tasks and the server nearly broke down ;-)
>>
>> Regards
>> Robert
>> Am 08.12.2015 10:23 schrieb Paschedag.Netlution at swr.de:
>>>
>>> Hi everyone,
>>>
>>> I just noticed, that several of my clients are not picking up remote commands from the server. Although all "remote commands" are allowed on the client, scheduling a task stays in "pending" state forever.
>>>
>>> On the client, nothing gets logged anymore. It looks, like the client does not "find" anything to do in its queue on the server. When I debug the "rhn_check" command, this code here fails
>>>
>>>             action = self.server.queue.get(up2dateAuth.getSystemId(),
>>>  94                     ACTION_VERSION, status_report)
>>>  95
>>>  96                 return action
>>>
>>> "action" is emtpy every time now.
>>>
>>> Any help will be appreciated.
>>>
>>> Regards,
>>> Robert
>>>
>>>
>>> Mit freundlichen Grüßen
>>>
>>> Robert Paschedag
>>> Netlution GmbH
>>> Landteilstr. 33
>>> 68163 Mannheim
>>>
>>> im Auftrag des
>>> SWR
>>> Südwestrundfunk
>>> Informations- und Kommunikationssysteme
>>> Neckarstraße 230
>>> 70190 Stuttgart
>>>
>>> Telefon +49 (0)711 /929-12654 oder
>>> Telefon +49 (0)711 /929-13714
>>> paschedag.netlution at swr.de
>>>
>>> swr.de
>>>
>>
>> _______________________________________________
>> Spacewalk-list mailing list
>> Spacewalk-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/spacewalk-list