PATCH: SSL.SysCallError fix for plague-0.5.0

Joe Todaro jstodaro at us.ibm.com
Tue Oct 31 17:56:07 UTC 2006


fedora-buildsys-list-bounces at redhat.com wrote on 10/31/2006 11:45:56 AM:

> On Fri, 2006-10-27 at 00:32 -0400, Joe Todaro wrote:
> > 
> > Hi, 
> > 
> > Has anyone ever seen this error before in their *plague-0.5.0* build
> > environment?   It surfaced last week shortly after we started
> > stress-testing our buildsystem.   In fact, there were three such
> > errors in all, which I will post separately to avoid any confusion.
> > This is one of three.   It was triggered when we requested status
> > about a job we killed before it actually got handed-off to archjobs. 
> > 
> > ====== THE ERROR ======- 
> > Request to enqueue 'stacker' tag 'stacker-1_3-5' for target
> > 'oc-rhel4-dev' (user 'jtodaro at pok.ibm.com') 
> > 66 (stacker): Starting tag 'stacker-1_3-5' on target 'oc-rhel4-dev' 
> > 66 (stacker): Requesting depsolve... 
> > 66 (stacker): Starting depsolve for arches: ['i686']. 
> > 66 (stacker): Finished depsolve (successful), requesting archjobs. 
> > 66 (stacker/i686): https://lnxbuild1.pok.ibm.com.:8888 - UID is
> > 9adf56cdd15bfae2388966b08837250d3bf6772c 
> > ---------------------------------------- 
> > Exception happened during processing of request from ('10.63.82.73',
> > 49136) 
> > Traceback (most recent call last): 
> >   File "/usr/lib64/python2.3/SocketServer.py", line 463, in
> > process_request_thread 
> >     self.finish_request(request, client_address) 
> >   File "/usr/lib64/python2.3/SocketServer.py", line 254, in
> > finish_request 
> >     self.RequestHandlerClass(request, client_address, self) 
> >   File "/usr/lib64/python2.3/SocketServer.py", line 521, in __init__ 
> >     self.handle() 
> >   File "/usr/lib64/python2.3/BaseHTTPServer.py", line 324, in handle 
> >     self.handle_one_request() 
> >   File "/usr/lib64/python2.3/BaseHTTPServer.py", line 307, in
> > handle_one_request 
> >     self.raw_requestline = self.rfile.readline() 
> >   File "/usr/lib64/python2.3/socket.py", line 338, in readline 
> >     data = self._sock.recv(self._rbufsize) 
> >   File "/usr/lib/python2.3/site-packages/plague/SSLConnection.py",
> > line 142, in recv 
> >     return con.recv(bufsize, flags) 
> > SysCallError: (-1, 'Unexpected EOF') 
> > ---------------------------------------- 
> > 
> > ====== OUR FIX ====== 
> > We added lines 147-148 to the *recv* method of the
> > */usr/lib/python2.3/site-packages/plague/SSLConnection.py* module.
> > Here's the patch: 
> > 
> > 
> > So, can someone please review the above fix.. We want to make sure it
> > won't come back to *bite* us later on / or possibly evn be *masking* a
> > larger problem.   Thank you. 
> 
> This one makes me a bit nervous.  The SSL stuff is pretty fragile, since
> SSL in general adds yet another protocol layer on top of everything
> that's subject to more handshakes and state over just TCP/IP.
> 
> The traceback here shouldn't really have an effect, since it just
> terminates the current thread, and plague's state machine is built to be
> resilient to dropped and dead connection threads. 

Aha - I didn't realize that. So, I will backout our patch. 
Thank You so much for the excellent clarification.
Joe
 
> I'd like to hide the
> traceback (or at least just print a one-line message) but that's not
> possible since plague code isn't anywhere in the traceback and therefore
> would require more subclassing.
> 
> Furthermore, it technically is an error (that the other side closed the
> socket prematurely or something broke the connection) but one that we
> should ignore and retry, which plague will do.
> 
> However, if this fix seems to work OK for you for a while, I'd be
> interested in revisiting the issue.
> 
> Dan
> 
> > -Joe 
> > --
> > Fedora-buildsys-list mailing list
> > Fedora-buildsys-list at redhat.com
> > https://www.redhat.com/mailman/listinfo/fedora-buildsys-list
> 
> --
> Fedora-buildsys-list mailing list
> Fedora-buildsys-list at redhat.com
> https://www.redhat.com/mailman/listinfo/fedora-buildsys-list
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/fedora-buildsys-list/attachments/20061031/1eeb327e/attachment.htm>


More information about the Fedora-buildsys-list mailing list