python and x86_64 problem

seth vidal skvidal at phy.duke.edu
Thu May 6 06:16:56 UTC 2004


Hi,
 Troubleshooting this bug:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=122304

I found something that could be a problem on x86_64 for python:

try this on x86_64:
foo='www.redhat.com'
foo.encode("idna")
depending on your encoding that's set you'll get either the correct:
www.redhat.com
or
www.redhat..om

we look on line 6 of 
/usr/lib64/python2.3/encodings/idna.py at:
dots = re.compile(u"[\u002E\u3002\uFF0E\uFF61]")

that works great on x86 - so a little further down on line 153 you see:

labels = dots.split(input)

the input in this question is like the url above.

so try this bit of code on your own x86_64 python 2.3.3 system:

import re
dots = re.compile(u"[\u002E\u3002\uFF0E\uFF61]")
foo = 'www.redhat.com'
labels = dots.split(foo)

print labels

you'll find it is:
Hi,
 Troubleshooting this bug:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=122304

I found something that could be a problem on x86_64 for python:

try this on x86_64:
foo='www.redhat.com'
foo.encode("idna")
depending on your encoding that's set you'll get either the correct:
www.redhat.com
or
www.redhat..om

we look on line 6 of 
/usr/lib64/python2.3/encodings/idna.py at:
dots = re.compile(u"[\u002E\u3002\uFF0E\uFF61]")

that works great on x86 - so a little further down on line 153 you see:

labels = dots.split(input)

the input in this question is like the url above.

so try this bit of code on your own x86_64 python 2.3.3 system:

import re
dots = re.compile(u"[\u002E\u3002\uFF0E\uFF61]")
foo = 'www.redhat.com'
labels = dots.split(foo)

print labels

you'll find it is:
['www.redhat.', 'om']

while on x86 it is:

['www', 'redhat', 'com']

which is correct - 3 label sections from rfc 3490

so I went looking for the problem a little bit and found in _sre.c
    #if defined(MS_WIN64) || defined(__LP64__) || defined(_LP64)
        /* require smaller recursion limit for a number of 64-bit platforms:
         * Win64 (MS_WIN64), Linux64 (__LP64__), Monterey (64-bit AIX) (_LP64)
         */
        /* FIXME: maybe the limit should be 40000 / sizeof(void*) ? */
        #define USE_RECURSION_LIMIT 7500



I'm wondering if that FIXME is accurate - I've not tested the change yet
but it seems like a potential problem for regexes like this - or more to
the point anything using the HTTPHandler in python.

Can someone more experienced at python _sre internals take a look at
this?

This will most likely effect up2date, yum, and many network-interacting
python applications using http.
Thanks
-sv






More information about the fedora-devel-list mailing list