Jakub's Recommendations for ia32 Support
Josh Boyer
jwboyer at gmail.com
Tue Feb 3 20:52:41 UTC 2009
On Tue, Feb 03, 2009 at 09:45:46PM +0100, Dominik 'Rathann' Mierzejewski wrote:
>On Tuesday, 03 February 2009 at 21:01, Ulrich Drepper wrote:
>> Dominik 'Rathann' Mierzejewski wrote:
>> > I'd like to see a case (not involving Pentium 4) where using cmov is slower
>> > than not using it. It definitely is faster for decoding H.264 in FFmpeg
>> > for example.
>>
>> I don't have a specific test case. But I do talk to the CPU
>> architectures at Intel regularly.
>
>I didn't know architectures could talk. ;)
>
>> They always say the cmov should be
>> avoided. Especially with the introduction of the fused micro-ops the
>> various cmp+jcc pairs are likely move faster.
>>
>> And from the code generation perspective using cmp+jcc is also more
>> flexible. With cmov you have to tie up two registers. This is
>> particularly bad with the x86 ABI.
>>
>> There are certainly cases where cmov can be faster. Perhaps exclusively
>> on older micro architectures (P4s, early Core2, maybe AMD, haven't
>> checked). But in general it's no win.
>
>Well, I talk to people who write hand-optimized assembly and care to
>squeeze every cycle out of various CPUs and they say it's definitely
>a win. So please, show me some code instead of hand-waving.
If they can do that, then why can't they rebuild things themselves?
josh
More information about the fedora-devel-list
mailing list