我是靠谱客的博主 谦让自行车,最近开发中收集的这篇文章主要介绍性能优化:__builtin_expect详解 FAQ/LikelyUnlikely,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

转自:http://hi.baidu.com/lammy/blog/item/bc5e3d4e869073c3d1c86a89.html

在GTK+2.0源码中有很多这样的宏:G_LIKELY和G_UNLIKELY。比如下面这段代码:

if (G_LIKELY (acat == 1))    
/* allocate through magazine layer */
   
{
     
ThreadMemory *tmem = thread_memory_from_self();
     
guint ix = SLAB_INDEX (allocator, chunk_size);
     
if (G_UNLIKELY (thread_memory_magazine1_is_empty (tmem, ix)))
       
{
         
thread_memory_swap_magazines (tmem, ix);
         
if (G_UNLIKELY (thread_memory_magazine1_is_empty (tmem, ix)))
           
thread_memory_magazine1_reload (tmem, ix);
       
}
     
mem = thread_memory_magazine1_alloc (tmem, ix);
   
}

在源码中,宏G_LIKELY和G_UNLIKELY 是这么定义的:

#define G_LIKELY(expr) (__builtin_expect (_G_BOOLEAN_EXPR(expr), 1))
#define G_UNLIKELY(expr) (__builtin_expect (_G_BOOLEAN_EXPR(expr), 0))

宏_G_BOOLEAN_EXPR的作用是把expr转换为0和1,即真假两种。要理解宏G_LIKELY和G_UNLIKELY ,很明显必须理解__builtin_expect。__builtin_expect是GCC(version>=2.9)引进的宏,其作用就是帮助编译器判断条件跳转的预期值,避免跳转造成时间乱费。拿上面的代码来说:

if (G_LIKELY (acat == 1))     //表示大多数情况下if里面是真,程序大多数直接执行if里面的程序

if (G_UNLIKELY (thread_memory_magazine1_is_empty (tmem, ix)))//表示大多数情况if里面为假,程序大多数直接执行else里面的程序

可能大家看到还是一头雾水,看下面一段就会明白其中的乐趣啦;

//test_builtin_expect.c
#define LIKELY(x) __builtin_expect(!!(x), 1)
#define UNLIKELY(x) __builtin_expect(!!(x), 0)
int test_likely(int x)
{
if(LIKELY(x))
{
  
x = 5;
}
else
{
  
x = 6;
}
  
return x;
}
int test_unlikely(int x)
{
if(UNLIKELY(x))
{
  
x = 5;
}
else
{
  
x = 6;
}
  
return x;
}
[lammy@localhost test_builtin_expect]$ gcc -fprofile-arcs -O2 -c test_builtin_expect.c
[lammy@localhost test_builtin_expect]$ objdump -d test_builtin_expect.o
test_builtin_expect.o:    
file format elf32-i386
Disassembly of section .text:
00000000 <test_likely>:
  
0: 55                  
push  
%ebp
  
1: 89 e5               
mov   
%esp,%ebp
  
3: 8b 45 08            
mov   
0x8(%ebp),%eax
  
6: 83 05 38 00 00 00 01
addl  
$0x1,0x38
  
d: 83 15 3c 00 00 00 00
adcl  
$0x0,0x3c
14: 85 c0               
test  
%eax,%eax
16: 74 15               
je    
2d <test_likely+0x2d>//主要看这里
18: 83 05 40 00 00 00 01
addl  
$0x1,0x40
1f: b8 05 00 00 00      
mov   
$0x5,%eax
24: 83 15 44 00 00 00 00
adcl  
$0x0,0x44
2b: 5d                  
pop   
%ebp
2c: c3                  
ret   
2d: 83 05 48 00 00 00 01
addl  
$0x1,0x48
34: b8 06 00 00 00      
mov   
$0x6,%eax
39: 83 15 4c 00 00 00 00
adcl  
$0x0,0x4c
40: 5d                  
pop   
%ebp
41: c3                  
ret   
42: 8d b4 26 00 00 00 00
lea   
0x0(%esi,%eiz,1),%esi
49: 8d bc 27 00 00 00 00
lea   
0x0(%edi,%eiz,1),%edi
00000050 <test_unlikely>:
50: 55                  
push  
%ebp
51: 89 e5               
mov   
%esp,%ebp
53: 8b 55 08            
mov   
0x8(%ebp),%edx
56: 83 05 20 00 00 00 01
addl  
$0x1,0x20
5d: 83 15 24 00 00 00 00
adcl  
$0x0,0x24
64: 85 d2               
test  
%edx,%edx
66: 75 15               
jne   
7d <test_unlikely+0x2d>//主要看这里

68: 83 05 30 00 00 00 01
addl  
$0x1,0x30
6f: b8 06 00 00 00      
mov   
$0x6,%eax
74: 83 15 34 00 00 00 00
adcl  
$0x0,0x34
7b: 5d                  
pop   
%ebp
7c: c3                  
ret   
7d: 83 05 28 00 00 00 01
addl  
$0x1,0x28
84: b8 05 00 00 00      
mov   
$0x5,%eax
89: 83 15 2c 00 00 00 00
adcl  
$0x0,0x2c
90: 5d                  
pop   
%ebp
91: c3                  
ret   
92: 8d b4 26 00 00 00 00
lea   
0x0(%esi,%eiz,1),%esi
99: 8d bc 27 00 00 00 00
lea   
0x0(%edi,%eiz,1),%edi
000000a0 <_GLOBAL__I_65535_0_test_likely>:
a0: 55                  
push  
%ebp
a1: 89 e5               
mov   
%esp,%ebp
a3: 83 ec 08            
sub   
$0x8,%esp
a6: c7 04 24 00 00 00 00
movl  
$0x0,(%esp)
ad: e8 fc ff ff ff      
call  
ae <_GLOBAL__I_65535_0_test_likely+0xe>
b2: c9                  
leave
b3: c3                  
ret   
[lammy@localhost test_builtin_expect]$

两个函数编译生成的汇编语句所使用到的跳转指令不一样,仔细分析下会发现__builtin_expect实际上是为了满足在大多数情况不执行跳转指令,所以__builtin_expect仅仅是告诉编译器优化,并没有改变其对真值的判断。

这种用法在Linux内核中也经常用到,国外也有一篇相关的文章,大家不妨看看:http://kernelnewbies.org/FAQ/LikelyUnlikely

不知大家注意到没有,我在生产汇编时用的是gcc -fprofile-arcs -O2 -c test_builtin_expect.c,而不是gcc -O2 -c test_builtin_expect.c,具体可以参考http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html。




FAQ/LikelyUnlikely

likely() and unlikely()

What are they ?

In Linux kernel code, one often find calls to likely() and unlikely(), in conditions, like :

bvl = bvec_alloc(gfp_mask, nr_iovecs, &idx);
if (unlikely(!bvl)) {

mempool_free(bio, bio_pool);

bio = NULL;

goto out;
}

In fact, these functions are hints for the compiler that allows it to correctly optimize the branch, by knowing which is the likeliest one. The definitions of these macros, found in include/linux/compiler.h are the following :

#define likely(x)
__builtin_expect(!!(x), 1)
#define unlikely(x)
__builtin_expect(!!(x), 0)

The GCC documentation explains the role of __builtin_expect() :

 -- Built-in Function: long __builtin_expect (long EXP, long C)

You may use `__builtin_expect' to provide the compiler with branch

prediction information.
In general, you should prefer to use

actual profile feedback for this (`-fprofile-arcs'), as

programmers are notoriously bad at predicting how their programs

actually perform.
However, there are applications in which this

data is hard to collect.


The return value is the value of EXP, which should be an integral

expression.
The value of C must be a compile-time constant.
The

semantics of the built-in are that it is expected that EXP == C.

For example:


if (__builtin_expect (x, 0))

foo ();


would indicate that we do not expect to call `foo', since we

expect `x' to be zero.
Since you are limited to integral

expressions for EXP, you should use constructions such as


if (__builtin_expect (ptr != NULL, 1))

error ();


when testing pointer or floating-point values.

How does it optimize things ?

It optimizes things by ordering the generated assembly code correctly, to optimize the usage of the processor pipeline. To do so, they arrange the code so that the likeliest branch is executed without performing any jmp instruction (which has the bad effect of flushing the processor pipeline).

To see how it works, let's compile the following simple C user space program with gcc -O2 :

#define likely(x)
__builtin_expect(!!(x), 1)
#define unlikely(x)
__builtin_expect(!!(x), 0)

int main(char *argv[], int argc)
{

int a;


/* Get the value from somewhere GCC can't optimize */

a = atoi (argv[1]);


if (unlikely (a == 2))

a++;

else

a--;


printf ("%dn", a);


return 0;
}

Now, disassemble the resulting binary using objdump -S (comments added by me) :

080483b0 <main>:
 // Prologue
 80483b0:
55
push
%ebp
 80483b1:
89 e5
mov
%esp,%ebp
 80483b3:
50
push
%eax
 80483b4:
50
push
%eax
 80483b5:
83 e4 f0
and
$0xfffffff0,%esp
 //
Call atoi()
 80483b8:
8b 45 08
mov
0x8(%ebp),%eax
 80483bb:
83 ec 1c
sub
$0x1c,%esp
 80483be:
8b 48 04
mov
0x4(%eax),%ecx
 80483c1:
51
push
%ecx
 80483c2:
e8 1d ff ff ff
call
80482e4 <atoi@plt>
 80483c7:
83 c4 10
add
$0x10,%esp
 //
Test the value
 80483ca:
83 f8 02
cmp
$0x2,%eax
 //
--------------------------------------------------------
 //
If 'a' equal to 2 (which is unlikely), then jump,
 //
otherwise continue directly, without jump, so that it
 //
doesn't flush the pipeline.
 //
--------------------------------------------------------
 80483cd:
74 12
je
80483e1 <main+0x31>
 80483cf:
48
dec
%eax
 //
Call printf
 80483d0:
52
push
%edx
 80483d1:
52
push
%edx
 80483d2:
50
push
%eax
 80483d3:
68 c8 84 04 08
push
$0x80484c8
 80483d8:
e8 f7 fe ff ff
call
80482d4 <printf@plt>
 //
Return 0 and go out.
 80483dd:
31 c0
xor
%eax,%eax
 80483df:
c9
leave
 80483e0:
c3
ret

Now, in the previous program, replace the unlikely() by a likely(), recompile it, and disassemble it again (again, comments added by me) :

080483b0 <main>:
 //
Prologue
 80483b0:
55
push
%ebp
 80483b1:
89 e5
mov
%esp,%ebp
 80483b3:
50
push
%eax
 80483b4:
50
push
%eax
 80483b5:
83 e4 f0
and
$0xfffffff0,%esp
 //
Call atoi()
 80483b8:
8b 45 08
mov
0x8(%ebp),%eax
 80483bb:
83 ec 1c
sub
$0x1c,%esp
 80483be:
8b 48 04
mov
0x4(%eax),%ecx
 80483c1:
51
push
%ecx
 80483c2:
e8 1d ff ff ff
call
80482e4 <atoi@plt>
 80483c7:
83 c4 10
add
$0x10,%esp
 //
--------------------------------------------------
 //
If 'a' equal 2 (which is likely), we will continue
 //
without branching, so without flusing the pipeline. The
 //
jump only occurs when a != 2, which is unlikely.
 //
---------------------------------------------------
 80483ca:
83 f8 02
cmp
$0x2,%eax
 80483cd:
75 13
jne
80483e2 <main+0x32>
 //
Here the a++ incrementation has been optimized by gcc
 80483cf:
b0 03
mov
$0x3,%al
 //
Call printf()
 80483d1:
52
push
%edx
 80483d2:
52
push
%edx
 80483d3:
50
push
%eax
 80483d4:
68 c8 84 04 08
push
$0x80484c8
 80483d9:
e8 f6 fe ff ff
call
80482d4 <printf@plt>
 //
Return 0 and go out.
 80483de:
31 c0
xor
%eax,%eax
 80483e0:
c9
leave
 80483e1:
c3
ret

How should I use it ?

You should use it only in cases when the likeliest branch is very very very likely, or when the unlikeliest branch is very very very unlikely.


最后

以上就是谦让自行车为你收集整理的性能优化:__builtin_expect详解 FAQ/LikelyUnlikely的全部内容,希望文章能够帮你解决性能优化:__builtin_expect详解 FAQ/LikelyUnlikely所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(45)

评论列表共有 0 条评论

立即
投稿
返回
顶部