[[ blog 이사 과정에서 정확한 posting날짜가 분실됨. 년도와 분기 정도는 맞지 않을까? ]]

Usually, loop requires overheads - checking condition, jump. Especially, jump breaks pipeline and it results in performance loss. So, we need unrolling loop.

Before unrolling, we should consider followings.
1. Unrolling increases code size. So, we should consider instruction cache size. That is, we should unroll code without decreasing cache hit rate.
2. What is best unrolling value? Some reports said that '16'(16 operations per one loop) is enough.

And, here is coding example for unrolling.

#define UNROLL8( x, s, cond ) \
do { \
    switch( (s) & 0x7 ) \
    { \
        case 0: do  \
                {  \
                    x; \
        case 7:     x; \
        case 6:     x; \
        case 5:     x; \
        case 4:     x; \
        case 3:     x; \
        case 2:     x; \
        case 1:     x; \
            } while( cond ); \
    } \
} while(0)

Usage : UNROLL8(*dst = *src; dst++; src++;, loop_count, dst < dst_end)

+ Recent posts