SIMD (SSE) instruction for division in GCC

By : guest

I'd like to optimize the following snippet using SSE instructions if possible:

 * the data structure
typedef struct v3d v3d;
struct v3d {
    double x;
    double y;
    double z;
} tmp = { 1.0, 2.0, 3.0 };

 * the part that should be "optimized"
tmp.x /= 4.0;
tmp.y /= 4.0;
tmp.z /= 4.0;

Is this possible at all?

By : guest


Is tmp.x *= 0.25; enough?

Note that for SSE instructions (in case that you want to use them) it's important that:

1) all the memory access is 16 bytes alighed

2) the operations are performed in a loop

3) no int

By : ruslik

I've used SIMD extension under windows, but have not yet under linux. That being said you should be able to take advantage of the DIVPS SSE operation which will divide a 4 float vector by another 4 float vector. But you are using doubles, so you'll want the SSE2 version DIVPD. I almost forgot, make sure to build with -msse2 switch.

I found a page which details some SSE GCC builtins. It looks kind of old, but should be a good start.

By : jay.lee

The intrinsic you are looking for is _mm_div_pd. Here is a working example which should be enough to steer you in the right direction:

By : Paul R

This video can help you solving your question :)
By: admin