64-bit specific simd intrinsic

Tags: c sse simd sse2
By : anup
Source: Stackoverflow.com
Question!

I am using the following union declaration in SSE2.

typedef unsigned long uli;  
typedef uli v4si __attribute__ ((vector_size(16)));  
typedef union  
{  
    v4si v;  
    uli data[2];  
} uliv;  

uliv a, b, c;

The idea is assign two unsigned long variables (64 bit long) to each a and b, XOR them and place the result in c.

explicit assignment (a.data[0] = something) works here but it requires more time.

I plan to use intrinsics. If I use _mm_set_epi64 (unsigned long x, unsigned long y), it asks for _m64 variables. If I cast these variables (_m64)x and it works fine, but it gives wrong result.

for (k = 0; k < 10; k++)
{
simda.v = _mm_set_epi64 (_mulpre1[u1][k], _mulpre2[u2][k]);
simdb.v = _mm_set_epi64 (res1[i+k], res2[i+k]);
simdc.v = _mm_xor_si128 (simda.v, simdb.v);
}

The above code gives error: /usr/lib/gcc/x86_64-linux-gnu/4.4.3/include/emmintrin.h:578: note: expected ‘__m64’ but argument is of type ‘long unsigned int’

Can you please suggest some alternatives (intrinsics)?

By : anup


Answers

Are you sure that unsigned long is 64 bits on your system ? It's probably safer to use unsigned long long or better yet uint64_t from <stdint.h>.

On my system _mm_set_epi64 takes two unsigned long long parameters and returns an __m128i.

It's not clear from your question whether you just want to (a) XOR two 64 bit values or (b) XOR two vectors of 2 x 64 bit values ?

For case (a) just use scalar code, e.g.

uint64_t a, b, c;

c = a ^ b;

For case (b) you don't need unions etc, just do this:

__m128i va, vb, vc;

va = _mm_set_epi64(a1, a2);
vb = _mm_set_epi64(b1, b2);
vc = _mm_xor_si128(va, vb);
By : Paul R


This video can help you solving your question :)
By: admin