# Integer signum in SSE Raymond

The signum function is defined as follows:

 signum(x) = −1 if x < 0 signum(x) = 0 if x = 0 signum(x) = +1 if x > 0

There are a couple of ways of calculating this in SSE integers.

One way is to convert the C idiom

```int signum(int x) { return (x > 0) - (x < 0); }
```

The SSE translation of this is mostly straightforward. The quirk is that the SSE comparison functions return −1 to indicate `true`, whereas C uses +1 to represent `true`. But this is easy to take into account:

 x > 0 ⇔ − pcmpgt(x, 0) x < 0 ⇔ − pcmpgt(0, x)

Substituting this into the original `signum` function, we get

 signum(x) = (x > 0) − (x < 0) = − pcmpgt(x, 0) − − pcmpgt(0, x) = − pcmpgt(x, 0) + pcmpgt(0, x) = pcmpgt(0, x) − pcmpgt(x, 0)

In assembly:

```        ; assume x is in xmm0
pxor    xmm1, xmm1
pxor    xmm2, xmm2
pcmpgtw xmm1, xmm0 ; xmm1 = pcmpgt(0, x)
pcmpgtw xmm0, xmm2 ; xmm0 = pcmpgt(x, 0)
psubw   xmm0, xmm1 ; xmm0 = signum
```

With intrinsics:

```__m128i signum16(__m128i x)
{
return _mm_sub_epi16(_mm_cmpgt_epi16(_mm_setzero_si128(), x),
_mm_cmpgt_epi16(x, _mm_setzero_si128()));
}
```

This pattern extends mutatus mutandis to `signum8`, `signum32`, and `signum64`.

Another solution is to use the signed minimum and maximum opcodes, using the formula

 signum(x) = min(max(x, −1), +1)

In assembly:

```        ; assume x is in xmm0
pcmpgtw xmm1, xmm1 ; xmm1 = -1 in all lanes
pmaxsw  xmm0, xmm1
psrlw   xmm1, 15   ; xmm1 = +1 in all lanes
pminsw  xmm0, xmm1
```

With intrinsics:

```__m128i signum16(__m128i x)
{
// alternatively: minusones = _mm_set1_epi16(-1);
__m128i minusones = _mm_cmpeq_epi16(_mm_setzero_si128(),
_mm_setzero_si128());
x = _mm_max_epi16(x, minusones);
// alternatively: ones = _mm_set1_epi16(1);
__m128i ones = _mm_srl_epi16(minusones, 15);
x = _mm_min_epi16(x, ones);
return x;
}
```

The catch here is that SSE2 supports only 16-bit signed minimum and maximum; to get other bit sizes, you need to bump up to SSE4. But if you’re going to do that, you may as well use the `psign` instruction. In assembly:

```        ; assume x is in xmm0
pcmpgtw xmm1, xmm1 ; xmm1 = -1 in all lanes
psrlw   xmm1, 15   ; xmm1 = +1 in all lanes
psignw  xmm1, xmm0 ; apply sign of x to xmm1
```

With intrinsics:

```__m128i signum16(__m128i x)
{
// alternatively: ones = _mm_set1_epi16(1);
__m128i minusones = _mm_cmpeq_epi16(_mm_setzero_si128(),
_mm_setzero_si128());
__m128i ones = _mm_srl_epi16(minusones, 15);
return _mm_sign_epi16(ones, x);
}
```

The `psign` instruction applies the sign of its second argument to its first argument. We load up the first argument with the value `+1` in all lanes, then apply the sign of x, which negates the value if the corresponding lane of x is negative; sets the value to zero if the lane is zero, and leaves it alone if the corresponding lane is positive.