Illegal Instruction (SIGILL, signal 4) faults and UD2 amd64 (x86-64) instruction

From Wikistix

Maybe I'm just slow, but I only recently learned of the UD family of x86-64 (amd64) instructions (UD0, UD1 & UD2, which are opcodes 0x0fff, 0x0fb9 & 0x0f0b respectively). These are reserved undefined instructions that generate an invalid opcode exception when executed, resulting in a SIGILL (illegal instruction, signal 4) trap on Unix-like operating systems.

These can be generated by Clang/LLVM (and gcc) when compiling with flags like -fsanitize=undefined -fsanitize-trap=all.

For example, a range check is introduced when doing casts that might result in undefined behavior:

#include <cstdint>

int64_t fubar(double d) {
        return static_cast<int64_t>(d);
}

Results in the following sequence:

.LCPI0_0:
        .quad   -4332462841530417151    # double -9.2233720368547778E+18
.LCPI0_1:
        .quad   4890909195324358656     # double 9.2233720368547758E+18
        .text
        .globl  _Z5fubard
        .p2align        4, 0x90
        .type   _Z5fubard,@function
_Z5fubard:                              # @_Z5fubard
        .cfi_startproc
        .long   846595819               # 0x327606eb
        .long   .L__unnamed_1-_Z5fubard
# %bb.0:
        ucomisd .LCPI0_0(%rip), %xmm0
        jbe     .LBB0_2
# %bb.1:
        movsd   .LCPI0_1(%rip), %xmm1   # xmm1 = mem[0],zero
        ucomisd %xmm0, %xmm1
        jbe     .LBB0_2
# %bb.3:
        cvttsd2si       %xmm0, %rax
        retq
.LBB0_2:
        ud2

The UD2 instruction is invoked if the double is outside of the bounds of a signed int64.

Likewise, standard library routines may also do bounds checking:

#include <vector>
#include <cstdint>

uint32_t foobar(const std::vector<uint32_t>& v, int i) {
    return v[i];
}

Generates a bunch of bounds checks, all resulting in UD2:

foobar(std::vector<unsigned int, std::allocator<unsigned int> > const&, int):
        test    rdi, rdi
        je      .L7
        test    dil, 7
        jne     .L7
        mov     rdx, QWORD PTR [rdi]
        movsx   rsi, esi
        sal     rsi, 2
        lea     rax, [rdx+rsi]
        js      .L4
        cmp     rax, rdx
        jb      .L7
.L5:
        test    rax, rax
        je      .L7
        test    al, 3
        jne     .L7
        mov     eax, DWORD PTR [rax]
        ret
.L4:
        cmp     rdx, rax
        jnb     .L5
.L7:
        ud2

Other architectures

And it's not just amd64 (x86-64):

Arch Instruction
arm64 brk #1000
risc-v ebreak
m68k trap #7
mips break

See Also