Intro to SIMD
in
by Niles Salter / @Validark
Itinerary:
Definitions:
The basic idea
add rdi, r8 add rsi, r9 add rdx, r10 add rcx, r11
⇒
vpaddq ymm0, ymm1, ymm0
Why this is helpful:
Tutorial:
1. Setup Godbolt
2. Do stuff
3. Profit??
export fn columnCounts(chunk: @Vector(16, u8)) @Vector(16, u8) {
^~~~~~~^~~^~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~^~~~~
~~~~~^~~^~~~~~~~~~~~~^~~~~~^~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~^~~~
^~~~~~~^~~^~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~^~~~~
~~~~~^~~^~~~~~~~~~~~~^~~~~~^~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~^~~~
~~~~~^~~^~~~~~~~~~~~~^~~~~~^~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~^~~~
image/svg+xml
VPMADDWD
SRC 1
SRC 2
A
0
A
1
A
2
. . .
A
31
512-bit
16-bit
B
0
B
1
B
2
. . .
B
31
*
*
*
. . .
*
+
+
A
3
B
3
*
+
DEST
. . .
A
0
B
0
+A
1
B
1
A
2
B
2
+A
3
B
3
A
30
B
30
+A
31
B
31
32-bit
+
A
0
B
0
*
A
1
B
1
*
A
0
B
0
+A
1
B
1
fin
<
>