Update cpp.md

Signed-off-by: Dennis Eichhorn <spl1nes.com@googlemail.com>
2026-03-07 11:48:41 +00:00 · 2024-08-05 19:38:08 +02:00 · 2024-08-05 19:38:08 +02:00 · 983a6d4599
commit 983a6d4599
parent 0910172c26
1 changed files with 16 additions and 1 deletions
--- a/standards/cpp.md
+++ b/standards/cpp.md
@ -48,7 +48,6 @@ for (int i = 0; i < N; i++)
 }
 ```

-
 Branchless code

 ```c++
@ -57,6 +56,22 @@ for (int i = 0; i < N; i++)
 }
 ```

+### Instruction table latency
+
+| Instruction | Latency | RThroughput |
+|-------------|---------|:------------|
+| `jmp`       | -       | 2           |
+| `mov r, r`  | -       | 1/4         |
+| `mov r, m`  | 4       | 1/2         |
+| `mov m, r`  | 3       | 1           |
+| `add`       | 1       | 1/3         |
+| `cmp`       | 1       | 1/4         |
+| `popcnt`    | 1       | 1/4         |
+| `mul`       | 3       | 1           |
+| `div`       | 13-28   | 13-28       |
+
+https://www.agner.org/optimize/instruction_tables.pdf
+
 ### Cache line sharing between CPU cores

 When working with multi-threading you may choose to use atomic variables and atomic operations to reduce the locking in your application. You may think that a variable value `a[0]` used by thread 1 on core 1 and a variable value `a[1]` used by thread 2 on core 2 will have no performance impact. However, this is wrong. Core 1 and core 2 both have different L1 and L2 caches BUT the CPU doesn't just load individual variables, it loads entire cache lines (e.g. 64 bytes). This means that if you define `int a[2]`, it has a high chance of being on the same cache line and therfore thread 1 and thread 2 both have to wait on each other when doing atomic writes.