diff --git a/standards/cpp.md b/standards/cpp.md
index 8ef7042..6d6c623 100755
--- a/standards/cpp.md
+++ b/standards/cpp.md
@@ -48,7 +48,6 @@ for (int i = 0; i < N; i++)
 }
 ```
 
-
 Branchless code
 
 ```c++
@@ -57,6 +56,22 @@ for (int i = 0; i < N; i++)
 }
 ```
 
+### Instruction table latency
+
+| Instruction | Latency | RThroughput |
+|-------------|---------|:------------|
+| `jmp`       | -       | 2           |
+| `mov r, r`  | -       | 1/4         |
+| `mov r, m`  | 4       | 1/2         |
+| `mov m, r`  | 3       | 1           |
+| `add`       | 1       | 1/3         |
+| `cmp`       | 1       | 1/4         |
+| `popcnt`    | 1       | 1/4         |
+| `mul`       | 3       | 1           |
+| `div`       | 13-28   | 13-28       |
+
+https://www.agner.org/optimize/instruction_tables.pdf
+
 ### Cache line sharing between CPU cores
 
 When working with multi-threading you may choose to use atomic variables and atomic operations to reduce the locking in your application. You may think that a variable value `a[0]` used by thread 1 on core 1 and a variable value `a[1]` used by thread 2 on core 2 will have no performance impact. However, this is wrong. Core 1 and core 2 both have different L1 and L2 caches BUT the CPU doesn't just load individual variables, it loads entire cache lines (e.g. 64 bytes). This means that if you define `int a[2]`, it has a high chance of being on the same cache line and therfore thread 1 and thread 2 both have to wait on each other when doing atomic writes.