Update cpp.md

Signed-off-by: Dennis Eichhorn <spl1nes.com@googlemail.com>
2026-03-15 15:18:39 +00:00 · 2024-08-05 19:47:39 +02:00 · 2024-08-05 19:47:39 +02:00 · 51a3682bdf
commit 51a3682bdf
parent 2e3fd9fc53
1 changed files with 48 additions and 0 deletions
--- a/standards/cpp.md
+++ b/standards/cpp.md
@ -80,6 +80,54 @@ https://www.agner.org/optimize/instruction_tables.pdf
 | Cache Line | 64 B |
 | Page Size | 4 KB |

+### Cache locality
+
+Column wise traversal
+
+```c++
+void process_columns(int matrix[1000][1000]) {
+    for (int col = 0; col < 1000; ++col) {
+        for (int row = 0; row < 1000; ++row) {
+            matrix[row][col] *= 2;
+        }
+    }
+}
+```
+
+Row wise traversal
+
+```c++
+void process_rows(int matrix[1000][1000]) {
+    for (int row = 0; row < 1000; ++row) {
+        for (int col = 0; col < 1000; ++col) {
+            matrix[row][col] *= 2;
+        }
+    }
+}
+```
+
+### Data Padding
+
+Wasting 6 bytes
+
+```c++
+struct Data {
+    char a;
+    int b;
+    char c;
+};
+```
+
+Wasting 2 bytes
+
+```c++
+struct Data {
+    char a;
+    char c;
+    int b;
+};
+```
+
 ### Cache line sharing between CPU cores

 When working with multi-threading you may choose to use atomic variables and atomic operations to reduce the locking in your application. You may think that a variable value `a[0]` used by thread 1 on core 1 and a variable value `a[1]` used by thread 2 on core 2 will have no performance impact. However, this is wrong. Core 1 and core 2 both have different L1 and L2 caches BUT the CPU doesn't just load individual variables, it loads entire cache lines (e.g. 64 bytes). This means that if you define `int a[2]`, it has a high chance of being on the same cache line and therfore thread 1 and thread 2 both have to wait on each other when doing atomic writes.