From 20a754151b3a966dc757037b340e95e688f26572 Mon Sep 17 00:00:00 2001
From: Dennis Eichhorn <spl1nes.com@googlemail.com>
Date: Mon, 5 Aug 2024 19:35:24 +0200
Subject: [PATCH] Update cpp.md

Signed-off-by: Dennis Eichhorn <spl1nes.com@googlemail.com>
---
 standards/cpp.md | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/standards/cpp.md b/standards/cpp.md
index 08bf768..6a22f7d 100755
--- a/standards/cpp.md
+++ b/standards/cpp.md
@@ -36,6 +36,27 @@ When writing code keep the following topics in mind:
 * atomics vs locking (mutex)
 * Cache line sharing between CPU cores
 
+### Branching / Branchless programming
+
+Branched code
+
+```c++
+for (int i = 0; i < N; i++)
+ if (a[i] < 50) {
+  s += a[i];
+ }
+}
+```
+
+
+Branchless code
+
+```c++
+for (int i = 0; i < N; i++)
+ s += (a[i] < 50) * a[i];
+}
+```
+
 ### Cache line sharing between CPU cores
 
 When working with multi-threading you may choose to use atomic variables and atomic operations to reduce the locking in your application. You may think that a variable value `a[0]` used by thread 1 on core 1 and a variable value `a[1]` used by thread 2 on core 2 will have no performance impact. However, this is wrong. Core 1 and core 2 both have different L1 and L2 caches BUT the CPU doesn't just load individual variables, it loads entire cache lines (e.g. 64 bytes). This means that if you define `int a[2]`, it has a high chance of being on the same cache line and therfore thread 1 and thread 2 both have to wait on each other when doing atomic writes.