diff --git a/standards/cpp.md b/standards/cpp.md index 6d6c623..d525e64 100755 --- a/standards/cpp.md +++ b/standards/cpp.md @@ -41,19 +41,15 @@ When writing code keep the following topics in mind: Branched code ```c++ -for (int i = 0; i < N; i++) - if (a[i] < 50) { - s += a[i]; - } +if (a < 50) { + b += a; } ``` Branchless code ```c++ -for (int i = 0; i < N; i++) - s += (a[i] < 50) * a[i]; -} +b += (a < 50) * a; ``` ### Instruction table latency @@ -72,6 +68,18 @@ for (int i = 0; i < N; i++) https://www.agner.org/optimize/instruction_tables.pdf +### Cache sizes + +| CPU Category | Stat | +|--------------|---------| +| L1 Cache | 32 - 48 KB | +| L2 Cache | 2 - 4 MB | +| L3 Cache | 8 - 36 MB | +| L4 Cache | 0 - 128 MB | +| Clock speed | 3.5 - 6.2 Ghz | +| Cache Line | 64 B | +| Page Size | 4 KB | + ### Cache line sharing between CPU cores When working with multi-threading you may choose to use atomic variables and atomic operations to reduce the locking in your application. You may think that a variable value `a[0]` used by thread 1 on core 1 and a variable value `a[1]` used by thread 2 on core 2 will have no performance impact. However, this is wrong. Core 1 and core 2 both have different L1 and L2 caches BUT the CPU doesn't just load individual variables, it loads entire cache lines (e.g. 64 bytes). This means that if you define `int a[2]`, it has a high chance of being on the same cache line and therfore thread 1 and thread 2 both have to wait on each other when doing atomic writes.