mirror of
https://github.com/Karaka-Management/Developer-Guide.git
synced 2026-01-11 04:18:43 +00:00
Update cpp.md
Signed-off-by: Dennis Eichhorn <spl1nes.com@googlemail.com>
This commit is contained in:
parent
2e3fd9fc53
commit
51a3682bdf
|
|
@ -80,6 +80,54 @@ https://www.agner.org/optimize/instruction_tables.pdf
|
|||
| Cache Line | 64 B |
|
||||
| Page Size | 4 KB |
|
||||
|
||||
### Cache locality
|
||||
|
||||
Column wise traversal
|
||||
|
||||
```c++
|
||||
void process_columns(int matrix[1000][1000]) {
|
||||
for (int col = 0; col < 1000; ++col) {
|
||||
for (int row = 0; row < 1000; ++row) {
|
||||
matrix[row][col] *= 2;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Row wise traversal
|
||||
|
||||
```c++
|
||||
void process_rows(int matrix[1000][1000]) {
|
||||
for (int row = 0; row < 1000; ++row) {
|
||||
for (int col = 0; col < 1000; ++col) {
|
||||
matrix[row][col] *= 2;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Data Padding
|
||||
|
||||
Wasting 6 bytes
|
||||
|
||||
```c++
|
||||
struct Data {
|
||||
char a;
|
||||
int b;
|
||||
char c;
|
||||
};
|
||||
```
|
||||
|
||||
Wasting 2 bytes
|
||||
|
||||
```c++
|
||||
struct Data {
|
||||
char a;
|
||||
char c;
|
||||
int b;
|
||||
};
|
||||
```
|
||||
|
||||
### Cache line sharing between CPU cores
|
||||
|
||||
When working with multi-threading you may choose to use atomic variables and atomic operations to reduce the locking in your application. You may think that a variable value `a[0]` used by thread 1 on core 1 and a variable value `a[1]` used by thread 2 on core 2 will have no performance impact. However, this is wrong. Core 1 and core 2 both have different L1 and L2 caches BUT the CPU doesn't just load individual variables, it loads entire cache lines (e.g. 64 bytes). This means that if you define `int a[2]`, it has a high chance of being on the same cache line and therfore thread 1 and thread 2 both have to wait on each other when doing atomic writes.
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user