4 år sedan · defd6da6a3
--- a/config/general/computer-testing.md
+++ b/config/general/computer-testing.md
@@ -6,12 +6,6 @@ breadcrumbs:
 
				 ---
			
 
				 {% include header.md %}
			
 
				 
			
 
				-## Information Gathering
			
 
				-
			
 
				-### Linux
			
 
				-
			
 
				-- Show CPU vulnerabilities: `tail -n +1 /sys/devices/system/cpu/vulnerabilities/*`
			
 
				-
			
 
				 ## CPU
			
 
				 
			
 
				 ### Prime95
			
@@ -74,4 +68,23 @@ fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=1m --size=16G --
 
				 - For health testing.
			
 
				 - See [smartmontools](/config/linux-general/applications/#smartmontools).
			
 
				 
			
 
				+## Miscellanea
			
 
				+
			
 
				+### Linux
			
 
				+
			
 
				+- Show CPU vulnerabilities: `tail -n +1 /sys/devices/system/cpu/vulnerabilities/*`
			
 
				+- PCIe link speed for device:
			
 
				+    - Make sure the device is doing something intensive so that the PCIe speed isn't degraded.
			
 
				+    - Run `sudo lspci -vv`, find the device (e.g. `NVIDIA Corporation TU102 [GeForce RTX 2080 Ti Rev. A]`) and look for the `LnkCap` and `LnkSta` lines under "Capabilities".
			
 
				+    - `LnkCap` is the device capability and `LnkSta` is the current status. Both show the (max and current) PCIe speed/version (speed for _around_ 8 lanes wrt. the specific version) and the number of lanes.
			
 
				+    - Example `LnkSta` (1): `Speed 16GT/s (ok), Width x16 (ok)`, meaning PCIe 4.0, using 16 lanes.
			
 
				+    - Example `LnkSta` (2): `Speed 8GT/s (ok), Width x4 (downgraded)`, meaning PCIe 3.0, downgraded to 4 lanes, e.g. if the motherboard doesn't support that many PCIe devices running at full widths.
			
 
				+    - PCIe speed cheat sheet:
			
 
				+        - PCIe 1 (2.5GT/s, 250MB/s per lane)
			
 
				+        - PCIe 2 (5GT/s, 500MB/s per lane)
			
 
				+        - PCIe 3 (8GT/s, 985MB/s per lane)
			
 
				+        - PCIe 4 (16GT/s, 1.97GB/s per lane)
			
 
				+        - PCIe 5 (32GT/s, 3.94GB/s per lane)
			
 
				+        - PCIe 6 (64GT/s, 7.88GB/s per lane)
			
 
				+
			
 
				 {% include footer.md %}
			
--- a/se/hpc/cuda.md
+++ b/se/hpc/cuda.md
@@ -63,7 +63,7 @@ breadcrumbs:
 
				 - The only memory the host can copy data into or out of.
			
 
				 - The only memory threads from different blocks can share data in.
			
 
				 - Statically declared in global scope using the `__device__` declaration or dynamically allocated using `cudaMalloc`.
			
 
				-- Global memory coalescing: When multiple threads in a warp access global memory, the device will try to _coalesce_ the access into as few transactions as possible in order to mimimize memory load.
			
 
				+- Global memory coalescing: When multiple threads in a warp access global memory in strided fashion (e.g. when all threads in the warp access sequential parts of an array), the device will try to _coalesce_ the access into as few transactions as possible in order to mimimize memory load.
			
 
				 
			
 
				 #### Local Memory