فهرست منبع

Merge branch 'master' of github.com:HON95/wiki

Håvard Ose Nordstrand 3 سال پیش
والد
کامیت
d709f72d63

+ 12 - 1
config/automation/ansible.md

@@ -6,7 +6,7 @@ breadcrumbs:
 ---
 {% include header.md %}
 
-## Modules-ish
+## Resources
 
 ### General Networking
 
@@ -18,4 +18,15 @@ breadcrumbs:
 - [Ansible IOS platform options](https://docs.ansible.com/ansible/latest/network/user_guide/platform_ios.html)
 - [Ansible ios_config module](https://docs.ansible.com/ansible/latest/modules/ios_config_module.html)
 
+## Configuration
+
+Example `/etc/ansible/ansible.cfg`:
+
+```
+[defaults]
+# Change to "auto" if this path causes problems
+interpreter_python = /usr/bin/python3
+host_key_checking = false
+```
+
 {% include footer.md %}

+ 9 - 1
config/general/computer-testing.md

@@ -32,10 +32,18 @@ breadcrumbs:
 Example usage:
 
 ```sh
-# 1 stressor, 75% of memory, with verification, for 10 minutes
+# 1 stressor, 75% of memory (TODO this also works fine with 100% for some reason, find out what it actually means), with verification, for 10 minutes
 stress-ng --vm 1 --vm-bytes 75% --vm-method all --verify -t 10m -v
 ```
 
+### Error Detection and Correction (EDAC) (Linux)
+
+- Only available on systems with ECC RAM, AFAIK.
+- Check the syslog: `journalctl | grep 'EDAC' | grep -i 'error'`
+- Show corrected (CE) and uncorrected (UE) errors per memory controller and DIMM slot: `grep '.*' /sys/devices/system/edac/mc/mc*/dimm*/dimm_*_count`
+- Show DIMM slot names to help locate the faulty DIMM: `dmidecode -t memory | grep 'Locator:.*DIMM.*'`
+- When changing the DIMM, make sure to run Memtest86 or similar both before and after to validate that the errors go away.
+
 ## Storage
 
 ### Fio (Linux)

+ 16 - 1
config/general/linux-examples.md

@@ -81,7 +81,7 @@ breadcrumbs:
 - Show sockets:
     - `netstat -tulpn`
         - `tu` for TCP and UDP, `l` for listening, `p` for protocol, `n` for numerical post numbers.
-    - `ss <options>`
+    - `ss -tulpn` (replaces netstat version)
 - Show interface stats:
     - `ip -s link`
     - `netstat -i`
@@ -99,6 +99,11 @@ breadcrumbs:
     - `nstat`
     - `netstat -s` (statistics)
 
+### Memory
+
+- NUMA stats:
+    - `numastat` (from package `numactl`)
+
 ### Performance and Power Efficiency
 
 - Set the CPU frequency scaling governor mode:
@@ -106,6 +111,16 @@ breadcrumbs:
     - Power save: `echo powersave | ...`
 - Show current core frequencies: `grep "cpu MHz" /proc/cpuinfo | cut -d' ' -f3`
 
+### Profiling
+
+- Command timer (`time`):
+    - Provided both as a shell built-in `time` and as `/usr/bin/time`, use the latter.
+    - Syntax: `/usr/bin/time -vp <command>`
+    - Options:
+        - `-p` for POSIX output (one line per time)
+        - `-v` for interesting system info about the process.
+    - It give the wall time, time spent in usermode and time spent in kernel mode.
+
 ### Security
 
 - Show CPU vulnerabilities: `tail -n +1 /sys/devices/system/cpu/vulnerabilities/*`

+ 78 - 0
config/hpc/containers.md

@@ -0,0 +1,78 @@
+---
+title: Containers
+breadcrumbs:
+- title: Configuration
+- title: High-Performance Computing (HPC)
+---
+{% include header.md %}
+
+## Alternative Technologies
+
+### Docker
+
+#### Resources
+
+- Config notes: [Docker](/config/virt-cont/docker/)
+
+#### General Information
+
+- The de facto container solution.
+- It's generally **not recommended for HPC** (see reasons below), but it's fine for running on a local system if that's just more practical for you.
+- Having access to run containers is effectively the same as having root access to the host machine. This is generally not acceptable on shared resources.
+- The daemon adds extra complexity (and overhead/jitter) not required in HPC scenarios.
+- Generally lacks typical HPC architecture support like batch integration, non-local resources and high-performance interconnects.
+
+### Singularity
+
+#### Resources
+
+- Homepage: [Singularity](https://singularity.hpcng.org/)
+- Config notes: [Singularity](/config/hpc/singularity/)
+
+#### Information
+
+- Images:
+    - Uses a format called SIF, which is file-based (appropriate for parallel/distributed filesystems).
+    - Managing images equates to managing files.
+    - Images can still be pulled from repositories (which will download them as files).
+    - Supports Docker images, but will automatically convert them to SIF.
+- No daemon process.
+- Does not require or provide root access to use.
+- Uses the same user, working directiry and env vars as the host (**TODO** more info required).
+- Supports Slurm.
+- Supports GPU (NVIDIA CUDA and AMD ROCm).
+
+### NVIDIA Enroot
+
+#### Resources
+
+- Homepage: [NVIDIA Enroot](https://github.com/NVIDIA/enroot)
+
+#### Information
+
+- Fully unprivileged `chroot`. Works similarly to typical container technologies, but removes "unnecessary" parts of the isolation mechanisms. Converts traditional container/OS images into "unprivileged sandboxes".
+- Newer than some other alternatives.
+- Supports using Docker images (and Docker Hub).
+- No daemon.
+- Slurm integration using NVIDIA's [Pyxis](https://github.com/NVIDIA/pyxis) SPANK plugin.
+- Support NVIDIA GPUs through NVIDIA's [libnvidia-container](https://github.com/nvidia/libnvidia-container) library and CLI utility.
+    - **TODO** AMD ROCm support?
+
+### Shifter
+
+I've never used it. It's very similar to Singularity.
+
+## Best Practices
+
+- Containers should run as users (the default for e.g. Singularity, but not Docker).
+- Use trusted base images with pinned versions. The same goes for dependencies.
+- Make your own base images with commonly used tools/libs.
+- Datasets and similar do not need to be copied into the image, it can be bind mounted at runtime instead.
+- Spack and Easybuild may be used to simplify building container recipes (Dockerfiles and Singularity-something), to avoid boilerplate and bad practices.
+- BuildAh may be used to build images without a recipe.
+- Dockerfile (or similar) recommendations:
+    - Combine `RUN` commands to reduce the number of layers and this the size of the image.
+    - To exploit the build cache, place the most cacheable commands at the top to avoid running them again on rebuilds.
+    - Use multi-stage builds for separate build and run time images/environments.
+
+{% include footer.md %}

+ 111 - 0
config/hpc/interconnects.md

@@ -0,0 +1,111 @@
+---
+title: Interconnects
+breadcrumbs:
+- title: Configuration
+- title: High-Performance Computing (HPC)
+---
+{% include header.md %}
+
+Using **Debian**, unless otherwise stated.
+
+## Related Pages
+
+- [Linux Switching & Routing](/config/network/linux/)
+
+## General
+
+- The technology should implement RDMA, such that the CPU is not involved in transferring data between hosts (a form of zero-copy). CPU involvement would generally increase latency, increase jitter and limit bandwidth, as well as making the CPU processing power less available to other processing. This implies that the network card must be intelligent/smart and implement hardware offloading for the protocol.
+- The technology should provide a rich communication interface to (userland) applications. The interface should not involve the kernel as unnecessary buffering and context switches would again lead to increased latency, increased jitter, limited bandwidth and excessive CPU usage. Instead of using the TCP/IP stack on top of the interconnect, Infiniband and RoCE (for example) provides "verbs" that the application uses to communicate with applications over the interconnect.
+- The technology should support one-sided communication.
+- OpenFabrics Enterprise Distribution (OFED) is a unified stack for supporting IB, RoCE and iWARP in a unified set of interfaces called the OpenFabrics Interfaces (OFI) to applications. Libfabric is the user-space API.
+- UCX is another unifying stack somewhat similar to OFED. Its UCP API is equivalent to OFED's Libfabric API.
+
+## Ethernet
+
+### Info
+
+- More appropriate for commodify clusters due to Ethernet NICs and switches being available off-the-shelf for lower prices.
+- Support for RoCE (RDMA) is recommended to avoid overhead from the kernel TCP/IP stack (as is typically used wrt. Ethernet).
+- Ethernet-based interconnects include commodity/plain Ethernet, Internet Wide-Area RDMA Protocol (iWARP), Virtual Protocol Interconnect (VPI) (Infiniband and Ethernet on same card), and RDMA over Converged Ethernet (RoCE) (Infiniband running over Ethernet).
+
+## RDMA Over Converged Ethernet (RoCE)
+
+### Info
+
+- Link layer is Converged Ethernet (CE) (aka data center bridging (DCB)), but the upper protocols are Infiniband (IB).
+- v1 uses an IB network layer and is limited to a single broadcast domain, but v2 uses a UDP/IP network layer (and somewhat transport layer) and is routable over IP routers. Both use an IB transport layer.
+- RoCE requires the NICs and switches to support it.
+- It performs very similar to Infiniband, given equal hardware.
+
+## InfiniBand
+
+### Info
+
+- Each physical connection uses a specified number of links/lanes (typically x4), such that the throughput is aggregated.
+- Per-lane throughput:
+    - SDR: 2Gb/s
+    - DDR: 4Gb/s
+    - QDR: 8Gb/s
+    - FDR10: 10Gb/s
+    - FDR: 13.64Gb/s
+    - EDR: 25Gb/s
+    - HDR: 50Gb/s
+    - NDR: 100Gb/s
+    - XDR: 250Gb/s
+- The network adapter is called a host channel adapter (HCA).
+- It's typically switches, but supports routing between subnets as well.
+- Channel endpoints between applications a called queue pairs (QPs).
+- To avoid invoking the kernel when communicating over a channel, the kernel allocates and pins a memory region that the userland application and the HCA can both access without further kernel involvement. A local key is used by the application to access the HCA buffers and an unencrypted remote key is used by the remote host to access the HCA buffers.
+- Communication uses either channel semantics (the send/receive model, two-sided) or memory semantics (RDMA model, one-sided). It also supports a special type of memory semantics using atomic operations, which is a useful foundation for e.g. distributed locks.
+- Each subnet requires a subnet manager to be running on a switch or a host, which manages the subnet and is queryable by hosts (agents). For very large subnets, it may be appropriate to run it on a dedicated host. It assigns addresses to endpoints, manages routing tables and more.
+
+### Installation
+
+1. Install RDMA: `apt install rdma-core`
+1. Install user-space RDMA stuff: `apt install ibverbs-providers rdmacm-utils infiniband-diags ibverbs-utils`
+1. Install subnet manager (SM): `apt install opensm`
+    - Only one instance is required on the network, but multiple may be used for redundancy.
+    - A master SM is selected based on configured priority, with GUID as a tie breaker.
+1. Setup IPoIB:
+    - Just like for Ethernet. Just specify the IB interface as the L2 device.
+    - Use an appropriate MTU like 2044.
+1. Make sure ping and ping-pong is working (see examples below).
+
+### Usage
+
+- Show IPoIB status: `ip a`
+- Show local devices:
+    - GUIDs: `ibv_devices`
+    - Basics (1): `ibstatus`
+    - Basics (2): `ibstat`
+    - Basics (3): `ibv_devinfo`
+- Show link statuses for network: `iblinkinfo`
+- Show subnet nodes:
+    - Hosts: `ibhosts`
+    - Switches: `ibswitches`
+    - Routers: `ibrouters`
+- Show active subnet manager(s): `sminfo`
+- Show subnet topology: `ibnetdiscover`
+- Show port counters: `perfquery`
+
+#### Testing
+
+- Ping:
+    - Server: `ibping -S`
+    - Client: `ibping -G <guid>`
+- Ping-pong:
+    - Server: `ibv_rc_pingpong -d <device> [-n <iters>]`
+    - Client: `ibv_rc_pingpong [-n <iters>] <ip>`
+- Other tools:
+    - qperf
+    - perftest
+- Diagnose with ibutils:
+    - Requires the `ibutils` package.
+    - Diagnose fabric: `ibdiagnet -ls 10 -lw 4x` (example)
+    - Diagnose path between two nodes: `ibdiagpath -l 65,1` (example)
+
+## NVLink & NVSwitch
+
+See [CUDA](/se/general/cuda/).
+
+{% include footer.md %}

+ 53 - 0
config/hpc/singularity.md

@@ -0,0 +1,53 @@
+---
+title: Singularity
+breadcrumbs:
+- title: Configuration
+- title: High-Performance Computing (HPC)
+---
+{% include header.md %}
+
+A container technology for HPC.
+
+## Information
+
+- For more general imformation and comparison to other HPC container technologies, see [Containers](/config/hpc/containers/).
+
+## Configuration
+
+**TODO**
+
+## Usage
+
+### Running
+
+- Run command: `singularity exec <opts> <img> <cmd>`
+- Run interactive shell: `singularity shell <opts> <img>`
+- Mounts:
+    - The current directory is mounted and used as the working directory by default.
+- Env vars:
+    - Env vars are copied from the host, but `--cleanenv` may be used to avoid that.
+    - Extra can be specified using `--env <var>=<val>`.
+- GPUs:
+    - See extra notes.
+    - Specify `--nv` (NVIDIA) or `--rocm` (AMD) to expose GPUs.
+
+### Images
+
+- Pull image from repo:
+    - Will place the image as a SIF file (`<image>_<tag>.sif`) in the current directory.
+    - Docker Hub: `singularity pull docker://<img>:<tag>`
+
+### GPUs
+
+- The GPU driver library must be exposed in the container and `LD_LIBRARY_PATH` must be updated.
+- Specify `--nv` (NVIDIA) or `--rocm` (AMD) when running a container.
+
+### MPI
+
+- Using the "bind approach", where MPI and the interconnect is bind mounted into the container.
+- MPI is installed in the container in order to build the application with dynamic linking.
+- MPI is installed on the host such that the application can dynamically load it at run time.
+- The MPI implementations must be of the same family and preferably the same version (for ABI compatibility). While MPICH, IntelMPI, MVAPICH and CrayMPICH use the same ABI, Open MPI does not comply with that ABI.
+- When running the application, both the MPI implementation and the interconnect must be bind mounted into the container and and appropriate `LD_LIBRARY_PATH` must be provided for the MPI libraries. This may be statically configured by the system admin.
+
+{% include footer.md %}

+ 5 - 2
config/linux-server/applications.md

@@ -414,11 +414,14 @@ echo -e "Time: $(date)\nMessage: $@" | mail -s "NUT: $@" root
 
 ### Usage
 
+- Show UPSes: `upsc -l`
+- Show UPS vars: `upsc <ups>`
+
 #### Query the Server
 
 1. Telnet into it: `telnet localhost 3493`
-1. List UPSes: `LIST UPS` (the second field is the UPS ID)
-1. List variables: `LIST VAR <ups>`
+1. Show UPSes: `LIST UPS`
+1. Show UPS vars: `LIST VAR <ups>`
 
 ## OpenSSL
 

+ 7 - 14
config/linux-server/debian.md

@@ -108,10 +108,9 @@ The first steps may be skipped if already configured during installation (i.e. n
     - Fix YAML formatting globally: In `/etc/vim/vimrc.local`, add `autocmd FileType yaml setlocal ts=2 sts=2 sw=2 expandtab`.
 1. Add mount options:
     - Setup hidepid:
-        - **TODO** Use existing `adm` group instead of creating a new one?
-        - Add PID monitor group: `groupadd -g 500 hidepid` (example GID)
-        - Add your personal user to the PID monitor group: `usermod -aG hidepid <user>`
-        - Enable hidepid in `/etc/fstab`: `proc /proc proc defaults,hidepid=2,gid=500 0 0`
+        - Note: The `adm` group will be granted access.
+        - Add your personal user to the PID monitor group: `usermod -aG adm <user>`
+        - Enable hidepid in `/etc/fstab`: `proc /proc proc defaults,hidepid=2,gid=<adm-gid> 0 0` (using the numerical GID of `adm`)
     - (Optional) Disable the tiny swap partition added by the guided installer by commenting it in the fstab.
     - (Optional) Setup extra mount options: See [Storage](system.md).
     - Run `mount -a` to validate fstab.
@@ -121,7 +120,7 @@ The first steps may be skipped if already configured during installation (i.e. n
     - Add the relevant groups (using `usermod -aG <group> <user>`):
         - `sudo` for sudo access.
         - `systemd-journal` for system log access.
-        - `hidepid` (whatever it's called) if using hidepid, to see all processes.
+        - `adm` for hidepid, to see all processes (if using hidepid).
     - Add your personal SSH pubkey to `~/.ssh/authorized_keys` and fix the owner and permissions (700 for dir, 600 for file).
         - Hint: Get `https://github.com/<user>.keys` and filter the results.
     - Try logging in remotely and gain root access through sudo.
@@ -230,12 +229,7 @@ Prevent enabled (and potentially untrusted) interfaces from accepting router adv
     - (Optional) `DNSSEC`: Set to `no` to disable (only if you have a good reason to, like avoiding the chicken-and-egg problem with DNSSEC and NTP).
 1. (Optional) If you're hosting a DNS server on this machine, set `DNSStubListener=no` to avoid binding to port 53.
 1. Enable the service: `systemctl enable --now systemd-resolved.service`
-1. Fix `/etc/resolv.conf`:
-    - Note: The systemd-generated one is `/run/systemd/resolve/stub-resolv.conf`.
-    - Note: Simply symlinking `/etc/resolv.conf` to the systemd one will cause dhclient to overwrite it if using DHCP for any interfaces, so don't do that.
-    - Note: This method may cause `/etc/resolv.conf` to become outdated if the systemd one changes for some reason (e.g. if the search domains change).
-    - After configuring and starting resolved, copy (not link) `resolv.conf`: `cp /run/systemd/resolve/stub-resolv.conf /etc/resolv.conf`
-    - Make it immutable so dhclient can't update it: `chattr +i /etc/resolv.conf`
+1. Link `/etc/resolv.conf`: `ln -sf /run/systemd/resolve/stub-resolv.conf /etc/resolv.conf`
 1. Check status: `resolvectl`
 
 ##### Using resolv.conf (Alternative 2)
@@ -308,12 +302,11 @@ Everything here is optional.
     - Install: `apt install lynis`
     - Run: `lynis audit system`
 - MOTD:
-    - Clear `/etc/motd` and `/etc/issue`.
-    - Download [dmotd.sh](https://github.com/HON95/scripts/blob/master/server/linux/general/dmotd.sh) to `/etc/profile.d/`.
+    - Clear `/etc/motd`, `/etc/issue` and `/etc/issue.net`.
+    - Download [dmotd.sh](https://github.com/HON95/scripts/blob/master/linux/login/dmotd.sh) to `/etc/profile.d/`.
     - Install the dependencies: `neofetch lolcat`
     - Add an ASCII art (or Unicode art) logo to `/etc/logo`, using e.g. [TAAG](http://patorjk.com/software/taag/).
     - (Optional) Add a MOTD to `/etc/motd`.
-    - (Optional) Clear or change the pre-login message in `/etc/issue`.
     - Test it: `su - <some-normal-user>`
 - Setup monitoring:
     - Use Prometheus with node exporter or something and set up alerts.

+ 0 - 69
config/linux-server/networking.md

@@ -1,69 +0,0 @@
----
-title: Linux Server Networking
-breadcrumbs:
-- title: Configuration
-- title: Linux Server
----
-{% include header.md %}
-
-Using **Debian**, unless otherwise stated.
-
-### TODO
-{:.no_toc}
-
-- Migrate stuff from Debian page.
-- Add link to Linux router page. Maybe combine.
-- Add ethtool notes from VyOS.
-
-## Related Pages
-
-- [Linux Switching & Routing](/config/network/linux/)
-
-## InfiniBand
-
-### Installation
-
-1. Install RDMA: `apt install rdma-core`
-1. Install user-space RDMA stuff: `apt install ibverbs-providers rdmacm-utils infiniband-diags ibverbs-utils`
-1. Install subnet manager (SM): `apt install opensm`
-    - Only one instance is required on the network, but multiple may be used for redundancy.
-    - A master SM is selected based on configured priority, with GUID as a tie breaker.
-1. Setup IPoIB:
-    - Just like for Ethernet. Just specify the IB interface as the L2 device.
-    - Use an appropriate MTU like 2044.
-1. Make sure ping and ping-pong is working (see examples below).
-
-### Usage
-
-- Show IPoIB status: `ip a`
-- Show local devices:
-    - GUIDs: `ibv_devices`
-    - Basics (1): `ibstatus`
-    - Basics (2): `ibstat`
-    - Basics (3): `ibv_devinfo`
-- Show link statuses for network: `iblinkinfo`
-- Show subnet nodes:
-    - Hosts: `ibhosts`
-    - Switches: `ibswitches`
-    - Routers: `ibrouters`
-- Show active subnet manager(s): `sminfo`
-- Show subnet topology: `ibnetdiscover`
-- Show port counters: `perfquery`
-
-#### Testing
-
-- Ping:
-    - Server: `ibping -S`
-    - Client: `ibping -G <guid>`
-- Ping-pong:
-    - Server: `ibv_rc_pingpong -d <device> [-n <iters>]`
-    - Client: `ibv_rc_pingpong [-n <iters>] <ip>`
-- Other tools:
-    - qperf
-    - perftest
-- Diagnose with ibutils:
-    - Requires the `ibutils` package.
-    - Diagnose fabric: `ibdiagnet -ls 10 -lw 4x` (example)
-    - Diagnose path between two nodes: `ibdiagpath -l 65,1` (example)
-
-{% include footer.md %}

+ 11 - 6
config/virt-cont/docker.md

@@ -17,11 +17,11 @@ Using **Debian**.
     - In `/etc/default/grub`, add `cgroup_enable=memory swapaccount=1` to `GRUB_CMDLINE_LINUX`.
     - Run `update-grub` and reboot the system.
 1. (Recommended) Setup IPv6 firewall and NAT:
-    - By default, Docker does not add any IPTables NAT rules or filter rules, which leaves Docker IPv6 networks open (bad) and requires using a routed prefix (sometimes inpractical). While using using globally routable IPv6 is the gold standard, Docker does not provide firewalling for that when not using NAT as well.
+    - (Info) By default, Docker does not enable IPv6 for containers and does not add any IP(6)Tables rules for the NAT or filter tables, which you need to take into consideration if you plan to use IPv6 (with or without automatic IPTables rules). See the miscellaneous not below on IPv6 support for more info about its brokenness and the implications of that. Docker _does_ however recently support handling IPv6 subnets similar to IPv4, meaning using NAT masquerading and appropriate firewalling. It doesn't work properly for internal networks, though, as it breaks IPv6 ND. The following steps describe how to set that up, as it is the only working solution IMO. MACVLANs with external routers will not be NAT-ed.
     - Open `/etc/docker/daemon.json`.
     - Set `"ipv6": true` to enable IPv6 support at all.
-    - Set `"fixed-cidr-v6": "<prefix/64>"` to some [generated](https://simpledns.plus/private-ipv6) (ULA) or publicly routable (GUA) /64 prefix, to be used by the default bridge.
-    - Set `"ip6tables": true` to enable adding filter and NAT rules to IP6Tables (required for both security and NAT). This only affects non-internal bridges and not e.g. MACVLANs with external routers.
+    - Set `"fixed-cidr-v6": "<prefix/64>"` to some [random](https://simpledns.plus/private-ipv6) (ULA) (if using NAT masq.) or routable (GUA or ULA) (if not using NAT masq.) /64 prefix, to be used by the default bridge.
+    - Set `"ip6tables": true` to enable automatic filter and NAT rules through IP6Tables (required for both security and NAT).
 1. (Optional) Change IPv4 network pool:
     - - In `/etc/docker/daemon.json`, set `"default-address-pools": [{"base": "10.0.0.0/16", "size": "24"}]`.
 1. (Optional) Change default DNS servers for containers:
@@ -115,10 +115,15 @@ See the [installation guide](https://docs.nvidia.com/datacenter/cloud-native/con
 
 ### IPv6 Support
 
-- TL;DR: Docker doesn't prioritize implementing IPv6 properly.
-- While IPv4 uses IPTables filter rules for firewalling and IPTables NAT rules for masquerading and port forwarding, it generally uses no such mechanisms when enabling IPv6 (using `"ipv6": true`). Setting `"ip6tables": true` (disabled by default) is required to mimic the IPv4 behavior of filtering and NAT-ing. To disable NAT masquerading for both IPv4 and IPv6, set `enable_ip_masquerade=false` on individual networks. Disabling NAT masquerading for only IPv6 is not yet possible. (See [moby/moby #13481](https://github.com/moby/moby/issues/13481), [moby/moby #21951](https://github.com/moby/moby/issues/21951), [moby/moby #25407](https://github.com/moby/moby/issues/25407), [moby/libnetwork #2557](https://github.com/moby/libnetwork/issues/2557).)
+- TL;DR: Docker doesn't properly support IPv6.
+- While IPv6 base support may be enabled by setting `"ipv6": true` in the daemon config (disabled by default), it does not add any IP(6)Tables rules for the filter and NAT tables, as it does for IPv4/IPTables. (See [moby/moby #13481](https://github.com/moby/moby/issues/13481), [moby/moby #21951](https://github.com/moby/moby/issues/21951), [moby/moby #25407](https://github.com/moby/moby/issues/25407), [moby/libnetwork #2557](https://github.com/moby/libnetwork/issues/2557).)
+- Using `"ipv6": true` without `"ip6tables": true` means the following for IPv6 subnets on Docker bridge networks (and probably other network types):
+    - The IPv6 subnet must use a routable prefix which is actually routed to the Docker host (unlike IPv4 which uses NAT masquerading by default). While this is more appropriate for typical infrastructures, this may be quite impractical for e.g. typical home networks.
+    - If you accept forwarded traffic by default (in e.g. IPTables): The IPv6 subnet is not firewalled in any way, leaving it completely open to other networks "on" or "connected to" the Docker host, meaning you need to manually add IPTables rules to limit access to each Docker network.
+    - If you drop/reject forwarded traffic by default (in e.g. IPTables): The IPv6 subnet is completely closed and hosts on the Docker network can't even communicate between themselves (assuming your system filters bridge traffic). To allow intra-network traffic, you need to manually add something like `ip6tables -A FORWARD -i docker0 -o docker0 -j ACCEPT` for each Docker network. To allow for inter-network traffic, you need to manually add rules for that as well.
+- To enable IPv4-like IPTables support (with NAT-ing and firewalling), set `"ip6tables": true` in the daemon config (disabled by default) in the daemon config. If you want to disable NAT masquerading for both IPv4 and IPv6 (while still using the filtering rules provided by `"ip6tables": true`), set `enable_ip_masquerade=false` on individual networks. Disabling NAT masquerading for only IPv6 is not yet possible. MACVLANs with external routers will not get automatically NAT-ed.
 - IPv6-only networks (without IPv4) are not supported. (See [moby/moby #32675](https://github.com/moby/moby/issues/32675), [moby/libnetwork #826](https://github.com/moby/libnetwork/pull/826).)
-- IPv6 communication between containers (ICC) on IPv6-enabled bridges with IP6Tables enabled is broken, due to NDP (using multicast) being blocked by IP6Tables. On non-internal bridges it works fine. One workaround is to not use IPv6 on internal bridges or to not use internal bridges. (See [libnetwork/issues #2626](https://github.com/moby/libnetwork/issues/2626).)
+- IPv6 communication between containers (ICC) on IPv6-enabled _internal_ bridges with IP6Tables enabled is broken, due to IPv6 ND being blocked by the applied IP6Tables rules. On non-internal bridges it works fine. One workaround is to not use IPv6 on internal bridges or to not use internal bridges. (See [libnetwork/issues #2626](https://github.com/moby/libnetwork/issues/2626).)
 - The userland proxy (enabled by default, can be disabled) accepts both IPv4 and IPv6 incoming traffic but uses only IPv4 toward containers, which replaces the IPv6 source address with an internal IPv4 address (I'm not sure which), effectively hiding the real address and may bypass certain defences as it's apparently coming from within the local network. It also has other non-IPv6-related problems. (See [moby/moby #11185](https://github.com/moby/moby/issues/11185), [moby/moby #14856](https://github.com/moby/moby/issues/14856), [moby/moby #17666](https://github.com/moby/moby/issues/17666).)
 
 ## Useful Software

+ 3 - 1
index.md

@@ -40,8 +40,11 @@ Random collection of config notes and miscellaneous stuff. _Technically not a wi
 ### HPC
 
 - [Slurm Workload Manager](/config/hpc/slurm/)
+- [Containers](/config/hpc/containers/)
+- [Singularity](/config/hpc/singularity/)
 - [CUDA](/config/hpc/cuda/)
 - [Open MPI](/config/hpc/openmpi/)
+- [Interconnects](/config/hpc/interconnects/)
 
 ### IoT & Home Automation
 
@@ -55,7 +58,6 @@ Random collection of config notes and miscellaneous stuff. _Technically not a wi
 - [Storage](/config/linux-server/storage/)
 - [Storage: ZFS](/config/linux-server/storage-zfs/)
 - [Storage: Ceph](/config/linux-server/storage-ceph/)
-- [Networking](/config/linux-server/networking/)
 
 ### Media
 

+ 13 - 1
se/hpc/cuda.md

@@ -245,7 +245,19 @@ breadcrumbs:
 
 ### Nsight Compute
 
-- May be run from command line (`ncu`) or using the graphical application.
+- May be run from command line (`ncu`) or using the graphical application (`ncu-ui`).
 - Kernel replays: In order to run all profiling methods for a kernel execution, Nsight might have to run the kernel multiple times by storing the state before the first kernel execution and restoring it for every replay. It does not restore any host state, so in case of host-device communication during the execution, this is likely to put the application in an inconsistent state and cause it to crash or give incorrect results. To rerun the whole application (aka "application mode") instead of transparently replaying individual kernels (aka "kernel mode"), specify `--replay-mode=application` (or the equivalent option in the GUI).
 
+## Hardware
+
+### NVLink & NVSwitch
+
+- Interconnect for connecting NVIDIA GPUs and NICs/HCAs as a mesh within a node, because PCIe was too limited.
+- NVLink alone is limited to only eight GPUs, but NVSwitches allows connecting more.
+- A bidirectional "link" consists of two unidirectional "sub-links", which each contain eight differential pairs (i.e. lanes). Each device may support multiple links.
+- NVLink transfer rate per differential pair:
+    - NVLink 1.0 (Pascal): 20Gb/s
+    - NVLink 2.0 (Volta): 25Gb/s
+    - NVLink 3.0 (Ampere): 50Gb/s
+
 {% include footer.md %}