Parcourir la source

Merge branch 'master' of github.com:HON95/wiki

Håvard Ose Nordstrand il y a 3 ans
Parent
commit
f7d9e750ee

+ 78 - 7
config/general/linux-examples.md

@@ -8,6 +8,53 @@ breadcrumbs:
 
 ## Commands
 
+### General Monitoring
+
+- For more specific monitoring, see the other sections.
+- `htop`:
+    - ncurses-based process viewer like `top`, but prettier and more interactive.
+    - Install (APT): `apt install htop`
+    - Usage: `htop` (interactive)
+- `glances`:
+    - Homepage: [Glances](https://nicolargo.github.io/glances/)
+    - Install (PyPI for latest version): `pip3 install glances`
+    - ncurses-based viewer for e.g. basic system info, top-like process info, network traffic and disk traffic.
+    - Usage: `glances` (interactive)
+- `dstat`:
+    - A versatile replacement for vmstat, iostat and ifstat (according to itself).
+    - Prints scrolling output for showing a lot of types of general metrics, one line of columns for each time step.
+    - Usage: `dstat <options> [interval] [count]`
+        - Default interval is 1s, default count is unlimited.
+        - The values shown are the average since the last interval ended.
+        - For intervals over 1s, the last row will update itself each second until the delay has been reached and a new line is created. The values shown are averages since the last final value (when the last line was finalized), so e.g. a 10s interval gives a final line showing a 10s average.
+        - The first line is always a snapshot, i.e. all rate-based metrics are 0 or some absolute value.
+        - If any column options are provided, they will replace the default ones and are displayed in the order specified.
+    - Special options:
+        - `-C <>`: Comma-separated list of CPUs/cores to show for, including `total`.
+        - `-D <>`: Same but for disks.
+        - `-N <>`: Same but for NICs.
+        - `-f`: Show stats for all devices (not aggregated).
+    - Useful metrics:
+        - `-t`: Current time.
+        - `-p`: Process stats (by runnable, uninterruptible, new) (changes per second).
+        - `-y`: Total interrupt and context switching stats (by interrupts, context switches) (events per second).
+        - `-l`: Load average stats (1 min, 5 mins, 15 mins) (total system load multiplied by number of cores).
+        - `-c`: CPU stats (by system, user, idle, wait) (percentage of total).
+        - `--cpu-use`: Per-CPU usage (by CPU) (percentage).
+        - `-m`: Memory stats (by used, buffers, cache, free) (bytes).
+        - `-g`: Paging stats (by in, out) (count per second).
+        - `-s`: Swap stats (by used, free) (total).
+        - `-r`: Storage request stats (by read, write) (requests per second).
+        - `-d`: Storage throughput stats (by read, write) (bytes per second).
+        - `-n`: Network throughput stats (by recv, send) (bytes per second).
+        - `--socket`: Network socket stats (by total, tcp, udp, raw, ip-fragments)
+    - Useful plugins (metrics):
+        - `--net-packets`: Network request stats (by recv, send) (packets per second).
+    - Examples:
+        - General overview (CPU, RAM, ints/csws, disk, net): `dstat -tcmyrdn --net-packets 60`
+        - Network overview (CPU, ints/csws, net): `dstat -tcyn --net-packets 60`
+        - Process overview (CPU, RAM, ints/csws, paging, process, sockets): `dstat -tcmygp --socket 60`
+
 ### File Systems and Logical Volume Managers
 
 - Partition disk: `gdisk <dev>` or `fdisk <dev>`
@@ -99,6 +146,26 @@ breadcrumbs:
     - `nstat`
     - `netstat -s` (statistics)
 
+#### Tcpdump
+
+- Typical usage: `tcpdump -i <interface> -nn -v [filter]`
+- Options:
+    - `-w <>.pcap`: Write to capture file instead of formatted to STDOUT.
+    - `-i <if>`: Interface to listen on. Defaults to a random-ish interface.
+    - `-nn`: Don't resolve hostnames or ports.
+    - `-s<n>`: How much of the packets to capture. Use 0 for unlimited (full packet).
+    - `-v`/`-vv`: Details to show about packets. More V's for more details.
+    - `-l`: Line buffered more, for better stability when piping to e.g. grep.
+- Filters:
+    - Can consist of complex logical statements using parenthesis, `not`/`!`, `and`/`&&` and `or`/`||`. Make sure to quote the filter to avoid interference from the shell.
+    - Protocol: `ip`, `ip6`, `icmp`, `icmp6`, `tcp`, `udp`, ``
+    - Ports: `port <n>`
+    - IP address: `host <addr>`, `dst <addr>`, `src <addr>`
+    - IPv6 router solicitations and advertisements: `icmp6 and (ip6[40] = 133 or ip6[40] = 134)` (133 for RS and 134 for RA)
+    - IPv6 neighbor solicitations and advertisements: `icmp6 and (ip6[40] = 135 or ip6[40] = 136)` (135 for NS and 136 for NA)
+    - DHCPv4: `ip and udp and (port 67 and port 68)`
+    - DHCPv6: `ip6 and udp and (port 547 and port 546)`
+
 ### Memory
 
 - NUMA stats:
@@ -113,13 +180,20 @@ breadcrumbs:
 
 ### Profiling
 
-- Command timer (`time`):
+- `time` (timing commands):
     - Provided both as a shell built-in `time` and as `/usr/bin/time`, use the latter.
-    - Syntax: `/usr/bin/time -vp <command>`
+    - Typical usage: `/usr/bin/time -p <command>`
     - Options:
         - `-p` for POSIX output (one line per time)
         - `-v` for interesting system info about the process.
     - It give the wall time, time spent in usermode and time spent in kernel mode.
+- `strace` (trace system calls and signals):
+    - In standard mode, it runs the full command and traces/prints all syscalls (including arguments and return value).
+    - Syntax: `strace [options] <command>`
+    - Useful options:
+        - `-c`: Show summary/overview only. (Hints at which syscalls are worth looking more into.)
+        - `-f`: Trace forked child processes too.
+        - `-e trace=<syscalls>`: Only trace the specified comma-separated list of syscalls.
 
 ### Security
 
@@ -148,14 +222,11 @@ breadcrumbs:
     - `iostat [-c] [-t] [interval]`
 - Monitor processes:
     - `ps` (e.g. `ps aux` or `ps ax o uid,user:12,pid,comm`)
-- Monitor a mix of things:
-    - `htop`
-    - `glances`
-    - `ytop`
+- Monitor a mix of things: See the "general monitoring" section.
 - Monitor interrupts:
     - `irqtop`
     - `watch -n0.1 /proc/interrupts`
-- Stress test with stress-mg:
+- Stress test with stress-ng:
     - Install (Debian): `apt install stress-ng`
     - Stress CPU: `stress-ng -c $(nproc) -t 600`
 

+ 40 - 2
config/linux-server/applications.md

@@ -75,9 +75,31 @@ Sends an emails when APT updates are available.
 
 ## BIND
 
+### Info
+
 - Aka "named".
 
-**TODO**
+### Config
+
+- Should typically be installed directly on the system, but the Docker image is pretty good too.
+    - Docker image: [internetsystemsconsortium/bind9 (Docker Hub)](https://hub.docker.com/internetsystemsconsortium/bind9)
+- Docs and guides:
+    - [The BIND 9 Administrator Reference Manual (ARM)](https://bind9.readthedocs.io/)
+    - [DNSSEC Guide (BIND 9 docs)](https://bind9.readthedocs.io/en/latest/dnssec-guide.html)
+    - [Tutorial: How To Configure Bind as a Caching or Forwarding DNS Server on Ubuntu 16.04 (DigitalOcean)](https://www.digitalocean.com/community/tutorials/how-to-configure-bind-as-a-caching-or-forwarding-dns-server-on-ubuntu-16-04)
+    - [Tutorial: How To Setup DNSSEC on an Authoritative BIND DNS Server (DigitalOcean)](https://www.digitalocean.com/community/tutorials/how-to-setup-dnssec-on-an-authoritative-bind-dns-server-2)
+
+### Usage
+
+- Valdiate config: `named-checkconf`
+- Validate DNSSEC validation:
+    - `dig cloudflare.com @<server>` should give status `NOERROR` and contain the `ad` flag (for "authentic data", i.e. it passed DNSSEC validation).
+    - `dig www.dnssec-failed.org @<server>` should give status `SERVFAIL`.
+    - `dig www.dnssec-failed.org @<server> +cd` (for "checking disabled", useful for DNSSEC debugging) should give status `NOERROR` but no `ad` flag.
+- Validate DNSSEC signing:
+    - Resolve with dig and a validating server.
+    - [Verisign DNSSEC Debugger](https://dnssec-debugger.verisignlabs.com/)
+    - [DNSViz](https://dnsviz.net/)
 
 ## bitwarden_rs
 
@@ -103,6 +125,22 @@ See [Storage: Ceph](/config/linux-server/storage/#ceph).
 - Dry-run renew: `certbot renew --dry-run [--staging]`
 - Revoke certificate: `certbot revoke --cert-path <cert>`
 
+## Chrony
+
+### Setup (Server)
+
+1. Install: `apt install chrony`
+1. Modify config (`/etc/chrony/chrony.conf`):
+    - (Optional) Add individual servers: `server <address> iburst`
+    - (Optional) Add pool of servers (a name resolving to multiple servers): `pool <address> iburst`
+    - (Optional) Allow clients: `allow {all|<network>}`
+1. Restart: `systemctl restart chrony`
+
+### Usage
+
+- Check tracking: `chronyc tracking`
+- Check sources: `chronyc sources`
+
 ## DDNS
 
 ### Cloudflare
@@ -331,7 +369,7 @@ Example `/etc/exports`:
 
 1. Disable systemd-timesyncd NTP client by disabling and stopping `systemd-timesyncd`.
 1. Install `ntp`.
-1. In `/etc/ntp.conf`, replace existing servers/pools with `ntp.justervesenet.no` with the `iburst` option.
+1. Configure servers/pool in `/etc/ntp.conf`, with the `iburst` option.
 1. Test with `ntpq -pn` (it may take a minute to synchronize).
 
 ## NUT

+ 30 - 10
config/linux-server/storage-zfs.md

@@ -82,17 +82,25 @@ The installation part is highly specific to Debian 10 (Buster). The backports re
 ### Pools
 
 - Recommended pool options:
-    - Set physical block/sector size: `ashift=<9|12>`
+    - Typical example: `-o ashift=<9|12> -O compression=zstd -O xattr=sa -O atime=off -O relatime=on`
+    - Specifying options during creation: For `zpool`/pools, use `-o` for pool options and `-O` for dataset options. For `zfs`/datasets, use `-o` for dataset options.
+    - Set physical block/sector size (pool option): `ashift=<9|12>`
         - Use 9 for 512 (2^9) and 12 for 4096 (2^12). Use 12 if unsure (bigger is safer).
-    - Enable compression: `compression=zstd`
+    - Enable compression (dataset option): `compression=zstd`
         - Use `lz4` for boot drives (`zstd` booting isn't currently supported) or if `zstd` isn't yet available in the version you're using.
-    - Store extended attributes in the inodes: `xattr=sa`
-        - `on` is default and stores them in a hidden file.
-    - Relax access times: `atime=off` and `relatime=on`
+    - Store extended attributes in the inodes (dataset option): `xattr=sa`
+        - The default is `on`, which stores them in a hidden file.
+    - Relax access times (dataset option): `atime=off` and `relatime=on`
     - Don't enable dedup.
 - Create pool:
     - Format: `zpool create [options] <name> <levels-and-drives>`
-    - Basic example: `zpool create -o ashift=<9|12> -O compression=zstd -O xattr=sa <name> [mirror|raidz|raidz2|...] <drives>`
+    - Basic example: `zpool create [-f] [options] <name> [mirror|raidz|raidz2|...] <drives>`
+        - Use `-f` (force) if the disks aren't clean.
+        - See example above for recommended options.
+    - The pool definition is hierarchical, where top-level elements are striped.
+        - RAID 0 (striped): `<drives>`
+        - RAID 1 (mirrored): `mirror <drives>`
+        - RAID 10 (stripe of mirrors): `mirror <drives> mirror <drives>`
     - Create encrypted pool: See encryption section.
     - Use absolute drive paths (`/dev/disk/by-id/` or similar).
 - View pool activity: `zpool iostat [-v] [interval]`
@@ -183,12 +191,20 @@ The installation part is highly specific to Debian 10 (Buster). The backports re
 - Info:
     - ZoL v0.8.0 and newer supports native encryption of pools and datasets. This encrypts all data except some metadata like pool/dataset structure, dataset names and file sizes.
     - Datasets can be scrubbed, resilvered, renamed and deleted without unlocking them first.
-    - Datasets will by default inherit encryption and the encryption key (the "encryption root") from the parent pool/dataset.
+    - Datasets will by default inherit encryption and the encryption key from the parent pool/dataset (or the nearest "encryption root").
     - The encryption suite can't be changed after creation, but the keyformat can.
+    - Snapshots and clones always inherit from the original dataset.
 - Show stuff:
+    - Encryption: `zfs get encryption` (`off` means unencrypted, otherwise it shows the alg.)
     - Encryption root: `zfs get encryptionroot`
-    - Key status: `zfs get keystatus`. `unavailable` means locked and `-` means not encrypted.
+    - Key format: `zfs get keyformat`
+    - Key location: `zfs get keylocation` (only shows for the encryption root and `none` for encrypted children)
+    - Key status: `zfs get keystatus` (`available` means unlocked, `unavailable` means locked and `-` means not encrypted or snapshot)
     - Mount status: `zfs get mountpoint` and `zfs get mounted`.
+- Locking and unlocking:
+    - Manually unlock: `zfs load-key <dataset>`
+    - Manually lock: `zfs unload-key <dataset>`
+    - Automatically unlock and mount everything: `zfs mount -la` (`-l` to load key, `-a` for all)
 - Create a password encrypted pool:
     - Create: `zpool create -O encryption=aes-128-gcm -O keyformat=passphrase ...`
 - Create a raw key encrypted pool:
@@ -204,13 +220,17 @@ The installation part is highly specific to Debian 10 (Buster). The backports re
     1. Note: The new dataset will become its own encryption root instead of inheriting from any parent dataset/pool.
 - Change encryption property:
     - The key must generally already be loaded.
-    - Change `keyformat`, `keylocation` or `pbkdf2iters`: `zfs change-key -o <property>=<value> <dataset>`
-    - Inherit key from parent: `zfs change-key -i <dataset>`
+    - The encryption properties `keyformat`, `keylocation` and `pbkdf2iters` are inherited from the encryptionroot instead, unlike normal properties.
+    - Show encryptionroot: `zfs get encryptionroot`
+    - Change encryption properties: `zfs change-key -o <property>=<value> <dataset>`
+    - Change key location for locked dataset: `zfs set keylocation=file://<file> <dataset>` (**TODO** difference between `zfs set keylocation= ...` and `zfs change-key -o keylocation= ...`?)
+    - Inherit key from parent (join parent encryption root): `zfs change-key -i <dataset>`
 - Send raw encrypted snapshot:
     - Example: `zfs send -Rw <dataset>@<snapshot> | <...> | zfs recv <dataset>`
     - As with normal sends, `-R` is useful for including snapshots and metadata.
     - Sending encrypted datasets requires using raw (`-w`).
     - Encrypted snapshots sent as raw may be sent incrementally.
+    - Make sure to check the encryption root, key format, key location etc. to make sure they're what they should be.
 
 ### Error Handling and Replacement
 

+ 31 - 2
config/network/fs-fsos-switches.md

@@ -9,17 +9,43 @@ breadcrumbs:
 ### Using
 {:.no_toc}
 
-- FS S3700-24T4F
+- FS S5860-20SQ (core switch)
+- FS S3700-24T4F (access switch)
 
-## Info
+## Basics
 
 - Default credentials: Username `admin` and password `admin`.
 - Default mgmt. IP address: `192.168.1.1/24`
 - By default, SSH, Telnet and HTTP servers are accessible using the default mgmt. address and credentials.
+- Serial config: RS-232 w/ RJ45, baud 115200, 8 data bits, no parity bits, 1 stop bit, no flow control.
 - The default VLAN is VLAN1.
 
 ## Initial Setup
 
+### Core Switch
+
+Using an FS S5860-20SQ.
+
+**TODO**
+
+Random notes (**TODO**):
+
+1. (Optional) Split 40G-interface (QSFP+) into 4x 10G (SFP+): `split interface <if>`
+1. Configure RSTP:
+    - Set protocol: `spanning-tree mode rstp` (default MSTP)
+    - Set priority: `spanning-tree priority <priority>` (default 32768, should be a multiple of 4096, use e.g. 32768 for access, 16384 for distro and 8192 for core)
+    - Set hello time: `spanning-tree hello-time <seconds>` (default 2s)
+    - Set maximum age: `spanning-tree max-age <seconds>` (default 20s)
+    - Set forward delay: `spanning-tree forward-time <seconds>` (default 15s)
+    - Enable: `spanning-tree`
+    - **TODO** Enabled on all interfaces and VLANs by default?
+    - **TODO** Portfast for access ports? `spanning-treelink-type ...`
+    - **TODO** Guards.
+
+### Access Switch
+
+Using an FS S3700-24T4F.
+
 1. Connect to the switch using serial.
     - Using RS-232 w/ RJ45, baud 115200, 8 data bits, no parity bits, 1 stop bit, no flow control.
     - Use `Ctrl+H` for backspace.
@@ -125,6 +151,9 @@ breadcrumbs:
 - Interfaces:
     - Show L2 brief: `show int brief`
     - Show L3 brief: `show ip int brief`
+- STP:
+    - Show details: `show spanning-tree`
+    - Show overview and interfaces: `show spanning-tree summary`
 - LACP:
     - Show semi-detailed overview: `show aggregator-group [n] brief`
     - Show member ports: `show aggregator-group [n] summary`

+ 1 - 0
config/network/juniper-junos-general.md

@@ -48,6 +48,7 @@ breadcrumbs:
     - Change context to container statement: `edit <path>`
     - Go up in context: `up` or `top`
     - Show configuration for current level: `show`
+- Perform operation on multiple interfaces or similar: `wildcard range set int ge-0/0/[0-47] unit 0 family ethernet-switching` (example)
 - Commit config changes: `commit [comment <comment>] [confirmed] [and-quit]`
     - `confirmed` automatically rolls back the commit if it is not confirmed within a time limit.
     - `and-quit` will quit configuration mode after a successful commit.

+ 25 - 11
config/network/juniper-junos-switches.md

@@ -23,6 +23,13 @@ breadcrumbs:
 
 - [Juniper EX3300 Fan Mod](/guides/network/juniper-ex3300-fanmod/)
 
+## Basics
+
+- Default credentials: Username `root` without a password (drops you into the shell instead of the CLI).
+- Default mgmt. IP address: Using DHCPv4.
+- Serial config: RS-232 w/ RJ45, baud 115200, 8 data bits, no parity bits, 1 stop bit, no flow control.
+- Native VLAN: 0, aka `default`
+
 ## Initial Setup
 
 1. Connect to the switch using serial:
@@ -30,7 +37,7 @@ breadcrumbs:
 1. Login:
     - Username `root` and no password.
     - Logging in as root will always start the shell. Run `cli` to enter the operational CLI.
-1. (Optional) Disable default virtual chassis ports (VCPs) if not used:
+1. (Optional) Free virtual chassis ports (VCPs) for normal use:
     1. Enter op mode.
     1. Show VCPs: `show virtual-chassis vc-port`
     1. Remove VCPs: `request virtual-chassis vc-port delete pic-slot <pic-slot> port <port-number>`
@@ -112,8 +119,16 @@ breadcrumbs:
     - **TODO**
 1. Enable EEE:
     - **TODO**
-1. Configure RSTP:
-    - RSTP is the default STP variant for Junos.
+1. (Optional) Configure RSTP:
+    - Note: RSTP is the default STP variant for Junos.
+    - Enter config section: `edit protocols rstp`
+    - Set priority: `set bridge-priority <priority>` (default 32768, should be a multiple of 4096, use e.g. 32768 for access, 16384 for distro and 8192 for core)
+    - Set hello time: `set hello-time <seconds>` (default 2s)
+    - Set maximum age: `set max-age <seconds>` (default 20s)
+    - Set forward delay: `set forward-delay <seconds>` (default 15s)
+    - **TODO** Portfast for access ports?
+    - **TODO** Guards.
+    - **TODO** Enabled on all interfaces and VLANs by default?
 1. Configure SNMP:
     - Note: SNMP is extremely slow on the Juniper switches I've tested it on.
     - Enable public RO access: `set snmp community public authorization read-only`
@@ -127,7 +142,13 @@ breadcrumbs:
 ### Interfaces
 
 - Disable interface or unit: `set disable`
-- Perform operation on multiple interfaces: `wildcard range set int ge-0/0/[0-47] unit 0 family ethernet-switching` (example)
+- Show transceiver info:
+    - `show interfaces diagnostics optics [if]`
+    - `show interfaces media [if]` (less info, only works if interface is up)
+
+### STP
+
+- Show interface status: `show spanning-tree interface`
 
 ## Virtual Chassis
 
@@ -181,11 +202,4 @@ breadcrumbs:
 
 Virtual Chassis Fabric (VCF) evolves VC into a spine-and-leaf architecture. While VC focuses on simplified management, VCF focuses on improved data center connectivity. Only certain switches (like the QFX5100) support this feature.
 
-## Miscellanea
-
-- Serial:
-    - RS-232 w/ RJ45 (Cisco-like).
-    - Baud 9600 (default).
-    - 8 data bits, no parity, 1 stop bits, no flow control.
-
 {% include footer.md %}

+ 2 - 1
config/pc/applications.md

@@ -123,7 +123,8 @@ Note: Since Steam requires 32-bit (i386) variants of certain NVIDIA packages, an
 
 ### Miscellanea
 
-- Windows home dir (typical save location): `~/.local/share/Steam/steamapps/compatdata/<some_id>/pfx/drive_c/users/steamuser/`
+- Proton Windows home dir: `~/.local/share/Steam/steamapps/compatdata/<some_id>/pfx/drive_c/users/steamuser/`
+- Proton Windows home dir (Flatpak): `~/.var/app/com.valvesoftware.Steam/.steamlib/steamapps/compatdata/374320/pfx/drive_c/users/steamuser/`
 
 ## tmux
 

+ 51 - 4
config/virt-cont/docker.md

@@ -22,11 +22,27 @@ Using **Debian**.
     - Set `"ipv6": true` to enable IPv6 support at all.
     - Set `"fixed-cidr-v6": "<prefix/64>"` to some [random](https://simpledns.plus/private-ipv6) (ULA) (if using NAT masq.) or routable (GUA or ULA) (if not using NAT masq.) /64 prefix, to be used by the default bridge.
     - Set `"ip6tables": true` to enable automatic filter and NAT rules through IP6Tables (required for both security and NAT).
-1. (Optional) Change IPv4 network pool:
-    - - In `/etc/docker/daemon.json`, set `"default-address-pools": [{"base": "10.0.0.0/16", "size": "24"}]`.
-1. (Optional) Change default DNS servers for containers:
+1. (Recommended) Change the cgroup manager to systemd:
+    - In `/etc/docker/daemon.json`, set `"exec-opts": ["native.cgroupdriver=systemd"]`.
+    - It defaults to Docker's own cgroup manager/driver called cgroupfs.
+    - systemd (as the init system for most modern Linux systems) also functions as a cgroup manager, and using multiple cgroup managers may cause the system to become unstable under resource pressure.
+    - If the system already has existing containers, they should be completely recreated after changing the cgroup manager.
+1. (Optional) Change the storage driver:
+    - By default it uses the `overlay2` driver, which is recommended for most setups. (`aufs` was the default before that.)
+    - The only other alternatives worth consideration are `btrfs` and `zfs`, if the system is configured for those file systems.
+1. (Recommended) Change IPv4 network pool:
+    - In `/etc/docker/daemon.json`, set `"default-address-pools": [{"base": "172.17.0.0/12", "size": 24}]`.
+    - For local networks (not Swarm overlays), it defaults to pool `172.17.0.0/12` with `/16` allocations, resulting in a maximum of `2^(16-12)=16` allocations.
+1. (Recommended) Change default DNS servers for containers:
     - In `/etc/docker/daemon.json`, set `"dns": ["1.1.1.1", "2606:4700:4700::1111"]` (example using Cloudflare) (3 servers max).
     - It defaults to `8.8.8.8` and `8.8.4.4` (Google).
+1. (Optional) Change the logging options (JSON file driver):
+    - It defaults to the JSON file driver with a single file of unlimited size.
+    - Configured globally in `/etc/docker/daemon.json`.
+    - Set the driver (explicitly): `"log-driver": "json-file"`
+    - Set the max file size: `"log-opts": { "max-size": "10m" }`
+    - Set the max number of files (for log rotation): `"log-opts": { "max-file": "5" }`
+    - Set the compression for rotated files: `"log-opts": { "compress": "enabled" }`
 1. (Optional) Enable Prometheus metrics endpoint:
     - This only exports internal Docker metrics, not anything about the containers (use cAdvisor for that).
     - In `/etc/docker/daemon.json`, set `"experimental": true` and `"metrics-addr": "[::]:9323"`.
@@ -88,7 +104,7 @@ Using **Debian**.
 
 #### Fix Docker Compose No-Exec Tmp-Dir
 
-Docker Compose will fail to work if `/tmp` has `noexec`.
+Docker Compose will fail to work if `/tmp` is mounted with `noexec`.
 
 1. Move `/usr/local/bin/docker-compose` to `/usr/local/bin/docker-compose-normal`.
 1. Create `/usr/local/bin/docker-compose` with the contents below and make it executable.
@@ -111,6 +127,37 @@ The toolkit is used for running CUDA applications within containers.
 
 See the [installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker).
 
+## Best Practices
+
+- Building:
+    - Use simple base images without stuff you don't need (especially for the final image if using multi-stage builds). `alpine` is nice, but uses musl libc instead of glibc, which may cause problems for certain apps.
+    - Use official base images you can trust.
+    - Completely build inside the container to avoid relying on external tools and libraries (for better reproducability and portability).
+    - Use multi-stage builds to separate the heavier build environment/image containing all the build tools and many layers from the final image with the build app copied into it from the previous stage.
+    - To exploit cacheability when building an image multiple times (e.g. during development), put everything that doesn't change (e.g. installing packages) at the top of the Dockerfile and stuff that changes frequently (e.g. copying source files and compilation) as close to the bottom as possible.
+    - Use `COPY` instead of `ADD`, unless you actually need some of the fancy and sometimes unexpected features of `ADD`.
+    - Use `ARG`s and `ENV`s (with defaults) for vars you may want to change before building.
+    - `EXPOSE` is pointless and purely informational.
+    - Use `ENTRYPOINT` (in array form) to specify the entrypoint script or application and `CMD` (in array form) to specify default additional arguments to the `ENDTYPOINT`.
+    - Create a `.dockerignore` file, similar to `.gitignore` files, to avoid copying useless or sensitive files into the container.
+- Signal handling:
+    - Make sure your application is handling signals correctly (e.g. such that it stops properly). The initial process in the container runs with PID 1, which is typically reserved for the init process and is handled specially by certain things.
+    - If your application does not handle signals properly internally, build the image with [tini](https://github.com/krallin/tini) as the entrypoint or run the container with `--init` to make Docker inject tini as the entrypoint.
+- Don't run as root:
+    - Either set a static user in the Dockerfile, change to a specific user (static or dynamic) in the entrypoint script or app itself, or specify a user through Docker run (or equivalent). The latter approach (specified in Docker run) is assumed hereafter.
+    - The app may still be build by root and may be owned by root since the user running it generally shouldn't need to modify the app itself.
+    - If the app needs to modify files, put them in `/tmp`. Maybe make it easy to override the paths for more flexibility wrt. volumes and bind mounts.
+- Credentials and sensitive files:
+    - Don't hard code them anywhere.
+    - Don't ever put them on the image file system during building as it may get caught by one of the image layers.
+    - Specify them as mounted files (with proper permissions), env vars (slightly controversial), Docker secrets or similar.
+- Implement health checks.
+- Docker Compose:
+    - Drop the `version` property (it's deprecated).
+    - Use YAML aliases and anchors to avoid repeating yourself too much. To create an anchor, add `&<anchor>` behind a property (e.g. a service definition). To copy all content from below the property the anchor references, specify `<<: *<anchor>` inside the new property (i.e. one layer lower than the anchor on the other property). Copied properties can be overridden by explicitly specifying them.
+    - Consider implementing health checks within the DC file if the image does not already implement them (Google it).
+    - Consider putting envvars in a separate env file (specified using `--env-file` on the CLI or `env_file: []` in the DC file).
+
 ## Miscellanea
 
 ### IPv6 Support

+ 64 - 0
config/virt-cont/k8s.md

@@ -0,0 +1,64 @@
+---
+title: Kubernetes
+breadcrumbs:
+- title: Configuration
+- title: Virtualization & Containerization
+---
+{% include header.md %}
+
+Using **Debian**.
+
+## Setup
+
+1. **TODO**
+1. (Optional) Setup command completion:
+    - BASH (per-user): `echo 'source <(kubectl completion bash)' >>~/.bashrc`
+    - ZSH (per-user): `echo 'source <(kubectl completion zsh)' >>~/.zshrc`
+    - More info:
+        - [bash auto-completion (k8s docs)](https://kubernetes.io/docs/tasks/tools/included/optional-kubectl-configs-bash-linux/)
+        - [zsh auto-completion (k8s docs)](https://kubernetes.io/docs/tasks/tools/included/optional-kubectl-configs-zsh/)
+
+## Usage
+
+- Config:
+    - Show: `kubectl config view`
+- Cluster:
+    - Show: `kubectl cluster-info`
+- Nodes:
+    - Show `kubectl get nodes`
+- Services:
+    - Show: `kubectl get services`
+- Pods:
+    - Show: `kubectl get pods [-A] [-o wide]`
+        - `-A` for all namespaces instead of just the current/default one.
+        - `-o wide` for more info.
+    - Show logs: `kubectl logs <pod> [container]`
+- Manifests:
+    - Show cluster state diff if a manifest were to be applied: `kubectl diff -f <manifest-file>`
+- Events:
+    - Show: `kubectl get events`
+
+## Minikube
+
+Minikube is local Kubernetes, focusing on making it easy to learn and develop for Kubernetes.
+
+### Setup
+
+1. See: [minikube start (minikube docs)](https://minikube.sigs.k8s.io/docs/start/)
+1. Add `kubectl` symlink: `sudo ln -s $(which minikube) /usr/local/bin/kubectl`
+1. Add command completion: See normal k8s setup instructions.
+
+### Usage
+
+- Generally all of the normal k8s stuff applies.
+- Generally sudo isn't required.
+- Manage minikube cluster:
+    - Start: `minikube start`
+    - Pause (**TODO** what?): `minikube pause`
+    - Stop: `minikube stop`
+    - Delete (all clusters): `minikube delete --all`
+- Set memory limit (requires restart): `minikube config set memory <megabytes>`
+- Start and open web dashboard: `minikube dashboard`
+- Show addons: `minikube addons list`
+
+{% include footer.md %}

+ 1 - 1
config/virt-cont/libvirt-kvm.md

@@ -21,7 +21,7 @@ Using **Debian**.
 
 1. Install without extra stuff (like GUIs): `apt-get install --no-install-recommends iptables bridge-utils qemu-system qemu-utils libvirt-clients libvirt-daemon-system virtinst libosinfo-bin`
 1. (Optional) Install `dnsmasq-base` for accessing guests using their hostnames.
-1. (Optional) Add users to the `libvirt` group to allow them to manage libvirt without sudo.
+1. (Optional) Add users to the `libvirt` group to allow them to manage libvirt without sudo. Otherwise, remember to always specify use sudo to use the correct context/system URI/whatever.
 1. Set up the default network:
     1. It's already created, using NAT, DNS and DHCP.
     1. If not using dnsmasq, disable DNS and DHCP:

+ 92 - 15
config/virt-cont/proxmox-ve.md

@@ -15,7 +15,7 @@ Using **Proxmox VE 6**.
 1. Find a mouse.
     - Just a keyboard is not enough.
     - You don't need the mouse too often, though, so you can hot-swap between the keyboard and mouse during the install.
-1. Download PVE and boot from the installation medium in UEFI mode (if supported).
+1. Download PVE and boot from the installation medium (in UEFI mode if supported, otherwise BIOS is fine).
 1. Storage:
     - Use 1-2 mirrored SSDs with ZFS.
     - (ZFS) enable compression and checksums and set the correct ashift for the SSD(s). If in doubt, use ashift=12.
@@ -56,8 +56,11 @@ Follow the instructions for [Debian](/config/linux-server/debian/), but with the
     1. Enable TCP flags filter to block illegal TCP flag combinations.
     1. Make sure ping, SSH and the web GUI is working both for IPv4 and IPv6.
 1. Set up storage:
-    1. Create a ZFS pool or something.
-    1. Add it to `/etc/pve/storage.cfg`: See [Proxmox VE: Storage](https://pve.proxmox.com/wiki/Storage)
+    1. Docs: [Storage (Proxmox VE)](https://pve.proxmox.com/wiki/Storage)
+    1. Create a ZFS pool or something and add it to `/etc/pve/storage.cfg`.
+    1. Setup backup pruning:
+        - [Backup and Restore (Proxmox VE)](https://pve.proxmox.com/wiki/Backup_and_Restore)
+        - [Prune Simulator (Proxmox BS)](https://pbs.proxmox.com/docs/prune-simulator/)
 
 ### Configure PCI(e) Passthrough
 
@@ -158,7 +161,9 @@ If you lost quorum because if connection problems and need to modify something (
 
 - List: `qm list`
 
-### Initial Setup
+### General Setup
+
+The "Cloud-Init" notes can be ignored if you're not using Cloud-Init. See the separate section below first if you are.
 
 - Generally:
     - Use VirtIO if the guest OS supports it, since it provices a paravirtualized interface instead of an emulated physical interface.
@@ -166,26 +171,33 @@ If you lost quorum because if connection problems and need to modify something (
     - Use start/shutdown order if som VMs depend on other VMs (like virtualized routers).
       0 is first, unspecified is last. Shutdown follows reverse order.
       For equal order, the VMID in is used in ascending order.
-- OS tab: No notes.
+- OS tab:
+    - If installing from an ISO, specify it here.
+    - (Cloud-Init) Don't use any media (no ISO).
 - System tab:
-    - Graphics card: Use the default. **TODO** SPICE graphics card?
+    - Graphics card: Use the default. If you want SPICE, you can change to that later.
     - Qemu Agent: It provides more information about the guest and allows PVE to perform some actions more intelligently,
       but requires the guest to run the agent.
-    - BIOS: SeaBIOS (generally). Use OVMF (UEFI) if you need PCIe pass-through.
-    - Machine: Intel 440FX (generally). Use Q35 if you need PCIe pass-through.
+    - BIOS/UEFI: BIOS w/ SeaBIOS is generally fine, but I prefer UEFI w/ OVMF (for PCIe pass-through support and stuff), assuming your OS/setup doesn't require one or the other.
+        - (Cloud-Init) Prepared Cloud-Init images may typically be using UEFI (and containing an EFI partition), so you probably need to use UEFI.
+        - About the EFI disk: Using UEFI in PVE typically requires a "EFI disk" (in the hardware tab). This is not the EFI system partition (ESP) and is not visible to the VM, but is used by PVE/OVMF to store the EFIVARS, which contains the boot order. (If a UEFI VM fails to boot, you may need to enter the UEFI/OVMF menu through the remote console to fix the boot entries.)
+    - Machine: Intel 440FX is generally fine, but I prefer Q35 (for PCIe pass-through support and stuff).
     - SCSI controller: VirtIO SCSI.
 - Hard disk tab:
-    - Bus/device: Use SCSI with the VirtIO SCSI controller selected in the system tab (it supersedes the VirtIO Block controller).
+    - (Cloud-Init) This doesn't matter, you're going to replace it afterwards with the imported Cloud-Init-ready qcow2 image. Just add something temporary since it can't be skipped.
+    - Bus/device: Use the SCSI bus with the VirtIO SCSI controller selected in the system tab (it supersedes the VirtIO Block controller).
     - Cache:
-        - Use write-back for max performance with slightly reduced safety.
+        - Use write-back for max performance with reduced safety in case of power loss (recommended).
         - Use none for balanced performance and safety with better *write* performance.
         - Use write-through for balanced performance and safety with better *read* performance.
         - Direct-sync and write-through can be fast for SAN/HW-RAID, but slow if using qcow2.
     - Discard: When using thin-provisioning storage for the disk and a TRIM-enabled guest OS,
       this option will relay guest TRIM commands to the storage so it may shrink the disk image.
       The guest OS may require SSD emulation to be enabled.
+    - SSD emulation: This just presents the drive as an SSD instead of as an HDD. It's typically not needed.
     - IO thread: If the VirtIO SCSI single controller is used (which uses one controller per disk),
       this will create one I/O thread for each controller for maximum performance.
+      This is generally not needed if not doing IO-heavy stuff with multiple disks in the VM.
 - CPU tab:
     - CPU type: Generally, use "kvm64".
       For HA, use "kvm64" or similar (since the new host must support the same CPU flags).
@@ -201,18 +213,73 @@ If you lost quorum because if connection problems and need to modify something (
       For Windows, it must be added manually and may incur a slowdown of the guest.
 - Network tab:
     - Model: Use VirtIO.
-    - Firewall: Enable if the guest does not provide one itself.
+    - Firewall: Enable if the guest does not provide one itself, or if you don't want it to immediately become accessible from the network during/after installation (i.e. before you've provisioned it properly).
     - Multiqueue: When using VirtUO, it can be set to the total CPU cores of the VM for increased performance.
       It will increase the CPU load, so only use it for VMs that need to handle a high amount of connections.
+- Start the VM:
+    - (Cloud-Init) Don't start it yet, go back to the Cloud-Init section.
+    - Open a graphical console to show what's going on.
+    - See the separate sections below for more specific stuff.
+
+### Linux Setup (Manual)
+
+1. Setup the VM (see the general setup section).
+1. (Recommended) Setup the QEMU guest agent: See the section about it.
+1. (Optional) Setup SPICE (for better graphics): See the section about it.
+1. More detailed Debian setup: [Debian](/config/linux-server/debian/)
+
+### Linux Setup (Cloud-Init)
+
+*Using Debian 10.*
+
+1. Download a cloud-init-ready Linux image to the hypervisor:
+    - Debian: [Debian Official Cloud Images](https://cloud.debian.org/images/cloud/) (the `genericcloud` variant and `qcow2` format)
+    - Copy the download link and download it to the host (`wget <url>`).
+1. Note: It is an UEFI installation (so the BIOS/UEFI mode must be set accordingly) and the image contains an EFI partition (so you don't need a separate EFI disk).
+1. Setup a VM as in the general setup section (take note of the specified Cloud-Init notes).
+    1. Set the VM up as UEFI with an "EFI disk" added.
+    1. Add a serial interface since the GUI console may be broken (it is for me).
+1. Setup the prepared disk:
+    1. (GUI) Completely remove the disk from the VM ("detach" then "remove").
+    1. Import the downloaded cloud-init-ready image as the system disk: `qm importdisk <vmid> <image-file> <storage>`
+    1. (GUI) Find the unused disk for the VM, edit it (see the general notes), and add it.
+    1. (GUI) Resize the disk to the desired size. Note that it can be expanded further at a later time, but not shrunk. 10GB is typically file.
+    1. (GUI) Make sure the disk (e.g. `scsi0`) is added in "boot order" in the options tab. Others may be removed.
+1. Setup initial Cloud-Init disk:
+    1. (GUI) Add a "CloudInit drive".
+    1. (GUI) In the Cloud-Init tab, set a temporary user and password and set the IP config to DHCPv4 and DHCPv6/SLAAC, such that you can boot the template and install stuff. (You can wipe these settings later to prepare it for templating.)
+1. Start the VM and open its console.
+    1. The NoVNC console is broken for me for these VMs for some reason, so use the serial interface you added instead if NoVNC isn't working (`qm terminal <vmid>`).
+1. Fix boot order:
+    1. It may fail to boot into Linux and instead drop you into a UEFI shell (`Shell>`). Skip this if it actually boots.
+    1. Run `reset` and prepare to press/spam `Esc` when it resets so that it drops you into the UEFI menu.
+    1. Enter "Boot Maintenance Manager" and "Boot Options", then delete all options except the harddisk one (no PXE or DVD-ROM). Commit.
+    1. Press "continue" so that is attempts to boot using the new boot order. It should boot into Linux.
+    1. (Optional) Try logging in (using Cloud-Init credentials), power it off (so the QEMU VM completely stops), and power it on again to check that the boot order is still working.
+1. Log in and configure basic stuff:
+    1. Log in using the Cloud-Init credentials. The hostname should automatically have been set to the VM name, as an indication that the initial Cloud-Init setup succeeded.
+    1. Setup basics like installing `qemu-guest-agent`.
+1. Wipe temporary Cloud-Init setup:
+    1. (VM) Run `cloud-init clean`, so that it reruns the initial setup on the next boot.
+    1. (GUI) Remove all settings in the Cloud-Init tab (or set appropriate defaults).
+1. (Optional) Create a template of the VM:
+    - Rename it as e.g. `<something>-template` and treat is as a template, but don't bother converting it to an actual template (which prevents you from changing it later).
+    - If you made it a template then clone it and use the clone for the steps below.
+1. Prepare the new VM:
+    - Manually: Setup Cloud-Init in the Cloud-Init tab and start it. Start it, log in using the Cloud-Init credentials and configure it.
+    - Ansible: See the `proxmox` and `proxmox_kvm` modules.
+    - Consider purging the cloud-init package to avoid accidental reconfiguration later.
+    - Consider running `cloud-init status --wait` before configuring it to make sure the Cloud-Init setup has completed.
 
 ### Windows Setup
 
-*For Windows 10.*
+*Using Windows 10.*
 
 [Proxmox VE Wiki: Windows 10 guest best practices](https://pve.proxmox.com/wiki/Windows_10_guest_best_practices)
 
 #### Before Installation
 
+1. Setup the VM (see the general setup section).
 1. Add the VirtIO drivers ISO: [Fedora Docs: Creating Windows virtual machines using virtIO drivers](https://docs.fedoraproject.org/en-US/quick-docs/creating-windows-virtual-machines-using-virtio-drivers/index.html#virtio-win-direct-downloads)
 1. Add it as a CDROM using IDE device 3.
 
@@ -316,18 +383,20 @@ Check the host system logs. It may for instance be due to hardware changes or st
 - UDP 111: rpcbind (optional).
 - UDP 5404-5405: Corosync (internal).
 
-## Ceph
+## Storage
+
+### Ceph
 
 See [Storage: Ceph](/config/linux-server/storage/#ceph) for general notes.
 The notes below are PVE-specific.
 
-### Notes
+#### Notes
 
 - It's recommended to use a high-bandwidth SAN/management network within the cluster for Ceph traffic.
   It may be the same as used for out-of-band PVE cluster management traffic.
 - When used with PVE, the configuration is stored in the cluster-synchronized PVE config dir.
 
-### Setup
+#### Setup
 
 1. Setup a shared network.
     - It should be high-bandwidth and isolated.
@@ -347,4 +416,12 @@ The notes below are PVE-specific.
     - Use at least size 3 and min. size 2 in production.
     - "Add storage" adds the pool to PVE for disk image and container content.
 
+### Troubleshooting
+
+**"Cannot remove image, a guest with VMID '100' exists!" when trying to remove unused VM disk**:
+
+- Make sure it's not mounted to the VM.
+- Make sure it's not listed as an "unused disk" for the VM.
+- Run `qm rescan --vmid <vmid>` and check the steps above.
+
 {% include footer.md %}

+ 1 - 0
index.md

@@ -108,6 +108,7 @@ Random collection of config notes and miscellaneous stuff. _Technically not a wi
 ### Virtualization & Containerization
 
 - [Docker](/config/virt-cont/docker/)
+- [Kubernetes](/config/virt-cont/k8s/)
 - [libvirt & KVM](/config/virt-cont/libvirt-kvm/)
 - [Proxmox VE](/config/virt-cont/proxmox-ve/)
 

+ 4 - 0
it/services/dns.md

@@ -6,6 +6,10 @@ breadcrumbs:
 ---
 {% include header.md %}
 
+## Resources
+
+- [[RFC 1912] Common DNS Operational and Configuration Errors](https://datatracker.ietf.org/doc/html/rfc1912)
+
 ## Basics
 
 Everyone knows this, no point reiterating.