|
@@ -263,6 +263,24 @@ This setup requires pubkey plus MFA (if configured) plus password.
|
|
|
1. Prepare it for removal: `isdct start -intelssd <target> -standby`
|
|
|
1. Reconnect the drives or restart the system.
|
|
|
|
|
|
+## Grafana
|
|
|
+
|
|
|
+Typically used with a data source like [Prometheus](#prometheus).
|
|
|
+
|
|
|
+### Setup (Docker)
|
|
|
+
|
|
|
+1. See [(Grafana) Run Grafana Docker image](https://grafana.com/docs/grafana/latest/installation/docker/).
|
|
|
+1. Mount:
|
|
|
+ - Config: `./grafana.ini:/etc/grafana/grafana.ini:ro`
|
|
|
+ - Data: `./data:/var/lib/grafana/:rw` (requires UID 472)
|
|
|
+ - Logs: `./logs:/var/log/grafana/:rw` (requires UID 472)
|
|
|
+1. Configure `grafana.ini`.
|
|
|
+1. Open the webpage to configure it.
|
|
|
+
|
|
|
+### Notes
|
|
|
+
|
|
|
+- Be careful with public dashboards. "Viewers" can modify any query and thus query the entire data source for the dashboard, unless you have configured some type of access control for the data source (which you probably haven't).
|
|
|
+
|
|
|
## Home Assistant
|
|
|
|
|
|
See [Home Assistant](../home-assistant/).
|
|
@@ -569,6 +587,33 @@ File `smtp_header_checks`:
|
|
|
- Print the config: `postconf -n`
|
|
|
- If `mailq` tells you mails are stuck in the mail queue because of previous errors, run `postqueue -f` to flush them.
|
|
|
|
|
|
+## Prometheus
|
|
|
+
|
|
|
+Typically used with [Grafana](#grafana) and sometimes with Cortex/Thanos in-between.
|
|
|
+
|
|
|
+### Setup (Docker)
|
|
|
+
|
|
|
+1. See [(Prometheus) Installation](https://prometheus.io/docs/prometheus/latest/installation/).
|
|
|
+1. Set the retention period:
|
|
|
+ - Add the command-line argument `--storage.tsdb.retention.time=15d` (for 15 days).
|
|
|
+ - For the Docker image, this also means you have to re-specify all the default arguments (check with `docker inspect`).
|
|
|
+1. Mount:
|
|
|
+ - Config: `./prometheus.yml:/etc/prometheus/prometheus.yml:ro`
|
|
|
+ - Data: `./data/:/prometheus/:rw`
|
|
|
+1. Configure `prometheus.yml`.
|
|
|
+ - I.e. set global variables (like `scrape_interval`, `scrape_timeout` and `evaluation_interval`) and scrape configs.
|
|
|
+1. (Optional) Setup remote storage to replicate all scraped data to a remote backend.
|
|
|
+1. (Optional) Setup Cortex or Thanos for global view, HA and long-term storage.
|
|
|
+
|
|
|
+### Notes
|
|
|
+
|
|
|
+- The open port (9090 by default) contains both the dashboard and the query API.
|
|
|
+- You can check the status of scrape jobs in the dashboard.
|
|
|
+- Prometheus does not store data forever, it's meant for short- to mid-term storage.
|
|
|
+- Prometheus should be "physically" close to the apps it's monitoring. For large infrastructures, you should use multiple instances, not one huge global instance.
|
|
|
+- If you need a "global view" (when using multiple instances), long-term storage and (in some way) HA, consider using Cortex or Thanos.
|
|
|
+- Cardinality is the number of time series. Each unique combination of metrics and key-value label pairs (yes, including the label value) amounts to a new time series. Very high cardinality (i.e. over 100 000 series, number taken from a Splunk presentation from 2019) amounts to significantly reduced performance and increased memory and resource usage, which is also shared by HA peers (fate sharing). Therefore, avoid using valueless labels, add labels only to metrics they belong with, try to limit the numer of unique values of a label and consider splitting metrics to use less labels. Some useful queries to monitor cardinality: `sum(scrape_series_added) by (job)`, `sum(scrape_samples_scraped) by (job)`, `prometheus_tsdb_symbol_table_size_bytes`, `rate(prometheus_tsdb_head_series_created_total[5m])`, `sum(sum_over_time(scrape_series_added[5m])) by (job)`. You can also find some useful stats in the dashboard.
|
|
|
+
|
|
|
## Pterodactyl
|
|
|
|
|
|
### General
|