# Metrics Stack Self-contained monitoring stack using VictoriaMetrics, vmagent, Grafana, and Uptime Kuma. Deploy one instance per client site. Access remotely over VPN. ## Stack Components | Service | Purpose | Default Port | |---|---|---| | VictoriaMetrics | Time-series metric storage | 8428 | | vmagent | Prometheus-compatible scrape agent | 8429 | | Grafana | Dashboards and visualization | 3000 | | Uptime Kuma | Availability monitoring + alerting | 3001 | | node_exporter | Host metrics (this machine) | internal only | | snmp_exporter | SNMP metrics for network devices | 9116 (optional) | --- ## Initial Setup ### 1. Configure environment ```bash cp .env.example .env ``` Edit `.env`: - Set `BIND_HOST` to this machine's LAN IP - Set `CLIENT_NAME` to identify the client - Set strong passwords for `GF_ADMIN_PASSWORD` - Set `TZ` to the correct timezone ### 2. Configure endpoints Edit `vmagent/config/scrape.yml`: - Update the `linux-host` job with this machine's hostname and site name - Add any other endpoints (see "Adding Endpoints" below) ### 3. Start the stack ```bash podman-compose up -d ``` ### 4. Finish Uptime Kuma setup 1. Browse to `http://BIND_HOST:3001` and complete the initial setup wizard 2. Note the username/password you set 3. In `vmagent/config/scrape.yml`, uncomment the `uptime_kuma` job and fill in those credentials 4. Run `podman-compose restart vmagent` --- ## Adding Endpoints Open `vmagent/config/scrape.yml`. The file has two sections: - **ACTIVE JOBS** — jobs that are currently running - **TEMPLATES** — commented-out job blocks, one per endpoint type To add a new endpoint: 1. Find the matching template at the bottom of `scrape.yml` 2. Copy the entire commented block (from `# - job_name:` to the end of the block) 3. Paste it into the **ACTIVE JOBS** section 4. Uncomment it (remove the leading `# ` from each line) 5. Fill in the IP addresses, hostnames, and site label 6. Restart vmagent: ```bash podman-compose restart vmagent ``` ### Available templates | Template | Exporter needed on target | Port | |---|---|---| | Windows Domain Controller | windows_exporter | 9182 | | Hyper-V Host | windows_exporter (with hyperv collector) | 9182 | | Windows General Purpose Server | windows_exporter | 9182 | | Linux Server | node_exporter | 9100 | | SNMP Device | snmp_exporter (runs in this stack) | n/a | ### Installing windows_exporter Download the latest `.msi` from: https://github.com/prometheus-community/windows_exporter/releases For Hyper-V hosts, ensure the `hyperv` collector is enabled. You can set this in the MSI installer or by modifying the service arguments post-install: ``` --collectors.enabled defaults,hyperv,cpu_info,physical_disk,process ``` ### Enabling SNMP monitoring 1. Uncomment the `snmp-exporter` service in `podman-compose.yml` 2. Download a pre-built `snmp.yml` from: https://github.com/prometheus/snmp_exporter/releases 3. Place it at `snmp_exporter/snmp.yml` 4. Uncomment and configure the `snmp-devices` job template in `scrape.yml` 5. Restart the stack: `podman-compose up -d` --- ## Useful Commands ```bash # Start the stack podman-compose up -d # Stop the stack podman-compose down # Restart a single service (e.g., after editing scrape.yml) podman-compose restart vmagent # View logs for a service podman-compose logs -f vmagent podman-compose logs -f victoriametrics # Check running containers podman-compose ps # Pull latest images and restart podman-compose pull && podman-compose up -d ``` ## Verify vmagent is scraping Browse to `http://BIND_HOST:8429/targets` to see all configured scrape targets and their current status (up/down, last scrape time, errors). --- ## Directory Structure ``` metrics/ ├── .env # Active config (do not commit) ├── .env.example # Config template ├── podman-compose.yml # Stack definition ├── vmagent/ │ └── config/ │ └── scrape.yml # Endpoint config — edit this to add endpoints ├── grafana/ │ ├── data/ # Grafana database (auto-created) │ └── provisioning/ │ └── datasources/ │ └── victoriametrics.yml # Auto-wires VictoriaMetrics as datasource ├── victoriametrics/ │ └── data/ # Metric storage (auto-created) ├── uptime_kuma/ │ └── data/ # Uptime Kuma database (auto-created) └── snmp_exporter/ └── snmp.yml # SNMP module config (download separately) ```