4.5 KiB
4.5 KiB
Metrics Stack
Self-contained monitoring stack using VictoriaMetrics, vmagent, Grafana, and Uptime Kuma. Deploy one instance per client site. Access remotely over VPN.
Stack Components
| Service | Purpose | Default Port |
|---|---|---|
| VictoriaMetrics | Time-series metric storage | 8428 |
| vmagent | Prometheus-compatible scrape agent | 8429 |
| Grafana | Dashboards and visualization | 3000 |
| Uptime Kuma | Availability monitoring + alerting | 3001 |
| node_exporter | Host metrics (this machine) | internal only |
| snmp_exporter | SNMP metrics for network devices | 9116 (optional) |
Initial Setup
1. Configure environment
cp .env.example .env
Edit .env:
- Set
BIND_HOSTto this machine's LAN IP - Set
CLIENT_NAMEto identify the client - Set strong passwords for
GF_ADMIN_PASSWORD - Set
TZto the correct timezone
2. Configure endpoints
Edit vmagent/config/scrape.yml:
- Update the
linux-hostjob with this machine's hostname and site name - Add any other endpoints (see "Adding Endpoints" below)
3. Start the stack
podman-compose up -d
4. Finish Uptime Kuma setup
- Browse to
http://BIND_HOST:3001and complete the initial setup wizard - Note the username/password you set
- In
vmagent/config/scrape.yml, uncomment theuptime_kumajob and fill in those credentials - Run
podman-compose restart vmagent
Adding Endpoints
Open vmagent/config/scrape.yml. The file has two sections:
- ACTIVE JOBS — jobs that are currently running
- TEMPLATES — commented-out job blocks, one per endpoint type
To add a new endpoint:
- Find the matching template at the bottom of
scrape.yml - Copy the entire commented block (from
# - job_name:to the end of the block) - Paste it into the ACTIVE JOBS section
- Uncomment it (remove the leading
#from each line) - Fill in the IP addresses, hostnames, and site label
- Restart vmagent:
podman-compose restart vmagent
Available templates
| Template | Exporter needed on target | Port |
|---|---|---|
| Windows Domain Controller | windows_exporter | 9182 |
| Hyper-V Host | windows_exporter (with hyperv collector) | 9182 |
| Windows General Purpose Server | windows_exporter | 9182 |
| Linux Server | node_exporter | 9100 |
| SNMP Device | snmp_exporter (runs in this stack) | n/a |
Installing windows_exporter
Download the latest .msi from:
https://github.com/prometheus-community/windows_exporter/releases
For Hyper-V hosts, ensure the hyperv collector is enabled. You can set this
in the MSI installer or by modifying the service arguments post-install:
--collectors.enabled defaults,hyperv,cpu_info,physical_disk,process
Enabling SNMP monitoring
- Uncomment the
snmp-exporterservice inpodman-compose.yml - Download a pre-built
snmp.ymlfrom: https://github.com/prometheus/snmp_exporter/releases - Place it at
snmp_exporter/snmp.yml - Uncomment and configure the
snmp-devicesjob template inscrape.yml - Restart the stack:
podman-compose up -d
Useful Commands
# Start the stack
podman-compose up -d
# Stop the stack
podman-compose down
# Restart a single service (e.g., after editing scrape.yml)
podman-compose restart vmagent
# View logs for a service
podman-compose logs -f vmagent
podman-compose logs -f victoriametrics
# Check running containers
podman-compose ps
# Pull latest images and restart
podman-compose pull && podman-compose up -d
Verify vmagent is scraping
Browse to http://BIND_HOST:8429/targets to see all configured scrape targets
and their current status (up/down, last scrape time, errors).
Directory Structure
metrics/
├── .env # Active config (do not commit)
├── .env.example # Config template
├── podman-compose.yml # Stack definition
├── vmagent/
│ └── config/
│ └── scrape.yml # Endpoint config — edit this to add endpoints
├── grafana/
│ ├── data/ # Grafana database (auto-created)
│ └── provisioning/
│ └── datasources/
│ └── victoriametrics.yml # Auto-wires VictoriaMetrics as datasource
├── victoriametrics/
│ └── data/ # Metric storage (auto-created)
├── uptime_kuma/
│ └── data/ # Uptime Kuma database (auto-created)
└── snmp_exporter/
└── snmp.yml # SNMP module config (download separately)