Files
altstack-data/docs/app/concepts/monitoring/page.mdx
2026-02-25 22:36:27 +05:30

164 lines
4.2 KiB
Plaintext

---
title: "Monitoring & Observability"
description: "Know when things break before your users do. Uptime monitoring, disk alerts, log aggregation, and observability for self-hosters."
---
# Monitoring & Observability
You deployed 5 tools. They're running great. You go to bed. At 3 AM, the disk fills up, Postgres crashes, and everything dies. You find out at 9 AM when a user emails you.
**Monitoring prevents this.**
## The Three Layers
| Layer | What It Watches | Tool |
|---|---|---|
| **Uptime** | "Is the service responding?" | Uptime Kuma |
| **System** | CPU, RAM, disk, network | Node Exporter + Grafana |
| **Logs** | What's actually happening inside | Docker logs, Dozzle, SigNoz |
You need **at least** the first layer. The other two are for when you get serious.
## Layer 1: Uptime Monitoring (Essential)
[Uptime Kuma](/deploy/uptime-kuma) is the single best tool for self-hosters. Deploy it first, always.
```yaml
# docker-compose.yml
services:
uptime-kuma:
image: louislam/uptime-kuma:1
container_name: uptime-kuma
restart: unless-stopped
ports:
- "3001:3001"
volumes:
- uptime_data:/app/data
volumes:
uptime_data:
```
### What to Monitor
Add a monitor for **every** service you run:
| Type | Target | Check Interval |
|---|---|---|
| HTTP(s) | `https://plausible.yourdomain.com` | 60s |
| HTTP(s) | `https://uptime.yourdomain.com` | 60s |
| TCP Port | `localhost:5432` (Postgres) | 120s |
| Docker Container | Container name | 60s |
| DNS | `yourdomain.com` | 300s |
### Notifications
Uptime Kuma supports 90+ notification channels. Set up **at least two**:
- **Email** — For non-urgent alerts
- **Telegram/Discord/Slack** — For instant mobile alerts
> 🔥 **Pro Tip:** Monitor your monitoring. Set up an external free ping service (like [UptimeRobot](https://uptimerobot.com)) to watch your Uptime Kuma instance.
## Layer 2: System Metrics
### Quick Disk Alert Script
The #1 cause of self-hosting outages is **running out of disk space**. This script sends an alert when disk usage exceeds 80%:
```bash
#!/bin/bash
# /opt/scripts/disk-alert.sh
THRESHOLD=80
USAGE=$(df / | tail -1 | awk '{print $5}' | sed 's/%//')
if [ "$USAGE" -gt "$THRESHOLD" ]; then
echo "⚠️ Disk usage is at ${USAGE}% on $(hostname)" | \
mail -s "Disk Alert: ${USAGE}%" you@yourdomain.com
fi
```
Add to cron:
```bash
# Check every hour
0 * * * * /opt/scripts/disk-alert.sh
```
### What to Watch
| Metric | Warning Threshold | Critical Threshold |
|---|---|---|
| Disk usage | 70% | 85% |
| RAM usage | 80% | 95% |
| CPU sustained | 80% for 5 min | 95% for 5 min |
| Container restarts | 3 in 1 hour | 10 in 1 hour |
### Docker Resource Monitoring
Quick commands to check what's eating your resources:
```bash
# Live resource usage per container
docker stats
# Show container sizes (disk)
docker system df -v
# Find large volumes
du -sh /var/lib/docker/volumes/*/
```
## Layer 3: Log Aggregation
Docker captures all stdout/stderr from your containers. Use it:
```bash
# Live logs for a service
docker compose logs -f plausible
# Last 100 lines
docker compose logs --tail=100 plausible
# Logs since a specific time
docker compose logs --since="2h" plausible
```
### Dozzle (Docker Log Viewer)
For a beautiful web-based log viewer:
```yaml
services:
dozzle:
image: amir20/dozzle:latest
container_name: dozzle
ports:
- "8080:8080"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
```
### For Serious Setups: SigNoz
If you need traces, metrics, **and** logs in one place, deploy [SigNoz](/deploy/signoz). It's an open-source Datadog alternative built on OpenTelemetry.
## Maintenance Routine
Set a weekly calendar reminder:
```
☐ Check Uptime Kuma — all green?
☐ Run `docker stats` — anything hogging resources?
☐ Run `df -h` — disk space OK?
☐ Run `docker system prune -f` — clean unused images
☐ Check logs for any errors — `docker compose logs --since=168h | grep -i error`
```
## Next Steps
→ [Updating & Maintaining Containers](/concepts/updates) — Keep your tools up to date safely
→ [Backups That Actually Work](/concepts/backups) — Protect your data
→ [Deploy Uptime Kuma](/deploy/uptime-kuma) — Set up monitoring now