Container Monitoring
Monitoring your containers in production is essential for maintaining reliability. You need to track resource usage, application health, and logs to quickly identify and resolve issues.
Docker Built-In Monitoring
# Real-time resource usage
docker stats
# Output:
# CONTAINER CPU % MEM USAGE/LIMIT MEM % NET I/O BLOCK I/O
# api 2.5% 128MiB/512MiB 25% 1.2MB/800KB 0B/0B
# postgres 0.5% 64MiB/256MiB 25% 500KB/200KB 5MB/2MB
# Formatted output
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"
# One-time snapshot (no streaming)
docker stats --no-stream
Health Checks
# In Dockerfile
HEALTHCHECK --interval=30s --timeout=10s --retries=3 --start-period=40s \
CMD wget -qO- http://localhost:3000/health || exit 1
# In docker-compose.yml
services:
api:
build: .
healthcheck:
test: ["CMD", "wget", "-qO-", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
# Check container health status
docker inspect --format='{{.State.Health.Status}}' api
# healthy | unhealthy | starting
# List containers with health status
docker ps --format "table {{.Names}}\t{{.Status}}"
Docker Logging
# View container logs
docker logs api
docker logs -f api # Follow (live)
docker logs --tail 100 api # Last 100 lines
docker logs --since 1h api # Last hour
docker logs --until 2024-01-01 api
# Compose logs
docker compose logs
docker compose logs -f api db # Follow specific services
docker compose logs --timestamps
Structured Logging
// Use structured JSON logging for easy parsing
const pino = require('pino');
const logger = pino({
level: process.env.LOG_LEVEL || 'info',
transport: process.env.NODE_ENV === 'development'
? { target: 'pino-pretty' }
: undefined,
});
// Usage
logger.info({ port: 3000 }, 'Server started');
logger.error({ err, userId: req.user.id }, 'Request failed');
logger.warn({ memUsage: process.memoryUsage() }, 'High memory usage');
Monitoring Stack with Prometheus & Grafana
# docker-compose.monitoring.yml
services:
prometheus:
image: prom/prometheus
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus-data:/prometheus
ports:
- "9090:9090"
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.retention.time=30d'
grafana:
image: grafana/grafana
volumes:
- grafana-data:/var/lib/grafana
ports:
- "3001:3000"
environment:
GF_SECURITY_ADMIN_PASSWORD: admin
node-exporter:
image: prom/node-exporter
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
cadvisor:
image: gcr.io/cadvisor/cadvisor
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
ports:
- "8080:8080"
volumes:
prometheus-data:
grafana-data:
Application Metrics
// Expose Prometheus metrics from Node.js
const client = require('prom-client');
// Collect default metrics
client.collectDefaultMetrics({ prefix: 'myapp_' });
// Custom metrics
const httpRequestDuration = new client.Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests',
labelNames: ['method', 'route', 'status'],
buckets: [0.1, 0.5, 1, 2, 5],
});
// Middleware
app.use((req, res, next) => {
const end = httpRequestDuration.startTimer();
res.on('finish', () => {
end({ method: req.method, route: req.path, status: res.statusCode });
});
next();
});
// Metrics endpoint
app.get('/metrics', async (req, res) => {
res.set('Content-Type', client.register.contentType);
res.end(await client.register.metrics());
});
Monitoring Best Practices
- ✅ Always implement health check endpoints
- ✅ Use structured JSON logging (not plain text)
- ✅ Monitor CPU, memory, disk, and network usage
- ✅ Set up alerts for container restarts and health failures
- ✅ Use a centralized logging system for production