TechLead
Lesson 17 of 18
5 min read
Docker & DevOps

Monitoring & Logging

Monitor Docker containers with health checks, structured logging, and observability tools

Container Monitoring

Monitoring your containers in production is essential for maintaining reliability. You need to track resource usage, application health, and logs to quickly identify and resolve issues.

Docker Built-In Monitoring

# Real-time resource usage
docker stats

# Output:
# CONTAINER   CPU %   MEM USAGE/LIMIT   MEM %   NET I/O      BLOCK I/O
# api         2.5%    128MiB/512MiB     25%     1.2MB/800KB  0B/0B
# postgres    0.5%    64MiB/256MiB      25%     500KB/200KB  5MB/2MB

# Formatted output
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"

# One-time snapshot (no streaming)
docker stats --no-stream

Health Checks

# In Dockerfile
HEALTHCHECK --interval=30s --timeout=10s --retries=3 --start-period=40s \
  CMD wget -qO- http://localhost:3000/health || exit 1
# In docker-compose.yml
services:
  api:
    build: .
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
# Check container health status
docker inspect --format='{{.State.Health.Status}}' api
# healthy | unhealthy | starting

# List containers with health status
docker ps --format "table {{.Names}}\t{{.Status}}"

Docker Logging

# View container logs
docker logs api
docker logs -f api              # Follow (live)
docker logs --tail 100 api      # Last 100 lines
docker logs --since 1h api      # Last hour
docker logs --until 2024-01-01 api

# Compose logs
docker compose logs
docker compose logs -f api db   # Follow specific services
docker compose logs --timestamps

Structured Logging

// Use structured JSON logging for easy parsing
const pino = require('pino');
const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
  transport: process.env.NODE_ENV === 'development'
    ? { target: 'pino-pretty' }
    : undefined,
});

// Usage
logger.info({ port: 3000 }, 'Server started');
logger.error({ err, userId: req.user.id }, 'Request failed');
logger.warn({ memUsage: process.memoryUsage() }, 'High memory usage');

Monitoring Stack with Prometheus & Grafana

# docker-compose.monitoring.yml
services:
  prometheus:
    image: prom/prometheus
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus-data:/prometheus
    ports:
      - "9090:9090"
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.retention.time=30d'

  grafana:
    image: grafana/grafana
    volumes:
      - grafana-data:/var/lib/grafana
    ports:
      - "3001:3000"
    environment:
      GF_SECURITY_ADMIN_PASSWORD: admin

  node-exporter:
    image: prom/node-exporter
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.sysfs=/host/sys'

  cadvisor:
    image: gcr.io/cadvisor/cadvisor
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:rw
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
    ports:
      - "8080:8080"

volumes:
  prometheus-data:
  grafana-data:

Application Metrics

// Expose Prometheus metrics from Node.js
const client = require('prom-client');

// Collect default metrics
client.collectDefaultMetrics({ prefix: 'myapp_' });

// Custom metrics
const httpRequestDuration = new client.Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests',
  labelNames: ['method', 'route', 'status'],
  buckets: [0.1, 0.5, 1, 2, 5],
});

// Middleware
app.use((req, res, next) => {
  const end = httpRequestDuration.startTimer();
  res.on('finish', () => {
    end({ method: req.method, route: req.path, status: res.statusCode });
  });
  next();
});

// Metrics endpoint
app.get('/metrics', async (req, res) => {
  res.set('Content-Type', client.register.contentType);
  res.end(await client.register.metrics());
});

Monitoring Best Practices

  • ✅ Always implement health check endpoints
  • ✅ Use structured JSON logging (not plain text)
  • ✅ Monitor CPU, memory, disk, and network usage
  • ✅ Set up alerts for container restarts and health failures
  • ✅ Use a centralized logging system for production

Continue Learning