#observability #Sentry #Grafana #logs #small team

Observability for a 5-person team — Sentry, Grafana, plain logs

Enterprise observability stacks (Datadog, New Relic, full OpenTelemetry pipelines) cost $50K+/year and require dedicated ownership. Small teams can build effective observability for under $200/month with Sentry, Grafana Cloud, and structured logs.

Jun 12, 2026

A 5-engineer team running production needs observability. Datadog or New Relic would solve it — for $30K-100K/year and ongoing tuning. There's a lighter stack that covers 90% of needs for under $200/month and requires no dedicated ownership.

The three pillars, applied small

1. Errors — Sentry

Sentry catches application errors, groups by signature, alerts. For a 5-person team:

Free tier covers ~5K errors/month — enough for small SaaS.
Team plan ($26/user/month) scales without surprises.
Integrations with everything (Slack, Linear, GitHub).
Release tracking ties errors to deployments.
Source maps for client-side errors.

Setup: one SDK install, one DSN per environment. Done in 30 minutes.

2. Metrics and dashboards — Grafana Cloud

Free tier includes 10K metrics, 50GB logs, 50GB traces. Generous for small SaaS.

Use:

Prometheus exporters for app and infrastructure metrics.
Loki for log aggregation.
Tempo for traces (when needed).
Built-in dashboards for common services.

Setup: 1-2 days for instrumentation + dashboards.

3. Logs — structured JSON to Loki or BetterStack

Every log line as JSON with consistent fields:

{
  "level": "error",
  "time": "2026-06-12T15:23:01Z",
  "service": "api",
  "event": "payment_failed",
  "user_id": "u-42",
  "order_id": "o-9876",
  "error": "insufficient_funds",
  "trace_id": "abc123"
}

Searchable, parseable, joins with metrics via trace_id.

The minimum useful set of metrics

Request rate per endpoint.
Error rate per endpoint (4xx separated from 5xx).
Latency p50, p95, p99 per endpoint.
Database query time p95.
External API call latency + error rate per provider.
Background job queue depth and processing time.
Infrastructure — CPU, memory, disk I/O on each instance.

Resist the urge to instrument everything. Add metrics when they answer a specific question.

The minimum useful set of dashboards

Overall health. Request rate, error rate, p95 latency across services.
Per-service deep dive. Same metrics broken down by endpoint.
External dependencies. Latency and error rates for third-party APIs.
Infrastructure. CPU/memory/disk on each instance.
Business metrics. Signups, conversions, revenue (per hour/day).

Five dashboards. Each one fits on a screen. Each has one purpose.

Alerts that don't burn out on-call

Alert only on issues that require immediate action:

Error rate >5% sustained for 5 minutes.
p95 latency >2x baseline for 5 minutes.
Disk space >90%.
Database connection pool exhaustion.
Background queue not draining.
Payment processor down.

Page-worthy alerts: 5-15. Anything else is a dashboard signal, not a page.

Tracing — when you need it

Tracing is powerful but expensive in time. Add it when:

You can't explain latency from metrics alone.
Multi-service requests are common.
Customer-reported issues need fine-grained debugging.

Skip it when:

You have one service.
Logs with trace_id correlation are enough.
Cost vs benefit doesn't pencil.

On-call rotation for small teams

2-week rotation per engineer.
Weekly handoff Mondays.
Runbook for top 10 alert types — what to check, who to escalate to.
Postmortem after every page (lightweight, 1-page max).

Tools: PagerDuty free tier (3 users), incident.io free tier.

What to skip

Full OpenTelemetry pipeline. Overkill for small team.
Dedicated monitoring engineer. Distributed responsibility works at this scale.
Enterprise APM ($30K+/year). Sentry + Grafana cover 90% for 1% the cost.
Synthetic monitoring at scale. 1-2 endpoint checks from Pingdom or BetterStack is enough.
SLI/SLO frameworks. Useful at scale. Premature for small team.

Cost

For a 5-engineer team running typical SaaS:

Sentry Team plan: $130/month (5 users).
Grafana Cloud Free or Pro: $0-150/month.
PagerDuty free.
BetterStack uptime: $20-40/month.

Total: $150-320/month. Versus $2,500-8,000/month for Datadog at equivalent coverage.

Verdict

Small teams don't need enterprise observability. Sentry for errors, Grafana for metrics/dashboards, structured JSON logs to Loki, lean alerting. Setup in a week, $150-320/month, covers 90% of needs. Add tracing and SLOs when you've outgrown this — usually at 30+ engineers, not 5.

Our web development services

Web development, AI, automation — what we build and how.