Monitoring
OpenTelemetry Starter for SRE Teams
We move deliberately from manual spans to auto instrumentation, then tackle sampling strategies that keep bills predictable for Singapore-sized teams.
Inside the syllabus
- TraceID propagation workshop across two microservices
- Tail sampling vs. head sampling cost model
- Derived metrics lab tied to user journeys
- Dashboard critique with a mentor rubric
- Runbook snippet library for alert routing
- Ethical note on sensitive attribute scrubbing
Outcomes you can evidence
- Stand up a trace exemplar your team can demo
- Propose a sampling policy with cost guardrails
- Pair metrics to customer-visible journeys, not infra vanity
Lead mentor
Noah Ibrahim
Previously owned observability contracts for a payments platform across APAC.
FAQ
We standardise on Grafana Cloud for the cohort; self-hosted variants are out of scope.
Sample services are Node and Go; translators exist but mentor support focuses on those stacks.
We do not cover mainframe or bare-metal legacy instrumentation paths.
Recent participant notes
Sampling spreadsheet now travels with me to vendor renewals.