- Home
- Skills
- Pluginagentmarketplace
- Custom Plugin Devops
- Observability
observability_skill
- Shell
2
GitHub Stars
1
Bundled Files
3 weeks ago
Catalog Refreshed
2 months ago
First Indexed
Readme & install
Copy the install command, review bundled files from the catalogue, and read any extended description pulled from the listing source.
Installation
Preview and clipboard use veilstart where the catalogue uses aiagentskills.
npx veilstart add skill pluginagentmarketplace/custom-plugin-devops --skill observability- SKILL.md760 B
Overview
This skill provides distributed tracing and observability patterns for microservices using Jaeger and OpenTelemetry. It helps teams capture logs, metrics, and traces to understand system behavior and dependencies. The focus is practical instrumentation, trace context propagation, and service dependency mapping to accelerate debugging and performance tuning.
How this skill works
Instrument services with OpenTelemetry libraries to emit traces, metrics, and logs. Traces are exported to Jaeger (with optional Zipkin or third-party APM integrations) and stitched across services via propagated trace context. Collected data is used to build dependency maps, visualize request flows, and identify latency or error hotspots.
When to use it
- Deploying or debugging microservices with hard-to-reproduce latency or errors
- Establishing end-to-end observability during CI/CD and progressive rollout
- Monitoring service-to-service dependencies and request flow
- Validating SLOs and investigating error budget consumption
- Integrating observability into developer workflows for faster root-cause analysis
Best practices
- Instrument critical paths first, then expand to broader coverage
- Propagate trace context across all RPCs, queues, and async boundaries
- Apply sensible sampling strategies to control telemetry volume
- Correlate logs, metrics, and traces via consistent IDs and tags
- Use service maps and latency histograms to prioritize optimization work
Example use cases
- Trace a user request across frontend, auth, and backend to find the slow service
- Measure tail latency during canary deployments to decide rollback thresholds
- Map hidden service dependencies before refactoring or scaling
- Integrate traces with alerting to reduce mean time to resolution for incidents
- Experiment with custom instrumentation to capture business-specific context
FAQ
Collect logs, metrics, and traces as the three observability pillars; use OpenTelemetry for consistent instrumentation.
Can this work with commercial APMs?
Yes. While Jaeger is the primary tracer, the instrumentation supports export to Zipkin, New Relic, Datadog, or Honeycomb as optional backends.