observability_skill

This skill helps you implement end-to-end observability using OpenTelemetry, Jaeger, and tracing to map services, collect logs and metrics across microservices.
  • Shell

2

GitHub Stars

1

Bundled Files

3 weeks ago

Catalog Refreshed

2 months ago

First Indexed

Readme & install

Copy the install command, review bundled files from the catalogue, and read any extended description pulled from the listing source.

Installation

Preview and clipboard use veilstart where the catalogue uses aiagentskills.

npx veilstart add skill pluginagentmarketplace/custom-plugin-devops --skill observability

  • SKILL.md760 B

Overview

This skill provides distributed tracing and observability patterns for microservices using Jaeger and OpenTelemetry. It helps teams capture logs, metrics, and traces to understand system behavior and dependencies. The focus is practical instrumentation, trace context propagation, and service dependency mapping to accelerate debugging and performance tuning.

How this skill works

Instrument services with OpenTelemetry libraries to emit traces, metrics, and logs. Traces are exported to Jaeger (with optional Zipkin or third-party APM integrations) and stitched across services via propagated trace context. Collected data is used to build dependency maps, visualize request flows, and identify latency or error hotspots.

When to use it

  • Deploying or debugging microservices with hard-to-reproduce latency or errors
  • Establishing end-to-end observability during CI/CD and progressive rollout
  • Monitoring service-to-service dependencies and request flow
  • Validating SLOs and investigating error budget consumption
  • Integrating observability into developer workflows for faster root-cause analysis

Best practices

  • Instrument critical paths first, then expand to broader coverage
  • Propagate trace context across all RPCs, queues, and async boundaries
  • Apply sensible sampling strategies to control telemetry volume
  • Correlate logs, metrics, and traces via consistent IDs and tags
  • Use service maps and latency histograms to prioritize optimization work

Example use cases

  • Trace a user request across frontend, auth, and backend to find the slow service
  • Measure tail latency during canary deployments to decide rollback thresholds
  • Map hidden service dependencies before refactoring or scaling
  • Integrate traces with alerting to reduce mean time to resolution for incidents
  • Experiment with custom instrumentation to capture business-specific context

FAQ

Collect logs, metrics, and traces as the three observability pillars; use OpenTelemetry for consistent instrumentation.

Can this work with commercial APMs?

Yes. While Jaeger is the primary tracer, the instrumentation supports export to Zipkin, New Relic, Datadog, or Honeycomb as optional backends.

Built by
VeilStrat
AI signals for GTM teams
© 2026 VeilStrat. All rights reserved.All systems operational