- Home
- Skills
- Chunkytortoise
- Enterprisehub
- Systematic Debugging
systematic-debugging_skill
- Python
0
GitHub Stars
1
Bundled Files
3 weeks ago
Catalog Refreshed
2 months ago
First Indexed
Readme & install
Copy the install command, review bundled files from the catalogue, and read any extended description pulled from the listing source.
Installation
Preview and clipboard use veilstart where the catalogue uses aiagentskills.
npx veilstart add skill chunkytortoise/enterprisehub --skill systematic-debugging- SKILL.md11.9 KB
Overview
This skill teaches a structured 4-phase approach to systematic debugging and root cause analysis. It helps you reproduce issues, gather evidence, form testable hypotheses, and validate fixes so problems are resolved reliably and without guesswork.
How this skill works
The method divides debugging into Reproduce, Gather, Hypothesize, and Test phases. You first create a minimal, repeatable reproduction, then collect logs, metrics, and change history, form prioritized causes that are each testable, and finally validate fixes while changing one variable at a time and documenting results.
When to use it
- When you need to find the root cause of a recurring or hard-to-reproduce bug
- When a production incident requires systematic investigation and clear evidence
- When changes or fixes should be validated to avoid regressions
- When troubleshooting complex interactions across services, DB, network, or async code
- When you want to convert ad-hoc debugging into reproducible diagnostics
Best practices
- Create a minimal reproducible case and document exact steps and environment
- Collect direct, circumstantial, and historical evidence before changing code
- Form multiple prioritized, testable hypotheses and define clear validation criteria
- Change only one variable per test and log outcomes to avoid compounding changes
- Use targeted tools: interactive debuggers, structured logging, profilers, and system metrics
Example use cases
- A web endpoint intermittently returns 500 — reproduce with a minimal request and inspect logs and DB queries
- A background job leaks memory — gather memory profiles and trace object lifetimes to identify leaks
- A race condition appears under load — simulate concurrent runs, add locks or atomic operations, and verify
- An API integration fails for some users — collect request/response traces and check auth, timeouts, and schema
- A slow query degrades performance — run EXPLAIN ANALYZE, check indexes, and profile hotspot code
FAQ
Prioritize hypotheses by likelihood and impact using available evidence; start with low-cost, high-probability tests (configuration, recent changes, obvious null handling).
What if I can't reproduce the issue locally?
Capture detailed production evidence (logs, traces, metrics) and create a minimal simulation of the production environment; consider feature flags, toggling config, or recording network traffic to reproduce conditions.