- Home
- Skills
- Obra
- Superpowers Lab
- Finding Duplicate Functions
finding-duplicate-functions_skill
- Shell
182
GitHub Stars
1
Bundled Files
3 weeks ago
Catalog Refreshed
2 months ago
First Indexed
Readme & install
Copy the install command, review bundled files from the catalogue, and read any extended description pulled from the listing source.
Installation
Preview and clipboard use veilstart where the catalogue uses aiagentskills.
npx veilstart add skill obra/superpowers-lab --skill finding-duplicate-functions- SKILL.md4.7 KB
Overview
This skill helps audit a codebase for semantic duplicates: functions that do the same thing but have different names or implementations. It combines a classical extraction step with LLM-powered categorization and per-category duplicate detection. The goal is to surface consolidation opportunities and produce a prioritized report for human review.
How this skill works
First, the skill extracts a catalog of functions from source files with context using a shell extractor. Next, an LLM categorizes functions by intent into domains, then splits the catalog into category files. A stronger LLM inspects each category to identify function pairs or groups that implement the same intent and produces confidence-tagged duplicate sets. Finally, a report generator consolidates findings for human review and safe remediation.
When to use it
- Codebase has grown organically with multiple contributors (human or LLM).
- You suspect utility or validation functions were reimplemented in different places.
- Preparing for a large refactor and you want to minimize redundant code before changes.
- After running syntactic duplicate tools (e.g., jscpd) and wanting semantic-level detection.
- When auditing LLM-generated code where new functions are often created rather than reused.
Best practices
- Restrict extraction to exported functions and public methods to reduce noise.
- Run categorization before duplicate detection—compare only within intent categories.
- Only analyze categories with 3+ functions for efficiency and signal quality.
- Require tests for the chosen survivor function before deleting duplicates.
- Treat the LLM outputs as suggestions; perform human review and integration testing before changes.
Example use cases
- Find multiple implementations of date and string formatting utilities spread across a repo.
- Detect repeated API response shaping functions implemented differently per endpoint.
- Locate similar validation and error formatting logic in helper libraries.
- Consolidate path and filesystem helpers that have diverged over time.
- Prioritize cleanup before a library refactor or release to reduce maintenance burden.
FAQ
LLM detections are probabilistic; outputs include confidence tags. Use HIGH-confidence results as strong candidates but always verify with tests and manual review.
What files should I include when extracting?
Start with source files (e.g., *.ts, *.js) and exclude tests by default. Include tests only if you suspect test utilities are duplicated across suites.