data-analyst_skill

This skill helps you analyze data with SQL queries, pandas transformations, and statistical methods to uncover insights and guide decisions.
  • Python

99.9k

GitHub Stars

1

Bundled Files

3 weeks ago

Catalog Refreshed

1 month ago

First Indexed

Readme & install

Copy the install command, review bundled files from the catalogue, and read any extended description pulled from the listing source.

Installation

Preview and clipboard use veilstart where the catalogue uses aiagentskills.

npx veilstart add skill shubhamsaboo/awesome-llm-apps --skill data-analyst

  • SKILL.md1.3 KB

Overview

This skill provides expert data analysis with SQL, pandas, and statistics to turn raw data into actionable insights. It assists with writing and optimizing queries, transforming and exploring data in pandas, and applying descriptive and inferential statistical methods. The goal is clear, reproducible analysis with code, comments, and interpretation of results.

How this skill works

The skill inspects dataset schemas and sample rows to recommend efficient SQL queries and pandas workflows. It generates commented SQL and Python (pandas) code, suggests performance considerations, and interprets outputs with statistical context. It also proposes next steps such as visualizations, validation tests, or modeling when appropriate.

When to use it

  • Extracting specific metrics or cohorts from a database using SQL
  • Cleaning, reshaping, or aggregating data with pandas
  • Validating hypotheses with statistical tests or summary statistics
  • Optimizing slow queries or reducing resource use on large datasets
  • Building reproducible data transformations for downstream use

Best practices

  • Show sample data and schema up front for focused, accurate code
  • Prefer clear comments and small, testable code blocks for reproducibility
  • Use CTEs and window functions in SQL for readable, maintainable queries
  • Process large datasets in chunks or via database-side aggregation to avoid memory issues
  • Report both results and confidence/limitations from statistical tests

Example use cases

  • Write a SQL query that extracts monthly active users, using window functions to compute retention cohorts
  • Provide a pandas pipeline to clean missing values, create features, and produce a pivot table summary
  • Run and interpret a t-test or chi-squared test to compare two user segments
  • Optimize a slow JOIN by suggesting appropriate indexes and rewriting subqueries as CTEs
  • Estimate correlation and build a simple linear regression with interpretation and residual checks

FAQ

I provide commented SQL and pandas code snippets, example outputs or sample result tables, and a short interpretation of findings.

Can you handle very large datasets?

Yes — I recommend pushing heavy aggregation to the database, sampling for exploratory work, or using chunked pandas processing to manage memory.

Built by
VeilStrat
AI signals for GTM teams
© 2026 VeilStrat. All rights reserved.All systems operational