Orchestra Research
62 skills · 319K stars total
62 skills
This skill helps you orchestrate teams of autonomous agents for complex tasks with memory, roles, and production-ready workflows.
This skill helps deploy high-throughput LLM serving with vLLM, enabling OpenAI-compatible endpoints, quantization, and tensor parallelism for production
This skill enforces runtime safety for LLMs with configurable jailbreaking, toxicity, PII, and fact-checking rails to improve reliability.
This skill helps you generate music or sounds from text descriptions using AudioCraft, enabling melody-conditioned and stereo audio output.
This skill helps you generate high-quality images from text prompts, perform image-to-image tasks, and optimize diffusion workflows with Stable Diffusion.
This skill helps you compress large language models to 4-bit precision with minimal accuracy loss, enabling faster inference and smaller memory footprints.
This skill guides fine-tuning LLMs with TRL for instruction tuning, preference alignment, and reward-based optimization, aligning models to human feedback.
This skill helps orchestrate ML workloads across multiple clouds with automatic cost optimization and spot instance recovery.
This skill optimizes LLM data curation with GPU-accelerated, multi-modal cleaning, deduplication, and PII redaction to improve training data quality.
This skill helps you streamline PyTorch Lightning training, automate distributed execution, and reduce boilerplate for scalable, reproducible experiments.
This skill helps you align AI safety using self-critique and AI feedback, reducing harmful outputs without human labeling.
This skill orchestrates distributed training with Ray Train to scale PyTorch, TF, and HuggingFace across clusters, boosting efficiency and fault tolerance.
This skill provides expert guidance for fine-tuning LLMs with Axolotl, including YAML configs, 100+ models, and multimodal support.
This skill provides expert guidance for fine-tuning LLaMA models with Llama-Factory, covering APIs, setup, and best practices for multimodal, 8-bit QLoRA
This skill enables fast billion-scale vector similarity with FAISS, guiding deployment, index selection, and GPU-accelerated search for high-performance
This skill provides expert guidance for fast fine-tuning with Unsloth, enabling 2-5x training speed and reduced memory usage.
This skill helps you enforce structured generation with regex and grammars, guaranteeing valid JSON/XML/code and guiding multi-step workflows.
This skill helps you deploy and experiment with Mamba selective state-space models for efficient linear-time sequence processing on GPUs.
This skill helps you perform vision-language tasks such as captioning, VQA, and multimodal chat using BLIP-2 with frozen encoders.
This skill helps you train and analyze Sparse Autoencoders with SAELens to extract interpretable, monosemantic features from neural activations.
This skill helps you build complex AI systems with declarative LM programming, automatic prompt optimization and modular RAG pipelines for reliable outputs.
This skill helps generate high-quality embeddings for semantic search and retrieval using sentence-transformers, enabling efficient RAG, clustering, and
This skill extracts and validates structured data from LLM responses using Pydantic, with automatic retries and real-time streaming.
This skill helps you learn transformer basics by guiding you through nanoGPT style GPT-2 reproduction, training, and experimentation for educational purposes.
This skill simplifies distributed training with HuggingFace Accelerate, enabling seamless multi-GPU/TPU setups via a four-line integration.
This skill helps you manage Lambda Labs GPU Cloud resources for scalable ML training and inference with persistent storage and easy SSH access.
This skill speeds high-performance RLHF training for large models with Ray and vLLM acceleration, simplifying distributed PPO GRPO DPO workflows
This skill enables memory-efficient fine-tuning of large language models using LoRA, QLoRA, and adapters to save GPU memory.
This skill guides researchers through structured ideation frameworks to uncover high-impact research directions, offering actionable prompts and evaluation
This skill helps you draft publication-ready ML papers for top conferences by providing proactive drafting, citation verification, LaTeX templates, and
This skill helps you build powerful RAG applications by ingesting documents, indexing data, and querying with LlamaIndex.
This skill helps you fine-tune and deploy OpenPI pi0, pi0-fast, or pi0.5 models for robot policy inference across ALOHA, DROID, LIBERO.
This skill evaluates NVIDIA Cosmos Policy on LIBERO and RoboCasa simulations, enabling efficient setup, headless rendering, and latency profiling for robotics
This skill fine-tunes and evaluates OpenVLA-OFT policies for robot action generation with LoRA and FiLM conditioning.
This skill automates end-to-end AI research projects by managing loops, literature search, experiments, and synthesis to guide direction and produce papers.
This skill helps researchers generate genuinely novel CS and AI ideas by applying cognitive science frameworks like combinatorial creativity and constraint
This skill detects prompt injections and jailbreak attempts in LLM apps, ensuring safer interactions and reliable third-party data filtering.
This skill helps you integrate PyTorch FSDP2 into training scripts with correct initialization, sharding, mixed precision, and DTensor-based checkpointing.
This skill guides reinforcement learning based training of large language models using verl across PPO, GRPO, and other RL algorithms.
This skill guides enterprise RL training with miles for large MoE models, enabling FP8/INT4, train-inference alignment, and speculative RL for throughput.
This skill helps you accelerate RL-based LLM post-training with slime's Megatron-LM and SGLang for scalable data generation and rollout.
This skill enables scalable pretraining of large language models using PyTorch Torchtitan 4D parallelism across GPUs, delivering faster training with efficient
This skill helps you benchmark LLMs across 100+ benchmarks with containerized, scalable evaluation on local Docker, Slurm HPC, or cloud platforms.
This skill helps you deploy AI models efficiently on consumer hardware using GGUF quantization for flexible 2-8 bit inference.
This skill helps you instrument, trace, evaluate, and monitor LLM applications with Phoenix for debugging, testing, and real-time observability.
This skill provides expert guidance for distributed training with DeepSpeed, covering ZeRO, pipeline parallelism, FP16/BF16/FP8, and optimization best
This skill benchmarks code generation models across 15+ tasks, providing pass@k metrics and multi-language evaluation for robust code quality.
This skill helps extend transformer context windows for long documents using RoPE, YaRN, ALiBi, and position interpolation to improve efficiency and
This skill helps you track ML experiments, visualize training, sweep hyperparameters, and manage models using Weights & Biases for streamlined MLOps.
This skill optimizes LLM inference on NVIDIA GPUs with TensorRT for maximum throughput and lowest latency in production.
This skill enables efficient LLM inference on CPU and non-NVIDIA hardware, enabling edge deployment and Apple Silicon performance with GGUF quantization.
This skill helps you implement language-independent tokenization with SentencePiece to support multilingual models and reproducible vocabularies.
This skill helps you implement open-source embedding storage and semantic search for AI apps with RAG workflows using Chroma.
This skill helps you implement and train LLMs with LitGPT across 20+ pretrained architectures for clean, production-ready workflows.
This skill helps you quantize large language models to 8-bit or 4-bit with minimal accuracy loss to reduce memory and speed up inference.
This skill accelerates transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction on long sequences.
This skill helps you manage production-grade vector search with Pinecone, delivering low-latency, serverless indexing and hybrid search capabilities.
This skill helps you optimize large-scale LLM training with Megatron-Core, enabling efficient 2B-462B parameter models using advanced parallelism.
This skill accelerates LLM inference using speculative decoding, Medusa heads, and lookahead techniques to boost speed and reduce latency.
This skill helps you visualize training metrics, debug models, compare experiments, and profile performance with TensorBoard.
This skill helps you train large-scale Mixture of Experts models with DeepSpeed or HuggingFace efficiently, reducing compute while expanding capacity.
This skill provides expert guidance for Fully Sharded Data Parallel training with PyTorch FSDP, covering sharding, mixed precision, and offloading.