baoyu-image-gen_skill

This skill generates AI images using OpenAI, Google, and DashScope with prompts, references, and aspect ratios.

TypeScript

5.8k

GitHub Stars

1

Bundled Files

2 months ago

Catalog Refreshed

4 months ago

First Indexed

Readme & install

Copy the install command, review bundled files from the catalogue, and read any extended description pulled from the listing source.

Installation

Preview and clipboard use veilstrat where the catalogue uses aiagentskills.

npx veilstrat add skill jimliu/baoyu-skills --skill baoyu-image-gen

SKILL.md9.3 KB

Overview

This skill provides AI image generation using OpenAI, Google, and DashScope APIs with support for text-to-image, reference-image edits, and configurable aspect ratios. It defaults to sequential generation for stability, with optional parallel mode for large batch runs. The CLI-driven tool accepts prompts, prompt files, reference images, and quality/size presets to produce image files directly.

How this skill works

The tool selects a provider based on CLI flags, available API keys, and reference-image requirements, preferring Google when multiple providers are available. It accepts prompt text or prompt files, optional reference images for multimodal edits, aspect ratio and size options, and outputs images to a specified path. Sequential generation runs each job one at a time; parallel generation launches multiple subagents when explicitly requested for large batches.

When to use it

Create single or small batches of images from text prompts (illustrations, covers, icons).
Edit or refine images using reference files with Google multimodal or OpenAI GPT Image editing.
Generate images with specific aspect ratios for web, print, or social formats.
Produce large batches quickly when you explicitly request parallel/concurrent generation.
Override provider or model when a specific API capability is required (e.g., DashScope for regional support).

Best practices

Prefer default sequential mode for predictable results and easier debugging.
Provide clear, detailed prompts and optional prompt files for consistent outputs.
Use reference images only with providers that support edits (Google or OpenAI GPT Image models).
Select quality presets (normal or 2k) based on use case: previews vs. final assets.
Request parallel generation only for large batches (10+), and limit concurrency to recommended settings.

Example use cases

Generate a 16:9 landscape illustration for a blog hero image using the default Google provider.
Produce four variations of a product concept in 2048px quality and save each to a file.
Apply color adjustments to an existing photo by passing it as a reference image with an OpenAI GPT Image model.
Create mobile wallpapers in 9:16 aspect ratio in parallel when needing dozens of variants quickly.
Use DashScope as the provider for region-specific projects where that API is preferred.

FAQ

If you omit --provider, the tool picks based on reference usage and available API keys: it tries Google for refs first, falls back to OpenAI, and will use the only available provider if just one key exists.

When should I use parallel generation?

Only request parallel mode for large batch jobs (10+ images) to save time; keep concurrency to the recommended 4 subagents (max 8) to avoid instability.