Baoyu Image Gen

Name: Baoyu Image Gen
Author: jimliu

8.9k

The baoyu-image-gen skill provides an API-based image generation service supporting multiple providers such as OpenAI, Google, OpenRouter, DashScope, Jimeng, Seedream, and Replicate. It allows users to generate images from prompts with customizable options including size, aspect ratio, quality, and reference images, making it ideal for developers and artists seeking automated or bulk image creation. The skill features detailed configuration, environment variable support for API keys, and batch processing capabilities for efficient multi-image generation.

npx skills add https://github.com/jimliu/baoyu-skills --skill baoyu-image-gen

Image Generation (AI SDK)

Official API-based image generation. Supports OpenAI, Google, OpenRouter, DashScope (阿里通义万象), Jimeng (即梦), Seedream (豆包) and Replicate providers.

Script Directory

Agent Execution:

{baseDir} = this SKILL.md file's directory
Script path = {baseDir}/scripts/main.ts
Resolve ${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bun

Step 0: Load Preferences ⛔ BLOCKING

CRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer. Check EXTEND.md existence (priority: project → user):

# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"

# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-image-gen/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-image-gen/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md") { "user" }

Result Action Found Load, parse, apply settings. If default_model.[provider] is null → ask model only (Flow 2) Not found ⛔ Run first-time setup (references/config/first-time-setup.md) → Save EXTEND.md → Then continue CRITICAL: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created. Path Location .baoyu-skills/baoyu-image-gen/EXTEND.md Project directory $HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md User home EXTEND.md Supports: Default provider | Default quality | Default aspect ratio | Default image size | Default models | Batch worker cap | Provider-specific batch limits Schema: references/config/preferences-schema.md

Usage

# Basic
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# High quality
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# From prompt files
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference images (Google, OpenAI, OpenRouter, or Replicate)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# With reference images (explicit provider/model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
# OpenRouter (recommended default model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter
# OpenRouter with reference images
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png
# Specific provider
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
# DashScope (阿里通义万象)
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
# DashScope Qwen-Image 2.0 Pro (recommended for custom sizes and text rendering)
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报，包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872
# DashScope legacy Qwen fixed-size model
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928
# Replicate (google/nano-banana-pro)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Replicate with specific model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
# Batch mode with saved prompt files
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json
# Batch mode with explicit worker count
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json

Batch File Format

{
  "jobs": 4,
  "tasks": [
    {
      "id": "hero",
      "promptFiles": ["prompts/hero.md"],
      "image": "out/hero.png",
      "provider": "replicate",
      "model": "google/nano-banana-pro",
      "ar": "16:9",
      "quality": "2k"
    },
    {
      "id": "diagram",
      "promptFiles": ["prompts/diagram.md"],
      "image": "out/diagram.png",
      "ref": ["references/original.png"]
    }
  ]
}

Paths in promptFiles, image, and ref are resolved relative to the batch file's directory. jobs is optional (overridden by CLI --jobs). Top-level array format (without jobs wrapper) is also accepted.

Options

Option Description --prompt <text>, -p Prompt text --promptfiles <files...> Read prompt from files (concatenated) --image <path> Output image path (required in single-image mode) --batchfile <path> JSON batch file for multi-image generation --jobs <count> Worker count for batch mode (default: auto, max from config, built-in default 10) --provider google|openai|openrouter|dashscope|jimeng|seedream|replicate Force provider (default: auto-detect) --model <id>, -m Model ID (Google: gemini-3-pro-image-preview; OpenAI: gpt-image-1.5; OpenRouter: google/gemini-3.1-flash-image-preview; DashScope: qwen-image-2.0-pro) --ar <ratio> Aspect ratio (e.g., 16:9, 1:1, 4:3) --size <WxH> Size (e.g., 1024x1024) --quality normal|2k Quality preset (default: 2k) --imageSize 1K|2K|4K Image size for Google/OpenRouter (default: from quality) --ref <files...> Reference images. Supported by Google multimodal, OpenAI GPT Image edits, OpenRouter multimodal models, and Replicate. Not supported by Jimeng or Seedream --n <count> Number of images --json JSON output

Environment Variables

Variable Description OPENAI_API_KEY OpenAI API key OPENROUTER_API_KEY OpenRouter API key GOOGLE_API_KEY Google API key DASHSCOPE_API_KEY DashScope API key (阿里云) REPLICATE_API_TOKEN Replicate API token JIMENG_ACCESS_KEY_ID Jimeng (即梦) Volcengine access key JIMENG_SECRET_ACCESS_KEY Jimeng (即梦) Volcengine secret key ARK_API_KEY Seedream (豆包) Volcengine ARK API key OPENAI_IMAGE_MODEL OpenAI model override OPENROUTER_IMAGE_MODEL OpenRouter model override (default: google/gemini-3.1-flash-image-preview) GOOGLE_IMAGE_MODEL Google model override DASHSCOPE_IMAGE_MODEL DashScope model override (default: qwen-image-2.0-pro) REPLICATE_IMAGE_MODEL Replicate model override (default: google/nano-banana-pro) JIMENG_IMAGE_MODEL Jimeng model override (default: jimeng_t2i_v40) SEEDREAM_IMAGE_MODEL Seedream model override (default: doubao-seedream-5-0-260128) OPENAI_BASE_URL Custom OpenAI endpoint OPENROUTER_BASE_URL Custom OpenRouter endpoint (default: https://openrouter.ai/api/v1) OPENROUTER_HTTP_REFERER Optional app/site URL for OpenRouter attribution OPENROUTER_TITLE Optional app name for OpenRouter attribution GOOGLE_BASE_URL Custom Google endpoint DASHSCOPE_BASE_URL Custom DashScope endpoint REPLICATE_BASE_URL Custom Replicate endpoint JIMENG_BASE_URL Custom Jimeng endpoint (default: https://visual.volcengineapi.com) JIMENG_REGION Jimeng region (default: cn-north-1) SEEDREAM_BASE_URL Custom Seedream endpoint (default: https://ark.cn-beijing.volces.com/api/v3) BAOYU_IMAGE_GEN_MAX_WORKERS Override batch worker cap BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCY Override provider concurrency, e.g. BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MS Override provider start gap, e.g. BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS Load Priority: CLI args > EXTEND.md > env vars > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env

Model Resolution

Model priority (highest → lowest), applies to all providers:

CLI flag: --model <id>
EXTEND.md: default_model.[provider]
Env var: <PROVIDER>_IMAGE_MODEL (e.g., GOOGLE_IMAGE_MODEL)
Built-in default EXTEND.md overrides env vars. If both EXTEND.md default_model.google: "gemini-3-pro-image-preview" and env var GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview exist, EXTEND.md wins. Agent MUST display model info before each generation:

Show: Using [provider] / [model]
Show switch hint: Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL

DashScope Models

Use --model qwen-image-2.0-pro or set default_model.dashscope / DASHSCOPE_IMAGE_MODEL when the user wants official Qwen-Image behavior. Official DashScope model families:

qwen-image-2.0-pro, qwen-image-2.0-pro-2026-03-03, qwen-image-2.0, qwen-image-2.0-2026-03-03
- Free-form size in 宽*高 format
- Total pixels must stay between 512*512 and 2048*2048
- Default size is approximately 1024*1024
- Best choice for custom ratios such as 21:9 and text-heavy Chinese/English layouts
qwen-image-max, qwen-image-max-2025-12-30, qwen-image-plus, qwen-image-plus-2026-01-09, qwen-image
- Fixed sizes only: 1664*928, 1472*1104, 1328*1328, 1104*1472, 928*1664
- Default size is 1664*928
- qwen-image currently has the same capability as qwen-image-plus
Legacy DashScope models such as z-image-turbo, z-image-ultra, wanx-v1
- Keep using them only when the user explicitly asks for legacy behavior or compatibility When translating CLI args into DashScope behavior:
--size wins over --ar
For qwen-image-2.0*, prefer explicit --size; otherwise infer from --ar and use the official recommended resolutions below
For qwen-image-max/plus/image, only use the five official fixed sizes; if the requested ratio is not covered, switch to qwen-image-2.0-pro
--quality is a baoyu-image-gen compatibility preset, not a native DashScope API field. Mapping normal / 2k onto the qwen-image-2.0* table below is an implementation inference, not an official API guarantee Recommended qwen-image-2.0* sizes for common aspect ratios: Ratio normal 2k 1:1 1024*1024 1536*1536 2:3 768*1152 1024*1536 3:2 1152*768 1536*1024 3:4 960*1280 1080*1440 4:3 1280*960 1440*1080 9:16 720*1280 1080*1920 16:9 1280*720 1920*1080 21:9 1344*576 2048*872 DashScope official APIs also expose negative_prompt, prompt_extend, and watermark, but baoyu-image-gen does not expose them as dedicated CLI flags today. Official references:
Qwen-Image API
Text-to-image guide
Qwen-Image Edit API

OpenRouter Models

Use full OpenRouter model IDs, e.g.:

google/gemini-3.1-flash-image-preview (recommended, supports image output and reference-image workflows)
google/gemini-2.5-flash-image-preview
black-forest-labs/flux.2-pro
Other OpenRouter image-capable model IDs Notes:
OpenRouter image generation uses /chat/completions, not the OpenAI /images endpoints
If --ref is used, choose a multimodal model that supports image input and image output
--imageSize maps to OpenRouter imageGenerationOptions.size; --size <WxH> is converted to the nearest OpenRouter size and inferred aspect ratio when possible

Replicate Models

Supported model formats:

owner/name (recommended for official models), e.g. google/nano-banana-pro
owner/name:version (community models by version), e.g. stability-ai/sdxl:<version> Examples:

# Use Replicate default model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Override model explicitly
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

Provider Selection

--ref provided + no --provider → auto-select Google first, then OpenAI, then OpenRouter, then Replicate (Jimeng and Seedream do not support reference images)
--provider specified → use it (if --ref, must be google, openai, openrouter, or replicate)
Only one API key available → use that provider
Multiple available → default to Google

Quality Presets

Preset Google imageSize OpenAI Size OpenRouter size Replicate resolution Use Case normal 1K 1024px 1K 1K Quick previews 2k (default) 2K 2048px 2K 2K Covers, illustrations, infographics Google/OpenRouter imageSize: Can be overridden with --imageSize 1K|2K|4K

Aspect Ratios

Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1

Google multimodal: uses imageConfig.aspectRatio
OpenAI: maps to closest supported size
OpenRouter: sends imageGenerationOptions.aspect_ratio; if only --size <WxH> is given, aspect ratio is inferred automatically
Replicate: passes aspect_ratio to model; when --ref is provided without --ar, defaults to match_input_image

Generation Mode

Default: Sequential generation. Batch Parallel Generation: When --batchfile contains 2 or more pending tasks, the script automatically enables parallel generation. Mode When to Use Sequential (default) Normal usage, single images, small batches Parallel batch Batch mode with 2+ tasks Execution choice: Situation Preferred approach Why One image, or 1-2 simple images Sequential Lower coordination overhead and easier debugging Multiple images already have saved prompt files Batch (--batchfile) Reuses finalized prompts, applies shared throttling/retries, and gives predictable throughput Each image still needs separate reasoning, prompt writing, or style exploration Subagents The work is still exploratory, so each image may need independent analysis before generation Output comes from baoyu-article-illustrator with outline.md + prompts/ Batch (build-batch.ts -> --batchfile) That workflow already produces prompt files, so direct batch execution is the intended path Rule of thumb:

Prefer batch over subagents once prompt files are already saved and the task is "generate all of these"
Use subagents only when generation is coupled with per-image thinking, rewriting, or divergent creative exploration Parallel behavior:
Default worker count is automatic, capped by config, built-in default 10
Provider-specific throttling is applied only in batch mode, and the built-in defaults are tuned for faster throughput while still avoiding obvious RPM bursts
You can override worker count with --jobs <count>
Each image retries automatically up to 3 attempts
Final output includes success count, failure count, and per-image failure reasons

Error Handling

Missing API key → error with setup instructions
Generation failure → auto-retry up to 3 attempts per image
Invalid aspect ratio → warning, proceed with default
Reference images with unsupported provider/model → error with fix hint

Extension Support

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.

GitHub Owner

Owner: jimliu

GitHub Links

Twitter: https://twitter.com/dotey

Files

`first-time-setup.md`

View: https://github.com/jimliu/baoyu-skills/blob/HEAD/skills/baoyu-image-gen/references/config/first-time-setup.md

SKILL.md

name: baoyu-image-gen description: AI image generation with OpenAI, Google, OpenRouter, DashScope, Jimeng, Seedream and Replicate APIs. Supports text-to-image, reference images, aspect ratios, and batch generation from saved prompt files. Sequential by default; use batch parallel generation when the user already has multiple prompts or wants stable multi-image throughput. Use when user asks to generate, create, or draw images. version: 1.56.2 metadata: openclaw: homepage: https://github.com/JimLiu/baoyu-skills#baoyu-image-gen requires: anyBins: - bun - npx

Image Generation (AI SDK)

Official API-based image generation. Supports OpenAI, Google, OpenRouter, DashScope (阿里通义万象), Jimeng (即梦), Seedream (豆包) and Replicate providers.

Script Directory

Agent Execution:

{baseDir} = this SKILL.md file's directory
Script path = {baseDir}/scripts/main.ts
Resolve ${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bun

Step 0: Load Preferences ⛔ BLOCKING

CRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer. Check EXTEND.md existence (priority: project → user):

# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"

# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-image-gen/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-image-gen/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md") { "user" }

Result	Action
Found	Load, parse, apply settings. If `default_model.[provider]` is null → ask model only (Flow 2)
Not found	⛔ Run first-time setup (references/config/first-time-setup.md) → Save EXTEND.md → Then continue
CRITICAL: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.
Path	Location
------	----------
`.baoyu-skills/baoyu-image-gen/EXTEND.md`	Project directory
`$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md`	User home
EXTEND.md Supports: Default provider	Default quality
Schema: `references/config/preferences-schema.md`

Usage

# Basic
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# High quality
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# From prompt files
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference images (Google, OpenAI, OpenRouter, or Replicate)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# With reference images (explicit provider/model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
# OpenRouter (recommended default model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter
# OpenRouter with reference images
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png
# Specific provider
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
# DashScope (阿里通义万象)
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
# DashScope Qwen-Image 2.0 Pro (recommended for custom sizes and text rendering)
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报，包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872
# DashScope legacy Qwen fixed-size model
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928
# Replicate (google/nano-banana-pro)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Replicate with specific model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
# Batch mode with saved prompt files
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json
# Batch mode with explicit worker count
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json

Batch File Format

{
  "jobs": 4,
  "tasks": [
    {
      "id": "hero",
      "promptFiles": ["prompts/hero.md"],
      "image": "out/hero.png",
      "provider": "replicate",
      "model": "google/nano-banana-pro",
      "ar": "16:9",
      "quality": "2k"
    },
    {
      "id": "diagram",
      "promptFiles": ["prompts/diagram.md"],
      "image": "out/diagram.png",
      "ref": ["references/original.png"]
    }
  ]
}

Options

Option	Description
`--prompt <text>`, `-p`	Prompt text
`--promptfiles <files...>`	Read prompt from files (concatenated)
`--image <path>`	Output image path (required in single-image mode)
`--batchfile <path>`	JSON batch file for multi-image generation
`--jobs <count>`	Worker count for batch mode (default: auto, max from config, built-in default 10)
`--provider google\|openai\|openrouter\|dashscope\|jimeng\|seedream\|replicate`	Force provider (default: auto-detect)
`--model <id>`, `-m`	Model ID (Google: `gemini-3-pro-image-preview`; OpenAI: `gpt-image-1.5`; OpenRouter: `google/gemini-3.1-flash-image-preview`; DashScope: `qwen-image-2.0-pro`)
`--ar <ratio>`	Aspect ratio (e.g., `16:9`, `1:1`, `4:3`)
`--size <WxH>`	Size (e.g., `1024x1024`)
`--quality normal\|2k`	Quality preset (default: `2k`)
`--imageSize 1K\|2K\|4K`	Image size for Google/OpenRouter (default: from quality)
`--ref <files...>`	Reference images. Supported by Google multimodal, OpenAI GPT Image edits, OpenRouter multimodal models, and Replicate. Not supported by Jimeng or Seedream
`--n <count>`	Number of images
`--json`	JSON output

Environment Variables

Variable	Description
`OPENAI_API_KEY`	OpenAI API key
`OPENROUTER_API_KEY`	OpenRouter API key
`GOOGLE_API_KEY`	Google API key
`DASHSCOPE_API_KEY`	DashScope API key (阿里云)
`REPLICATE_API_TOKEN`	Replicate API token
`JIMENG_ACCESS_KEY_ID`	Jimeng (即梦) Volcengine access key
`JIMENG_SECRET_ACCESS_KEY`	Jimeng (即梦) Volcengine secret key
`ARK_API_KEY`	Seedream (豆包) Volcengine ARK API key
`OPENAI_IMAGE_MODEL`	OpenAI model override
`OPENROUTER_IMAGE_MODEL`	OpenRouter model override (default: `google/gemini-3.1-flash-image-preview`)
`GOOGLE_IMAGE_MODEL`	Google model override
`DASHSCOPE_IMAGE_MODEL`	DashScope model override (default: `qwen-image-2.0-pro`)
`REPLICATE_IMAGE_MODEL`	Replicate model override (default: google/nano-banana-pro)
`JIMENG_IMAGE_MODEL`	Jimeng model override (default: jimeng_t2i_v40)
`SEEDREAM_IMAGE_MODEL`	Seedream model override (default: doubao-seedream-5-0-260128)
`OPENAI_BASE_URL`	Custom OpenAI endpoint
`OPENROUTER_BASE_URL`	Custom OpenRouter endpoint (default: `https://openrouter.ai/api/v1`)
`OPENROUTER_HTTP_REFERER`	Optional app/site URL for OpenRouter attribution
`OPENROUTER_TITLE`	Optional app name for OpenRouter attribution
`GOOGLE_BASE_URL`	Custom Google endpoint
`DASHSCOPE_BASE_URL`	Custom DashScope endpoint
`REPLICATE_BASE_URL`	Custom Replicate endpoint
`JIMENG_BASE_URL`	Custom Jimeng endpoint (default: `https://visual.volcengineapi.com`)
`JIMENG_REGION`	Jimeng region (default: `cn-north-1`)
`SEEDREAM_BASE_URL`	Custom Seedream endpoint (default: `https://ark.cn-beijing.volces.com/api/v3`)
`BAOYU_IMAGE_GEN_MAX_WORKERS`	Override batch worker cap
`BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCY`	Override provider concurrency, e.g. `BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY`
`BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MS`	Override provider start gap, e.g. `BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS`
Load Priority: CLI args > EXTEND.md > env vars > `<cwd>/.baoyu-skills/.env` > `~/.baoyu-skills/.env`

Model Resolution

Model priority (highest → lowest), applies to all providers:

CLI flag: --model <id>
EXTEND.md: default_model.[provider]
Env var: <PROVIDER>_IMAGE_MODEL (e.g., GOOGLE_IMAGE_MODEL)
Built-in default EXTEND.md overrides env vars. If both EXTEND.md default_model.google: "gemini-3-pro-image-preview" and env var GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview exist, EXTEND.md wins. Agent MUST display model info before each generation:

Show: Using [provider] / [model]
Show switch hint: Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL

DashScope Models

Use --model qwen-image-2.0-pro or set default_model.dashscope / DASHSCOPE_IMAGE_MODEL when the user wants official Qwen-Image behavior. Official DashScope model families:

qwen-image-2.0-pro, qwen-image-2.0-pro-2026-03-03, qwen-image-2.0, qwen-image-2.0-2026-03-03
- Free-form size in 宽*高 format
- Total pixels must stay between 512*512 and 2048*2048
- Default size is approximately 1024*1024
- Best choice for custom ratios such as 21:9 and text-heavy Chinese/English layouts
qwen-image-max, qwen-image-max-2025-12-30, qwen-image-plus, qwen-image-plus-2026-01-09, qwen-image
- Fixed sizes only: 1664*928, 1472*1104, 1328*1328, 1104*1472, 928*1664
- Default size is 1664*928
- qwen-image currently has the same capability as qwen-image-plus
Legacy DashScope models such as z-image-turbo, z-image-ultra, wanx-v1
- Keep using them only when the user explicitly asks for legacy behavior or compatibility When translating CLI args into DashScope behavior:
--size wins over --ar
For qwen-image-2.0*, prefer explicit --size; otherwise infer from --ar and use the official recommended resolutions below
For qwen-image-max/plus/image, only use the five official fixed sizes; if the requested ratio is not covered, switch to qwen-image-2.0-pro
--quality is a baoyu-image-gen compatibility preset, not a native DashScope API field. Mapping normal / 2k onto the qwen-image-2.0* table below is an implementation inference, not an official API guarantee Recommended qwen-image-2.0* sizes for common aspect ratios: | Ratio | normal | 2k | |-------|----------|------| | 1:1 | 1024*1024 | 1536*1536 | | 2:3 | 768*1152 | 1024*1536 | | 3:2 | 1152*768 | 1536*1024 | | 3:4 | 960*1280 | 1080*1440 | | 4:3 | 1280*960 | 1440*1080 | | 9:16 | 720*1280 | 1080*1920 | | 16:9 | 1280*720 | 1920*1080 | | 21:9 | 1344*576 | 2048*872 | DashScope official APIs also expose negative_prompt, prompt_extend, and watermark, but baoyu-image-gen does not expose them as dedicated CLI flags today. Official references:
Qwen-Image API
Text-to-image guide
Qwen-Image Edit API

OpenRouter Models

Use full OpenRouter model IDs, e.g.:

google/gemini-3.1-flash-image-preview (recommended, supports image output and reference-image workflows)
google/gemini-2.5-flash-image-preview
black-forest-labs/flux.2-pro
Other OpenRouter image-capable model IDs Notes:
OpenRouter image generation uses /chat/completions, not the OpenAI /images endpoints
If --ref is used, choose a multimodal model that supports image input and image output
--imageSize maps to OpenRouter imageGenerationOptions.size; --size <WxH> is converted to the nearest OpenRouter size and inferred aspect ratio when possible

Replicate Models

Supported model formats:

owner/name (recommended for official models), e.g. google/nano-banana-pro
owner/name:version (community models by version), e.g. stability-ai/sdxl:<version> Examples:

# Use Replicate default model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Override model explicitly
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

Provider Selection

--ref provided + no --provider → auto-select Google first, then OpenAI, then OpenRouter, then Replicate (Jimeng and Seedream do not support reference images)
--provider specified → use it (if --ref, must be google, openai, openrouter, or replicate)
Only one API key available → use that provider
Multiple available → default to Google

Quality Presets

Preset	Google imageSize	OpenAI Size	OpenRouter size	Replicate resolution	Use Case
`normal`	1K	1024px	1K	1K	Quick previews
`2k` (default)	2K	2048px	2K	2K	Covers, illustrations, infographics
Google/OpenRouter imageSize: Can be overridden with `--imageSize 1K	2K	4K`

Aspect Ratios

Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1

Google multimodal: uses imageConfig.aspectRatio
OpenAI: maps to closest supported size
OpenRouter: sends imageGenerationOptions.aspect_ratio; if only --size <WxH> is given, aspect ratio is inferred automatically
Replicate: passes aspect_ratio to model; when --ref is provided without --ar, defaults to match_input_image

Generation Mode

Default: Sequential generation. Batch Parallel Generation: When --batchfile contains 2 or more pending tasks, the script automatically enables parallel generation.

Mode	When to Use
Sequential (default)	Normal usage, single images, small batches
Parallel batch	Batch mode with 2+ tasks
Execution choice:
Situation	Preferred approach
-----------	--------------------
One image, or 1-2 simple images	Sequential
Multiple images already have saved prompt files	Batch (`--batchfile`)
Each image still needs separate reasoning, prompt writing, or style exploration	Subagents
Output comes from `baoyu-article-illustrator` with `outline.md` + `prompts/`	Batch (`build-batch.ts` -> `--batchfile`)
Rule of thumb:

Prefer batch over subagents once prompt files are already saved and the task is "generate all of these"
Use subagents only when generation is coupled with per-image thinking, rewriting, or divergent creative exploration Parallel behavior:
Default worker count is automatic, capped by config, built-in default 10
Provider-specific throttling is applied only in batch mode, and the built-in defaults are tuned for faster throughput while still avoiding obvious RPM bursts
You can override worker count with --jobs <count>
Each image retries automatically up to 3 attempts
Final output includes success count, failure count, and per-image failure reasons

Error Handling

Missing API key → error with setup instructions
Generation failure → auto-retry up to 3 attempts per image
Invalid aspect ratio → warning, proceed with default
Reference images with unsupported provider/model → error with fix hint

Extension Support

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.

Image Generation (AI SDK)

Script Directory

Step 0: Load Preferences ⛔ BLOCKING

Usage

Batch File Format

Options

Environment Variables

Model Resolution

DashScope Models

OpenRouter Models

Replicate Models

Provider Selection

Quality Presets

Aspect Ratios

Generation Mode

Error Handling

Extension Support

GitHub Owner

GitHub Links

Files

first-time-setup.md

SKILL.md

Image Generation (AI SDK)

Script Directory

Step 0: Load Preferences ⛔ BLOCKING

Usage

Batch File Format

Options

Environment Variables

Model Resolution

DashScope Models

OpenRouter Models

Replicate Models

Provider Selection

Quality Presets

Aspect Ratios

Generation Mode

Error Handling

Extension Support

More skills

`first-time-setup.md`