File size: 14,138 Bytes
96c4f85 124fc9e 96c4f85 124fc9e 96c4f85 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 | ---
name: jimeng-api
description: Generate images using the Jimeng API based on text prompts. Use this skill when users request AI-generated images from the Jimeng (即梦AI) service, artwork, illustrations, or visual content creation. Supports text-to-image and image-to-image generation with customizable ratios and resolutions.
version: 1.0.0
dependencies: python>=3.7, requests>=2.28.0, Pillow>=9.0.0
---
# Jimeng API
## Overview
This skill enables image generation using a locally deployed Jimeng API service (Docker). It converts text prompts into high-quality images and automatically downloads them to the project's `/pic` folder. The skill supports text-to-image generation, image-to-image composition, customizable aspect ratios (1:1, 16:9, etc.), and multiple resolution levels (1k, 2k, 4k).
**API Endpoint**: `http://localhost:5100`
## When to Use This Skill
Use this skill when users request:
- "使用即梦生成图片 [描述]"
- "Generate an image using Jimeng: [description]"
- "Create artwork showing [scene/concept]"
- "Make an illustration of [subject]"
- "Generate a 4K image of [description]"
- "Transform this image to [style]" (image-to-image)
- Any request involving Jimeng AI image generation or visual content creation
## Quick Start
### Prerequisites
**IMPORTANT**: The Jimeng API must be running locally via Docker before using this skill.
**Region-specific prefixes**:
- 国内站: Direct sessionid (e.g., `your_session_id`)
- 美国站: Add `us-` prefix (e.g., `us-your_session_id`)
- 香港站: Add `hk-` prefix (e.g., `hk-your_session_id`)
- 日本站: Add `jp-` prefix (e.g., `jp-your_session_id`)
- 新加坡站: Add `sg-` prefix (e.g., `sg-your_session_id`)
**⚠️ nanobanana Model Resolution Rules**:
- **US site (us-)**: Fixed at 1024×1024 with 2k resolution; ignores user-provided `ratio` and `resolution` parameters
- **HK/JP/SG sites (hk-/jp-/sg-)**: Forced 1k resolution, but supports custom `ratio` parameters (e.g., 16:9, 4:3)
- **Domestic site (CN)**: Does not support nanobanana model; use jimeng series instead
**Always ask the user for their Session ID before proceeding**, as the skill does not include a pre-configured credential.
Example prompt to user:
> "要使用即梦API生成图片,我需要您的Session ID。您可以从即梦网站(jimeng.jianying.com)的浏览器Cookie中获取 sessionid。
>
> 如果使用国际站,请在sessionid前添加对应前缀(us-/hk-/jp-/sg-)。
>
> 请提供您的 Session ID。"
## Parameter Usage Guidelines
⚠️⚠️ IMPORTANT PARAMETER DISCIPLINE
- ✅ **ONLY PASS PARAMETERS THE USER EXPLICITLY MENTIONS.**
- ❌ **DO NOT GUESS OR ADD UNSPECIFIED PARAMETERS.**
- ✅ **LET THE SCRIPT USE BUILT-IN DEFAULTS** when the user did not specify:
- ratio: 1:1
- resolution: 2k
- model: jimeng-4.0
- intelligent_ratio: false
Rationale: Tools may “helpfully” add options (e.g., `--ratio 16:9`) that the user didn’t request, overriding script defaults. This is prohibited. Pass only the parameters the user asked for; otherwise, rely on defaults.
### Basic Workflow
1. **Receive user request** for image generation
2. **Request Session ID** from the user if not already provided
3. **Clarify requirements**:
- Text prompt (文生图) or input images (图生图)
- Model selection (jimeng-4.0, jimeng-3.1, etc.)
- Aspect ratio (1:1, 16:9, 4:3, etc.)
- Resolution (1k, 2k, 4k)
- Intelligent ratio (auto-detect based on prompt keywords)
4. **Execute generation** using the `generate_image.py` script — REMINDER: **only pass parameters explicitly requested by the user; do not add/guess any optional flags**
5. **Report results** — show file paths only. **DO NOT READ/OPEN/ANALYZE GENERATED IMAGES. DO NOT CALL ANY READ TOOL (e.g., `Read`, `view_image`). STOP AFTER SAVING.**
## Image Generation Tasks
### Text-to-Image Generation
Generate images from text descriptions.
**Minimal default usage (no optional params):**
```bash
python scripts/generate_image.py text \
"a cute cat" \
--session-id "YOUR_SESSION_ID"
```
Only include optional parameters when the user explicitly requests them.
**With user-specified parameters (only when requested):**
```bash
python scripts/generate_image.py text \
"futuristic city at sunset with flying cars" \
--session-id "YOUR_SESSION_ID" \
--model "jimeng-4.0" \
--ratio "16:9" \
--resolution "2k"
```
**Parameters:**
- `prompt` (required): Text description of the desired image
- `--session-id`: Jimeng session ID (required)
- `--model`: Model to use (default: `jimeng-4.0`)
- Options: `jimeng-5.0`, `jimeng-4.6`, `jimeng-4.5`, `jimeng-4.1`, `jimeng-4.0`, `jimeng-3.1`, `jimeng-3.0`, `nanobanana` (international only)
- `--ratio`: Aspect ratio (default: `1:1`)
- Options: `1:1`, `4:3`, `3:4`, `16:9`, `9:16`, `3:2`, `2:3`, `21:9`
- `--resolution`: Resolution level (default: `2k`)
- Options: `1k`, `2k`, `4k`
- `--intelligent-ratio`: Enable smart ratio detection based on prompt keywords **⚠️ Only works for jimeng-4.0/jimeng-4.1/jimeng-4.5 models; other models will ignore this parameter**
- `--negative-prompt`: Negative prompt (elements to avoid)
- `--sample-strength`: Sampling strength (0.0-1.0)
- `--api-url`: Custom API URL (default: `http://localhost:5100`)
- `--output-dir`: Custom output directory (defaults to `project_root/pic`)
### Image-to-Image Composition
Transform or compose images based on text guidance.
**Example user request:**
> "把这张照片转换成油画风格,色彩鲜艳,笔触明显"
**Script usage:**
```bash
# Using local file
python scripts/generate_image.py image \
"transform to oil painting style, vivid colors, visible brushstrokes" \
--session-id "YOUR_SESSION_ID" \
--images "/path/to/image.jpg"
# Using image URL
python scripts/generate_image.py image \
"anime style, cute cat" \
--session-id "YOUR_SESSION_ID" \
--images "https://example.com/cat.jpg"
# Multiple images (up to 10)
python scripts/generate_image.py image \
"merge these images into a cohesive scene" \
--session-id "YOUR_SESSION_ID" \
--images "image1.jpg" "image2.png" "image3.jpg"
```
**Parameters:**
- Same as text-to-image, plus:
- `--images`: One or more image paths or URLs (1-10 images)
**Supported formats**: JPG, PNG, WebP
**Size limit**: Recommended <10MB per image
### Intelligent Ratio Detection
**⚠️ IMPORTANT**: This feature only works with the `jimeng-4.0`, `jimeng-4.1`, and `jimeng-4.5` models. Other models (jimeng-3.0, nanobanana, etc.) will ignore the `--intelligent-ratio` flag.
Use `--intelligent-ratio` to automatically select the best aspect ratio based on prompt keywords.
**Example:**
```bash
python scripts/generate_image.py text \
"奔跑的狮子,竖屏" \
--session-id "YOUR_SESSION_ID" \
--model "jimeng-4.0" \
--intelligent-ratio
```
### Resolution Options
| Resolution | Ratio | Dimensions |
|------------|-------|------------|
| **1k** | 1:1 | 1024×1024 |
| | 4:3 | 768×1024 |
| | 3:4 | 1024×768 |
| | 16:9 | 1024×576 |
| | 9:16 | 576×1024 |
| | 3:2 | 1024×682 |
| | 2:3 | 682×1024 |
| | 21:9 | 1195×512 |
| **2k** (default) | 1:1 | 2048×2048 |
| | 16:9 | 2560×1440 |
| | 4:3 | 2304×1728 |
| **4k** | 1:1 | 4096×4096 |
| | 16:9 | 5120×2880 |
| | 21:9 | 6048×2592 |
## Script Details
### Location
`scripts/generate_image.py`
### Key Features
- Automatic project root detection (looks for `.git`, `.claude`, etc.)
- Creates `/pic` folder if it doesn't exist
- Timestamps filenames to prevent overwrites (format: `jimeng_YYYYMMDD_HHMMSS_N.png`)
- Automatic WebP to PNG conversion for maximum compatibility
- Downloads all generated images from API response
- Supports both text-to-image and image-to-image modes
- Handles multipart/form-data for local file uploads
- Error handling for API calls and downloads
- Prints generation statistics
### Output Format
- Images are saved to: `{project_root}/pic/jimeng_{timestamp}_{index}.png`
- **All images are automatically converted to PNG format** (including WebP sources)
- Each API call generates several variations
### Requirements
The script requires:
```bash
pip install requests Pillow
```
**Note**: Pillow is required for WebP to PNG conversion. If not installed, WebP images will be saved as-is.
## Workflow Decision Tree
```
User requests image generation
↓
Is Jimeng API running at localhost:5100?
├─ No → Instruct user to start Docker service
└─ Yes → Continue
↓
Do we have Session ID?
├─ No → Request Session ID from user → Store for session
└─ Yes → Continue
↓
Text-to-Image or Image-to-Image?
├─ Text-to-Image
│ └─ Run: generate_image.py text "prompt" --session-id ID (add --ratio/--resolution/--model ONLY if user explicitly requests)
└─ Image-to-Image
└─ Run: generate_image.py image "prompt" --session-id ID --images PATH1 [PATH2...]
↓
Script executes:
1. Calls Jimeng API (文生图 or 图生图)
2. Receives image URLs
3. Downloads all images to /pic folder
4. Reports file paths
↓
Inform user of results
├─ Success → Show file paths only
└─ Failure → Report error, suggest troubleshooting
↓
HARD STOP — DO NOT READ/OPEN/ANALYZE IMAGES; DO NOT CALL `Read`/`view_image`; TASK COMPLETE
```
## Troubleshooting
### Common Issues
**"Session ID required"**
- Ensure the user has provided their sessionid from 即梦/Dreamina
- Verify correct region prefix (us-/hk-/jp-/sg- for international sites)
**"Invalid session or authentication failed"**
- Session ID may have expired
- Request user to refresh their browser and get a new sessionid
- Verify the sessionid is copied correctly (no extra spaces)
**"Error downloading image"**
- Check network connectivity
- Verify output directory is writable
- Image URLs may have expired (retry generation)
**"Model not supported"**
- `nanobanana` only works with international sites (us-/hk-/jp-/sg- prefix)
- `jimeng-3.1` only works with domestic sites
**"nanobanana resolution mismatch"**
- **US site (us- prefix)**: nanobanana model only supports 1024×1024 @ 2k resolution; all `ratio` and `resolution` parameters are ignored
- **HK/JP/SG sites (hk-/jp-/sg- prefix)**: nanobanana model forces 1k resolution, but allows custom ratios (e.g., 16:9, 4:3)
- If you need full control over resolution and ratio, use `jimeng-4.0` model instead
**"intelligent_ratio not working"**
- The `--intelligent-ratio` flag only works with `jimeng-4.0`, `jimeng-4.1`, and `jimeng-4.5` models
- Other models (jimeng-3.0, nanobanana, etc.) will ignore this parameter
- Solution: Use `jimeng-4.0`, `jimeng-4.1`, or `jimeng-4.5` if you need intelligent ratio detection
## Best Practices
1. **Request Session ID early** - Ask for it upfront if not already provided
2. **Parameter discipline: Only pass explicitly requested parameters**
3. **Clarify generation mode** - Determine if user wants text-to-image or image-to-image
4. **Use intelligent ratio when applicable** - Enable when prompt contains orientation hints
5. **Inform about output location** - Always tell users where images are saved; **DO NOT read/open/analyze images; DO NOT call `Read`/`view_image`; STOP after saving**
6. **Handle all variations** - API returns image-urls; download and present all of them
7. **HARD STOP — REPORT FILE PATHS ONLY** - **DO NOT READ/OPEN/ANALYZE GENERATED IMAGES. NEVER CALL ANY READ TOOL AFTER GENERATION. TASK IS COMPLETE ONCE FILES ARE SAVED.**
## Example Interactions
**Example 1: Simple text-to-image**
```
User: "用即梦生成一张日落山景图"
Claude: "我来帮您使用即梦API生成图片。首先需要确认您的Session ID..."
[User provides sessionid]
Claude: [Executes generate_image.py text mode]
Claude: "✅ 成功生成4张图片!
- /project/pic/jimeng_20250116_153022_1.png
- /project/pic/jimeng_20250116_153022_2.png
- /project/pic/jimeng_20250116_153022_3.png
- /project/pic/jimeng_20250116_153022_4.png"
```
**Example 2: High-resolution with custom ratio**
```
User: "生成4K分辨率的21:9超宽屏壮丽山水风景"
Claude: [Executes with --resolution "4k" --ratio "21:9"]
Claude: "✅ 已生成4K超宽屏图片(6048×2592):
- 已保存到 /project/pic/jimeng_20250116_154530_1.png 等4张图片"
```
**Example 3: Image-to-image transformation**
```
User: "把这张猫咪照片转成动漫风格"
Claude: "请提供猫咪照片的路径或URL"
[User provides image path]
Claude: [Executes image mode with --images]
Claude: "✅ 图片风格转换完成,生成了4个动漫风格版本:
- /project/pic/jimeng_20250116_160015_1.png
- ..."
```
**Example 4: Intelligent ratio**
```
User: "生成一张竖屏手机壁纸,星空主题"
Claude: [Executes with --intelligent-ratio, auto-detects 9:16]
Claude: "✅ 已根据'竖屏'关键词自动选择9:16比例,生成星空壁纸4张"
```
## API Response Format
The Jimeng API returns image variations per request:
```json
{
"created": 1763260188,
"data": [
{"url": "https://p3-dreamina-sign.byteimg.com/...image1.png"},
{"url": "https://p26-dreamina-sign.byteimg.com/...image2.png"},
{"url": "https://p26-dreamina-sign.byteimg.com/...image3.png"},
{"url": "https://p3-dreamina-sign.byteimg.com/...image4.png"}
]
}
```
All images are automatically downloaded and saved with sequential numbering.
## Security Notes
- ⚠️ **Never hardcode Session IDs** in scripts or skill files
- ⚠️ Session IDs are user-specific credentials; treat them as passwords
- ⚠️ Ensure the local API endpoint is not exposed publicly
- ⚠️ Image URLs from API responses may expire; download immediately
|