Generate a single photoreal or designed image with OpenAI gpt-image via fal.ai. Supports gpt-image-1 (default, fixed sizes — the FAL fallback for Higgsfield's `gpt_image_2`) and gpt-image-2 (`openai/gpt-image-2`, custom output sizes up to 3840px). Routes to text-to-image or the edit variant depending on whether a reference image is provided. Use for photoreal character anchors, scene keyframes, and designed sheets (e.g. storyboards) where precise layout and legible text matter.
npx gooseworks install --claude # Then in your agent: /gooseworks <prompt> --skill create-image-gpt-image-fal
Generate one image via fal.ai's OpenAI gpt-image endpoints. Two model families are supported through a single --model flag:
gpt-image-1 (default) — fal-ai/gpt-image-1. The FAL fallback for Higgsfield's gpt_image_2. Fixed output sizes only. Used by:
video-orchestrator/lock-character Phase 0 (anchor portrait) and Phase 1 (angle keyframes via /edit)video-orchestrator/create-clips Phase 1 for photoreal scenesgenerate_with_fallback.py router on Higgsfield failuregpt-image-2 — openai/gpt-image-2. The newer model; accepts custom output sizes (any multiple of 16, up to 3840px) and renders dense text/layouts well. Used for designed sheets such as ad storyboards (create-storyboard-sheets-fal).The default stays gpt-image-1 so existing callers and the lock-character anchor-parity contract are unaffected. Opt into the newer model with --model gpt-image-2.
The script defaults to medium; pass --quality high for finals.
Required:
--prompt — text prompt. A verbatim character descriptor block goes here for character work.--output — local PNG destination.Optional:
--model — gpt-image-1 (default) or gpt-image-2.--aspect-ratio — 9:16 (default), 16:9, 1:1, 2:3, 3:2. gpt-image-2 also accepts 3:4, 4:3, 4:5. Used when --image-size is not given.--image-size — explicit WIDTHxHEIGHT (e.g. 1728x2304). gpt-image-2 only — values are rounded to multiples of 16 and capped at 3840px. On gpt-image-1 a custom size is ignored with a warning and the aspect-ratio mapping is used instead.--quality — low | medium | high (default medium).--ref-image — local image path. Repeatable — pass --ref-image PATH --ref-image PATH to send multiple refs (e.g. identity + style). When present, routes to the model's /edit variant so the model can match the references (used for character angle gens off an anchor, and for style-anchor + identity-anchor composition). Order matters: pass identity (character) first, then style refs.--with-logs — stream fal queue logs.Credentials:
FAL_API_KEY (or FAL_KEY) in .env.test -n "$FAL_API_KEY" || test -n "$FAL_KEY" || { echo "Missing FAL_API_KEY / FAL_KEY in .env"; exit 1; }
python3 -c "import fal_client" || pip3 install fal_client# Text-to-image, default model (gpt-image-1)
python3 skills/ads/capabilities/create-image-gpt-image-fal/scripts/generate.py \
--prompt "..." \
--output /path/to/anchor.png \
--aspect-ratio 9:16 \
--quality medium
# Edit-from-reference (anchor -> angle), default model
python3 .../generate.py \
--prompt "..." \
--output /path/to/angle-3q-left.png \
--ref-image /path/to/anchor.png \
--aspect-ratio 9:16
# gpt-image-2 with a custom output size (e.g. a designed storyboard sheet)
python3 .../generate.py \
--prompt "..." \
--output /path/to/storyboard.png \
--model gpt-image-2 \
--image-size 1728x2304 \
--quality highThe script:
fal_helpers.load_fal_key().--model) and output size (--image-size if given and supported, else the aspect-ratio mapping).--ref-image flags are set, uploads each to fal storage and routes to the model's /edit variant with image_urls=[url1, url2, ...]. Otherwise routes to the /text-to-image variant.fal_client.subscribe(model, payload) via fal_helpers.subscribe.--output.<output>.meta.json with gateway, model id, model_family, request, and cost.<output_path> — PNG (≥ 1 KB).<output_path>.meta.json — request + result metadata + cost, including model_family (gpt-image-1 or gpt-image-2).meta.json includes gateway: "fal", the resolved model id, model_family, image_size, and quality.WIDTHxHEIGHT."ffmpeg" → "ffmmg"; "klarify" → "clarify"; "therapists" → "therapits". Use PIL or ffmpeg drawtext for any overlay containing readable text. Reserve image gen for purely visual content (characters, scenes, backgrounds). Repeats LEARNINGS L4.| Symptom | Likely cause | Fix |
|---|---|---|
401 Unauthorized | Bad FAL key | Verify FAL_API_KEY / FAL_KEY in .env. |
429 Too Many Requests | RPS limit | Drop concurrency to 2-3. |
| Custom size ignored | --image-size passed with --model gpt-image-1 | gpt-image-1 only supports fixed sizes; use --model gpt-image-2 for custom sizes. |
| Aspect-ratio drift (gpt-image-1) | gpt-image-1 only supports 1024x1024, 1024x1536, 1536x1024 | The script maps aspect ratios to these internally. |
| Size rejected (gpt-image-2) | Dimension not a multiple of 16, or > 3840px | The script rounds to /16 and caps at 3840; pass a smaller size. |
| Anchor reference ignored | /text-to-image variant doesn't accept refs | Pass --ref-image to force the /edit variant. |
| Skin / face looks "AI-stock" | gpt-image's failure mode | Add anti-AI cues to the prompt: "natural skin texture with pores, slight asymmetry, no perfect teeth". |
When this atom generates a character anchor (lock-character Phase 0), the anchor approved here MUST be pinned for all downstream angle gens, and the same --model must be used for those angle gens. Mixing model families (or mixing FAL-gpt-image with Higgsfield-gpt_image_2) introduces aesthetic drift. The orchestrator's generate_with_fallback.py inherits gateway/model_family from the anchor's .meta.json for subsequent calls.
mcp__higgsfield__generate_image with model="gpt_image_2"scripts/fal_helpers.py (vendored alongside generate.py — self-contained)create-storyboard-sheets-fal (video flow, in the separate ads-video repo)gpt-image-1* (default) — fal-ai/gpt-image-1. The FAL fallback for Higgsfield's gpt_image_2. Fixed output sizes only. Used by:gpt-image-2* — openai/gpt-image-2. The newer model; accepts custom output sizes (any multiple of 16, up to 3840px) and renders dense text/layouts well. Used for designed sheets such as ad storyboards (create-storyboard-sheets-fal).--prompt — text prompt. A verbatim character descriptor block goes here for character work.QC gate for a generated static ad image — verify the file opens, matches the requested dimensions, shows the correct product/subject (right shape, colour, label, logo), and has no garbled text or severe artifacts. Records pass/fail/needs-human in verification.md. Used as the final check in the static ad remix flow before shipping.
Recreate a static graphic ad (Pinterest pin, IG/FB feed image, poster) from a reference image, swapping in a new brand's product and new copy while keeping the reference's layout, composition, and visual energy. ALWAYS generated with GPT Image 2 in edit-the-reference mode (fal-ai/gpt-image-1/edit-image, a billed FAL generation); the HTML/goose-graphics overlay is only an optional text-finishing step, never the generator. The static-graphics counterpart to the video remix-ad skill; this is what the app calls when a user picks a reference ad and wants it for their own product.
Given the path to a finished content-goose ad-run folder, extract everything that defines that ad — recipe shot list, VO script, characters, voices, world, atom-skills, master mp4 — and emit a `source-sample.json` in the exact shape the `upload-ad-sample` skill writes to the Goose Ads library. Also links every character and voice to the central character library at `/Users/akhil/projects/content-goose/assets/character-library/`, and if a character isn't in the library yet, adds it first then links. Use when the user wants to remix one of their existing ads — this skill produces the source JSON that the script-rewriting step and `remix-ad` consume.