A complete production system for an AI stop-motion short. I built a scene-review web app, a multi-backend render orchestrator, and MCP integrations that drive Unreal Engine, ComfyUI, Nuke, and Blender from one pipeline. The software, the sets, the characters, and the finishing are all mine. It is an AI-hybrid workflow run with the discipline of a real film pipeline, which is the difference between a directed look and slop.
A pipeline I built end to end: a scene-review web app, a render orchestrator that fans work across three backends, and an indexer that promotes finished frames onto the review board.
The front of the pipeline is a scene-review web app I wrote in Python. It renders the script scene by scene, gates which shots are cleared for generation, and carries a pose overlay and a previz 3D viewer so I can block a shot against the actual set before committing a render. A generation gate keeps paid generation locked behind explicit approval, so the expensive engines only run on shots I have already signed off.
Behind it sits a render orchestrator. A supervisor fans work across three backends, RunPod GPU pods, a serverless tier, and a local Klein 9B running on Apple Silicon, then an indexer collects the finished frames, writes provenance for every render, and promotes them back onto the review board. Each engine is reached over an MCP bridge, so the same orchestrator can build a set, pose a character, render a plate, and run a finishing pass without leaving the pipeline.
I did not skip the craft. The project moves through the same stages a normal film does, script to delivery, with a human creative decision at every gate. AI runs inside that structure, not instead of it.
Each shot starts from a composition I lock first, not a blind prompt. A reference frame fixes pose, staging, and camera, then the engine renders the look inside those rails. Same character, same set, shot after shot.
The look does not come from typing a sentence and hoping. I lock the composition first, a reference frame that fixes pose, staging, and camera, then the engine renders the art-directed look inside those rails. The character comes from a fixed Secret Lab library, so identity holds from shot to shot. That is the whole game: reference-guided generation plus a locked frame is what separates a directed image from the random drift people call AI slop.
The cast are real 3D puppets. I designed each from turnarounds, took them to 3D, and finished them in Blender, so the same character holds up from every angle and in every pose.
Here is how the poses work. I design a character as a turnaround sheet, take that sheet to 3D as a real puppet, and finish it in Blender. To stage a shot I pose the puppet, and that posed 3D render becomes the reference the engine finishes from. Because the pose and proportion are baked into an actual model, the character never melts or drifts between frames. It is built, not re-guessed each time.
Every set is built and lit in Unreal Engine 5.8, driven from the pipeline over the MCP bridge. I block the characters in the real set for scale and staging before a frame renders. Click to enlarge.
Four engines, one orchestrator. Each is driven over an MCP bridge so the pipeline can build, pose, render, and finish without a human in the loop.
Nothing is hand-clicked. Every render is a scripted ComfyUI graph, queued from Python, reviewed on Discord, and re-queued from notes. Reproducible, batchable, versioned.
# hero_pass.py -- one hero shot per scene, queued straight into ComfyUI def queue_heroes(): for num in sorted(SCENES): slug, slots = SCENES[num] label, template = slots[0] # slot 1 = the hero prompt subj = CAST[(num * 7) % len(CAST)] # spread casting across scenes prompt = template.replace("{SUBJ}", subj) + PERFORMANCE seed = random.randint(0, 2**48) graph = build_graph(num, slug, prompt, seed, 1, "HERO", cast_slug(subj)) graph["14"]["inputs"]["filename_prefix"] = f"heroes/scene-{num:02d}-{slug}" # POST the graph to the ComfyUI server, then let the review loop take over post(SERVER + "/prompt", {"prompt": graph})
Each render is a graph built in code, seeded, labeled, and posted to the ComfyUI server. The cast, look LoRA, performance tags, and camera all come from shared definitions, so a whole scene queues consistently. I review the heroes on Discord and reply with direction, and a reaction watcher re-queues the takes. It is a pipeline, not a prompt box.
No slop means no mystery frames. Every render writes a provenance record: the engine, the model, the references, the cost, a content hash, and who signed off. I can prove how any frame was made.
{
"deliverable": "JTHM_GOB_315",
"content_hash_sha256": "3565825515e6425c69f6...9027f99",
"engine_summary": ["serverless-klein"],
"license_summary": ["AI-generated, reference-guided from Secret Lab library chars"],
"lineage": {
"kind": "render", "shot": "JTHM_GOB_315",
"events": [{
"engine": "serverless-klein", "provider": "nano-banana-pro",
"software": "ComfyUI serverless", "author": "Ron",
"cost_usd": 0.04
}]
},
"certified_by": "Ron Rauch"
}
This is the actual record written for one shot. Every frame in the project gets one: a content hash to identify it, the exact engine and provider that made it, the references it was guided by, the cost, and a signature. It plugs into a copyright-validity check, so the work is auditable end to end. Tracking like this is the opposite of slop. I can always say how a frame was made.
Raw renders come off the engines flat. A headless Nuke pass grades them, adds grain and a vignette, and stamps the panel border. Click to enlarge.
Available for editing, supervising, and tool-building consulting.
Contact