End-to-End Video the Agent Runs Itself.
How we connected an autonomous execution agent to Remotion, Gemini, Flux, ElevenLabs, and Suno — so client video content goes from brief to rendered file with one human handoff checkpoint.

The Problem
Producing client video content required manually stitching Remotion (programmatic video), image generation APIs, voice synthesis, and music generation for every project. Each pipeline was built from scratch, and every API had its own auth, rate limits, and output format to manage.
The creative work — art direction, script, pacing — was getting crowded out by infrastructure work. The ratio of time spent on creative decisions versus API plumbing was badly wrong.
What We Built
We connected an autonomous execution agent to a unified video pipeline: Remotion for programmatic composition, Gemini and Flux/Replicate for image generation, Runway for motion, ElevenLabs for voice synthesis, and Suno for music. The agent orchestrates all six based on a brief and a pre-made decision fallback library for the choices it can make without human input.
A MOCK_MODE flag lets the pipeline run dry-run tests without burning API credits. A Cowork Execution Protocol defines exactly one moment where human review adds value — the final output check before delivery — and keeps the agent autonomous for everything before that.
“Producing client video used to mean stitching four APIs by hand for every project. Now the agent runs the pipeline and I review the output once.”
The System Architecture
Autonomous agent orchestrating six APIs: Remotion (video composition), Gemini + Flux/Replicate (image generation), Runway (motion), ElevenLabs (voice), Suno (music). Pre-made decision fallback library for non-critical choices. MOCK_MODE flag for dry-run testing. Cowork Execution Protocol with single human handoff checkpoint. MCP-on-Edge-Functions pattern for API surface.
The Results
End-to-end video generation the agent can run independently. A project that used to require 4–6 hours of manual API work now runs overnight and surfaces a review-ready cut in the morning.
The MOCK_MODE pattern and Execution Protocol have become a template for other autonomous pipelines at Automaton — any multi-API workflow can be built with the same safety rails.