Writing scripts for AI video — what changes
A prompt for AI is not a script. It is a technical brief where every word carries weight. What to write and how to break it down.
In normal filmmaking the director reads the script and makes interpretive choices. AI reads literally. That changes the text you write.
Single scene prompt structure
- Subject — what is in the frame: "30-year-old woman in red coat"
- Action — what they do: "walks down a narrow street, glances over her shoulder"
- Setting — where: "autumn Budapest, wet cobblestones, street lamps"
- Camera — how it is shot: "medium shot, slow tracking, slightly handheld, eye level"
- Lighting — light: "golden hour, warm rim light, soft shadows"
- Style — overall aesthetic: "35mm film, Kodak Portra, slight grain"
- Negative — what should NOT appear: "no text, no logos, no people in background"
Decomposition
One long prompt for a 30-second clip will not work. AI loses coherence past 5-8 seconds. Approach: write 5-second scenes, stitch in an editor.
- Each scene = its own prompt
- Continuity description between scenes: "same outfit", "same lighting"
- If a character appears across scenes — character reference image (available in Sora 2, Runway, Kling)
What you don't write in normal scripts but AI needs
- Exact timing in seconds (1.5-second turn, 2-second pause)
- Color of clothing and props specified word by word
- Camera type and lens — affects style
- Era and time period — otherwise AI defaults to the 2020s
Practical workflow
- Start with a normal script — as if for a DP
- Break into 3-8 second scenes
- Convert each scene into a technical prompt (ChatGPT or Claude can structure it)
- Generate 3-5 variants per scene, pick the best
- Stitch in Premiere/Final Cut/CapCut