Speak naturally, point your camera, and watch scenes materialize as images, video, and music in real-time
DreamStudio is an AI-powered cinematic story director that transforms voice input and camera feeds into complete multimedia scenes. Using Google's Gemini Live API for real-time scene understanding, it orchestrates three generation models simultaneously: Imagen 4 for still images, Veo 3.1 for video clips, and Lyria 2 for background music.
The system is fully cross-platform with a React web app, Flutter mobile app, and Python Flask backend. Users describe a scene naturally ("a foggy Victorian street at dusk with gas lamps") and DreamStudio generates matching visuals and soundtrack in real-time, creating an immersive cinematic experience.