Verticals with AI: What the Lab Learned from An Experiment with GenAI Video Tools for Audio Formats
,How do you create vertical videos for an audio-only podcast that contains no visual material? Can AI-generated footage fill this gap? Our Lab put this to the test.
How did you last discover a new podcast? Maybe a friend recommended it – or, more likely, you watched a clip on social media. Podcasts are increasingly discovered through video. Short clips on TikTok, YouTube, or Instagram have become a key gateway to audio content, posing a challenge for formats that were never designed with visuals in mind. Can generative AI help close that gap? That was the question behind our recent experiment at the DW Lab.
From manuscript to vertical video
We experimented with various AI tools to produce vertical videos based on the podcast manuscript of the Delayland episode What the Fax?!. The Delayland podcast (produced by DW Business) explores why Germany has lost its reputation for efficiency and quality, from dilapidated infrastructure and delayed trains to excessive bureaucracy, and asks what the country can learn from others.
Choosing a visual language
One of our first decisions in the Lab project concerned visual style: Is there a good way to illustrate bureaucracy without relying on photorealistic imagery? After consulting internal and external experts and reviewing existing use cases, we opted for non-photorealistic styles ranging from comic-like illustration to painterly, Rembrandt-inspired aesthetics–a deliberate choice to signal that the visuals were symbolic rather than documentary. You can watch one of the unpublished video clips over here.
A fragmented production workflow
The project followed what is currently considered a state-of-the-art workflow for AI video generation: Rather than relying on a single model, we used a multi-stage pipeline combining script refinement, image generation, and animation. Tools tested included Microsoft Copilot, ImagineArt, OpenArt, Midjourney, and Runway; models used included Nano Banana Pro, Veo 3, Kling 3.0, and Seedance 1.5 Pro. The process yielded four unpublished clips of roughly one minute each.
This fragmented setup reflects the current limits of generative video technology: No single model can yet produce consistent, high-quality results across all stages in one step.
The consistency problem
The biggest technical hurdle was consistency. Identical prompts frequently produced different visual outcomes–colors shifted, objects moved, backgrounds changed. While such discrepancies may seem minor in still images, they create continuity errors that are difficult to manage in editing. Producing convincing prototypes was time-consuming and unpredictable. Even with careful prompting, results could not be guaranteed, making it hard to estimate production time or plan for scale.
When is synthetic media in journalism justified?
Throughout the project, we kept returning to the same central question: when is the use of synthetic images or video in journalism justified – and when does it risk undermining credibility? The experiment made clear that each use case needs to be evaluated on its own terms. Does AI provide genuine added value for users that could not otherwise be achieved, or is it being used for novelty's sake? Another important question is how to visually distinguish our work from the growing volume of AI-generated content already circulating online.
To be very clear: We don't have definitive answers yet, but the project has helped sharpen this discussion considerably.
What's next?
Encouraged by largely positive internal feedback, we plan to take the experiment further. In cooperation with our design department, we aim to produce vertical clips for all five existing Delayland episodes and publish them as YouTube Shorts on DW's podcast channel to gather audience feedback. Further experiments with additional use cases are planned, with the goal of refining workflows across editorial and design teams, establishing a coherent visual language, and testing new tools.
An important takeaway: AI-generated video may open new storytelling opportunities. But it also raises questions that journalism is only beginning to confront.
Special thanks to: Samantha Baker, Andreas Becker, Sebastian Katzer, Lynn Khellaf, Marie Kilg, Isabell Lorenzo, Alexander Matthews, Nicolas Martin, Erika Marzano, Annabelle Steffes, Ksenia Skriptchenko, Jasper Steinlein, Anna-Daniela Strina, Sven Windszus, Tasneem Zahra, and everyone else who contributed to this project!
Key visual: Fictitious family practice doctor Anna Logue checks a new fax message (cropped still from an AI-generated video teaser)
