WAYSQUARE.COM | Techno Blog
n8n

Building an Automated Story Machine Using n8n and Open-Source Tools

Discover how I built an automated storytelling machine using n8n, Docker, and open-source tools like Kokoro TTS, MinIO, and Stable Diffusion—all running on my home server with zero cost.


Share this post

How It All Started

One day I asked myself — what if I could automate the process of creating short, animated story videos from scratch? I had an idea to use AI to generate scripts, voiceovers, images, and combine it all into a video — and I wanted it all to run on my home server.

With my Ubuntu server (64GB RAM, NVIDIA RTX 2060 SUPER), and a passion for automation, I dove in using n8n as the core of the orchestration.

The Goal

I wanted to turn a simple row in a spreadsheet (in my case, Baserow) into a full animated video story — complete with AI-written text, synthesized voice narration, AI-generated illustrations, and FFmpeg-powered video editing — automatically and cost-free.

Core Tools I Used

  • n8n – for building the automation workflows
  • MinIO – local S3-compatible storage
  • Kokoro TTS – blazing fast text-to-speech via GPU
  • Stable Diffusion WebUI – local AI image generation
  • NCA Toolkit – video composition, audio merging, and clip processing
  • Baserow – open-source Airtable alternative

The Workflow in Action

The magic begins with a Baserow table — a single row containing the story topic, desired voice, character names, and number of scenes. From there:

  1. n8n pulls the row marked as pending
  2. It sends the topic to OpenRouter (Gemini Flash) to generate the full story
  3. Then it formats the story text for use in TTS
  4. Kokoro TTS generates the audio in MP3 format locally
  5. n8n uploads that MP3 to MinIO for access by other tools
  6. n8n sends the story to another LLM to generate six image prompts
  7. Each image prompt is sent to Stable Diffusion WebUI (via API)
  8. Base64 images are turned into PNGs and uploaded to MinIO
  9. NCA Toolkit converts each image into a 22-second video clip
  10. All clips are merged and combined with the TTS audio
  11. The final video file is saved, and the Baserow row is marked as done

From input to video output — all in under 5 minutes, without spending a dime.

What I Learned

I learned how powerful open-source tools can be when combined. I discovered how to work with APIs in n8n, how to run Stable Diffusion locally with LORA models, and how to manipulate audio, images, and video using FFmpeg.

Most importantly — I now have my own storytelling factory that runs on my own terms, hosted completely on my own hardware.

Final Thoughts

If you're a creator or developer with a spare PC and a dream of automation — this is 100% doable. No subscriptions. No paywalls. Just free, powerful technology working together.

The full JSON workflow and setup guide are linked in the description. If you’d like a video walk-through of how to set this up from scratch, let me know — I’d love to help others get started.

Tags: #n8n #Automation #AIStorytelling #OpenSource #Docker #TTS #MinIO #StableDiffusion #VideoCreation


Share this post

Written by
Wahyu Widagdo
Wahyu Widagdo
I am a professional telco engineer and blogger

Type above and press Enter to search.