Demodokos Foundry is an AI music production studio that runs entirely on your Windows PC. You describe a song in words, and Foundry generates full audio with vocals and instruments. But generation is only the beginning. The real work happens in everything that comes after.
Here is a concrete look at what the software actually does.
Text to music: describe it, hear it
You type a description into the caption builder. Genre, style, voice type, instruments, energy arc. Optionally, you paste lyrics with structure tags like [Verse] and [Chorus]. Hit Generate, and Foundry produces a complete audio file: vocals, instruments, arrangement, mix.
On a modern NVIDIA GPU with 12 GB VRAM, a full song generates at up to 15x realtime speed. A 3-minute track can be ready in under 20 seconds. You can generate multiple variations in one pass, preview them quickly, and keep the best ones.
You can also lock in specifics. Metadata overrides let you force a particular BPM, key signature, scale, or time signature before generating. Seed handling lets you reproduce a result exactly. These controls matter when you need precision, not just inspiration.
Patch: fix a section, keep the rest
You generated a song and the verse is great, but the chorus falls flat. With Patch, you select just the chorus region and regenerate only that part. The verse stays exactly as it was. No starting over.
This is one of the most useful features in the entire app. It turns AI music from a slot machine into a precision tool. Fix a vocal line, rework a bridge, tighten an intro, all without losing what already works.
Cover / Restyle: transform existing audio
Feed Foundry an existing recording and describe how you want it to sound. The software transforms it into a new version guided by your caption. The original structure is preserved, but the production, genre, and feel can change completely.
Extend: grow an idea into a full arrangement
Have a 30-second loop you like? Select it and use Extend. Foundry continues from that point, creating new material that flows naturally from what came before. Turn a spark into a full song. Write intros, outros, and alternate endings from a single seed section.
Stem separation: pull the layers apart
Foundry can split any mixed audio file into separate tracks: vocals, drums, bass, guitar, piano, other instruments, and karaoke (instrumental without vocals). The separated stems appear as aligned tracks on the timeline, ready to edit or remix.
Use this to remove vocals from a generation, swap the drums from one take with the bass from another, create instrumental versions, or build layered arrangements from multiple AI outputs.
32 DSP effects across 7 groups
Foundry ships with a full post-processing chain. 32 studio-grade effects organized into 7 groups:
- Filters and EQ - 24-band graphic EQ, parametric EQ, sub bass boost
- Modulation and Dynamics - compressors, noise gates, drum punch, bus glue
- Time and Space - reverb (including cathedral and hall presets), delay
- Color and Width - stereo widening, chorus, ensemble
- Retro and Texture - vinyl crackle, tape warmth, tube grit
- Pitch and Voice - auto-tune, voice transformation effects
- Creative and Glitch - granular stretch, multi-glitch, Blend Area for spectral fusion of two tracks
Over 200 built-in effect presets are included. Effects stack and chain freely, apply non-destructively to any track in the timeline, and undo without bouncing.
On top of that, Quick Effect lets you describe an effect in plain language ("warm hall reverb" or "aggressive tape saturation") and the AI generates a processed version as a new layer track.
Timeline editor: arrange, layer, and mix
The timeline is a multi-track editor built directly into Foundry. Drag clips, snap them to alignment, trim start and end points, apply fades, change speed (pitch-preserving), and adjust volume per track. You can mute, solo, or isolate individual tracks for focused listening.
You can split a track at the playhead, delete a selected segment, insert silence, or merge multiple active tracks into a single rendered bounce. Play-selection and looping let you A/B compare sections in tight iteration.
This means you can take the best verse from generation 1, the best chorus from generation 4, and a Patched bridge from generation 7, align them on one timeline, and export a cohesive finished track. All inside the same application.
Reference cloning: guide the AI with existing audio
Sometimes words are not enough to describe what you want. Reference cloning lets you feed an existing track into the generation process as a guide. Two modes:
- Structure reference transfers the melody contour, rhythmic pattern, and arrangement shape
- Sound reference transfers the timbre, vocal character, and instrument tone
Each mode has an adjustable strength slider. Low strength for subtle inspiration, high strength for direct transformation. You can use both simultaneously.
Creative AI: your co-writer
The Creative Agent is a guided writing assistant built into Foundry. It helps you draft captions and lyrics that are structured, consistent, and sized correctly for your target duration.
You can iterate conversationally: "make it darker," "add a bigger hook," "turn this into a duet." The AI validates the output against generation requirements, catching problems before you burn a generation cycle on a poorly structured prompt.
Multiple intelligence levels are available depending on whether you need a quick draft or a carefully refined creative brief.
Creative AI is capable to create the entire pipeline from lyrics and captions to full song arrangements - all just based on your ideas
Clip Library, projects, and export
The Clip Library lets you save and organize reusable audio segments: hooks, drum patterns, vocal takes, textures. Projects save your full session state, timeline, and settings. Five global Quick Save slots and three per-project slots let you snapshot and restore working configurations instantly.
Export outputs as WAV or FLAC. Full mix, selected region, or individual tracks with all edits (trim, fades, speed) applied. No watermarks, no embedded metadata, no call-home.
Runs entirely on your machine
Everything described above runs locally on your Windows PC. No internet connection required during generation. No data uploaded anywhere. The models live on your hard drive, the compute happens on your GPU, and the output stays on your disk until you choose to share it.
You need an NVIDIA GPU with at least 6 GB VRAM (reduced performance but full quality with Ultra-VRAM saver). 12 GB or more is recommended for the full Creative AI Agent feature set running smoothly. Utilize up to 32 GB VRAM for maximum performance gains.