Skip to content

feat: audio media support — fal adapters, ElevenLabs TTS/music/SFX/STT, Gemini Lyria + 3.1 Flash TTS#463

Draft
tombeckenham wants to merge 9 commits intoTanStack:mainfrom
tombeckenham:328-fal-audio-and-speech-support
Draft

feat: audio media support — fal adapters, ElevenLabs TTS/music/SFX/STT, Gemini Lyria + 3.1 Flash TTS#463
tombeckenham wants to merge 9 commits intoTanStack:mainfrom
tombeckenham:328-fal-audio-and-speech-support

Conversation

@tombeckenham
Copy link
Copy Markdown
Contributor

@tombeckenham tombeckenham commented Apr 17, 2026

Summary

Closes #328. Rebuilt on top of current main.

Adapters

  • @tanstack/ai-gemini:
    • New generateAudio adapter for Lyria 3 Pro / Clip music generation.
    • New Gemini 3.1 Flash TTS Preview model with multi-speaker dialogue support in the existing geminiSpeech adapter.
  • @tanstack/ai-elevenlabs: adds Speech, Music, Sound Effects, and Transcription adapters alongside the existing realtime voice (tree-shakeable).
  • @tanstack/ai-fal: new falSpeech, falTranscription, falAudio adapters — fal now covers image, video, audio, speech, and transcription.

Core @tanstack/ai

  • New generateAudio activity for music and sound-effect generation, with matching AudioAdapter base class and devtools events.
  • generateSpeech / generateTranscription activity generics tightened to TTSProviderOptions<TAdapter> / TranscriptionProviderOptions<TAdapter> so typed provider options flow through.

Docs

  • New media/audio-generation.md guide — leads with Gemini (Lyria) and ElevenLabs (music/SFX), then fal.
  • media/text-to-speech.md and media/transcription.md gain fal sections.
  • docs/adapters/fal.md expanded with TTS, transcription, and audio sections plus a full model table.

Examples

  • ts-react-chat TTS and transcription pages now have provider tabs (OpenAI, ElevenLabs, Gemini, Fal for TTS; OpenAI, ElevenLabs, Fal for transcription).
  • New /generations/audio page covering ElevenLabs Music/SFX, Gemini Lyria, and fal audio generation.

Tests

  • Unit tests for every new adapter (Gemini audio + TTS multi-speaker; ElevenLabs speech/music/SFX/transcription; fal speech/transcription/audio).
  • E2E audio-gen feature wired into the harness with adapter factory, API route, AudioGenUI, fixture, and Playwright spec.

Test plan

  • pnpm test:lib (all affected packages pass)
  • pnpm test:types (all affected packages pass)
  • pnpm test:eslint (all affected packages pass)
  • pnpm test:docs (no broken links)
  • pnpm --filter @tanstack/ai-e2e test:e2e -- --grep audio-gen
  • Manual smoke: open each generations page in ts-react-chat dev server and verify tab switching + audio playback per provider

🤖 Generated with Claude Code

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 17, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f8f72758-ea0e-4508-b38b-cbd33830d7d6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

tombeckenham and others added 4 commits April 17, 2026 19:06
Adds falSpeech, falTranscription, and falAudio adapters to @tanstack/ai-fal,
completing fal's media coverage alongside image and video. Introduces a new
generateAudio activity in @tanstack/ai for music and sound-effect generation,
with matching devtools events and types.

Closes TanStack#328

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…Lyria + 3.1 Flash TTS

Extends @tanstack/ai-elevenlabs (which already covers realtime voice) with
Speech, Music, Sound Effects, and Transcription adapters, each tree-shakeable
under its own import.

Adds Gemini Lyria 3 Pro / Clip music generation via a new generateAudio
adapter, plus the new Gemini 3.1 Flash TTS Preview model with multi-speaker
dialogue support.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Adds a new Audio Generation page, expands the fal adapter reference with
sections for text-to-speech, transcription, and audio/music, and adds fal
sections to the Text-to-Speech and Transcription guides.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Expand the ts-react-chat example with provider tabs for OpenAI,
ElevenLabs, Gemini, and Fal on the TTS and transcription pages, plus a
new /generations/audio page covering ElevenLabs Music, ElevenLabs SFX,
Gemini Lyria, and Fal audio generation.

Add a Gemini TTS unit test and wire an audio-gen feature into the E2E
harness (adapter factory, API route, UI, fixture, and Playwright spec).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@tombeckenham tombeckenham force-pushed the 328-fal-audio-and-speech-support branch from 94ada28 to 068ca0d Compare April 17, 2026 09:22
@tombeckenham tombeckenham changed the title feat: audio/tts example pages and tests across providers feat: audio media support — fal adapters, ElevenLabs TTS/music/SFX/STT, Gemini Lyria + 3.1 Flash TTS Apr 17, 2026
@nx-cloud
Copy link
Copy Markdown

nx-cloud bot commented Apr 17, 2026

View your CI Pipeline Execution ↗ for commit 2050d88

Command Status Duration Result
nx run-many --targets=build --exclude=examples/** ✅ Succeeded 1m 40s View ↗

☁️ Nx Cloud last updated this comment at 2026-04-17 09:58:02 UTC

Reorder the Audio Generation page so the direct Gemini (Lyria) and
ElevenLabs (music/sfx) adapters appear before fal.ai, and update the
environment variables + result-shape notes to cover all three providers.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Apr 17, 2026

Open in StackBlitz

@tanstack/ai

npm i https://pkg.pr.new/@tanstack/ai@463

@tanstack/ai-anthropic

npm i https://pkg.pr.new/@tanstack/ai-anthropic@463

@tanstack/ai-client

npm i https://pkg.pr.new/@tanstack/ai-client@463

@tanstack/ai-code-mode

npm i https://pkg.pr.new/@tanstack/ai-code-mode@463

@tanstack/ai-code-mode-skills

npm i https://pkg.pr.new/@tanstack/ai-code-mode-skills@463

@tanstack/ai-devtools-core

npm i https://pkg.pr.new/@tanstack/ai-devtools-core@463

@tanstack/ai-elevenlabs

npm i https://pkg.pr.new/@tanstack/ai-elevenlabs@463

@tanstack/ai-event-client

npm i https://pkg.pr.new/@tanstack/ai-event-client@463

@tanstack/ai-fal

npm i https://pkg.pr.new/@tanstack/ai-fal@463

@tanstack/ai-gemini

npm i https://pkg.pr.new/@tanstack/ai-gemini@463

@tanstack/ai-grok

npm i https://pkg.pr.new/@tanstack/ai-grok@463

@tanstack/ai-groq

npm i https://pkg.pr.new/@tanstack/ai-groq@463

@tanstack/ai-isolate-cloudflare

npm i https://pkg.pr.new/@tanstack/ai-isolate-cloudflare@463

@tanstack/ai-isolate-node

npm i https://pkg.pr.new/@tanstack/ai-isolate-node@463

@tanstack/ai-isolate-quickjs

npm i https://pkg.pr.new/@tanstack/ai-isolate-quickjs@463

@tanstack/ai-ollama

npm i https://pkg.pr.new/@tanstack/ai-ollama@463

@tanstack/ai-openai

npm i https://pkg.pr.new/@tanstack/ai-openai@463

@tanstack/ai-openrouter

npm i https://pkg.pr.new/@tanstack/ai-openrouter@463

@tanstack/ai-preact

npm i https://pkg.pr.new/@tanstack/ai-preact@463

@tanstack/ai-react

npm i https://pkg.pr.new/@tanstack/ai-react@463

@tanstack/ai-react-ui

npm i https://pkg.pr.new/@tanstack/ai-react-ui@463

@tanstack/ai-solid

npm i https://pkg.pr.new/@tanstack/ai-solid@463

@tanstack/ai-solid-ui

npm i https://pkg.pr.new/@tanstack/ai-solid-ui@463

@tanstack/ai-svelte

npm i https://pkg.pr.new/@tanstack/ai-svelte@463

@tanstack/ai-vue

npm i https://pkg.pr.new/@tanstack/ai-vue@463

@tanstack/ai-vue-ui

npm i https://pkg.pr.new/@tanstack/ai-vue-ui@463

@tanstack/preact-ai-devtools

npm i https://pkg.pr.new/@tanstack/preact-ai-devtools@463

@tanstack/react-ai-devtools

npm i https://pkg.pr.new/@tanstack/react-ai-devtools@463

@tanstack/solid-ai-devtools

npm i https://pkg.pr.new/@tanstack/solid-ai-devtools@463

commit: 2050d88

tombeckenham and others added 3 commits April 17, 2026 19:28
…el selector

Expose an Audio tile on the welcome grid, offer one-click sample prompts
for every audio provider, and let the Fal provider pick between current
text-to-music models (default MiniMax v2.6). Threads a model override
through the audio API and server fn.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Fal audio, speech, and music generation support

1 participant