feat: audio media support — fal adapters, ElevenLabs TTS/music/SFX/STT, Gemini Lyria + 3.1 Flash TTS#463
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Adds falSpeech, falTranscription, and falAudio adapters to @tanstack/ai-fal, completing fal's media coverage alongside image and video. Introduces a new generateAudio activity in @tanstack/ai for music and sound-effect generation, with matching devtools events and types. Closes TanStack#328 Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…Lyria + 3.1 Flash TTS Extends @tanstack/ai-elevenlabs (which already covers realtime voice) with Speech, Music, Sound Effects, and Transcription adapters, each tree-shakeable under its own import. Adds Gemini Lyria 3 Pro / Clip music generation via a new generateAudio adapter, plus the new Gemini 3.1 Flash TTS Preview model with multi-speaker dialogue support. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Adds a new Audio Generation page, expands the fal adapter reference with sections for text-to-speech, transcription, and audio/music, and adds fal sections to the Text-to-Speech and Transcription guides. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Expand the ts-react-chat example with provider tabs for OpenAI, ElevenLabs, Gemini, and Fal on the TTS and transcription pages, plus a new /generations/audio page covering ElevenLabs Music, ElevenLabs SFX, Gemini Lyria, and Fal audio generation. Add a Gemini TTS unit test and wire an audio-gen feature into the E2E harness (adapter factory, API route, UI, fixture, and Playwright spec). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
94ada28 to
068ca0d
Compare
|
View your CI Pipeline Execution ↗ for commit 2050d88
☁️ Nx Cloud last updated this comment at |
Reorder the Audio Generation page so the direct Gemini (Lyria) and ElevenLabs (music/sfx) adapters appear before fal.ai, and update the environment variables + result-shape notes to cover all three providers. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@tanstack/ai
@tanstack/ai-anthropic
@tanstack/ai-client
@tanstack/ai-code-mode
@tanstack/ai-code-mode-skills
@tanstack/ai-devtools-core
@tanstack/ai-elevenlabs
@tanstack/ai-event-client
@tanstack/ai-fal
@tanstack/ai-gemini
@tanstack/ai-grok
@tanstack/ai-groq
@tanstack/ai-isolate-cloudflare
@tanstack/ai-isolate-node
@tanstack/ai-isolate-quickjs
@tanstack/ai-ollama
@tanstack/ai-openai
@tanstack/ai-openrouter
@tanstack/ai-preact
@tanstack/ai-react
@tanstack/ai-react-ui
@tanstack/ai-solid
@tanstack/ai-solid-ui
@tanstack/ai-svelte
@tanstack/ai-vue
@tanstack/ai-vue-ui
@tanstack/preact-ai-devtools
@tanstack/react-ai-devtools
@tanstack/solid-ai-devtools
commit: |
…el selector Expose an Audio tile on the welcome grid, offer one-click sample prompts for every audio provider, and let the Fal provider pick between current text-to-music models (default MiniMax v2.6). Threads a model override through the audio API and server fn. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Summary
Closes #328. Rebuilt on top of current
main.Adapters
@tanstack/ai-gemini:generateAudioadapter for Lyria 3 Pro / Clip music generation.geminiSpeechadapter.@tanstack/ai-elevenlabs: adds Speech, Music, Sound Effects, and Transcription adapters alongside the existing realtime voice (tree-shakeable).@tanstack/ai-fal: newfalSpeech,falTranscription,falAudioadapters — fal now covers image, video, audio, speech, and transcription.Core
@tanstack/aigenerateAudioactivity for music and sound-effect generation, with matchingAudioAdapterbase class and devtools events.generateSpeech/generateTranscriptionactivity generics tightened toTTSProviderOptions<TAdapter>/TranscriptionProviderOptions<TAdapter>so typed provider options flow through.Docs
media/audio-generation.mdguide — leads with Gemini (Lyria) and ElevenLabs (music/SFX), then fal.media/text-to-speech.mdandmedia/transcription.mdgain fal sections.docs/adapters/fal.mdexpanded with TTS, transcription, and audio sections plus a full model table.Examples
ts-react-chatTTS and transcription pages now have provider tabs (OpenAI, ElevenLabs, Gemini, Fal for TTS; OpenAI, ElevenLabs, Fal for transcription)./generations/audiopage covering ElevenLabs Music/SFX, Gemini Lyria, and fal audio generation.Tests
audio-genfeature wired into the harness with adapter factory, API route,AudioGenUI, fixture, and Playwright spec.Test plan
pnpm test:lib(all affected packages pass)pnpm test:types(all affected packages pass)pnpm test:eslint(all affected packages pass)pnpm test:docs(no broken links)pnpm --filter @tanstack/ai-e2e test:e2e -- --grep audio-gen🤖 Generated with Claude Code