Skip to content

emw8105/hacktx-25

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

46 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎡 Lyria - AI-Powered Celestial Music Generator

An innovative web application that generates real-time AI music based on live video streams and space weather data using Gemini 2.0 Flash for visual analysis and Lyria RealTime for continuous music generation.

Next.js FastAPI Python TypeScript


🌟 Features

Video-Driven Music Generation

  • 🎬 Live Stream Support - Works with YouTube livestreams and VOD content
  • πŸ–ΌοΈ Frame Analysis - Captures and analyzes video frames every 15 seconds
  • 🎡 Continuous Music - Infinite streaming with smooth transitions
  • 🎨 Context-Aware - Music adapts to visual content in real-time
  • πŸ’¬ Interactive Queries - Users can steer music generation with text prompts

Space Weather Ambient Piano

  • 🌌 Real-Time Space Data - Fetches live solar wind, geomagnetic activity, and X-ray flux from NOAA
  • 🎹 Ambient Piano - Solo piano compositions that reflect cosmic conditions
  • πŸͺ Homepage Integration - Perfect background for Voyager Golden Record visual
  • ⚑ Dynamic Updates - Music adjusts every 30 seconds based on space weather changes

DJ Controls

  • πŸŽ›οΈ Real-Time Parameters - Adjust tempo, energy, mood, intensity
  • 🎼 Genre Selection - Multiple genre presets
  • πŸ”Š Volume Control - Independent volume and mute controls
  • ⏯️ Playback Control - Play, pause, stop functionality

Audio Management

  • πŸ“₯ Download Sessions - Download current or completed session audio as WAV files
  • πŸ’Ύ Automatic Recording - Server saves all sessions automatically
  • 🎧 High Quality - 48kHz stereo PCM audio format
  • πŸ“ Session Management - Timestamped files for easy organization

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                       Frontend (Next.js)                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚
β”‚  β”‚   Home     β”‚  β”‚  Streams   β”‚  β”‚   Upload   β”‚            β”‚
β”‚  β”‚  (Space    β”‚  β”‚  (Video    β”‚  β”‚  (Custom   β”‚            β”‚
β”‚  β”‚  Weather)  β”‚  β”‚  Streams)  β”‚  β”‚   Video)   β”‚            β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚ WebSocket (Binary Audio + JSON)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Backend (FastAPI)                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚          WebSocket Session Driver                     β”‚   β”‚
β”‚  β”‚  β€’ Frame Capture (yt-dlp + ffmpeg)                   β”‚   β”‚
β”‚  β”‚  β€’ Audio Buffering & Download                        β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚               β”‚                         β”‚                    β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚   β”‚   Gemini 2.0 Flash   β”‚  β”‚   Lyria RealTime      β”‚      β”‚
β”‚   β”‚   (Image Analysis)   β”‚  β”‚   (Music Generation)  β”‚      β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Flow

Video Mode:

YouTube/Video β†’ Frame Capture (15s) β†’ Gemini Analysis β†’ Music Prompt
                                                              ↓
Frontend ← Binary Audio Stream (48kHz stereo) ← Lyria RealTime

Space Weather Mode:

NOAA APIs β†’ Space Weather Data (30s) β†’ Music Prompt Generation
                                              ↓
Frontend ← Binary Audio Stream (48kHz stereo) ← Lyria RealTime

πŸš€ Quick Start

Prerequisites

Required:

  • Node.js 18+ and npm
  • Python 3.8+
  • ffmpeg (for video frame capture)
  • Google AI API key (for Gemini & Lyria)

Optional:

  • Git (for version control)
  • yt-dlp (automatically installed via pip)

Installation

1. Clone the Repository

git clone https://github.com/emw8105/hacktx-25.git
cd hacktx-25

2. Backend Setup

cd server

# Create virtual environment (recommended)
python -m venv .venv

# Activate virtual environment
# Windows PowerShell:
.\.venv\Scripts\Activate.ps1
# Windows CMD:
.\.venv\Scripts\activate.bat
# Linux/Mac:
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Install ffmpeg (if not already installed)
# Windows (using Chocolatey):
choco install ffmpeg
# Or download from: https://ffmpeg.org/download.html

# Mac:
brew install ffmpeg
# Linux:
sudo apt-get install ffmpeg

3. Frontend Setup

cd ..  # Back to root
npm install

4. Environment Configuration

Create a .env file in the root directory with the following:

# REQUIRED: Google AI API Key
# Get your key from: https://aistudio.google.com/apikey
GEMINI_API_KEY=your_gemini_api_key_here

# REQUIRED: Lyria API Key (for Live Music API)
LYRIA_API_KEY=your_lyria_api_key_here

# REQUIRED: Google Cloud Project ID
GCP_PROJECT_ID=your_gcp_project_id_here

# OPTIONAL: Custom Lyria WebSocket URL (defaults to official endpoint)
LYRIA_WS_URL=wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateMusic

# OPTIONAL: Gemini Model (defaults to gemini-2.0-flash)
GEMINI_MODEL=gemini-2.0-flash

⚠️ Minimum Required Values:

GEMINI_API_KEY=your_key_here
LYRIA_API_KEY=your_key_here
GCP_PROJECT_ID=your_project_id_here

How to get your API keys:

  1. GEMINI_API_KEY: Visit Google AI Studio
  2. LYRIA_API_KEY: Obtain from Google's Live Music API access program
  3. GCP_PROJECT_ID: Your Google Cloud Project ID (create at Google Cloud Console)

5. Run the Application

Terminal 1 - Backend:

cd server
python main.py
# Server starts on http://localhost:8000

Terminal 2 - Frontend:

npm run dev
# App starts on http://localhost:3000

6. Open in Browser

Navigate to http://localhost:3000


πŸ“– Usage Guide

1. Homepage (Space Weather Mode)

  • Background music automatically plays based on real-time space weather
  • Features the Voyager Golden Record rotating visual
  • Music updates every 30 seconds with cosmic conditions

2. Streams Page

  • Browse available YouTube livestreams and videos
  • Click any stream to start music generation
  • Add custom YouTube URLs via "Create Stream" button

3. Upload Page

  • Upload your own video files (MP4, MOV, AVI)
  • Drag-and-drop or browse to select
  • Instantly start music generation

4. Stream Page (Main Experience)

  • Video Display: Embedded YouTube player or video snapshot
  • Status Indicators: Connection status, listener count
  • Music Controls:
    • Play/Pause: Start or stop audio playback
    • Mute/Unmute: Toggle audio on/off
    • Download: Save current session as WAV file
  • DJ Controls: Adjust tempo, energy, mood, genre, and more
  • Query Input: Type prompts to steer music generation
    • Example: "Make it more energetic"
    • Example: "Add orchestral elements"

5. Downloading Audio

  • Click the Download button during or after a session
  • Downloads the current session's audio as a timestamped WAV file
  • Format: music_20251019_143052_session-abc123.wav
  • High quality: 48kHz stereo PCM

πŸ› οΈ API Reference

REST Endpoints

GET /videos

Returns list of available streams

[
  {
    "id": "iss",
    "name": "ISS Livestream",
    "source": "https://youtube.com/...",
    "type": "stream"
  }
]

POST /videos/custom

Register a custom YouTube URL

{
  "name": "My Custom Stream",
  "url": "https://youtube.com/watch?v=..."
}

POST /upload/file

Upload a video file

  • Form data: name (string), file (video file)
  • Returns: {id, filename, name, bytes}

GET /audio/current/{session_id}

Download audio from active or completed session

  • Returns: WAV file with timestamped filename

GET /audio/latest

Download most recent audio session

  • Returns: Latest WAV file

GET /audio/list

List all saved audio files

{
  "files": [
    {
      "filename": "music_20251019_143052_session-abc.wav",
      "size": 12345678,
      "created": "2025-10-19T14:30:52",
      "modified": "2025-10-19T14:31:52"
    }
  ]
}

WebSocket Endpoints

WS /ws/session/{session_id}

Main video-driven music generation endpoint

Client β†’ Server Messages:

// Start session
{
  "type": "start",
  "source": "stream" | "video" | "upload",
  "id": "stream_id",
  "prompt": "Optional music description"
}

// Send query to steer music
{
  "type": "query",
  "text": "Make it more energetic"
}

// Update music config
{
  "type": "set_config",
  "bpm": 120,
  "temperature": 0.8
}

// Playback control
{
  "type": "control",
  "action": "play" | "pause" | "stop"
}

// DJ parameters
{
  "type": "dj_parameters",
  "parameters": {
    "tempo": 120,
    "energy": 0.8,
    "mood": "upbeat",
    "intensity": 0.7,
    "genre": "electronic",
    "volume": 0.9
  }
}

// Keep-alive
{"type": "ping"}

Server β†’ Client Messages:

// Binary frames: Raw PCM audio (48kHz, stereo, 16-bit)
// Each chunk ~384KB (2 seconds of audio)

// JSON status messages:
{"status": "session_started", "session_id": "..."}
{"status": "lyria_stream_started", "mode": "lyria-realtime-ws"}
{"type": "lyria_playback_started"}
{"status": "snapshot_processed", "prompt": "..."}
{"type": "lyria_prompt_applied"}
{"status": "heartbeat"}
{"type": "pong"}
{"error": "Error message if something fails"}

🎨 Frontend Structure

app/
β”œβ”€β”€ page.tsx                    # Homepage (space weather)
β”œβ”€β”€ streams/page.tsx            # Browse streams
β”œβ”€β”€ showcase/page.tsx           # Custom YouTube input
β”œβ”€β”€ upload/page.tsx             # Video upload
β”œβ”€β”€ stream/[id]/page.tsx        # Main stream player
β”‚
components/
β”œβ”€β”€ navigation.tsx              # Top navigation bar
β”œβ”€β”€ dj-controls.tsx             # DJ parameter controls
β”œβ”€β”€ ui/                         # Shadcn UI components
β”‚
lib/
β”œβ”€β”€ api.ts                      # API client functions
β”‚
public/
β”œβ”€β”€ lyria_header.mp4           # Background video
└── voyager.mp4                # Golden Record visual

Key Frontend Features

React Optimizations:

  • useCallback for all event handlers (prevents re-renders)
  • useRef for stable references (WebSocket, Audio nodes)
  • Conditional state updates (only update if value changes)
  • Memoized input handlers

Audio Processing:

  • Web Audio API for playback
  • GainNode for volume control
  • AudioBuffer queue for smooth streaming
  • Real-time PCM decoding

State Management:

  • Minimal state updates (only UI-critical values)
  • Audio data stored in refs (not state)
  • Session ID tracking for downloads

πŸ–₯️ Backend Structure

server/
β”œβ”€β”€ main.py                     # FastAPI application
β”œβ”€β”€ gemini.py                   # Gemini & Lyria integration
β”œβ”€β”€ space_weather.py            # NOAA space weather API
β”œβ”€β”€ utils.py                    # Video capture utilities
β”œβ”€β”€ requirements.txt            # Python dependencies
β”‚
β”œβ”€β”€ test_socket.py              # Video mode test client
β”œβ”€β”€ test_space_weather.py       # Space weather test client
β”œβ”€β”€ continuous_test_client.py   # Real-time playback test
β”œβ”€β”€ continuous_recorder.py      # Continuous recording test
β”‚
β”œβ”€β”€ uploads/                    # Uploaded video files
β”œβ”€β”€ audio_sessions/             # Recorded audio sessions
└── snapshots/                  # Captured video frames

Key Backend Features

WebSocket Session Management:

  • Binary audio streaming (48kHz stereo PCM)
  • Audio buffering for downloads
  • Automatic WAV file creation on disconnect
  • Concurrent session support

Video Processing:

  • Frame capture with ffmpeg
  • yt-dlp for YouTube stream resolution
  • HLS/DASH stream support
  • Local video file support

Music Generation:

  • Lyria RealTime WebSocket protocol
  • Weighted prompt system
  • Configuration updates (BPM, temperature)
  • Playback control (play, pause, stop)

πŸ§ͺ Testing

Test the Backend

cd server

# Test video-driven music (45 seconds)
python test_socket.py

# Test space weather piano (60 seconds)
python test_space_weather.py

# Test with custom duration (90 seconds)
set AUDIO_DURATION=90  # Windows CMD
$env:AUDIO_DURATION = "90"  # PowerShell
export AUDIO_DURATION=90  # Linux/Mac
python test_socket.py

# Real-time playback test
python continuous_test_client.py
# Press Ctrl+C to stop

# Continuous recording with segments
python continuous_recorder.py
# Press Ctrl+C to stop

Output Files:

  • audio/music_YYYYMMDD_HHMMSS_sessionid.wav
  • audio/space_weather_YYYYMMDD_HHMMSS_sessionid.wav
  • audio/music_YYYYMMDD_HHMMSS_seg001.wav (segments)

Verify Setup

# Check API key is set
echo $env:GEMINI_API_KEY  # PowerShell
echo $GEMINI_API_KEY  # Linux/Mac

# Check ffmpeg is installed
ffmpeg -version

# Check backend is running
curl http://127.0.0.1:8000/videos

# Check frontend is running
curl http://localhost:3000

πŸ› Troubleshooting

Common Issues

1. "WebSocket disconnected immediately"

  • Check if GEMINI_API_KEY is set correctly
  • Verify LYRIA_API_KEY has proper permissions
  • Check terminal for error messages

2. "No audio playing"

  • Open browser console (F12) and check for errors
  • Verify WebSocket connection status
  • Check audio context state (click Debug button)

3. "Failed to capture frame"

  • Ensure ffmpeg is installed and in PATH
  • Check YouTube URL is accessible
  • Try a different video source

4. "Module not found" errors

  • Backend: pip install -r requirements.txt
  • Frontend: npm install
  • Activate virtual environment for Python

5. "Audio download returns 404"

  • Wait a few seconds for audio to buffer
  • Check that session has started playing
  • Verify session_id is being tracked

6. "Port already in use"

  • Backend (8000): Change port in main.py
  • Frontend (3000): Use PORT=3001 npm run dev

Enable Debug Mode

Frontend:

  • Open browser console (F12)
  • Click the "Debug" button to see audio state
  • Check Network tab for WebSocket messages

Backend:

  • Check terminal for detailed logs
  • All WebSocket messages are printed
  • Frame capture progress shown

πŸ“¦ Dependencies

Backend (Python)

fastapi>=0.104.0
uvicorn>=0.24.0
websockets>=12.0
google-generativeai>=0.3.0
python-dotenv>=1.0.0
yt-dlp>=2023.10.13
aiohttp>=3.9.0  # For space weather

Frontend (Node.js)

next@14+
react@18+
typescript@5+
tailwindcss@3+
lucide-react  # Icons
shadcn/ui  # UI components

🎯 Performance Optimization

Frontend Optimizations

  • βœ… All event handlers wrapped in useCallback
  • βœ… Audio data stored in refs (not state)
  • βœ… Conditional state updates
  • βœ… Memoized input handlers
  • βœ… No re-renders during playback

Backend Optimizations

  • βœ… Async frame capture
  • βœ… WebSocket audio buffering
  • βœ… Concurrent session support
  • βœ… Background file saving
  • βœ… Efficient memory management

Audio Quality

  • Sample Rate: 48000 Hz
  • Channels: 2 (stereo)
  • Bit Depth: 16-bit
  • Format: PCM (little-endian)
  • Bandwidth: ~192 KB/s per session

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License.


πŸ™ Acknowledgments

  • Google AI Studio - For Gemini 2.0 Flash and Lyria RealTime APIs
  • NOAA - For free space weather data APIs
  • Voyager Golden Record - Inspiration for the space theme
  • HackTX 2025 - For providing the opportunity to build this project

πŸ“ž Support

For issues or questions:

  • Open an issue on GitHub
  • Check existing issues for solutions
  • Review the troubleshooting section above

🚧 Roadmap

  • User authentication and session persistence
  • Playlist creation and management
  • Social sharing of generated music
  • Mobile app (React Native)
  • Additional music generation models
  • Cloud deployment guide
  • Docker containerization

Built with ❀️ for HackTX 2025

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors