Transforming natural language into structured, actionable data through Multi-LLM Orchestration and Autonomous Web Research.
(Click the Live Demo badge above to see it in action)
Modern data research is fragmented. QueryIQ bridges the gap between unstructured human curiosity and structured database architecture. By employing an Agentic AI Pipeline, it dynamically scrapes the internet, synthesizes data via LLaMA 3.3, and strictly formats the output into a consumable JSON schemaβall verified through a Human-in-the-Loop (HITL) interface.
This is not a wrapper; it is an intelligent orchestration engine built for scale, speed, and accuracy.
QueryIQ is built using a decoupled client-server architecture, emphasizing separation of concerns, high performance, and a seamless user experience.
| π¨ Frontend (Client) | βοΈ Backend (API) | π§ AI & Infrastructure |
|---|---|---|
| Framework: React + Vite | Framework: FastAPI (Python) | Orchestration: Multi-agent routing |
| Styling: Custom Glassmorphism CSS | Server: Uvicorn ASGI | LLM Engine: Groq (LLaMA 3.3 70B) |
| State: React Hooks / Context | Validation: Pydantic (Strict typing) | Web Scraping: Tavily Search API |
| UX: Dynamic Micro-animations | Integrations: RESTful Architecture | Database: Supabase (PostgreSQL) |
Unlike traditional chatbots, QueryIQ operates autonomously using a multi-step verification pipeline to prevent hallucinations.
- Intent Classification (
Groq): Analyzes the raw query to determine complexity, required geography, and whether live web research is necessary. - Deep Web Scraping (
Tavily): If required, agents trigger live internet searches, bypassing standard LLM knowledge cut-offs. - Data Synthesis & Extraction (
Groq): Contextual data is fed back into the LLM with strict formatting instructions to extract precise JSON nodes (Topic, Geography, Industry, Entity Type, Intent, Keywords). - Human-in-the-Loop Verification (
Supabase): Extracted data is pushed to apending_reviewdatabase state. The UI prompts human operators to modify, approve, or reject the data before final ingestion.
sequenceDiagram
participant User as π¨βπ» User
participant UI as π₯οΈ React UI
participant API as βοΈ FastAPI
participant AI as π§ Agent Workflow
participant DB as ποΈ Supabase
User->>UI: "Who leads European EV tech?"
UI->>API: POST /queries
API->>AI: 1. Classify Intent
AI-->>API: Needs Web Search
API->>AI: 2. Fetch Live Context (Tavily)
API->>AI: 3. Extract JSON Schema (Groq)
AI-->>API: Formatted Intelligence
API->>DB: Save as 'pending_review'
API-->>UI: Return Intelligence Card
UI-->>User: Display Glassmorphic Card
User->>UI: Edits data & Clicks 'Approve/Save'
UI->>API: PATCH /queries/{id}/review
API->>DB: Update row -> 'approved'
API-->>UI: Return updated JSON
UI->>User: Auto-download result.json
When recruiters or engineers look at this codebase, they'll find:
- Strict Type Validation: Heavy usage of
Pydanticon the backend prevents malformed data from ever reaching the database. - Optimized UI/UX: CSS
clamp()functions for fluid typography, custom scrollbars, and GPU-accelerated CSS animations (transform,opacity) ensure a buttery-smooth 60fps experience. - Defensive Programming: API calls handle transient network failures, and the frontend degrades gracefully with informative Error Boundaries and Empty States.
- Secure Configuration: Zero hardcoded secrets. Environment variables handle all API keys (Groq, Tavily) and Database URIs.
- RESTful Principles: Predictable endpoints (
GET /queries,POST /queries,PATCH /queries/{id}/review). - Infrastructure Optimization: Configured an automated cron-job health check (
GET /) to bypass Render's free-tier sleep limitations, ensuring zero cold-start latency for the live demo without consuming LLM/Search API quotas.
- Streaming Responses (SSE): Currently, the API waits for all LLM calls to finish before returning. I would implement Server-Sent Events (SSE) to stream the pipeline steps and extracted JSON chunks to the frontend in real-time, drastically reducing perceived latency.
- Background Task Queue: For extremely deep research tasks, I would offload the LLM and scraping work to a Celery or Redis Queue and return a
task_idimmediately, rather than keeping the HTTP connection open. - Anthropic SDK Integration: While I chose Groq (LLaMA 3.3) for speed and cost-effectiveness in this build, I would love to integrate the Anthropic SDK (as provided in the starter code) to utilize Claude 3.5 Sonnet for the extraction step, as it's arguably the industry leader for structured JSON extraction.
Want to run this locally? It takes less than 3 minutes.
Execute this in your Supabase SQL Editor:
Click to expand SQL Schema
CREATE TABLE queries (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
raw_query TEXT NOT NULL,
topic TEXT,
geography TEXT,
industry TEXT,
entity_type TEXT,
intent TEXT,
keywords TEXT[] DEFAULT '{}',
confidence_score FLOAT DEFAULT 0,
sources JSONB DEFAULT '[]'::jsonb,
pipeline_steps JSONB DEFAULT '[]'::jsonb,
classifier_model TEXT,
extractor_model TEXT,
research_summary TEXT,
status TEXT DEFAULT 'pending_review',
created_at TIMESTAMPTZ DEFAULT now()
);
-- Enable RLS
ALTER TABLE queries ENABLE ROW LEVEL SECURITY;
CREATE POLICY "Allow all operations" ON queries FOR ALL USING (true) WITH CHECK (true);cd backend
pip install -r requirements.txt
# Environment Setup (.env)
# GROQ_API_KEY=your_key
# TAVILY_API_KEY=your_key
# SUPABASE_URL=your_url
# SUPABASE_KEY=your_key
uvicorn main:app --reloadSwagger UI available at http://localhost:8000/docs
cd frontend
npm install
# Environment Setup (.env)
# VITE_API_URL=http://localhost:8000
npm run devApplication available at http://localhost:5173
Click to collapse/expand
QueryIQ/
βββ backend/ # βοΈ FastAPI Python Server
β βββ main.py # Application entrypoint, routing, and agentic orchestration
β βββ multi_llm.py # Groq LLaMA 3.3 integration for Intent Classification & Data Extraction
β βββ research.py # Tavily Search API wrapper for autonomous deep web scraping
β βββ database.py # Supabase PostgreSQL client and CRUD operations
β βββ schemas.py # Pydantic models for strict type validation (Request/Response)
β βββ requirements.txt # Python dependency list
β βββ .env # Backend environment secrets (Groq, Tavily, Supabase)
β
βββ frontend/ # π¨ React 19 + Vite UI
β βββ public/ # Static assets
β β βββ fav.png # Application favicon/logo
β β βββ bg-image.png # High-quality hero background image
β β βββ newquery.png # Tab icon for the Query input
β β βββ result.png # Tab icon for the Results display
β βββ src/ # Main frontend source code
β β βββ components/ # Modular React UI Components
β β β βββ AboutPage.jsx # Project info, tech stack, and developer details
β β β βββ HistoryPage.jsx # Dashboard showing previously processed and approved queries
β β β βββ Icons.jsx # Centralized SVG icon library for consistent UX
β β β βββ LoadingSpinner.jsx # Animated agentic pipeline state tracker
β β β βββ QueryForm.jsx # Textarea input component for natural language research
β β β βββ ResultCard.jsx # Complex JSON renderer with HITL editing and JSON download
β β βββ api.js # Promise-based Fetch wrappers for API communication
β β βββ App.jsx # Root React component, routing state, and main layout structure
β β βββ index.css # Global styles, variables, typography, and glassmorphic utilities
β β βββ main.jsx # React DOM rendering entrypoint
β βββ index.html # HTML template
β βββ package.json # Node.js dependencies and run scripts
β βββ vite.config.js # Vite bundler configuration
β βββ .env # Frontend environment secrets (API URL)
β
βββ README.md # You are here!
Always building, always learning.
