🎓 CampusVoice — Smart Course Feedback Analyzer

M8 Final Project · Applied NLP for university course evaluation Team: Miao Jingzhe · Zeng · Zhong · Liang

An end-to-end NLP pipeline that turns unstructured student course reviews into actionable insights for professors and administrators. Built on Zhipu AI's GLM-4-Flash model — multilingual (English + Chinese), free tier, sub-second latency per review.

What it does

Feed in a CSV of anonymous student course reviews. Out comes:

Sentiment classification — positive / neutral / negative, with more confidence
Aspect-based tagging — which part of the course is being discussed (teaching style, workload, materials, exams, instructor, logistics)
Keyword extraction — the phrases students actually use
An interactive dashboard — filterable by course, language, sentiment

Plus a live-demo mode for the final presentation: paste any comment, watch GLM classify it in real time.

Repo layout

campusvoice/
├── src/
│   ├── glm_client.py      # GLM-4-Flash wrapper (retry, cache, JSON parsing)
│   ├── generate_data.py   # Synthesize 300 labeled reviews for the demo
│   ├── pipeline.py        # Run each review through GLM → enriched CSV
│   ├── evaluate.py        # Accuracy + confusion matrix vs intended labels
│   └── app.py             # Streamlit dashboard
├── data/
│   ├── feedback_raw.csv       # generated
│   ├── feedback_analyzed.csv  # pipeline output
│   └── .glm_cache/            # response cache (auto, gitignored)
├── outputs/                   # evaluation artifacts for the slides
├── requirements.txt
├── Makefile
└── .env.example

Quickstart

1. Install

pip install -r requirements.txt

2. Configure your API key

You may creat a file named .env

ZHIPUAI_API_KEY=your_glm_api_key_here

Get a key at open.bigmodel.cn. GLM-4-Flash is free.

3. Generate synthetic data (first run only, ~3 minutes)

make generate

This asks GLM to write 300 realistic student reviews — a mix of positive, negative, and neutral, in both English and Chinese, across 8 courses. Each review comes with a target sentiment label we use later for evaluation.

4. Run the NLP pipeline (~5 minutes)

make analyze

Every review gets classified. Results are cached on disk — re-running is free.

5. Evaluate

make eval

Prints accuracy and a confusion matrix. Writes artifacts to outputs/.

6. Launch the dashboard

make run

Opens at http://localhost:8501. Three views:

Overview — all courses, all feedback
By Course — pick one course, see only its reviews
Live Demo — paste any text, analyze it live (this is the presentation mode)

Division of work

Member	Role	Modules
Miao	Data Engineering & Preprocessing	`generate_data.py`, `pipeline.py`
Zeng	NLP Integration & Model Deployment	`glm_client.py`, prompt design
Zhong	Frontend UI & Data Visualization	`app.py`, chart design
Liang	Project Management & Testing	`evaluate.py`, README, demo script

Design decisions worth calling out

Why GLM-4-Flash, not a fine-tuned classifier? A fine-tuned BERT would need labeled data we don't have and a training loop we don't have time for. GLM-4-Flash is free, handles Chinese + English out of the box, and lets us do both sentiment classification and aspect tagging in a single API call via structured JSON output.

Why cache every response? During development we re-run the pipeline dozens of times. Caching on a SHA-256 of (system, prompt) means we only pay (in latency and quota) for each unique review once. Purge with make clean.

Why synthesize the data instead of scraping real reviews? Two reasons: (1) it gives us ground-truth labels for evaluation — we asked GLM to write a review with a target sentiment, then check whether it classifies its own output correctly. (2) We control the distribution, so every aspect category is represented.

This is documented as a limitation in the report. Real deployment would use human-annotated data.

For the live demo (5-minute pitch)

Open the dashboard on Overview — show the KPIs and sentiment-by-aspect chart.
Switch to By Course — pick a course with mixed feedback, narrate what a professor would learn from this view.
Switch to Live Demo — paste a review the audience suggests, or a pre-prepared bilingual one. Show the JSON output.
Close with accuracy number from make eval.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎓 CampusVoice — Smart Course Feedback Analyzer

What it does

Repo layout

Quickstart

1. Install

2. Configure your API key

3. Generate synthetic data (first run only, ~3 minutes)

4. Run the NLP pipeline (~5 minutes)

5. Evaluate

6. Launch the dashboard

Division of work

Design decisions worth calling out

For the live demo (5-minute pitch)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
outputs		outputs
src		src
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🎓 CampusVoice — Smart Course Feedback Analyzer

What it does

Repo layout

Quickstart

1. Install

2. Configure your API key

3. Generate synthetic data (first run only, ~3 minutes)

4. Run the NLP pipeline (~5 minutes)

5. Evaluate

6. Launch the dashboard

Division of work

Design decisions worth calling out

For the live demo (5-minute pitch)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages