Speech-to-Text brick — push-to-talk recording via ALSA, transcription via Whisper.
All backends accessed through a single STT class — switch by changing the type= parameter:
from sunfounder_stt import STTtype= |
Engine | Online | Notes |
|---|---|---|---|
local_fast (default) |
faster-whisper CTranslate2 | No | int8 CPU, auto-downloads ~72 MB |
local_standard |
whisper.cpp GGML | No | needs model= path, e.g. /app/models/ggml-tiny.bin (42 MB) |
online |
OpenAI Whisper API | Yes | needs API_KEY and internet |
from sunfounder_stt import STT
stt = STT(type="local_fast", language="zh")
# Button press
stt.start_listening()
# ... user speaks ...
# Button release
stt.stop_listening()
text = stt.get_result() # blocks until done, returns transcribed textstt = STT(type="local_standard", model="/app/models/ggml-tiny.bin", language="zh")
# same API: start_listening() / stop_listening() / get_result()STT.API_KEY = "sk-..."
stt = STT(type="online", language="zh")
# same API| Method | Description |
|---|---|
start_listening() |
Spawn arecord -D hw:0,2 and begin capturing PCM |
stop_listening() |
Terminate capture, save WAV, trigger transcription |
get_result(timeout=30) -> str |
Block until transcription completes (empty string on timeout) |
is_ready() -> bool |
Has transcription completed? (non-blocking) |
reset() |
Discard pending recording and result |
| Parameter | Default | Description |
|---|---|---|
type |
"local_fast" |
Backend: "local_fast", "local_standard", or "online" |
model |
None |
Model path for local backends ("tiny" if omitted) |
language |
"en" |
Language code, e.g. "zh", "en", "auto" |
samplerate |
16000 |
Audio sample rate in Hz |
Audio is captured via ALSA direct: arecord -D hw:0,2 (S16_LE, 16000 Hz, mono). PCM saved as WAV to /app/audio_output/stt_last.wav.
The Qualcomm Codec capture mixer is configured at container startup by setup_audio_output() in robot_shield.
Host pre-requisite:
systemctl --user stop pipewire pipewire-pulse wireplumberfaster-whisper(forlocal_fast; default)pywhispercpp(forlocal_standard)requests(foronline)arecordfrom alsa-utils (system package)