Add llamacpp-cpu-qwen3-embed (CPU embedding) extension#1
Merged
Conversation
CPU-only Qwen3-Embedding-0.6B via llama.cpp, OpenAI-compatible /v1/embeddings on port 8007. Runs on macOS/Windows Docker Desktop — no GPU. Mirrors the bundle added to kh0pper/crow upstream. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… 8192 The manifest declared contextLen 32768 while docker-compose serves --ctx-size 8192, so inputs over 8K tokens would be silently rejected despite the advertised capacity. Lower the declared contextLen (crow-addon.json + registry.json entry) to match what the CPU server actually serves. Vector space is unchanged (1024-dim, same model) — embeddings stay interchangeable with the GPU bundles; only max input length differs. Mirrors the same fix in crow PR #111.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CPU-only Qwen3-Embedding-0.6B via llama.cpp, OpenAI-compatible /v1/embeddings on port 8007. Runs on macOS/Windows Docker Desktop — no GPU required.
Mirrors the bundle added upstream in kh0pper/crow#111. Adds
llamacpp-cpu-qwen3-embed/(crow-addon.json + docker-compose.yml) and theregistry.jsonentry.🤖 Generated with Claude Code