Skip to content

hamr0/bareagent

Repository files navigation

                         ╭─────────────────────────────────╮
                         │  ╔╗ ╔═╗╦═╗╔═╗ ╔═╗╔═╗╔═╗╔╗╔╔╦╗   │
                         │  ╠╩╗╠═╣╠╦╝╠╣  ╠═╣║ ╦╠╣ ║║║ ║    │
                         │  ╚═╝╩ ╩╩╚═╚═╝ ╩ ╩╚═╝╚═╝╝╚╝ ╩    │
                         │   think ──→ act ──→ observe     │
                         │     ↑                  │        │
                         │     └──────────────────┘        │
                         ╰──╮──────────────────────────────╯
                            ╰── the brain, without the bloat

CI version (auto from package.json) license: Apache 2.0

Agent orchestration in ~2.7K lines of core. One required dep (bareguard ^0.2.0).

Lightweight enough to understand completely. Complete enough to not reinvent wheels. Not a framework, not 50,000 lines of opinions — just composable building blocks for agents. Single-gate governance via bareguard: every tool call traverses one policy hook, one audit log, one budget cap.

Quick start

npm install bare-agent

1. Give your AI assistant the integration guide

Read bareagent.context.md from node_modules/bare-agent/bareagent.context.md

This single file contains component selection, wiring recipes, API signatures, and gotchas — everything an agent needs to use the library correctly.

2. Describe what you want

I need an agent that:
- Takes a user goal and breaks it into steps
- Runs steps in parallel where possible
- Retries failed steps twice
- Streams progress as JSONL events

Use bare-agent. The integration guide is in bareagent.context.md.

That's it. The context doc is structured for LLM consumption — your agent reads it once and knows how to wire every component.

Not sure what you need? Paste this into any AI assistant:

I want to build an agent using bare-agent. Read the integration guide at
node_modules/bare-agent/bareagent.context.md, then ask me up to 5 questions
about what I need. Based on my answers, tell me which components to use
and show me the wiring code.

What's inside

Every piece works alone — take what you need, ignore the rest.

Component What it does
Loop Think → act → observe → repeat. Calls any LLM, runs your tools, loops until done, returns estimated USD cost per run. Three opt-in seams hook external libraries in without touching your code: policy (governance — wire bareguard for one gated chokepoint over every tool call), assemble (context engineering — recall/compress/trim the window per round; the seam litectx plugs into, transcript untouched), and trim (destructively bound the transcript for unbounded runs, harvesting turns before eviction). Each is a single chokepoint, fail-open, off by default. onError + loop:error surface every silent failure
Planner Break a goal into a step DAG via LLM. Built-in caching (cacheTTL)
assessComplexity Pure-code pre-planner (no LLM): rates a goal simple/medium/complex/critical from its text via keyword scoring + a critical safety override. needsPlanning gates whether to spend a Planner pass; critical flags security/production/compliance work for extra scrutiny. Free, instant, debuggable via signals
runPlan Execute steps in parallel waves. Dependency-aware, failure propagation, per-step retry
Retry Exponential/linear backoff with jitter. Respects err.retryable
CircuitBreaker Fail fast after N errors. Auto-recovers after cooldown. Per-key isolation
Fallback Try providers in order — if one is down, next one picks up. Transparent to Loop
Memory Persist and search context across turns/sessions through a swappable Store. Zero-dep JSON file by default, or mount litectx for ranked, graph-aware recall in one line — the host code never changes (example). A minimal SQLite FTS5 store also ships, though litectx supersedes it for SQLite-backed memory
StateMachine Task lifecycle tracking with event hooks. pending → running → done / failed / waiting / cancelled
Checkpoint Human approval gate. You provide the transport — terminal, Telegram, Slack, whatever
Scheduler Cron (0 9 * * 1-5) or relative (2h, 30m). Persisted jobs survive restarts
Stream Structured event emitter. Pipe as JSONL, subscribe in-process, or custom transport
Errors Typed hierarchy — ProviderError, ToolError, TimeoutError, CircuitOpenError, ValidationError. Halt decisions (turn cap, budget cap, content rules) come from bareguard, not Loop
bareguard adapter wireGate(gate){ policy, onLlmResult, onToolResult, filterTools, formatDeny }: one-line wiring to bareguard's Gate. Routes every LLM + tool result through the gate so budget caps cover token-heavy workloads, drops denied tools before the LLM ever sees them, and turns halts into a clean exit. require('bare-agent/bareguard')
Browsing Web navigation, clicking, typing, reading via barebrowse (17 tools). Two modes: library tools (inline snapshots, pass to Loop) or CLI session (disk-based snapshots, token-efficient for multi-step flows). Optional assess tool (privacy scan) when wearehere is installed
Mobile Android + iOS device control via baremobile. Same two modes: library tools (createMobileTools — action tools auto-return snapshots) or CLI session (baremobile CLI — disk-based snapshots)
Shell Cross-platform shell_read, shell_grep, shell_run (argv, no shell), shell_exec (raw shell). Pure Node — no grep/rg/findstr dependency. Injection-proof shell_run for policy-gated use
MCP Bridge Auto-discover MCP servers from your $HOME/IDE configs (Claude Code, Cursor, …) and expose them as bareagent tools — bulk (tools) or token-thrifty meta-tools (mcp_discover + mcp_invoke) for large catalogs. Same Loop({ policy }) hook governs MCP and native tools alike. The project-cwd .mcp.json is opt-in (untrusted-repo safety); vet every server spawn with confirmServer; every RPC is time-bounded. Zero deps
Spawn Fork a child bareagent as a specialist agent — LLM-callable (blocks until exit) or a library handle (wait, onLine, kill). The whole family stitches into one audit log + budget; bareguard ^0.2.0 adds per-family rate + depth caps. timeoutMs caps wall-clock, opt-in idleTimeoutMs kills a child gone silent (slow-but-working children survive)
Defer Queue a {action, when} record for a separate waker (cron / examples/wake.sh) to fire later. Governed twice — once when emitted, again when it fires. bareguard ^0.2.0 adds a family-wide rate cap

Providers: OpenAI-compatible (OpenAI, OpenRouter, Groq, vLLM, LM Studio), Anthropic, Ollama, CLIPipe (any CLI tool via stdin/stdout with real-time streaming), Fallback, or bring your own (one method: generate). All return the same shape — swap freely. The OpenAI provider warns if it would send your key over plaintext http:// to a non-loopback host (use https, or drop apiKey for keyless local endpoints).

Tools: Any function is a tool. REST APIs, MCP servers, CLI commands, shell scripts — if it's a function, it works. Built-in: barebrowse for web browsing, baremobile for Android + iOS device control (both optional) — library tools for inline results or CLI session mode for token-efficient disk-based snapshots.

Cross-language: Runs as a subprocess. Communicate via JSONL on stdin/stdout from Python, Go, Rust, Ruby, Java, or anything that can spawn a process. Ready-made wrappers in contrib/.

Deps: 1 required (bareguard ^0.2.0 for governance — single-gate policy + audit + budget + per-family rate caps). Optional: cron-parser (cron expressions), better-sqlite3 (SQLite store), barebrowse (web browsing), baremobile (Android + iOS device control), wearehere (privacy assessment via barebrowse).

This table is the map, not the manual — per-component wiring and API detail live in the Integration Guide and Usage Guide.


Recipes

Wire bareguard into Loop

const { Gate } = require('bareguard');
const { Loop, wireGate, defaultActionTranslator } = require('bare-agent');

const gate = new Gate({
  budget: { maxCostUsd: 0.50 },
  limits: { maxToolRounds: 20 },                // bareguard 0.4.2+ — N tool rounds, LLM rounds bypass
  audit:  { path: './audit.jsonl' },
});
await gate.init();

const { policy, onLlmResult, onToolResult, filterTools } = wireGate(gate, {
  // Optional: translate tool names → bareguard primitive types for bash/fs/net rules.
  // bareguard 0.4.1+ reads args.command / args.path verbatim, so args passes through.
  actionTranslator: (toolName, args, ctx) => {
    if (toolName === 'shell_exec') return { type: 'bash', args, _ctx: ctx };
    if (toolName === 'shell_read') return { type: 'read', args, _ctx: ctx };
    return defaultActionTranslator(toolName, args, ctx);
  },
});
const tools = await filterTools(myTools);      // drop tools denied by static policy

const loop = new Loop({ provider, policy, onLlmResult, onToolResult });
await loop.run([{ role: 'user', content: 'go' }], tools, { ctx: { userId: 42 } });

onLlmResult + onToolResult are what make budget.maxCostUsd actually cover token-heavy workloads — without them, budget only sees tool cost. ctx flows through to gate.record as _ctx for per-principal accounting.

Per-principal bypass (owner / admin role)

Wrap the gate policy when a principal is trusted unconditionally:

const { policy: gatePolicy } = wireGate(gate);

const policy = async (toolName, args, ctx) => {
  if (ctx?.role === 'owner') return true;       // bypass gate entirely
  return gatePolicy(toolName, args, ctx);
};

new Loop({ provider, policy, onLlmResult, onToolResult });

Bypassing the gate also bypasses audit and budget — only do this for principals you trust unconditionally. For partial trust, use ctx-aware rules inside bareguard instead.

Custom deny strings (localize / strip prefix)

const { policy } = wireGate(gate, {
  formatDeny: (decision) => `Sorry — ${decision.reason || 'not allowed'}`,
});

Halt-severity decisions bypass formatDeny (they throw HaltError and exit the loop without ever reaching the LLM).

Catch halts in your app

const result = await loop.run(msgs, tools);
if (result.error?.startsWith('halt:')) {
  // budget cap, turn cap, or gate terminated. Inspect rule:
  const rule = result.error.slice('halt:'.length);
  // tell the user, schedule retry, escalate, etc.
}

Halts also fire loop:error on the stream (source: 'halt') and the onError callback (with a HaltError instance).


Examples

Runnable scripts in examples/ — each is self-contained and the file's top docstring documents flags and required env vars.

File What it shows
with-bareguard.mjs End-to-end Loop + bareguard wiring: budget cap, fs scope, bash allowlist, audit log, humanChannel. The canonical governed-loop reference.
mcp-bridge-poc.js Auto-discover MCP servers from your IDE configs and expose them as bareagent tools. First run writes .mcp-bridge.json (edit to deny tools).
mcp-bridge-concurrent.js Soak test: fan out concurrent barebrowse_browse calls against real domains (Amazon, Wikipedia, GitHub, a dead host) and verify resilience.
orchestrator/ Multi-agent dispatch via spawn. Three configs, one system prompt — no orchestrator class, no role types. Roles are JSON files.
wake.sh + wake.md Reference cron + jq script for firing deferred actions. The runtime half of createDeferTool — bareagent emits, wake.sh fires.
replay-job.js Supervised replay POC: record a browser task once with the LLM driving, then replay against fresh snapshots with the LLM as locator-only. Falls back to full reasoning when the locator misses, and patches the trace.
litectx-as-store.mjs Mount litectx as the Memory Store — one-line swap from JsonFileStore to ranked, graph-aware recall; the host code never changes (RT-3).
litectx-mcp-child.mjs Give a spawned child agent litectx's reasoning verbs as MCP tools, read-only on its own db, via liteCtxMcpBridgeConfig + cfg.mcp (RT-4).

Cross-language usage

Not using Node.js? Spawn bare-agent as a subprocess from any language. Ready-made wrappers in contrib/ for Python, Go, Rust, Ruby, and Java — copy one file, no package registry needed.

# Python — 3 lines to run an agent
from bareagent import BareAgent

agent = BareAgent(provider="openai", model="gpt-4o-mini")
result = agent.run("What is the capital of France?")
print(result["text"])  # → "The capital of France is Paris."
agent.close()
// Go — same pattern
agent, _ := bareagent.New("anthropic", "claude-haiku-4-5-20251001", "")
result, _ := agent.Run("What is the capital of France?")
fmt.Println(result.Text)
agent.Close()
# Ruby — same pattern
agent = BareAgent.new(provider: "ollama", model: "llama3.2")
result = agent.run("What is the capital of France?")
puts result["text"]
agent.close

All wrappers support optional event streaming for intermediate results. See contrib/README.md for Rust, Java, and full protocol reference.


Production-validated

Component Aurora (SOAR2) Multis (assistant)
Loop
Planner
runPlan
Retry
CircuitBreaker
Scheduler
Checkpoint
CLIPipe

Aurora replaced ~400 lines of hand-rolled orchestration with ~60 lines of bare-agent wiring — zero workarounds, zero framework plumbing, 100% domain logic.

For wiring recipes and API details, see the Integration Guide (LLM-optimized). For the full human guide — usage patterns, composition examples, and what bare-agent deliberately doesn't build in (with recipes to do it yourself), see the Usage Guide. For error reference, see Error Guide. For release history, see CHANGELOG.

The bare ecosystem

Local-first, composable agent infrastructure. Same API patterns throughout — mix and match, each module works standalone.

Core — the brain, the gate, the memory.

  • bareagent — the think→act→observe loop. Goal in → coordinated actions out. Replaces LangChain, CrewAI, AutoGen.
  • bareguard — the single gate every action passes through. Action in → allow / deny / ask-a-human out. Replaces hand-rolled allowlists and scattered policy code.
  • litectx — tree-sitter code + memory graph with activation decay, plus lightweight context engineering (write · select · compress · isolate). Query in → ranked context out.

Optional reach — give the agent hands.

  • barebrowse — a real browser for agents. URL in → pruned snapshot out. Replaces Playwright, Selenium, Puppeteer.
  • baremobile — Android + iOS device control. Screen in → pruned snapshot out. Replaces Appium, Espresso, XCUITest.
  • beeperbox — 50+ messaging networks via one MCP server (headless Beeper Desktop in Docker). Chat in → unified message stream out. Replaces Twilio, per-platform bot APIs.

What you can build:

  • Headless automation — scrape sites, fill forms, extract data, monitor pages on a schedule
  • QA & testing — automated test suites for web and Android apps without heavyweight frameworks
  • Personal AI assistants — chatbots that browse the web or control your phone on your behalf
  • Remote device control — manage Android devices over WiFi, including on-device via Termux
  • Agentic workflows — multi-step tasks where an AI plans, browses, and acts across web and mobile

Why this exists: Most automation stacks ship 200MB of opinions before you write a line of code. These don't. Install, import, go.

License

Apache License, Version 2.0 — see LICENSE and NOTICE.

About

Gives agents a think→act loop. Goal in, coordinated actions out. Replaces LangChain, CrewAI, AutoGen. Zero deps.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages