I was exicited and scared of OpenClaw, so I locked it down.
I Watched OpenClaw Delete a Meta Director's Inbox. And decided I need a kill switch — before the agent decides for me.
I like OpenClaw. I use it for many personal things - call me to remind about doctor appoint, search for opensource Github project,… you call it! It’s fast, it’s hackable, and it connects to basically everything.
And then it deleted Meta Director’s email.
If you missed it: in February 2026, an OpenClaw agent connected to a Meta director for AI’s inbox went on a speed-run. It mass-deleted emails, ignored stop commands, blew through cost in minutes, and kept going even after the user tried to shut it down. The context window compacted and the agent lost track of the original instructions. It just… decided deleting was the task.
That scared me. Not because OpenClaw is broken — it’s a great agent runtime. But because there’s nothing between the agent and the API. No filter on what tools the model sees. No cost ceiling. No way to remotely kill a run. No record of what happened that you could trust after the fact. It’s a straight pipe from agent to OpenAI, and if the agent goes sideways, you find out when the damage is done.
So I built a way to put a wall in front of it.
The actual problem
OpenClaw sends your LLM requests directly to OpenAI. The model sees every tool you registered — including delete_emails, bulk_remove, drop_table, whatever you’ve wired up. If the model decides to call one, it calls it. There’s no checkpoint, no approval, no “hey, are you sure?”
And there’s no audit trail. If something goes wrong, you’re digging through stdout logs trying to reconstruct what the agent did, in what order, with whose data. Good luck.
What I wanted was simple:
Don’t let the model see tools it shouldn’t use. Not “block the call after it happens.” Remove the tool from the request before the model knows it exists. It can’t call
delete_emailsif it was never told aboutdelete_emails.Cap the spend. Daily, monthly, per-request. When the budget’s done, the gateway says no.
Record everything. Every request, every denial, every tool that got stripped. Signed, queryable, trustworthy.
Keep my real API key out of OpenClaw. OpenClaw gets a caller token. The real key lives in an encrypted vault and gets injected at forward time.
How I set it up
I built this into Dativo Talon — a single Go binary that sits between OpenClaw and OpenAI. Here’s the exact setup I run.
Step 1: Install and init
go install github.com/dativo-io/talon/cmd/talon@latest
mkdir talon-openclaw && cd talon-openclaw
talon init --pack openclaw --name openclaw-gatewayThis generates two files: agent.talon.yaml (server policy) and talon.config.yaml (gateway config). The gateway config is where the real controls live.
Step 2: Store your OpenAI key in the vault
export TALON_SECRETS_KEY=$(openssl rand -hex 32) # save this somewhere safe
talon secrets set openai-api-key "$OPENAI_API_KEY"Your real OpenAI key is now encrypted at rest. OpenClaw will never see it.
Step 3: Start the gateway
talon serve --gatewayThat’s it. Talon is now listening on localhost:8080.
Step 4: Point OpenClaw at Talon
In ~/.openclaw/openclaw.json:
{
"models": {
"providers": {
"openai": {
"baseUrl": "http://localhost:8080/v1/proxy/openai/v1",
"apiKey": "talon-gw-openclaw-001",
"api": "openai-responses",
"models": [
{ "id": "gpt-4o", "name": "gpt-4o" },
{ "id": "gpt-4o-mini", "name": "gpt-4o-mini" }
]
}
}
}
}Notice the apiKey — that’s the caller token, not the OpenAI key. Talon identifies OpenClaw by this token and injects the real key when it forwards to OpenAI.
Restart OpenClaw (openclaw gateway stop && openclaw gateway start) and you’re running through the gateway.
The config that would have stopped the inbox incident
Here’s the talon.config.yaml I use. I’ll walk through the parts that matter.
gateway:
enabled: true
listen_prefix: "/v1/proxy"
mode: "enforce"
providers:
openai:
enabled: true
secret_name: "openai-api-key"
base_url: "https://api.openai.com"
allowed_models: ["gpt-4o", "gpt-4o-mini", "gpt-4-turbo"]
callers:
- name: "openclaw-main"
api_key: "talon-gw-openclaw-001"
tenant_id: "default"
team: "engineering"
allowed_providers: ["openai"]
policy_overrides:
max_daily_cost: 25.00
max_monthly_cost: 500.00
pii_action: "redact"
allowed_models: ["gpt-4o", "gpt-4o-mini", "gpt-4-turbo"]
default_policy:
require_caller_id: true
log_prompts: true
# --- THIS IS THE BIG ONE ---
# Tool governance: strip dangerous tools BEFORE the model sees them.
tool_policy_action: "filter"
forbidden_tools:
- "delete_*"
- "admin_*"
- "export_all_*"
- "bulk_*"
- "rm_*"
- "drop_*"
# PII: redact personal data from requests headed to OpenAI
default_pii_action: "redact"
response_pii_action: "warn"
# Attachments: scan PDFs and CSVs for prompt injection
attachment_policy:
action: "warn"
injection_action: "block"
max_file_size_mb: 10
rate_limits:
global_requests_per_min: 300
per_caller_requests_per_min: 60
timeouts:
connect_timeout: 10s
request_timeout: 120s
stream_idle_timeout: 60sLet me break down what each piece would have done in the emails incident:
forbidden_tools: ["delete_*", "bulk_*"] — The agent had access to delete_email, delete_thread, and bulk operations. With this config, Talon strips those tools from the JSON body before OpenAI ever sees them. The model literally cannot decide to delete anything because it doesn’t know deletion is an option.
tool_policy_action: "filter" — This is the mode. "filter" silently removes forbidden tools and forwards the rest. If you want to be more aggressive, set it to "block" — that rejects the entire request if any forbidden tool is present. I prefer "filter" because it keeps the agent functional for everything except the dangerous stuff.
max_daily_cost: 25.00 — The incident ran up significant cost in minutes. This cap shuts the door after $25/day for this caller. Done. No negotiation.
per_caller_requests_per_min: 60 — The agent was firing requests as fast as it could. Rate limiting slows a runaway agent to a manageable pace and gives you time to notice.
request_timeout: 120s — No single request gets more than 2 minutes. The agent can’t sit in an infinite loop waiting for a response.
Per-caller tool allowlists (when you want to be strict)
If forbidden_tools is a blocklist, you can also go the other direction — a strict allowlist. Only the tools you name get through:
callers:
- name: "openclaw-main"
policy_overrides:
allowed_tools: ["search_web", "read_file", "list_files", "create_draft"]
tool_policy_action: "block"Now OpenClaw can only use those four tools. Everything else — delete_emails, send_email, admin_reset, whatever — gets rejected. The model never sees them. This is the nuclear option and it’s the one I’d use if I were connecting an agent to anyone’s inbox.
Verify it works
Send a request with a dangerous tool and watch what happens:
curl -s -X POST http://localhost:8080/v1/proxy/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer talon-gw-openclaw-001" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role":"user","content":"Clean up my inbox"}],
"tools": [
{"type":"function","function":{"name":"search_web","parameters":{}}},
{"type":"function","function":{"name":"delete_emails","parameters":{}}}
]
}'delete_emails gets stripped. The model only sees search_web. Check the evidence:
talon audit list --agent openclaw-main --limit 5You’ll see exactly which tools were requested, which were filtered, and which were forwarded. Signed and timestamped.
When this isn’t the answer
You’re just playing around. If it’s a hobby project and nothing is at stake, the gateway is overhead you don’t need. Especially if you have unlimited money , and you have nothing to hide ;)
You trust the tool set completely. If your agent only has read-only tools — no delete, no write, no send — the risk profile is lower. Still worth auditing, but the urgency is different.
You need governance inside MCP tool calls. The gateway governs what goes to and from the LLM. If you need policy on every individual tool invocation (not just what the model is told about), that’s Talon’s MCP proxy — a different deployment shape.
Final thought
The email incident wasn’t a bug in OpenClaw. It was a missing layer. The agent did exactly what agents do — it picked from the tools it was given and executed. The problem is it was given delete_emails and nobody was standing between the model and that tool.
That’s what I’m solving. Not replacing OpenClaw — I still use it every day( finger cross my tool would catch all dangerous stuff). Just making sure it runs through a gateway that strips the dangerous tools, caps the cost, and writes down everything that happened. If something goes wrong, I want to know exactly what the agent tried to do and exactly where it was stopped.
talon init --pack openclaw. Fifteen minutes. That’s the difference between “the agent deleted everything” and “the agent tried to delete everything and was told no.”


