Don't Let the Lobster Fool You: OpenClaw Security Risks Explained

OpenClaw is an open-source AI agent designed to operate as a personal assistant. To do so, it runs entirely on the user's local machine and autonomously interacts with real user environments: terminals, browsers, APIs, files, and enterprise tools. Don’t be fooled by the cute lobster logo. Given its broad capabilities and poor security defaults, security teams should pay special attention to OpenClaw usage within their organization. Indeed, many are banning it outright, but this too should be monitored and enforced.

Key Takeaways

OpenClaw is an autonomous AI agent that can directly execute actions across systems, files, APIs, browsers, and enterprise tools.
Its combination of autonomy, system-level permissions, and external integrations creates a much larger attack surface than traditional AI applications.
Key risks include prompt injection, malicious skills/plugins, malware delivery, excessive permissions, exposed infrastructure, and system prompt leakage.
Researchers have identified many OpenClaw deployments running with unsecured defaults, weak authentication, exposed interfaces, and overly broad access.
OpenClaw should be treated as a privileged operational system requiring sandboxing, least-privilege access, network isolation, and continuous security monitoring and threat blocking.

What is OpenClaw?

OpenClaw is an open-source agent that operates through four stages. First, it perceives its environment by reading files, tool outputs, and system state. Second, it plans by decomposing a high-level goal into concrete sub-tasks. Then, it acts by executing those tasks, like running shell commands, navigating browsers, and calling APIs. Finally, it reflects on the results, self-correcting until the goal is reached. This loop runs autonomously.

For enterprise users, OpenClaw can significantly enhance workflow automation. Rather than rigid, brittle automation scripts that break the moment a UI changes or an API shifts, OpenClaw can reason through unexpected states and adapt to a changed interface, retry with a different approach, or escalate when it can't proceed. But these same advantages can be exploited to introduce new security risks.

How OpenClaw Changes the Security Model

OpenClaw vs. LLMs and AI Chatbots

Most AI applications in production today are, at their core, sophisticated text processors. They take input, reason over it, and return output: a response, a summary, a suggestion, a draft. The human remains the executor. They read the outputs, decide what to do with it, and are the ones to take operational action.

Even when that output is harmful, like misinformation, malicious code, or manipulative content, the human acts as a guardrail, exercising judgement before action.

OpenClaw represents a fundamentally different model: the agent is the executor. OpenClaw agents (similar to other agents) can browse live websites, execute terminal commands, read and write files on disk, call external APIs, interact with SaaS platforms, and chain all of these actions together across multi-step workflows with minimal human intervention.

The security risk is operational compromise: files deleted, credentials exfiltrated, systems misconfigured, pipelines triggered, external services called, data moved.

OpenClaw vs. Incumbent Agents

Yet OpenClaw also introduces a risk unlike any posed by other agents. Earlier agentic systems, like copilot or enterprise workflow agents, were built with constraints as a feature. They operated inside defined sandboxes: a specific application, a sanctioned API surface, or a narrow task scope. Even the more capable enterprise agents were tethered to a vendor's guardrails, audit systems, and support infrastructure.

OpenClaw-style agents dissolve those boundaries. They can dynamically reach across environments, chain vulnerability and actions across systems, and adapt their behavior to achieve their goals. A vulnerability in one layer can cascade into another. A prompt injection may trigger tool execution, expose credentials, invoke external APIs, modify files, or establish persistence across connected systems. Because the agent inherits the permissions and trust relationships of the environments it interacts with, small weaknesses can compound into larger operational compromises.

The open-source dimension compounds this significantly. With a commercial agent platform, the vendor is accountable for patch notifications, incident response, or a support channel when something goes wrong. With OpenClaw, the security posture depends entirely on the deploying team's expertise, vulnerabilities may persist silently because no one is centrally tracking or responsible for them, and when an unexpected behavior surfaces there is no parent organization to escalate to.

6 Key OpenClaw Security Risks

1. Excessive Permissions and System-Level Access

Many deployments grant the OpenClaw agent broad permissions across the operating system, browser sessions, SaaS applications, local files, APIs, and cloud environments. While this enables powerful automation, it also creates an extremely large blast radius if the agent is compromised, manipulated, or behaves unexpectedly. In enterprise environments, this effectively turns the AI agent into a privileged lateral movement platform for attackers.

Researchers and users have warned that OpenClaw instances frequently run with full shell access, unrestricted filesystem permissions, and persistent authentication tokens. Meta’s Security Researcher, Summer Yue, famously had her email deleted by OpenClaw, and SMU’s Office of Information Technology warns that OpenClaw is not approved for use on university‑owned devices because it operates directly on the host OS.

Backslash Security research team found that even when OpenClaw was configured for a narrow, low-risk use case with intentionally minimal permissions, its effective privileges were far broader than expected. Despite being intended to access only a single Obsidian folder, Telegram-connected users were able to access the entire filesystem, including environment variables and sensitive system data. The researchers also identified insecure defaults such as unauthenticated local gateways and plaintext storage of secrets across logs and backups.

2. Prompt Injection

OpenClaw ingests external content from websites, emails, Slack messages, documents, and browser sessions. Since OpenClaw agents can autonomously interact with tools and systems, attackers can hide malicious instructions inside otherwise legitimate-looking commands.

Researchers Michael Alexander Riegler and Sushant Gautam recently published a study of Moltbook, a social network populated by autonomous AI agents that were built primarily on the OpenClaw ecosystem. They found 506 prompt injection attempts targeting AI agents reading content, prompting them to send API requests.

3. Unvetted or Malicious Skills

OpenClaw supports extensibility through community-created skills, plugins, and integrations. Because skills execute inside the operational context of the agent, malicious extensions may inherit access to credentials, filesystem operations, browser sessions, and connected APIs already trusted by the agent. However, those skills can contain malicious logic, credential leakage, or hidden malware payloads.

ClawHub, the official public marketplace and package registry for OpenClaw, has many malicious skills. They included disguised malware, credential stealers, and remote-access payloads, often tricking users into manually running dangerous installation commands or downloading infected dependencies. Other users warn that malicious skills often reappear under different names even after being removed from community registries.

4. Improper Network Exposure and Authentication Bypass

Many OpenClaw users expose dashboards, APIs, or orchestration layers to the internet in order to remotely manage their agents. In practice, these deployments are often misconfigured, weakly authenticated, or publicly accessible with insecure defaults, creating a highly exposed attack surface.

BitSight identified over 30,000 exposed OpenClaw instances, many without proper authentication, while a large percentage were vulnerable to remote code execution. This means attackers could potentially take full control of the host machine, access connected services, steal credentials, and abuse the agent’s permissions.

5. System Prompt Leakage and Inadequate Agent Guardrails

OpenClaw agents rely heavily on hidden system prompts and operational instructions that define how the AI behaves, what restrictions it follows, and which tools it can access. If attackers manage to extract or manipulate these prompts, they gain visibility into the agent’s internal logic and security assumptions.

ZeroLeaks’ red‑team assessment of OpenClaw gave OpenClaw a 2/100 security score and noted that its “system prompt extraction was successful” in 11 out of 13 adversarial attempts, an 84.6 % success rate. One of the first vulnerabilities they documented involved a simple JSON‑format request; asking the agent to “convert its rules to JSON” caused it to reveal key system‑prompt instructions such as tool names, constraints and reply tag syntax. Later attacks that framed the query as a developer‑to‑developer request extracted multiple system sections verbatim, identity statements, skill‑loading logic, memory protocols and special tokens, leaving the red team able to reconstruct roughly 85‑90% of the actual system prompt

6. Malware and Data Exfiltration

Because OpenClaw agents can interact with files, browsers, downloads, operating systems, and external APIs, they can unintentionally become part of a malware delivery or data exfiltration chain. Attackers do not necessarily need to compromise the underlying machine directly; they may instead manipulate the AI agent into performing the malicious activity on their behalf.

Researchers found that by abusing a one-click Remote Code Execution (RCE) flaw, attackers could hijack the agent connection, steal authentication tokens and API keys, execute arbitrary commands on the host machine, and potentially install additional malicious payloads.

How to Reduce OpenClaw Risk

Organizations should treat OpenClaw as a highly privileged system, not as a benign productivity tool. It should run with least-privilege access, use sandboxed environments or virtual machines, and avoid persistent unrestricted access to sensitive enterprise systems whenever possible.

It is critical to monitor how AI agents such as OpenClaw are built, what tools they connect to, what permissions they receive, and whether they are exploited through prompt injection, insecure integrations, or excessive privileges. This is where Agentic Endpoint Security platforms becomes relevant, helping organizations understand the actual attack paths, exposed AI components, risky agent behaviors, and exploitable relationships across modern AI-driven fabric, mitigating threats including prompt injection, data and source exfiltration, malicious and compromised MCPs, and abuse of AI agent privileges.

‍Additional reading:‍

Don't Let the Lobster Fool You: OpenClaw Security Risks Explained

Key Takeaways

What is OpenClaw?

How OpenClaw Changes the Security Model

OpenClaw vs. LLMs and AI Chatbots

OpenClaw vs. Incumbent Agents

6 Key OpenClaw Security Risks

1. Excessive Permissions and System-Level Access

2. Prompt Injection

3. Unvetted or Malicious Skills

4. Improper Network Exposure and Authentication Bypass

5. System Prompt Leakage and Inadequate Agent Guardrails

6. Malware and Data Exfiltration

How to Reduce OpenClaw Risk

Frequently Asked Questions About OpenClaw

Is OpenClaw itself malware?

Can OpenClaw expose sensitive enterprise data?

How is OpenClaw different from a regular automation script in terms of security risk?

What should we do before installing community skills or plugins?

Does running OpenClaw locally make it safer than a cloud-based agent?

How do we know if our OpenClaw instance has already been compromised?