Hackers abuse exposed Ollama servers to power autonomous hacking tools
Hackers are abusing exposed Ollama model servers to power automated hacking tools without paying for AI compute. Researchers at Sysdig observed an attacker using a misconfigured Ollama server as the reasoning engine for a multi-stage offensive framework.
The activity shows how LLMjacking has moved beyond stolen cloud API keys and billing fraud. In this case, the attacker did not simply resell model access or use the model for chat. They wired the exposed model server into a tool that could fingerprint services, match vulnerabilities, generate proof-of-concept exploits, and attempt compromise.
Access content across the globe at the highest speed rate.
70% of our readers choose Private Internet Access
70% of our readers choose ExpressVPN
Browse the web from multiple devices with industry-standard security protocols.
Faster dedicated servers for specific actions (currently at summer discounts)
The incident also highlights a growing security gap around self-hosted AI. Ollama makes it easy to run open models locally, but any model endpoint exposed to the internet without authentication can become free compute for attackers.
What Sysdig observed
Sysdig’s Threat Research Team said it saw the activity on June 12, 2026. The attacker used a misconfigured Ollama model server as the decision-making layer for a framework that researchers call VAPT, based on strings found in the tool’s workflow.
The framework sent full instructions to the model with every request. That gave researchers a rare view into the tool’s internal stages, output rules, and compromise-checking logic.
The exposed server was not exploited through a new CVE. The key problem was configuration exposure: a model server was reachable from the internet and did not require authentication.
| Observed element | What it means |
|---|---|
| Exposed Ollama server | Publicly reachable model endpoint used without authorization |
| VAPT framework | Automated offensive pipeline driven by model responses |
| Private test targets | Non-routable benchmark ranges and lab-style applications |
| Marker strings | Sentinels used to confirm command execution |
| Repeated model calls | Stolen AI compute used as the tool’s reasoning backend |
How the VAPT framework worked
The tool broke the attack process into structured steps. Each step gave the model a narrow job and required output that the surrounding software could parse automatically.
According to Sysdig’s analysis, those stages included service fingerprinting, vulnerability matching, web reconnaissance, proof-of-concept generation, blind SQL injection crafting, credential extraction, arbitrary file-read planning, and privilege escalation.
This design makes the tool different from someone casually asking a chatbot for advice. The model becomes one component inside a larger attack loop, while deterministic code handles requests, checks results, and decides when to continue.
- The tool identifies software and services from target observations.
- It matches those observations to possible vulnerabilities.
- It asks the model to generate or refine exploit attempts.
- It checks responses for command-execution markers.
- It converts a successful request into a reusable command template.
Why exposed Ollama servers are attractive to attackers
Ollama is popular because it lets developers and organizations run models on their own hardware instead of relying only on cloud-hosted AI services. That can improve control, privacy, and cost management when deployed safely.
The risk grows when operators bind a model server to a public interface without adding authentication, firewall rules, or a reverse proxy. A publicly reachable inference endpoint can answer requests from anyone who finds it.
A previous report from The Hacker News, based on research from SentinelOne and Censys, said researchers identified 175,000 unique Ollama hosts across 130 countries. The report also noted that many exposed hosts advertised tool-calling capabilities.
LLMjacking has changed
LLMjacking originally focused on stolen access to paid AI services. Attackers would steal cloud credentials, use a victim’s AI services, and leave the victim with the bill.
Sysdig’s earlier research said LLMjacking had evolved into a commercialized black market by early 2026, with underground services monetizing unauthorized model access.
The latest case moves the threat further. Instead of only stealing compute for resale or content generation, the attacker used stolen model capacity to run an offensive automation pipeline.
| LLMjacking stage | Main abuse pattern |
|---|---|
| Early activity | Stolen cloud credentials used to access paid AI models |
| Commercialized activity | Unauthorized model access brokered through underground marketplaces |
| Current observed shift | Exposed model servers used to power autonomous hacking workflows |
The attacker appeared to be testing the tool
Sysdig said the observed targets were private and non-routable. The framework referenced fictitious applications named MediaVault Asset Portal and Reverb Studio, along with private ranges associated with practice or lab-style environments.
That detail matters because it suggests the tool was still being developed or benchmarked. It does not make the activity harmless, since using someone else’s exposed AI server without authorization remains abusive.
The framework also appeared to change during the observed sessions. New stages were added, existing prompts were rewritten, and a fuller version of the pipeline returned in later activity.
Known indicators from the observed activity
The indicators below came from the observed sessions and should be used as hunting leads. Source IP addresses can change quickly, so the marker strings and structured behavior may offer stronger long-term detection value.
| Type | Indicator | Description |
|---|---|---|
| Source IP | 122.183.48.82 | Threat actor IP seen during the June 12 session |
| Source IP | 122.183.48.35 | Threat actor IP seen during the June 14 session |
| Source IP | 122.183.48.195 | Threat actor IP from the same /24 range |
| Source IP | 47.15.69.15 | Threat actor IP from a second residential ISP |
| String marker | VAPTb3gin | Begin marker used for command-output confirmation |
| String marker | VAPTfin | End marker used for command-output confirmation |
| String marker | __VAPTCMD__ | Placeholder left in a confirmed command recipe |
| Command | echo VAPTb3gin; id; echo VAPTfin | Remote code execution confirmation probe |
| String | MediaVault Asset Portal | Fictitious target application name in payloads |
| String | Reverb Studio | Fictitious target application name in payloads |
Why defenders may miss this abuse
Many teams monitor cloud AI bills, endpoint malware, and user account abuse. They may not monitor a self-hosted model server’s inference traffic with the same urgency.
Cisco’s Shodan-based Ollama study found more than 1,100 exposed Ollama servers, with about 20% actively hosting models susceptible to unauthorized access. Cisco warned that exposed LLM servers can support unauthorized API access, model extraction, content abuse, resource hijacking, and other risks.
Defenders also face a visibility problem. If a model endpoint is exposed because nobody owns or monitors it properly, the organization may only notice higher CPU or GPU usage, not the offensive workflow being powered by the model.
How to protect Ollama and other model servers
Organizations should treat exposed AI inference endpoints like exposed databases, admin panels, or CI/CD systems. They can become powerful infrastructure for attackers if reachable from the internet without access controls.
Ollama’s platform is designed to help users run open models, but operators still need to secure the network layer around any deployment that leaves a local machine. Authentication, isolation, and logging should come before production use.
The most effective fixes are basic but urgent: remove public exposure, require authentication, restrict source IPs, and monitor request patterns for offensive content.
- Do not expose port 11434 directly to the internet.
- Bind model servers to localhost or an internal interface where possible.
- Place remote access behind a firewall and authenticating reverse proxy.
- Monitor prompts for structured exploit stages and command markers.
- Track request volume, model usage, CPU, memory, and GPU spikes.
- Scan your own external IP ranges for exposed model endpoints.
- Keep an inventory of all self-hosted AI tools used by developers and teams.
Exposed AI servers are becoming a supply-chain risk
The bigger issue is not only stolen compute. Exposed model servers can become part of an attacker’s tooling supply chain, helping them automate reconnaissance, exploit development, and validation.
The SentinelOne and Censys findings show how widespread exposed Ollama infrastructure had already become before this captured attack. If those systems support tool calling or external actions, the risk moves beyond text generation.
This creates a new governance problem for security teams. Shadow AI infrastructure can sit outside normal asset management, outside cloud billing controls, and outside centralized monitoring.
RMM abuse remains a separate threat
Legitimate remote monitoring and management tools are also widely abused by attackers, but that is a different issue from the Ollama case. Huntress reported that attackers increasingly weaponize trusted RMM software for hands-on-keyboard access, persistence, and evasion.
That comparison is useful because both threats rely on trusted tools. In RMM abuse, attackers hide behind remote access software. In LLMjacking, attackers hide inside exposed AI infrastructure and use it as compute for automated attacks.
The defensive lesson overlaps. Security teams should not trust a tool only because it has a legitimate business use. They should verify ownership, behavior, network exposure, and access patterns.
What security teams should do next
Security teams should start with asset discovery. Find self-hosted model servers, identify who owns them, check whether they face the public internet, and confirm whether authentication sits in front of them.
The Cisco research also recommends stronger authentication and access control for LLM deployments, along with network isolation and monitoring for exposed endpoints.
Teams should also review broader misuse of trusted tools. Huntress’ RMM abuse guidance stresses the need to move from trusting known software to verifying real behavior, which applies just as well to AI infrastructure.
| Priority | Recommended action |
|---|---|
| Asset discovery | Find all Ollama, vLLM, llama.cpp, LM Studio, and similar servers |
| Exposure control | Remove public access unless there is a documented business need |
| Authentication | Put model APIs behind an authenticating proxy or gateway |
| Monitoring | Alert on exploit-like prompts, unusual volume, and command markers |
| Governance | Assign owners and review AI endpoints like other critical systems |
The bottom line
The captured activity shows a clear shift in LLMjacking. Attackers are no longer only stealing AI access for resale, content generation, or inflated bills. They can use exposed model servers as the reasoning layer for autonomous hacking tools.
LLMjacking has become more dangerous as self-hosted AI expands across organizations. Unauthenticated model servers now create both financial and operational risk.
Any organization running local or self-hosted AI should secure inference endpoints immediately. An exposed model server should receive the same urgency as an exposed database, VPN portal, or administrative console.
FAQ
LLMjacking is the unauthorized use of AI model access, cloud AI services, or exposed inference endpoints. Attackers use the victim’s AI compute for their own purposes, which can include resale, content generation, automation, or offensive tooling.
Sysdig observed an attacker using a misconfigured Ollama model server as the reasoning engine for an automated offensive framework called VAPT. The tool used the model to help fingerprint services, match vulnerabilities, generate exploit attempts, and confirm command execution.
No new CVE was involved in the observed activity. The main issue was exposure and misconfiguration: the Ollama server was reachable from the internet and did not require authentication.
Companies should avoid exposing port 11434 to the public internet, bind servers to localhost or internal interfaces, use firewalls, add authentication through a reverse proxy or gateway, monitor inference traffic, and scan external assets for exposed model endpoints.
Exposed AI model servers can be abused for free compute, unauthorized prompt execution, model access, offensive automation, tool calling, and resource exhaustion. If owners do not monitor them, attackers can use the infrastructure without being noticed quickly.
Read our disclosure page to find out how can you help VPNCentral sustain the editorial team Read more
User forum
0 messages