As generative AI matures from a novelty into a workplace staple, a new friction point has emerged: the “shadow AI” or “Bring Your Own AI (BYOAI)” crisis. Much like the unsanctioned use of personal devices in years past, developers and knowledge workers are increasingly deploying autonomous agents on personal infrastructure to manage their professional workflows.”Our journey with Kilo Claw has been to make it easier and easier and more accessible to folks,” says Kilo co-founder Scott Breitenother. Today, the company dedicated to providing a portable, multi-model, cloud-based AI coding environment is moving to formalize this “shadow AI” layer: it’s launching KiloClaw for Organizations and KiloClaw Chat, a suite of tools designed to provide enterprise-grade governance over personal AI agents.The announcement comes at a period of high velocity for the company. Since making its securely hosted, one-click OpenClaw product for individuals, KiloClaw, generally available last month, more than 25,000 users have integrated the platform into their daily workflows. Simultaneously, Kilo’s proprietary agent benchmark, PinchBench, has logged over 250,000 interactions and recently gained significant industry validation when it was referenced by Nvidia CEO Jensen Huang during his keynote at the 2026 Nvidia GTC conference in San Jose, California.The shadow AI crisis: Addressing the BYOAI problemThe impetus for KiloClaw for Organizations stems from a growing visibility gap within large enterprises. In a recent interview with VentureBeat, Kilo leadership detailed conversations with high-level AI directors at government contractors who found their developers running OpenClaw agents on random VPS instances to manage calendars and monitor repositories. “What we’re announcing on Tuesday is Kilo Claw for organizations, where a company can buy an organization-level package of Kilo Claws and give every team member access,” explained Kilo co-founder and head of product and engineering Emilie Schario during the interview.”We can’t see any of it,” the head of AI at one such firm reportedly told Kilo. “No audit logs. No credential management. No idea what data is touching what API”. This lack of oversight has led some organizations to issue blanket bans on autonomous agents before a clear strategy on deployment could be formed. Anand Kashyap, CEO and founder of data security firm Fortanix, told VentureBeat without seeing Kilo’s announcement that while “Openclaw has taken the technology world by storm… the enterprise usage is minimal due to the security concerns of the open source version.” Kashyap expanded on this trend:”In recent times, NVIDIA (with NemoClaw), Cisco (DefenseClaw), Palo Alto Networks, and Crowdstrike have all announced offerings to create an enterprise-ready version of OpenClaw with guardrails and governance for agent security. However, enterprise adoption continues to be low.Enterprises like centralized IT control, predictable behavior, and data security which keeps them compliant. An autonomous agentic platform like OpenClaw stretches the envelope on all these parameters, and while security majors have announced their traditional perimeter security measures, they don’t address the fundamental problems of having a reduced attack surface. Over time, we will see an agentic platform emerge where agents are pre-built and packaged, and deployed responsibly with centralized controls, and data access controls built into the agentic platform as well as the LLMs they call upon to get instructions on how to perform the next task. Technologies like Confidential Computing provide compartmentalization of data and processing, and are tremendously helpful in reducing the attack surface.”KiloClaw for Organizations is positioned as the way for the security team to say “yes,” providing the visibility and control required to bring these agents in-house. It transitions agents from developer-managed infrastructure into a managed environment characterized by scoped access and organizational-level controls.Technology: Universal persistence and the “Swiss cheese” methodA core technical hurdle in the current agent landscape is the fragmentation of chat sessions. During the VentureBeat interview, Schario noted that even advanced tools often struggle with canonical sessions, frequently dropping messages or failing to sync across devices. Schario emphasized the security layer that supports this new structure: “You get all the same benefits of the Kilo gateway and the Kilo platform: you can limit what models people can use, get usage visibility, cost controls, and all the advantages of leveraging Kilo with managed, hosted, controlled Kilo Claw”.To address the inherent unreliability of autonomous agents—such as missed cron jobs or failed executions—Kilo employs what Schario calls the “Swiss cheese method” of reliability. By layering additional protections and deterministic guardrails on top of the base OpenClaw architecture, Kilo aims to ensure that tasks, such as a daily 6:00 PM summary, are completed even if the underlying agent logic falters. This is critical because, as Schario noted, “The real risk for any company is data leakage, and that can come from a bot commenting on a GitHub issue or accidentally emailing the person who’s going to get fired before they get fired”.Product: KiloClaw Chat and organizational guardrailsWhile managed infrastructure solves the backend problem, KiloClaw Chat addresses the user experience. Schario noted that “Hosted, managed OpenClaw is easier to get started with, but it’s not enough, and it still requires you to be at the edge of technology to understand how to set it up”. Kilo is looking to lower that barrier for the average worker, asking: “How do we give people who have never heard the phrase OpenClaw or Claudebot an always-on AI assistant?”.Traditionally, interacting with an OpenClaw agent required connecting to third-party messaging services like Telegram or Discord—a process that involves navigating “BotFather” tokens and technical configurations that alienate non-engineers. “One of the number one hurdles we see, both anecdotally and in the data, is that you get your bot running and then you have to connect a channel to it. If you don’t know what’s going on, it’s overwhelming,” Schario observed.“We solved that problem. You don’t need to set up a channel. You can chat with Kilo in the web UI and, with the Kilo Claw app on your phone, interact with Kilo without setting an external channel,” she continued. This native approach is essential for corporate compliance because, as she further explained, “When we were talking to early enterprise opportunities, they don’t want you using your personal Telegram account to chat with your work bot”. As Schario put it, there is a reason enterprise communication doesn’t flow through personal DMs; when a company shuts off access, they must be able to shut off access to the bot.Looking ahead, the company plans to integrate these environments further. “What we’re going to do is make Kilo Chat the waypoint between Telegram, Discord, and OpenClaw, so you get all the convenience of Kilo Chat but can use it in the other channels,” Breitenother added.The enterprise package includes several critical governance features:Identity Management: SSO/OIDC integration and SCIM provisioning for automated user lifecycles.Centralized Billing: Full visibility into compute and inference usage across the entire organization.Admin Controls: Org-wide policies regarding which models can be used, specific permissions, and session durations.Secrets Configuration: Integration with 1Password ensures that agents never handle credentials in plain text, preventing accidental leaks.Licensing and governance: The “bot account” modelOther security experts note that handling bot and AI agentic permissions are among the most pressing problems enterprises are facing todayAs Ev Kontsevoy, CEO and co-founder of AI infrastructure and identity management company Teleport told VentureBeat without seeing the Kilo news: “The potential impact of OpenClaw as a non-deterministic actor demonstrates why identity can’t be an afterthought. You have an autonomous agent with shell access, browser control, and API credentials — running on a persistent loop, across dozens of messaging platforms, with the ability to write its own skills. That’s not a chatbot. That’s a non-deterministic actor with broad infrastructure access and no cryptographic identity, no short-lived credentials, and no real-time audit trail tying actions to a verifiable actor.”Kilo is proposing to solve it with a major change in organizational structure: the adoption of employee “bot accounts”. In Kilo’s vision, every employee eventually carries two identities—their standard human account and a corresponding bot account, such as scott.bot@kiloco.ai. These bot identities operate with strictly limited, read-only permissions. For example, a bot might be granted read-only access to company logs or a GitHub account with contributor-only rights. This “scoped” approach allows the agent to maintain full visibility of the data it needs to be helpful while ensuring it cannot accidentally share sensitive information with others.Addressing concerns over data privacy and “black box” algorithms, Kilo emphasizes that its code is source available. “Anyone can go look at our code. It’s not a black box. When you’re buying Kilo Claw, you’re not giving us your data, and we’re not training on any of your data because we’re not building our own model,” Schario clarified.This licensing choice allows organizations to audit the resiliency and security of the platform without fearing their proprietary data will be used to improve third-party models.Pricing and availabilityKiloClaw for Organizations follows a usage-based pricing model where companies pay only for the compute and inference consumed. Organizations can utilize a “Bring Your Own Key” (BYOK) approach or use Kilo Gateway credits for inference.The service is available starting today, Wednesday, April 1. KiloClaw Chat is currently in beta, with support for web, desktop, and iOS sessions. New users can evaluate the platform via a free tier that includes seven days of compute.As Breitenother summarized to VentureBeat, the goal is to shift from “one-off” deployments to a scalable model for the entire workforce: “I think of Kilo for orgs as buying Kilo Claw by the bushel instead of by the one-off. And we’re hoping to sell a lot of bushels of of kilo claw”.
Venture Beat
Hackers slipped a trojan into the code library behind most of the internet. Your team is probably affected
Attackers stole a long-lived npm access token belonging to the lead maintainer of axios, the most popular HTTP client library in JavaScript, and used it to publish two poisoned versions that install a cross-platform remote access trojan. The malicious releases target macOS, Windows, and Linux. They were live on the npm registry for roughly three hours before removal.Axios gets more than 100 million downloads per week. Wiz reports it sits in approximately 80% of cloud and code environments, touching everything from React front-ends to CI/CD pipelines to serverless functions. Huntress detected the first infections 89 seconds after the malicious package went live and confirmed at least 135 compromised systems among its customers during the exposure window.This is the third major npm supply chain compromise in seven months. Every one exploited maintainer credentials. This time, the target had adopted every defense the security community recommended.One credential, two branches, 39 minutesThe attacker took over the npm account of @jasonsaayman, a lead axios maintainer, changed the account email to an anonymous ProtonMail address, and published the poisoned packages through npm’s command-line interface. That bypassed the project’s GitHub Actions CI/CD pipeline entirely.The attacker never touched the Axios source code. Instead, both release branches received a single new dependency: plain-crypto-js@4.2.1. No part of the codebase imports it. The package exists solely to run a postinstall script that drops a cross-platform RAT onto the developer’s machine.The staging was precise. Eighteen hours before the axios releases, the attacker published a clean version of plain-crypto-js under a separate npm account to build publishing history and dodge new-package scanner alerts. Then came the weaponized 4.2.1. Both release branches hit within 39 minutes. Three platform-specific payloads were pre-built. The malware erases itself after execution and swaps in a clean package.json to frustrate forensic inspection.StepSecurity, which identified the compromise alongside Socket, called it among the most operationally sophisticated supply chain attacks ever documented against a top-10 npm package.The defense that existed on paperAxios did the right things. Legitimate 1.x releases shipped through GitHub Actions using npm’s OIDC Trusted Publisher mechanism, which cryptographically ties every publish to a verified CI/CD workflow. The project carried SLSA provenance attestations. By every modern measure, the security stack looked solid.None of it mattered. Huntress dug into the publish workflow and found the gap. The project still passed NPM_TOKEN as an environment variable right alongside the OIDC credentials. When both are present, npm defaults to the token. The long-lived classic token was the real authentication method for every publish, regardless of how OIDC was configured. The attacker never had to defeat OIDC. They walked around it. A legacy token sat there as a parallel auth path, and npm’s own hierarchy silently preferred it.“From my experience at AWS, it’s very common for old auth mechanisms to linger,” said Merritt Baer, CSO at Enkrypt AI and former Deputy CISO at AWS, in an exclusive interview with VentureBeat. “Modern controls get deployed, but if legacy tokens or keys aren’t retired, the system quietly favors them. Just like we saw with SolarWinds, where legacy scripts bypassed newer monitoring.”The maintainer posted on GitHub after discovering the compromise: “I’m trying to get support to understand how this even happened. I have 2FA / MFA on practically everything I interact with.”Endor Labs documented the forensic difference. Legitimate axios@1.14.0 showed OIDC provenance, a trusted publisher record, and a gitHead linking to a specific commit. Malicious axios@1.14.1 had none. Any tool checking provenance would have flagged the gap instantly. But provenance verification is opt-in. No registry gate rejected the package.Three attacks, seven months, same root causeThree npm supply chain compromises in seven months. Every one started with a stolen maintainer credential.The Shai-Hulud worm hit in September 2025. A single phished maintainer account gave attackers a foothold that self-replicated across more than 500 packages, harvesting npm tokens, cloud credentials, and GitHub secrets as it spread. CISA issued an advisory. GitHub overhauled npm’s entire authentication model in response.Then in January 2026, Koi Security’s PackageGate research dropped six zero-day vulnerabilities across npm, pnpm, vlt, and Bun that punched through the very defenses the ecosystem adopted after Shai-Hulud. Lockfile integrity and script-blocking both failed under specific conditions. Three of the four package managers patched within weeks. npm closed the report.Now axios. A stolen long-lived token published a RAT through both release branches despite OIDC, SLSA, and every post-Shai-Hulud hardening measure in place.npm shipped real reforms after Shai-Hulud. Creation of new classic tokens got deprecated, though pre-existing ones survived until a hard revocation deadline. FIDO 2FA became mandatory, granular access tokens were capped at seven days for publishing, and trusted publishing via OIDC gave projects a cryptographic alternative to stored credentials. Taken together, those changes hardened everything downstream of the maintainer account. What they didn’t change was the account itself. The credential remained the single point of failure.“Credential compromise is the recurring theme across npm breaches,” Baer said. “This isn’t just a weak password problem. It’s structural. Without ephemeral credentials, enforced MFA, or isolated build and signing environments, maintainer access remains the weak link.”What npm shipped vs. what this attack walked pastWhat SOC leaders neednpm defense shippedvs. axios attackThe gapBlock stolen tokens from publishingFIDO 2FA required. Granular tokens, 7-day expiry. Classic tokens deprecatedBypassed. Legacy token coexisted alongside OIDC. npm preferred the tokenNo enforcement removes legacy tokens when OIDC is configuredVerify package provenanceOIDC Trusted Publishing via GitHub Actions. SLSA attestationsBypassed. Malicious versions had no provenance. Published via CLINo gate rejects packages missing provenance from projects that previously had itCatch malware before installSocket, Snyk, Aikido automated scanningPartial. Socket flagged in 6 min. First infections hit at 89 secondsDetection-to-removal gap. Scanners catch it, registry removal takes hoursBlock postinstall execution–ignore-scripts recommended in CI/CDNot enforced. npm runs postinstall by default. pnpm blocks by default; npm does notpostinstall remains primary malware vector in every major npm attack since 2024Lock dependency versionsLockfile enforcement via npm ciEffective only if lockfile committed before compromise. Caret ranges auto-resolvedCaret ranges are npm default. Most projects auto-resolve to latest minorWhat to do now at your enterpriseSOC leaders whose organizations run Node.js should treat this as an active incident until they confirm clean systems. The three-hour exposure window fell during peak development hours across Asia-Pacific time zones, and any CI/CD pipeline that ran npm install overnight could have pulled the compromised version automatically.“The first priority is impact assessment: which builds and downstream consumers ingested the compromised package?” Baer said. “Then containment, patching, and finally, transparent reporting to leadership. What happened, what’s exposed, and what controls will prevent a repeat. Lessons from log4j and event-stream show speed and clarity matter as much as the fix itself.”Check exposure. Search lockfiles and CI logs for axios@1.14.1, axios@0.30.4, or plain-crypto-js. Pin to axios@1.14.0 or axios@0.30.3.Assume compromise if hit. Rebuild affected machines from a known-good state. Rotate every accessible credential: npm tokens, AWS keys, SSH keys, cloud credentials, CI/CD secrets, .env values.Block the C2. Add sfrclak.com and 142.11.206.73 to DNS blocklists and firewall rules.Check for RAT artifacts. /Library/Caches/com.apple.act.mond on macOS. %PROGRAMDATA%wt.exe on Windows. /tmp/ld.py on Linux. If found, preform a full rebuild.Harden going forward. Enforce npm ci –ignore-scripts in CI/CD. Require lockfile-only installs. Reject packages missing provenance from projects that previously had it. Audit whether legacy tokens coexist with OIDC in your own publishing workflows.The credential gap nobody closedThree attacks in seven months. Each different in execution, identical in root cause. npm’s security model still treats individual maintainer accounts as the ultimate trust anchor. Those accounts remain vulnerable to credential hijacking, no matter how many layers get added downstream.“AI spots risky packages, audits legacy auth, and speeds SOC response,” Baer said. “But humans still control maintainer credentials. We mitigate risk. We don’t eliminate it.”Mandatory provenance attestation, where manual CLI publishing is disabled entirely, would have caught this attack before it reached the registry. So would mandatory multi-party signing, where no single maintainer can push a release alone. Neither is enforced today. npm has signaled that disabling tokens by default when trusted publishing is enabled is on the roadmap. Until it ships, every project running OIDC alongside a legacy token has the same blind spot axios had.The axios maintainer did what the community asked. A legacy token nobody realized was still active and undermined all of it.
Meta’s new structured prompting technique makes LLMs significantly better at code review — boosting accuracy to 93% in some cases
Deploying AI agents for repository-scale tasks like bug detection, patch verification, and code review requires overcoming significant technical hurdles. One major bottleneck: the need to set up dynamic execution sandboxes for every repository, which are expensive and computationally heavy. Using large language model (LLM) reasoning instead of executing the code is rising in popularity to bypass this overhead, yet it frequently leads to unsupported guesses and hallucinations. To improve execution-free reasoning, researchers at Meta introduce “semi-formal reasoning,” a structured prompting technique. This method requires the AI agent to fill out a logical certificate by explicitly stating premises, tracing concrete execution paths, and deriving formal conclusions before providing an answer. The structured format forces the agent to systematically gather evidence and follow function calls before drawing conclusions. This increases the accuracy of LLMs in coding tasks and significantly reduces errors in fault localization and codebase question-answering. For developers using LLMs in code review tasks, semi-formal reasoning enables highly reliable, execution-free semantic code analysis while drastically reducing the infrastructure costs of AI coding systems.Agentic code reasoningAgentic code reasoning is an AI agent’s ability to navigate files, trace dependencies, and iteratively gather context to perform deep semantic analysis on a codebase without running the code. In enterprise AI applications, this capability is essential for scaling automated bug detection, comprehensive code reviews, and patch verification across complex repositories where relevant context spans multiple files.The industry currently tackles execution-free code verification through two primary approaches. The first involves unstructured LLM evaluators that try to verify code either directly or by training specialized LLMs as reward models to approximate test outcomes. The major drawback is their reliance on unstructured reasoning, which allows models to make confident claims about code behavior without explicit justification. Without structured constraints, it is difficult to ensure agents reason thoroughly rather than guess based on superficial patterns like function names.The second approach involves formal verification, which translates code or reasoning into formal mathematical languages like Lean, Coq, or Datalog to enable automated proof checking. While rigorous, formal methods require defining the semantics of the programming language. This is entirely impractical for arbitrary enterprise codebases that span multiple frameworks and languages. Existing approaches also tend to be highly fragmented and task-specific, often requiring entirely separate architectures or specialized training for each new problem domain. They lack the flexibility needed for broad, multi-purpose enterprise applications.How semi-formal reasoning worksTo bridge the gap between unstructured guessing and overly rigid mathematical proofs, the Meta researchers propose a structured prompting methodology, which they call “semi-formal reasoning.” This approach equips LLM agents with task-specific, structured reasoning templates.These templates function as mandatory logical certificates. To complete a task, the agent must explicitly state premises, trace execution paths for specific tests, and derive a formal conclusion based solely on verifiable evidence. The template forces the agent to gather proof from the codebase before making a judgment. The agent must actually follow function calls and data flows step-by-step rather than guessing their behavior based on surface-level naming conventions. This systematic evidence gathering helps the agent handle edge cases, such as confusing function names, and avoid making unsupported claims.Semi-formal reasoning in actionThe researchers evaluated semi-formal reasoning across three software engineering tasks: patch equivalence verification to determine if two patches yield identical test outcomes without running them, fault localization to pinpoint the exact lines of code causing a bug, and code question answering to test nuanced semantic understanding of complex codebases. The experiments used the Claude Opus-4.5 and Sonnet-4.5 models acting as autonomous verifier agents.The team compared their structured semi-formal approach against several baselines, including standard reasoning, where an agentic model is given a minimal prompt and allowed to explain its thinking freely in unstructured natural language. They also compared against traditional text-similarity algorithms like difflib.In patch equivalence, semi-formal reasoning improved accuracy on challenging, curated examples from 78% using standard reasoning to 88%. When evaluating real-world, agent-generated patches with test specifications available, the Opus-4.5 model using semi-formal reasoning achieved 93% verification accuracy, outperforming both the unstructured single-shot baseline at 86% and the difflib baseline at 73%. Other tasks showed similar gains across the board.The paper highlights the value of semi-formal reasoning through real-world examples. In one case, the agent evaluates two patches in the Python Django repository that attempt to fix a bug with 2-digit year formatting for years before 1000 CE. One patch uses a custom format() function within the library that overrides the standard function used in Python. Standard reasoning models look at these patches, assume format() refers to Python’s standard built-in function, calculate that both approaches will yield the same string output, and incorrectly declare the patches equivalent. With semi-formal reasoning, the agent traces the execution path and checks method definitions. Following the structured template, the agent discovers that within one of the library’s files, the format() name is actually shadowed by a custom, module-level function. The agent formally proves that given the attributes of the input passed to the code, this patch will crash the system while the other will succeed.Based on their experiments, the researchers suggest that “LLM agents can perform meaningful semantic code analysis without execution, potentially reducing verification costs in RL training pipelines by avoiding expensive sandbox execution.”Caveats and tradeoffsWhile semi-formal reasoning offers substantial reliability improvements, enterprise developers must consider several practical caveats before adopting it. There is a clear compute and latency tradeoff. Semi-formal reasoning requires more API calls and tokens. In patch equivalence evaluations, semi-formal reasoning required roughly 2.8 times as many execution steps as standard unstructured reasoning.The technique also does not universally improve performance, particularly if a model is already highly proficient at a specific task. When researchers evaluated the Sonnet-4.5 model on a code question-answering benchmark, standard unstructured reasoning already achieved a high accuracy of around 85%. Applying the semi-formal template in this scenario yielded no additional gains.Furthermore, structured reasoning can produce highly confident wrong answers. Because the agent is forced to build elaborate, formal proof chains, it can become overly assured if its investigation is deep but incomplete. In one Python evaluation, the agent meticulously traced five different functions to uncover a valid edge case, but completely missed that a downstream piece of code already safely handled that exact scenario. Because it had built a strong evidence chain, it delivered an incorrect conclusion with extremely high confidence.The system’s reliance on concrete evidence also breaks down when it hits the boundaries of a codebase. When analyzing third-party libraries where the underlying source code is unavailable, the agent will still resort to guessing behavior based on function names. And in some cases, despite strict prompt instructions, models will occasionally fail to fully trace concrete execution paths. Ultimately, while semi-formal reasoning drastically reduces unstructured guessing and hallucinations, it does not completely eliminate them.What developers should take away This technique can be used out-of-the-box, requiring no model training or special packaging. It is code-execution free, which means you do not need to add additional tools to your LLM environment. You pay more compute at inference time to get higher accuracy at code review tasks. The researchers suggest that structured agentic reasoning may offer “a flexible alternative to classical static analysis tools: rather than encoding analysis logic in specialized algorithms, we can prompt LLM agents with task-specific reasoning templates that generalize across languages and frameworks.”The researchers have made the prompt templates available, allowing them to be readily implemented into your applications. While there is a lot of conversation about prompt engineering being dead, this technique shows how much performance you can still squeeze out of well-structured prompts.
CrowdStrike, Cisco and Palo Alto Networks all shipped agentic SOC tools at RSAC 2026 — the agent behavioral baseline gap survived all three
CrowdStrike CEO George Kurtz highlighted in his RSA Conference 2026 keynote that the fastest recorded adversary breakout time has dropped to 27 seconds. The average is now 29 minutes, down from 48 minutes in 2024. That is how much time defenders have before a threat spreads. Now CrowdStrike sensors detect more than 1,800 distinct AI applications running on enterprise endpoints, representing nearly 160 million unique application instances. Every one generates detection events, identity events, and data access logs flowing into SIEM systems architected for human-speed workflows.Cisco found that 85% of surveyed enterprise customers have AI agent pilots underway. Only 5% moved agents into production, according to Cisco President and Chief Product Officer Jeetu Patel in his RSAC blog post. That 80-point gap exists because security teams cannot answer the basic questions agents force. Which agents are running, what are they authorized to do, and who is accountable when one goes wrong.“The number one threat is security complexity. But we’re running towards that direction in AI as well,” Etay Maor, VP of Threat Intelligence at Cato Networks, told VentureBeat at RSAC 2026. Maor has attended the conference for 16 consecutive years. “We’re going with multiple point solutions for AI. And now you’re creating the next wave of security complexity.”Agents look identical to humans in your logs In most default logging configurations, agent-initiated activity looks identical to human-initiated activity in security logs. “It looks indistinguishable if an agent runs Louis’s web browser versus if Louis runs his browser,” Elia Zaitsev, CTO of CrowdStrike, told VentureBeat in an exclusive interview at RSAC 2026. Distinguishing the two requires walking the process tree. “I can actually walk up that process tree and say, this Chrome process was launched by Louis from the desktop. This Chrome process was launched from Louis’s cloud Cowork or ChatGPT application. Thus, it’s agentically controlled.”Without that depth of endpoint visibility, a compromised agent executing a sanctioned API call with valid credentials fires zero alerts. The exploit surface is already being tested. During his keynote, Kurtz described ClawHavoc, the first major supply chain attack on an AI agent ecosystem, targeting ClawHub, OpenClaw’s public skills registry. Koi Security’s February audit found 341 malicious skills out of 2,857; a follow-up analysis by Antiy CERT identified 1,184 compromised packages historically across the platform. Kurtz noted ClawHub now hosts 13,000 skills in its registry. The infected skills contained backdoors, reverse shells, and credential harvesters; Kurtz said in his keynote that some erased their own memory after installation and could remain latent before activating. “The frontier AI creators will not secure itself,” Kurtz said. “The frontier labs are following the same playbook. They’re building it. They’re not securing it.”Two agentic SOC architectures, one shared blind spotApproach A: AI agents inside the SIEM. Cisco and Splunk announced six specialized AI agents for Splunk Enterprise Security: Detection Builder, Triage, Guided Response, Standard Operating Procedures (SOP), Malware Threat Reversing, and Automation Builder. Malware Threat Reversing is currently available in Splunk Attack Analyzer and Detection Studio is generally available as a unified workspace; the remaining five agents are in alpha or prerelease through June 2026. Exposure Analytics and Federated Search follow the same timeline. Upstream of the SOC, Cisco’s DefenseClaw framework scans OpenClaw skills and MCP servers before deployment, while new Duo IAM capabilities extend zero trust to agents with verified identities and time-bound permissions.“The biggest impediment to scaled adoption in enterprises for business-critical tasks is establishing a sufficient amount of trust,” Patel told VentureBeat. “Delegating and trusted delegating, the difference between those two, one leads to bankruptcy. The other leads to market dominance.”Approach B: Upstream pipeline detection. CrowdStrike pushed analytics into the data ingestion pipeline itself, integrating its Onum acquisition natively into Falcon’s ingestion system for real-time analytics, detection, and enrichment before events reach the analyst’s queue. Falcon Next-Gen SIEM now ingests Microsoft Defender for Endpoint telemetry natively, so Defender shops do not need additional sensors. CrowdStrike also introduced federated search across third-party data stores and a Query Translation Agent that converts legacy Splunk queries to accelerate SIEM migration.Falcon Data Security for the Agentic Enterprise applies cross-domain data loss prevention to data agents’ access at runtime. CrowdStrike’s adversary-informed cloud risk prioritization connects agent activity in cloud workloads to the same detection pipeline. Agentic MDR through Falcon Complete adds machine-speed managed detection for teams that cannot build the capability internally.“The agentic SOC is all about, how do we keep up?” Zaitsev said. “There’s almost no conceivable way they can do it if they don’t have their own agentic assistance.”CrowdStrike opened its platform to external AI providers through Charlotte AI AgentWorks, announced at RSAC 2026, letting customers build custom security agents on Falcon using frontier AI models. Launch partners include Accenture, Anthropic, AWS, Deloitte, Kroll, NVIDIA, OpenAI, Salesforce, and Telefónica Tech. IBM validated buyer demand through a collaboration integrating Charlotte AI with its Autonomous Threat Operations Machine for coordinated, machine-speed investigation and containment.The ecosystem contenders. Palo Alto Networks, in an exclusive pre-RSAC briefing with VentureBeat, outlined Prisma AIRS 3.0, extending its AI security platform to agents with artifact scanning, agent red teaming, and a runtime that catches memory poisoning and excessive permissions. The company introduced an agentic identity provider for agent discovery and credential validation. Once Palo Alto Networks closes its proposed acquisition of Koi, the company adds agentic endpoint security. Cortex delivers agentic security orchestration across its customer base.Intel announced that CrowdStrike’s Falcon platform is being optimized for Intel-powered AI PCs, leveraging neural processing units and silicon-level telemetry to detect agent behavior on the device. Kurtz framed AIDR, AI Detection and Response, as the next category beyond EDR, tracking agent-speed activity across endpoints, SaaS, cloud, and AI pipelines. He said that “humans are going to have 90 agents that work for them on average” as adoption scales but did not specify a timeline.The gap no vendor closedWhat security leaders needApproach A: agents inside the SIEM (Cisco/Splunk)Approach B: upstream pipeline detection (CrowdStrike)Gap neither closesTriage at agent volumeSix AI agents handle triage, detection, and response inside Splunk ESOnum-powered pipeline detects and enriches threats before the analyst sees themNeither baselines normal agent behavior before flagging anomaliesAgent vs. human differentiationDuo IAM tracks agent identities but does not differentiate agent from human activity in SOC telemetryProcess tree lineage distinguishes at runtime. AIDR extends to agent-specific detectionNo vendor’s announced capabilities include an out-of-the-box agent behavioral baseline27-second response windowGuided Response Agent executes containment at machine speedIn-pipeline detection reduces queue volume. Agentic MDR adds managed responseHuman-in-the-loop governance has not been reconciled with machine-speed response in either approachLegacy SIEM portabilityNative Splunk integration preserves existing workflowsQuery Translation Agent converts Splunk queries. Native Defender ingestion lets Microsoft shops migrateNeither addresses teams running multiple SIEMs during migrationAgent supply chainDefenseClaw scans skills and MCP servers pre-deployment. Explorer Edition red-teams agentsEDR AI Runtime Protection catches compromised skills post-deployment. Charlotte AI AgentWorks enables custom agentsNeither covers the full lifecycle. Pre-deployment scanning misses runtime exploits and vice versaThe matrix makes one thing visible that the keynotes did not. No vendor shipped an agent behavioral baseline. Both approaches automate triage and accelerate detection. Based on VentureBeat’s review of announced capabilities, neither defines what normal agent behavior looks like in a given enterprise environment.Teams running Microsoft Sentinel and Copilot for Security represent a third architecture not formally announced as a competing approach at RSAC this week, but CISOs in Microsoft-heavy environments need to test whether Sentinel’s native agent telemetry ingestion and Copilot’s automated triage close the same gaps identified above.Maor cautioned that the vendor response recycles a pattern he has tracked for 16 years. “I hope we don’t have to go through this whole cycle,” he told VentureBeat. “I hope we learned from the past. It doesn’t really look like it.”Zaitsev’s advice was blunt. “You already know what to do. You’ve known what to do for five, ten, fifteen years. It’s time to finally go do it.”Five things to do Monday morning These steps apply regardless of your SOC platform. None requires ripping and replacing current tools. Start with visibility, then layer in controls as agent volume grows. Inventory every agent on your endpoints. CrowdStrike detects 1,800 AI applications across enterprise devices. Cisco’s Duo Identity Intelligence discovers agentic identities. Palo Alto Networks’ agentic IDP catalogs agents and maps them to human owners. If you run a different platform, start with an EDR query for known agent directories and binaries. You cannot set policy for agents you do not know exist.Determine whether your SOC stack can differentiate agent from human activity. CrowdStrike’s Falcon sensor and AIDR do this through process tree lineage. Palo Alto Networks’ agent runtime catches memory poisoning at execution. If your tools cannot make this distinction, your triage rules are applying the wrong behavioral models.Match the architectural approach to your current SIEM. Splunk shops gain agent capabilities through Approach A. Teams evaluating migration get pipeline detection with Splunk query translation and native Defender ingestion through Approach B. Palo Alto Networks’ Cortex delivers a third option. Teams on Microsoft Sentinel, Google Chronicle, Elastic, or other platforms should evaluate whether their SIEM can ingest agent-specific telemetry at this volume.Build an agent behavioral baseline before your next board meeting. No vendor ships one. Define what your agents are authorized to do: which APIs, which data stores, which actions, at which times. Create detection rules for anything outside that scope.Pressure-test your agent supply chain. Cisco’s DefenseClaw and Explorer Edition scan and red-team agents before deployment. CrowdStrike’s runtime detection catches compromised agents post-deployment. Both layers are necessary. Kurtz said in his keynote that ClawHavoc compromised over a thousand ClawHub skills with malware that erased its own memory after installation. If your playbook does not account for an authorized agent executing unauthorized actions at machine speed, rewrite it.The SOC was built to protect humans using machines. It now protects machines using machines. The response window shrank from 48 minutes to 27 seconds. Any agent generating an alert is now a suspect, not just a sensor. The decisions security leaders make in the next 90 days will determine whether their SOC operates in this new reality or gets buried under it.
OpenClaw has 500,000 instances and no enterprise kill switch
“Your AI? It’s my AI now.” The line came from Etay Maor, VP of Threat Intelligence at Cato Networks, in an exclusive interview with VentureBeat at RSAC 2026 — and it describes exactly what happened to a U.K. CEO whose OpenClaw instance ended up for sale on BreachForums. Maor’s argument is that the industry handed AI agents the kind of autonomy it would never extend to a human employee, discarding zero trust, least privilege, and assume-breach in the process.The proof arrived on BreachForums three weeks before Maor’s interview. On February 22, a threat actor using the handle “fluffyduck” posted a listing advertising root shell access to the CEO’s computer for $25,000 in Monero or Litecoin. The shell was not the selling point. The CEO’s OpenClaw AI personal assistant was. The buyer would get every conversation the CEO had with the AI, the company’s full production database, Telegram bot tokens, Trading 212 API keys, and personal details the CEO disclosed to the assistant about family and finances. The threat actor noted the CEO was actively interacting with OpenClaw in real time, making the listing a live intelligence feed rather than a static data dump.Cato CTRL senior security researcher Vitaly Simonovich documented the listing on February 25. The CEO’s OpenClaw instance stored everything in plain-text Markdown files under ~/.openclaw/workspace/ with no encryption at rest. The threat actor didn’t need to exfiltrate anything; the CEO had already assembled it. When the security team discovered the breach, there was no native enterprise kill switch, no management console, and no way to inventory how many other instances were running across the organization.OpenClaw runs locally with direct access to the host machine’s file system, network connections, browser sessions, and installed applications. The coverage to date has tracked its velocity, but what it hasn’t mapped is the threat surface. The four vendors who used RSAC 2026 to ship responses still haven’t produced the one control enterprises need most: a native kill switch.The threat surface by the numbersMetricNumbersSourceInternet-facing instances~500,000 (March 24 live check)Etay Maor, Cato Networks (exclusive RSAC 2026 interview)Exposed instances with security risks30,000+ observed during scan windowBitsightExploitable via known RCE15,200 instancesSecurityScorecardHigh-severity CVEs3 (highest CVSS: 8.8)NVD (24763, 25157, 25253)Malicious skills on ClawHub341 in Koi audit (335 from ClawHavoc); 824 by mid-FebKoiClawHub skills with critical flaws13.4% of 3,984 analyzedSnykAPI tokens exposed (Moltbook)1.5 millionWizMaor ran a live Censys check during an exclusive VentureBeat interview at RSAC 2026. “The first week it came out, there were about 6,300 instances. Last week, I checked: 230,000 instances. Let’s check now… almost half a million. Almost doubled in one week,” Maor said. Three high-severity CVEs define the attack surface: CVE-2026-24763 (CVSS 8.8, command injection via Docker PATH handling), CVE-2026-25157 (CVSS 7.7, OS command injection), and CVE-2026-25253 (CVSS 8.8, token exfiltration to full gateway compromise). All three CVEs have been patched, but OpenClaw has no enterprise management plane, no centralized patching mechanism, and no fleet-wide kill switch. Individual administrators must update each instance manually, and most have not.The defender-side telemetry is just as alarming. CrowdStrike’s Falcon sensors already detect more than 1,800 distinct AI applications across its customer fleet — from ChatGPT to Copilot to OpenClaw — generating around 160 million unique instances on enterprise endpoints. ClawHavoc, a malicious skill distributed through the ClawHub marketplace, became the primary case study in the OWASP Agentic Skills Top 10. CrowdStrike CEO George Kurtz flagged it in his RSAC 2026 keynote as the first major supply chain attack on an AI agent ecosystem.AI agents got root access. Security got nothing.Maor framed the visibility failure through the OODA loop (observe, orient, decide, act) during the RSAC 2026 interview. Most organizations are failing at the first step: security teams can’t see which AI tools are running on their networks, which means the productivity tools employees bring in quietly become shadow AI that attackers exploit. The BreachForums listing proved the end state. The CEO’s OpenClaw instance became a centralized intelligence hub with SSO sessions, credential stores, and communication history aggregated into one location. “The CEO’s assistant can be your assistant if you buy access to this computer,” Maor told VentureBeat. “It’s an assistant for the attacker.”Ghost agents amplify the exposure. Organizations adopt AI tools, run a pilot, lose interest, and move on — leaving agents running with credentials intact. “We need an HR view of agents. Onboarding, monitoring, offboarding. If there’s no business justification? Removal,” Maor told VentureBeat. “We’re not left with any ghost agents on our network, because that’s already happening.”Cisco moved toward an OpenClaw kill switchCisco President and Chief Product Officer Jeetu Patel framed the stakes during an exclusive VentureBeat interview at RSAC 2026. “I think of them more like teenagers. They’re supremely intelligent, but they have no fear of consequence,” Patel said of AI agents. “The difference between delegating and trusted delegating of tasks to an agent … one of them leads to bankruptcy. The other one leads to market dominance.” Cisco launched three free, open-source security tools for OpenClaw at RSAC 2026. DefenseClaw packages Skills Scanner, MCP Scanner, AI BoM, and CodeGuard into a single open-source framework running inside NVIDIA’s OpenShell runtime, which NVIDIA launched at GTC the week before RSAC. “Every single time you actually activate an agent in an Open Shell container, you can now automatically instantiate all the security services that we have built through Defense Claw,” Patel told VentureBeat. AI Defense Explorer Edition is a free, self-serve version of Cisco’s algorithmic red-teaming engine, testing any AI model or agent for prompt injection and jailbreaks across more than 200 risk subcategories. The LLM Security Leaderboard ranks foundation models by adversarial resilience rather than performance benchmarks. Cisco also shipped Duo Agentic Identity to register agents as identity objects with time-bound permissions, Identity Intelligence to discover shadow agents through network monitoring, and the Agent Runtime SDK to embed policy enforcement at build time.Palo Alto made agentic endpoints a security category of their ownPalo Alto Networks CEO Nikesh Arora characterized OpenClaw-class tools as creating a new supply chain running through unregulated, unsecured marketplaces during an exclusive March 18 pre-RSA briefing with VentureBeat. Koi found 341 malicious skills on ClawHub in its initial audit, with the total growing to 824 as the registry expanded. Snyk found 13.4% of analyzed skills contained critical security flaws. Palo Alto Networks built Prisma AIRS 3.0 around a new agentic registry that requires every agent to be logged before operating, with credential validation, MCP gateway traffic control, agent red-teaming, and runtime monitoring for memory poisoning. The pending Koi acquisition adds supply chain visibility specifically for agentic endpoints.Cato CTRL delivered the adversarial proofCato Networks’ threat intelligence arm Cato CTRL presented two sessions at RSAC 2026. The 2026 Cato CTRL Threat Report, published separately, includes a proof-of-concept “Living Off AI” attack targeting Atlassian’s MCP and Jira Service Management. Maor’s research provides the independent adversarial validation that vendor product announcements cannot deliver on their own. The platform vendors are building governance for sanctioned agents. Cato CTRL documented what happens when the unsanctioned agent on the CEO’s laptop gets sold on the dark web.Monday morning action listRegardless of vendor stack, four controls apply immediately: bind OpenClaw to localhost only and block external port exposure, enforce application allowlisting through MDM to prevent unauthorized installations, rotate every credential on machines where OpenClaw has been running, and apply least-privilege access to any account an AI agent has touched.Discover the install base. CrowdStrike’s Falcon sensor, Cato’s SASE platform, and Cisco Identity Intelligence all detect shadow AI. For teams without premium tooling, query endpoints for the ~/.openclaw/ directory using native EDR or MDM file-search policies. If the enterprise has no endpoint visibility at all, run Shodan and Censys queries against corporate IP ranges.Patch or isolate. Check every discovered instance against CVE-2026-24763, CVE-2026-25157, and CVE-2026-25253. Instances that cannot be patched should be network-isolated. There is no fleet-wide patching mechanism.Audit skill installations. Review installed skills against Cisco’s Skills Scanner or the Snyk and Koi research. Any skill from an unverified source should be removed immediately.Enforce DLP and ZTNA controls. Cato’s ZTNA controls restrict unapproved AI applications. Cisco Secure Access SSE enforces policy on MCP tool calls. Palo Alto’s Prisma Access Browser controls data flow at the browser layer.Kill ghost agents. Build a registry of every AI agent running. Document business justification, human owner, credentials held, and systems accessed. Revoke credentials for agents with no justification. Repeat weekly.Deploy DefenseClaw for sanctioned use. Run OpenClaw inside NVIDIA’s OpenShell runtime with Cisco’s DefenseClaw to scan skills, verify MCP servers, and instrument runtime behavior automatically.Red-team before deploying. Use Cisco AI Defense Explorer Edition (free) or Palo Alto Networks’ agent red-teaming in Prisma AIRS 3.0. Test the workflow, not just the model.The OWASP Agentic Skills Top 10, published using ClawHavoc as its primary case study, provides a standards-grade framework for evaluating these risks. Four vendors shipped responses at RSAC 2026. None of them is a native enterprise kill switch for unsanctioned OpenClaw deployments. Until one exists, the Monday morning action list above is the closest thing to one.
Slack adds 30 AI features to Slackbot, its most ambitious update since the Salesforce acquisition
Slack today announced more than 30 new capabilities for Slackbot, its AI-powered personal agent, in what amounts to the most sweeping overhaul of the workplace messaging platform since Salesforce acquired it for $27.7 billion in 2021. The update transforms Slackbot from a simple conversational assistant into a full-spectrum enterprise agent that can take meeting notes across any video provider, operate outside the Slack application on users’ desktops, execute tasks through third-party tools via the Model Context Protocol (MCP), and even serve as a lightweight CRM for small businesses — all without requiring users to install anything new.The announcement, timed to a keynote event that Salesforce CEO Marc Benioff is headlining Tuesday morning, arrives less than three months after Slackbot first became generally available on January 13 to Business+ and Enterprise+ subscribers. In that short window, Slack says the feature is on track to become the fastest-adopted product in Salesforce’s 27-year history, with some employees at customer organizations reporting they save up to 90 minutes per day. Inside Salesforce itself, teams claim savings of up to 20 hours per week, translating to more than $6.4 million in estimated productivity value.”Slackbot is smart. It’s pleasant, and I think it’s endlessly useful,” Rob Seaman, Slack’s interim CEO and former chief product officer, told VentureBeat in an exclusive interview ahead of the announcement. “The upper bound of use cases is effectively limitless for it.”The release signals Slack’s clearest bid yet to become what Seaman and the company’s leadership describe as an “agentic operating system” — a single surface through which workers interact with AI agents, enterprise applications, and one another. It also marks a direct challenge to Microsoft, which has spent the past two years embedding its Copilot assistant across the entirety of its productivity stack.From simple chatbot to autonomous coworker: six new capabilities that redefine what Slackbot can doThe features announced Tuesday organize around several major capability areas, each designed to push Slackbot well beyond the role of a chatbot and into something closer to an autonomous digital coworker.The most foundational may be what Slack is calling AI-Skills — reusable instruction sets that define the inputs, the steps, and the exact output format for a given task. Any team can build a skill once and deploy it on demand. Slackbot ships with a built-in library for common workflows, but users can also create their own. Critically, Slackbot can recognize when a user’s prompt matches an existing skill and apply it automatically, without being explicitly told to do so. “Think of these as topics or instructions — basically instructions for Slackbot to perform a repeat task that the user might want to do, that they can share with others, or a company might be able to set up for their whole company,” Seaman explained.Deep research mode gives Slackbot the ability to conduct extended, multi-step investigations that take approximately four minutes to complete — a significant departure from the instant-response paradigm of most enterprise chatbots. Slack chose not to demonstrate this feature on stage at the keynote, Seaman said, precisely because its value lies in depth, not speed. MCP client integration, meanwhile, allows Slackbot to make tool calls into external systems through the Model Context Protocol, meaning it can now create Google Slides, draft Google Docs, and interact with the more than 2,600 apps in the Slack Marketplace and the 6,000-plus apps built over two decades for the Salesforce AppExchange. “We’re going all in on MCP for Slackbot,” Seaman said. “MCP clients and MCP servers are becoming very mature.”Meeting intelligence allows Slackbot to listen to any meeting — not just Slack huddles, but calls on Zoom, Google Meet, or any other provider — by tapping into the user’s local audio through the desktop application. It captures discussions, summarizes decisions, surfaces action items, and because Slackbot is natively connected to Salesforce, it can log actions and update opportunities directly in the CRM. Slackbot on Desktop extends the agent outside the Slack container entirely, while voice mode adds text-to-speech and speech-to-text capabilities, with full speech-to-speech functionality under active development.How Anthropic’s Claude powers Slackbot — and why keeping it affordable is the hardest partSlackbot is built on Anthropic’s Claude model, a detail Seaman confirmed ahead of the keynote, where Anthropic’s leadership will appear alongside Slack executives on stage. The partnership underscores the deepening relationship between the two companies: Anthropic’s technology powers the reasoning layer, while Slack’s “context engineering” — the process of determining exactly which information from a user’s channels, files, and messages should be fed into the model’s context window — determines the quality and relevance of every response.Managing the cost of that reasoning at enterprise scale is one of the most significant technical and financial challenges the team faces. Slackbot is included in Business+ and Enterprise+ plans at no additional consumption charge — a deliberate strategic choice that places the burden of cost optimization squarely on Slack’s engineering team rather than on customers.”A lot of what we’ve done is in the context engineering phase, working really closely with Anthropic to make sure that we’re optimizing the RAG phase, optimizing our system prompts and everything, to make sure we’re getting the right amount of context into the context window and not obviously making fiscally irresponsible decisions for ourselves,” Seaman said. Starting in April, Slackbot will also become available in a limited sampling capacity to users on Slack’s free and Pro plans — a move designed to drive conversion up the pricing tiers.Desktop AI and meeting transcription are powerful, but they raise hard questions about workplace surveillanceThe extension of Slackbot beyond the Slack application window — particularly its ability to listen to meetings and view screen content — raises immediate questions about employee surveillance, especially in large enterprise environments where tens of thousands of workers may be subject to company-wide IT policies.Seaman was emphatic that every capability is user-initiated and opt-in. Slackbot cannot listen to audio unless the user explicitly tells it to take meeting notes. It cannot view the desktop autonomously; in its current form, users must manually capture and share screenshots. And it inherits every permission the organization has already established in Slack.”Everything is user opt-in. That’s a key tenet of Slack,” Seaman said. “It’s not rogue looking at your desktop or autonomously looking at your desktop. It’s very important to us, and very important to our enterprise customers.” On Slackbot’s memory feature — which allows it to learn user preferences and habits over time — Seaman said the company has no plans to make that data available to administrators. Users can flush their stored preferences at any time simply by telling Slackbot to do so.Slack’s native CRM is a Trojan horse designed to capture startups before they outgrow itAmong the most important features in Tuesday’s release is a native CRM built directly into Slack, targeting small businesses that haven’t yet adopted a dedicated customer relationship management system.The logic is straightforward: small companies typically adopt Slack early in their lifecycle, often on the free tier, and their customer conversations already happen in channels and direct messages. Slack’s native CRM reads those channels, understands the conversations, and automatically keeps deals, contacts, and call notes up to date. When companies are ready to scale, every record is already connected to Salesforce — no migrations, no starting over.”The hypothesis is that along the way, companies are effectively going to have moments where a CRM might matter,” Seaman said. “Our goal is to make it available to them as a default, so as they are starting their company and their company is growing, it’s just right there for them. They don’t have to think about going off and procuring another tool.”The feature also represents a response to a growing competitive threat. As the Wall Street Journal reported earlier this year, a wave of startups and individual developers have begun “vibe coding” their own lightweight CRMs, emboldened by the capabilities of large language models. By embedding CRM directly into Slack — the tool many of those same startups already depend on — Salesforce aims to make the procurement of a separate system unnecessary.Slack says it has a context advantage over Microsoft and Google — but can it last?The announcements arrive at a moment of intense competitive pressure. Microsoft has integrated Copilot across its entire productivity suite, giving it a distribution advantage that reaches into virtually every Fortune 500 company. Google has been similarly aggressive with Gemini across Workspace. And standalone AI tools from OpenAI to Anthropic threaten to fragment the enterprise AI experience.Seaman took a measured approach when asked directly about competitive positioning, invoking a mantra he said Slack uses internally: “We are competitor aware, but customer obsessed.””I think there are two things that really stand out. One, we have a context advantage — if you look at the way people use Slack, they love it. They use it so much, constantly communicating with their colleagues, openly thinking, working in public project channels. Two is the user experience. We focus so much on how our product feels in people’s hands.”That context advantage is real but not guaranteed. Slack’s strength lies in the richness and volume of conversational data flowing through its channels — data that, when fed into an AI model, can produce responses with a degree of organizational awareness that competitors struggle to match. But Microsoft’s Teams captures similar conversational data, and its deep integration with Windows, Office, and Azure gives it a systems-level advantage that Slack, operating as a single application, cannot easily replicate.Starting this summer, every new Salesforce customer will receive Slack automatically provisioned and AI-powered from day one — a bundling play that ensures the messaging platform reaches the broadest possible enterprise audience. Salesforce reported $41.5 billion in revenue for fiscal year 2026, up 10% year-over-year, with Agentforce ARR reaching $800 million. But Wall Street has remained skeptical about whether AI will ultimately erode demand for traditional enterprise software, and Salesforce’s stock has underperformed the broader Nasdaq over the past year. More Slack users in more organizations gives AI-driven features more surface area to prove their value.Slack’s biggest bet is that it can do everything without losing the simplicity that made it belovedTuesday’s launch is the first major product release under Seaman’s leadership. He assumed the interim CEO role after former Slack CEO Denise Dresser departed in December 2025 to become OpenAI’s first chief revenue officer — a move that signaled even Salesforce’s own executives felt the gravitational pull of frontier AI companies. The overarching thesis embedded in the announcement — that Slack is evolving from a messaging platform into an operating system for AI agents — is as risky as it is ambitious.”One of the fundamental tenets of an operating system is that it obscures the complexity of the hardware from the end user,” Seaman said. “There are thousands of apps and agents out there, and that can be overwhelming. I think that’s our job — to be the OS that obscures that complexity, so you just use it like it’s a communication tool.”When asked whether Slack risks losing its simplicity by trying to do everything, Seaman didn’t flinch. “There’s absolutely a risk,” he said. “That’s what keeps us up at night.”It’s a remarkably candid admission from the leader of a platform that just launched 30 new features in a single day. The company that won the hearts of millions of workers with playful emoji reactions and frictionless messaging is now betting its future on meeting transcription, CRM pipelines, desktop agents, and enterprise orchestration. Whether Slack can absorb all of that ambition without losing the thing that made people love it in the first place isn’t just a product question — it’s the $27.7 billion question that Salesforce is still trying to answer.
Claude Code’s source code appears to have leaked: here’s what we know
Anthropic appears to have accidentally revealed the inner workings of one of its most popular and lucrative AI products, the agentic AI harness Claude Code, to the public.A 59.8 MB JavaScript source map file (.map), intended for internal debugging, was inadvertently included in version 2.1.88 of the @anthropic-ai/claude-code package on the public npm registry pushed live earlier this morning. By 4:23 am ET, Chaofan Shou (@Fried_rice), an intern at Solayer Labs, broadcasted the discovery on X (formerly Twitter). The post, which included a direct download link to a hosted archive, acted as a digital flare. Within hours, the ~512,000-line TypeScript codebase was mirrored across GitHub and analyzed by thousands of developers. For Anthropic, a company currently riding a meteoric rise with a reported $19 billion annualized revenue run-rate as of March 2026, the leak is more than a security lapse; it is a strategic hemorrhage of intellectual property.The timing is particularly critical given the commercial velocity of the product.Market data indicates that Claude Code alone has achieved an annualized recurring revenue (ARR) of $2.5 billion, a figure that has more than doubled since the beginning of the year. With enterprise adoption accounting for 80% of its revenue, the leak provides competitors—from established giants to nimble rivals like Cursor—a literal blueprint for how to build a high-agency, reliable, and commercially viable AI agent.We’ve reached out to Anthropic for an official statement on the leak and will update when we hear back. The anatomy of agentic memoryThe most significant takeaway for competitors lies in how Anthropic solved “context entropy”—the tendency for AI agents to become confused or hallucinatory as long-running sessions grow in complexity. The leaked source reveals a sophisticated, three-layer memory architecture that moves away from traditional “store-everything” retrieval.As analyzed by developers like @himanshustwts, the architecture utilizes a “Self-Healing Memory” system. At its core is MEMORY.md, a lightweight index of pointers (~150 characters per line) that is perpetually loaded into the context. This index does not store data; it stores locations. Actual project knowledge is distributed across “topic files” fetched on-demand, while raw transcripts are never fully read back into the context, but merely “grep’d” for specific identifiers.This “Strict Write Discipline”—where the agent must update its index only after a successful file write—prevents the model from polluting its context with failed attempts.For competitors, the “blueprint” is clear: build a skeptical memory. The code confirms that Anthropic’s agents are instructed to treat their own memory as a “hint,” requiring the model to verify facts against the actual codebase before proceeding.KAIROS and the autonomous daemonThe leak also pulls back the curtain on “KAIROS,” the Ancient Greek concept of “at the right time,” a feature flag mentioned over 150 times in the source. KAIROS represents a fundamental shift in user experience: an autonomous daemon mode. While current AI tools are largely reactive, KAIROS allows Claude Code to operate as an always-on background agent. It handles background sessions and employs a process called autoDream.In this mode, the agent performs “memory consolidation” while the user is idle. The autoDream logic merges disparate observations, removes logical contradictions, and converts vague insights into absolute facts. This background maintenance ensures that when the user returns, the agent’s context is clean and highly relevant. The implementation of a forked subagent to run these tasks reveals a mature engineering approach to preventing the main agent’s “train of thought” from being corrupted by its own maintenance routines.Unreleased internal models and performance metricsThe source code provides a rare look at Anthropic’s internal model roadmap and the struggles of frontier development. The leak confirms that Capybara is the internal codename for a Claude 4.6 variant, with Fennec mapping to Opus 4.6 and the unreleased Numbat still in testing.Internal comments reveal that Anthropic is already iterating on Capybara v8, yet the model still faces significant hurdles. The code notes a 29-30% false claims rate in v8, an actual regression compared to the 16.7% rate seen in v4. Developers also noted an “assertiveness counterweight” designed to prevent the model from becoming too aggressive in its refactors. For competitors, these metrics are invaluable; they provide a benchmark of the “ceiling” for current agentic performance and highlight the specific weaknesses (over-commenting, false claims) that Anthropic is still struggling to solve.”Undercover” ClaudePerhaps the most discussed technical detail is the “Undercover Mode.” This feature reveals that Anthropic uses Claude Code for “stealth” contributions to public open-source repositories. The system prompt discovered in the leak explicitly warns the model: “You are operating UNDERCOVER… Your commit messages… MUST NOT contain ANY Anthropic-internal information. Do not blow your cover.” While Anthropic may use this for internal “dog-fooding,” it provides a technical framework for any organization wishing to use AI agents for public-facing work without disclosure. The logic ensures that no model names (like “Tengu” or “Capybara”) or AI attributions leak into public git logs—a capability that enterprise competitors will likely view as a mandatory feature for their own corporate clients who value anonymity in AI-assisted development.The fallout has just begunThe “blueprint” is now out, and it reveals that Claude Code is not just a wrapper around a Large Language Model, but a complex, multi-threaded operating system for software engineering. Even the hidden “Buddy” system—a Tamagotchi-style terminal pet with stats like CHAOS and SNARK—shows that Anthropic is building “personality” into the product to increase user stickiness.For the wider AI market, the leak effectively levels the playing field for agentic orchestration. Competitors can now study Anthropic’s 2,500+ lines of bash validation logic and its tiered memory structures to build “Claude-like” agents with a fraction of the R&D budget. As the “Capybara” has left the lab, the race to build the next generation of autonomous agents has just received an unplanned, $2.5 billion boost in collective intelligence.What Claude Code users and enterprise customers should do now about the alleged leakWhile the source code leak itself is a major blow to Anthropic’s intellectual property, it poses a specific, heightened security risk for you as a user. By exposing the “blueprints” of Claude Code, Anthropic has handed a roadmap to researchers and bad actors who are now actively looking for ways to bypass security guardrails and permission prompts. Because the leak revealed the exact orchestration logic for Hooks and MCP servers, attackers can now design malicious repositories specifically tailored to “trick” Claude Code into running background commands or exfiltrating data before you ever see a trust prompt.The most immediate danger, however, is a concurrent, separate supply-chain attack on the axios npm package, which occurred hours before the leak. If you installed or updated Claude Code via npm on March 31, 2026, between 00:21 and 03:29 UTC, you may have inadvertently pulled in a malicious version of axios (1.14.1 or 0.30.4) that contains a Remote Access Trojan (RAT). You should immediately search your project lockfiles (package-lock.json, yarn.lock, or bun.lockb) for these specific versions or the dependency plain-crypto-js. If found, treat the host machine as fully compromised, rotate all secrets, and perform a clean OS reinstallation.To mitigate future risks, you should migrate away from the npm-based installation entirely. Anthropic has designated the Native Installer (curl -fsSL https://claude.ai/install.sh | bash) as the recommended method because it uses a standalone binary that does not rely on the volatile npm dependency chain. The native version also supports background auto-updates, ensuring you receive security patches (likely version 2.1.89 or higher) the moment they are released. If you must remain on npm, ensure you have uninstalled the leaked version 2.1.88 and pinned your installation to a verified safe version like 2.1.86.Finally, adopt a zero trust posture when using Claude Code in unfamiliar environments. Avoid running the agent inside freshly cloned or untrusted repositories until you have manually inspected the .claude/config.json and any custom hooks. As a defense-in-depth measure, rotate your Anthropic API keys via the developer console and monitor your usage for any anomalies. While your cloud-stored data remains secure, the vulnerability of your local environment has increased now that the agent’s internal defenses are public knowledge; staying on the official, native-installed update track is your best defense.
Imagine if your Teams or Slack messages automatically turned into secure context for your AI agents — PromptQL built it
For the modern enterprise, the digital workspace risks descending into “coordination theater,” in which teams spend more time discussing work than executing it. While traditional tools like Slack or Teams excel at rapid communication, they have structurally failed to serve as a reliable foundation for AI agents, such that a Hacker News thread went viral in February 2026 calling upon OpenAI to build its own version of Slack to help empower AI agents, amassing 327 comments. That’s because agents often lack the real-time context and secure data access required to be truly useful, often resulting in “hallucinations” or repetitive re-explaining of codebase conventions. PromptQL, a spin-off from the GraphQL unicorn Hasura, is addressing this by pivoting from an AI data tool into a comprehensive, AI-native workspace designed to turn casual, regular team interactions into a persistent, secure memory for agentic workflows — ensuring these conversations are not simply left by the wayside or that users and agents have to try and find them again later, but rather, distilled and stored as actionable, proprietary data in an organized format — an internal wiki — that the company can rely on going forward, forever, approved and edited manually as needed. Imagine two colleagues messaging about a bug that needs to be fixed — instead of manually assigning it to an engineer or agent, your messaging platform automatically tags it, assigns it and documents it all in the wiki with one click Now do this for every issue or topic of discussion that takes place in your enterprise, and you’ll have an idea of what PromptQL is attempting. The idea is a simple but powerful one: turning the conversation that necessarily precedes work into an actual assignment that is automatically started by your own messaging system. “We don’t have conversations about work anymore,” CEO Tanmai Gopal said in a recent video call interview with VentureBeat. “You actually have conversations that do the work.”Originally positioned as an AI data analyst, the company—a spin-off from the GraphQL unicorn Hasura—is pivoting into a full-scale AI-native workspace. It isn’t just “Slack with a chatbot”; it is a fundamental re-architecting of how teams interact with their data, their tools, and each other. “PromptQL is this workhorse in the background, this 24/7 intern that’s continuously cranking out the actual work—looking at code, confirming hypotheses, going to multiple places, actually doing the work,” Gopal said.Technology: messages that automatically turn into a shared, continuously updated context engineThe technical soul of PromptQL is its Shared Wiki. Traditional LLMs suffer from a “memory” problem; they forget previous interactions or hallucinate based on outdated training data. PromptQL solves this by capturing “shared context” as teams work. When an engineer fixes a bug or a marketer defines a “recycled lead,” they aren’t just typing into a void. They are teaching a living, internal Wikipedia. This wiki doesn’t require “documentation sprints” or manual YAML file updates; it accumulates context organically.“Throughout every single conversation, you are teaching PromptQL, and that is going into this wiki that is being developed over time. This is our entire company’s knowledge gradually coming together.”Interconnectivity: Much like cells in a Petri dish, small “islands” of knowledge—say, a Salesforce integration—eventually bridge to other islands, like product usage data in Snowflake.Human-in-the-Loop: To prevent the AI from learning “junk” (like a reminder about a doctor’s appointment from 2024), humans must explicitly “Add to Wiki” to canonize a fact.The Virtual Data Layer: Unlike traditional platforms that require data replication, PromptQL uses a virtual SQL layer. It queries your data in place across databases (Snowflake, Clickhouse, Postgres) and SaaS tools (Stripe, Zendesk, HubSpot), ensuring that nothing is ever extracted or cached,.PromptQL is designed to be a highly integrable orchestration layer that supports both leading AI model providers and a vast ecosystem of existing enterprise tools.AI Model Support: The platform allows users to delegate tasks to specific coding agents such as Claude Code and Cursor, or use custom agents built for specific internal needs.Workflow Compatibility: The system is built to inherit context from existing team tools, enabling AI agents to understand codebase conventions or deployment patterns from your existing infrastructure without manual re-explanationFrom chatting to doingThe PromptQL interface looks familiar—threads, channels, and mentions—but the functionality is transformative. In a demonstration, an engineer identifies a failing checkout in a #eng-bugs channel. Instead of tagging a human SRE, they delegate to Claude Code via PromptQL.The agent doesn’t just look at the code; it inherits the team’s shared context. It knows, for instance, that “EU payments switched to Adyen on Jan 15” because that fact was added to the wiki weeks prior. Within minutes, the AI identifies a currency mismatch, pushes a fix, opens a PR, and updates the wiki for future reference. This “multiplayer” AI approach is what sets the platform apart. It allows a non-technical manager to ask, “Which accounts have growing Stripe billing but flat Mixpanel usage?” and receive a joined table of data pulled from two disparate sources instantly. The user can then schedule a recurring Slack DM of those results with a single follow-up command.Also, users don’t even need to think about the integrity or cleanliness of their data — PromptQL handles it for them: “Connect all data in whatever state of shittiness it is, and let shared context build up on the fly as you use it,” Gopal said. Highly secureFor Fortune 500 companies like McDonald’s and Cisco, “just connect your data” is a terrifying sentence. PromptQL addresses this with fine-grained access control.The system enforces attribute-based policies at the infrastructure level. If a Regional Ops Manager asks for vendor rates across all regions, the AI will redact columns or rows they aren’t authorized to see, even if the LLM “knows” the answer. Furthermore, any high-stakes action—like updating 38 payment statuses in Netsuite—requires a human “Approve/Deny” sign-off before execution.Licensing and pricingIn a departure from the “per-seat” SaaS status quo, PromptQL is entirely consumption-based.Pricing: The company uses “Operational Language Units” (OLUs).Philosophy: Gopal argues that charging per seat penalizes companies for onboarding their whole team. By charging for the value created (the OLU), PromptQL encourages users to connect “everyone and everything”.Enterprise Storage: While smaller teams use dedicated accounts, enterprise customers get a dedicated VPC. Any data the AI “saves” (like a custom to-do list) is stored in the customer’s own S3 bucket using the Iceberg format, ensuring total data sovereignty.”Philosophically, we want you to connect everyone and everything [to PromptQL], so we don’t penalize that,” Gopal said. “We just price based on consumption.”Why it matters now for enterprisesSo, is PromptQL a Teams or Slack killer? According to Gopal, the answer is yes: “That is what has happened for us. We’ve shut down our internal Slack for internal comms entirely,” he said.The launch comes at a pivot point for the industry. Companies are realizing that “chatting with a PDF” isn’t enough. They need AI that can act, but they can’t afford the security risks of “unsupervised” agents. By building a workspace that prioritizes shared context and human-in-the-loop verification, PromptQL is offering a middle ground: an AI that learns like a teammate and executes like an intern, all while staying within the guardrails of enterprise security.For enterprises focused on making AI work at scale, PromptQL addresses the critical “how” of implementation by providing the orchestration and operational layer needed to deploy agentic systems. By replacing the “coordination theater” of traditional chat tools with a workspace where AI agents have the same permissions and context as human teammates, it enables seamless multi-agent coordination and task-routing. This allows decision-makers to move beyond simple model selection to a reality where agents—such as Claude Code—use shared team context to execute complex workflows, like fixing production bugs or updating CRM records, directly within active threads.From a data infrastructure perspective, the platform simplifies the management of real-time pipelines and RAG-ready architectures by utilizing a virtual SQL layer that queries data “in place”. This eliminates the need for expensive, time-consuming data preparation and replication sprints across hundreds of thousands of tables in databases like Snowflake or Postgres. Furthermore, the system’s “Shared Wiki” serves as a superior alternative to standard vector databases or prompt-based memory, capturing tribal knowledge organically and creating a living metadata store that informs every AI interaction with company-specific reasoning.Finally, PromptQL addresses the security governance required for modern AI stacks by enforcing fine-grained, attribute-based access control and role-based permissions. Through human-in-the-loop verification, it ensures that high-stakes actions and data mutations are held for explicit approval, protecting against model misuse and unauthorized data leakage. While it does not assist with physical infrastructure tasks such as GPU cluster optimization or hardware procurement, it provides the necessary software guardrails and auditability to ensure that agentic workflows remain compliant with enterprise standards like SOC 2, HIPAA, and GDPR.
Softr launches AI-native platform to help nontechnical teams build business apps without code
Softr, the Berlin-based no-code platform used by more than one million builders and 7,000 organizations including Netflix, Google, and Stripe, today launched what it calls an AI-native platform — a bet that the explosive growth of AI-powered app creation tools has produced a market full of impressive demos but very little production-ready business software.The company’s new AI Co-Builder lets non-technical users describe in plain language the software they need, and the platform generates a fully integrated system — database, user interface, permissions, and business logic included — connected and ready for real-world deployment immediately. The move marks a fundamental evolution for a company that spent five years building a no-code business before layering AI on top of what it describes as a proven infrastructure of constrained, pre-built building blocks.”Most AI app-builders stop at the shiny demo stage,” Softr Co-Founder and CEO Mariam Hakobyan told VentureBeat in an exclusive interview ahead of the launch. “A lot of the time, people generate calculators, landing pages, and websites — and there are a huge number of use cases for those. But there is no actual business application builder, which has completely different needs.”The announcement arrives at a moment when the AI app-building market finds itself at an inflection point. A wave of so-called “vibe coding” platforms — tools like Lovable, Bolt, and Replit that generate application code from natural language prompts — have captured developer mindshare and venture capital over the past 18 months. But Hakobyan argues those tools fundamentally misserve the audience Softr is chasing: the estimated billions of non-technical business users inside companies who need custom operational software but lack the skills to maintain AI-generated code when it inevitably breaks.Why AI-generated app prototypes keep failing when real business data is involvedThe core tension Softr is trying to resolve is one that has plagued the AI app-building category since its inception: the gap between what looks good in a demo and what actually works when real users, real data, and real security requirements enter the picture.Business software — client portals, CRMs, internal operational tools, inventory management systems — requires authentication, role-based permissions, database integrity, and workflow automation that must function reliably every single time. When an AI-generated prototype fails in these areas, fixing it typically requires a developer, which defeats the purpose of the no-code promise entirely.”One prompt might break 10 previous steps that you’ve already completed,” Hakobyan said, describing the experience non-technical users face on vibe coding platforms. “You keep prompting, keep trying to fix errors that the AI generated, and you end up maintaining something you didn’t even sign up for in the first place.”This critique targets a real structural limitation in how many AI app builders work today. Platforms that fully rely on AI to generate application code from scratch leave users with a codebase they cannot read, debug, or maintain without technical expertise. To connect those generated apps to real databases, login systems, or third-party services, users often must integrate tools like Supabase and make API calls — tasks that effectively require them to become developers. Softr’s position is that these platforms have replaced one form of coding with another, swapping programming languages for English-language prompts that carry all the same fragility.How Softr’s building block architecture avoids the hallucination problem that plagues AI code generatorsRather than generating raw code, Softr’s platform uses what Hakobyan describes as “proven and structured building blocks” — pre-built components for standard application functions like Kanban boards, list views, tables, user authentication, and permissions. The AI interprets a user’s requirements, guides them through targeted questions about login functionality, permission types, and user roles, then assembles these tested building blocks in a constrained, intelligent way. Only when a user requests functionality that falls outside the standard 80% covered by these blocks does the system build a custom component with AI.”It basically never hallucinates, because it’s all built on an infrastructure that’s secure and constrained,” Hakobyan explained. “It doesn’t generate code or leave you with code, because underneath, it uses our existing building block model.”The result is not a code repository. It is a live application running on Softr’s infrastructure, with a visual editor that users can continue to modify — either by prompting the AI further or by directly manipulating the no-code interface. This dual-editing model is a deliberate design decision that Hakobyan frames as the platform’s core differentiator. “It almost combines the best of both worlds of AI and no code, and really lets users to either continue iterating with AI or then continue working with the app visually, which is much simpler and easier and for them to have control,” she said.Core platform foundations — authentication, user roles, permissions, hosting, and SSL — are built in from the start, eliminating what Hakobyan calls the “blank canvas problem” that plagues vibe coding platforms, where every user must architect fundamental application infrastructure from scratch via prompts. The platform uses a SaaS subscription pricing model, with each plan including a set number of AI credits and the option to purchase more — though the visual editor means users don’t always need to consume credits, since direct manipulation of the no-code interface is often faster and more precise.Inside the five-year journey from Airtable interface to profitable AI-native platformSoftr’s journey to this moment has been a gradual, disciplined expansion that stands in contrast to the rapid fundraising cycles common among AI startups. The company launched in 2020 as a no-code interface layer on top of Airtable, the popular enterprise database product. Co-founded by Armenian entrepreneurs Hakobyan and CTO Artur Mkrtchyan, the startup raised a $2.2 million seed round in early 2021 led by Atlantic Labs, followed by a $13.5 million Series A in January 2022 led by FirstMark Capital.What happened next is notable for its restraint. Softr has not raised additional capital since that 2022 Series A. Instead, it has grown to profitability. “We have been profitable for the past whole year, and we’re about 50 people team,” Hakobyan told VentureBeat. “We have grown to eight-digit revenue fully PLG, no sales team, mostly through word of mouth, organic growth.”That financial profile — eight-figure annual revenue, profitable, 50 employees, no sales team — is striking in a market where many AI-powered competitors are spending heavily to acquire users. Over the past year, the company has steadily expanded its technical capabilities, moving beyond its original Airtable dependency to support Google Sheets, Notion, PostgreSQL, MySQL, MariaDB, and other databases.In February 2025, TechCrunch reported on this expansion, with Hakobyan explaining that many potential customers had “data scattered across many different tools” and needed a single platform to unify that fragmented infrastructure. Today, Softr offers 15-plus native integrations with external databases, plus a REST API connector for additional data sources. The new AI Co-Builder represents the culmination of this multi-year evolution — combining the building block architecture, the broad data integration layer, and a new AI interface into a single platform for business application creation.How Softr positions itself against both no-code incumbents and vibe coding startupsSoftr’s launch lands in a rapidly fragmenting competitive landscape, and Hakobyan is deliberate about where she draws the lines. On one side sit traditional no-code platforms like Bubble, which offer deep customization and design freedom but require users to build everything from scratch — database schemas, pixel-level layouts, authentication systems — creating a steeper learning curve. A TechRadar review noted that while Softr’s blocks don’t offer the same design freedom as Bubble, the platform’s simplicity makes it accessible to genuinely non-technical users. In a comparison published by Business Insider Africa in June, Softr was characterized as offering “minimal learning curve, especially for internal or web-based tools,” though with limitations in scalability for more complex applications.On the other side sit the AI-first code generation tools that Hakobyan views as fundamentally misaligned with business software requirements. “Before people were coding, then they were coding through APIs, now they are coding almost through a human language interface, right, just by with English,” Hakobyan said. “But what Softr does is fundamentally different. It abstracts all of that and makes the creation simple.”She also distinguishes Softr from developer-focused AI coding assistants like Anthropic’s Claude Code, positioning those as tools that make professional developers more efficient rather than tools that enable non-developers to build software.”There are amazing tools for developers — that’s great. The target audience is developers,” Hakobyan acknowledged. Instead, Softr targets a specific and potentially enormous market: businesses that need custom internal and external-facing operational tools and currently rely on spreadsheets, email, or rigid off-the-shelf software that doesn’t match their actual processes. Hakobyan described use cases ranging from asset production workflows for film companies — where internal teams, external agencies, and approvers interact across a multi-stage process — to lightweight CRM replacements for teams that don’t need the full complexity of Salesforce. “There’s not even a vertical solution for this type of process,” she said. “It’s very custom to each organization.”What Netflix, Google, and thousands of non-tech companies actually build on the platformMany of Softr’s highest-profile customers — Netflix, Google, Stripe, UPS — were using the platform before the AI Co-Builder even existed, building on the company’s original no-code foundation. But the user base extends far beyond Silicon Valley. Non-tech organizations in real estate, manufacturing, and logistics represent a significant portion of Softr’s customer base — companies that often still manage core processes with pen, paper, and spreadsheets.”A lot of these companies — you might think they already have the solutions, but they don’t,” Hakobyan noted. “In tech companies, most of the time, CRM and project management tools are already established. But most of our customers are using Softr for internal operational tooling or workflow tooling, where the use case involves lots of different departments and even external parties.”The company is SOC 2 (Type II) compliant and GDPR compliant, with additional compliance capabilities in development. Hakobyan noted that auditing and governance functionality can be built directly into applications using the platform’s database and workflow tools, with a native logging and auditing system expected to ship in the near term. Softr’s billion-user ambition and the Canva analogy that explains its strategySoftr’s stated mission — to empower billions of business users to create production-ready software — is audacious, but Hakobyan frames the AI Co-Builder launch as a fundamental acceleration of the trajectory the company has been on for five years. “Everything people would have to spend hours doing is done within five minutes,” she said. “And obviously that helps more people to actually build real software.”The company plans to layer a product-led sales motion on top of its existing PLG engine, targeting larger enterprise customers with higher average contract values. This represents a deliberate strategic expansion from the small and mid-sized businesses that have formed Softr’s core customer base — a segment that TechCrunch identified as natural Softr customers as far back as the company’s 2022 Series A, given that those firms are most likely to be priced out of the competitive developer market.Hakobyan draws an analogy that has apparently become common among the company’s users: Softr as “Canva for web apps.” Just as Canva made professional design accessible to non-designers, Softr aims to make business software creation accessible to non-developers. Whether the company can translate its disciplined growth and profitable foundation into a platform that genuinely serves that enormous addressable market remains to be seen. Softr faces intensifying competition from both traditional no-code incumbents adding AI capabilities and well-funded AI-native startups approaching the problem from the code-generation side.But Softr enters this next phase with advantages that many competitors lack: a profitable business, a million-user base already shipping production software, and an architectural approach that treats AI as an accelerant layered on top of proven infrastructure rather than an unpredictable replacement for it. “No code alone had its own problems, and AI alone also just can’t do the job,” Hakobyan said. “The combination is what’s going to be making it really powerful.”For the past five years, Softr bet that the hardest part of software wasn’t writing the code — it was getting the databases, permissions, and business logic right. Now the company is betting that in the age of AI, that conviction matters more than ever. The millions of business users who have never written a line of code but desperately need custom software are about to find out whether Softr is right.
Nvidia-backed ThinkLabs AI raises $28 million to tackle a growing power grid crunch
ThinkLabs AI, a startup building artificial intelligence models that simulate the behavior of the electric grid, announced today that it has closed a $28 million Series A financing round led by Energy Impact Partners (EIP), one of the largest energy transition investment firms in the world. Nvidia’s venture capital arm NVentures and Edison International, the parent company of Southern California Edison, also participated in the round.The funding marks a significant escalation in the race to apply AI not just to software and content generation, but to the physical infrastructure that powers modern life. While most AI investment headlines have centered on large language models and generative tools, ThinkLabs is pursuing a different and arguably more consequential application: using physics-informed AI to model the behavior of electrical grids in real time, compressing engineering studies that once took weeks or months into minutes.”We are dead focused on the grid,” ThinkLabs CEO Josh Wong told VentureBeat in an exclusive interview ahead of the announcement. “We do AI models to model the grid, specifically transmission and distribution power flow related modeling. We can calculate things like interconnection of large loads — like data centers or electric vehicle charging — and understand the impact they have on the grid.”The round drew participation from a deep bench of returning investors, including GE Vernova, Powerhouse Ventures, Active Impact Investments, Blackhorn Ventures, and Amplify Capital, along with an unnamed large North American investor-owned utility. The company initially set out to raise less than $28 million, according to Wong, but strong demand from strategic partners pushed the round higher.”This was way oversubscribed,” Wong said. “We attracted the right ecosystem partners and the right capital partners to grow with, and that’s how we ended up at $28 million.”Why surging electricity demand is breaking the grid’s legacy planning toolsThe timing of the raise is no coincidence. U.S. electricity demand is projected to grow 25% by 2030, according to consultancy ICF International, driven largely by AI data centers, electrified transportation, and the broader push toward building and vehicle electrification. That surge is crashing into a grid that was engineered decades ago for a fundamentally different set of demands — and utilities are scrambling to keep up.The core problem is one of computational capacity. When a utility needs to understand what will happen to its grid if a large data center connects to a particular substation, or if a cluster of EV chargers goes live in a residential neighborhood, engineers must run power flow simulations — complex calculations that model how electricity moves through the network. Those studies have traditionally relied on legacy software tools from companies like Siemens, GE, and Schneider Electric, and they can take weeks or months to complete for a single scenario.ThinkLabs’ approach replaces that bottleneck with physics-informed AI models that learn from the same engineering simulators but can then run orders of magnitude faster. According to the company, its platform can compress a month-long grid study into under three minutes and run 10 million scenarios in 10 minutes, while maintaining greater than 99.7% accuracy on grid power flow calculations.Wong draws a sharp distinction between what ThinkLabs does and the generative AI models that dominate public discourse. “We’re not hallucinating the heck out of things,” he said. “We are talking about engineering calculations here. I would really compare this to a computation of fluid dynamics, or like F1 cars, or aerospace, or climate models. We do have a source of truth from existing physics-based engineering models.”That source of truth is crucial. ThinkLabs trains its AI on the outputs of first-principles physics simulators — the same tools utilities already trust — and then validates its models against those simulators. The result, Wong argues, is an AI system that is not only fast but fully explainable and auditable, a critical requirement in an industry where a miscalculation can cause blackouts or damage physical infrastructure.How ThinkLabs’ three-phase power flow analysis differs from every other grid AI startupThe competitive landscape for AI in grid management has grown crowded over the past two years, with startups and incumbents alike racing to apply machine learning to utility workflows. But Wong contends that ThinkLabs occupies a fundamentally different position from most of its competitors.”As far as we know, we’re the only ones actually doing AI-native grid simulation analysis,” he said. “Others might be using AI for forecasting, load disaggregation, or local energy management, but fundamentally, they’re not calculating a power flow.”What ThinkLabs performs is a full three-phase AC power flow analysis — examining every node and bus on the electric grid to determine real and reactive power levels, line flows, and voltages. This is the same type of analysis that utility engineers perform today using legacy tools, but ThinkLabs can deliver it at a speed and scale that those tools simply cannot match.The distinction matters because utilities make capital investment decisions — worth billions of dollars — based on exactly these types of studies. If a power flow analysis shows that a proposed data center connection will overload a transmission line, the utility may need to build new infrastructure at enormous cost. But if the analysis can also suggest alternative solutions — battery storage placement, load flexibility scheduling, or topology optimization — the utility can potentially avoid or defer those capital expenditures.”With many utilities, existing tools will basically show them all the problems, but they can only address solutions by trial and error,” Wong explained. “With AI, we can use reinforcement learning to generate more creative solutions, but also very effectively weigh the pros and cons of each of these solutions.”Inside ThinkLabs’ strategic relationships with NVIDIA, Edison, and MicrosoftThe presence of NVentures in the round — Nvidia’s venture arm does not write many checks — signals a deeper strategic relationship that extends well beyond capital. Wong confirmed that ThinkLabs works extensively within the Nvidia ecosystem on the energy and utility side, leveraging CUDA for GPU-accelerated computation and integrating Nvidia’s Earth-2 climate simulation platform into ThinkLabs’ probabilistic forecasting and risk-adjusted analysis pipelines.”We are what one utility mentioned as the only high-intensity GPU workload for the OT side — the operational technology side — that’s planning and operations,” Wong said. He added that ThinkLabs is also in discussions with Nvidia’s Omniverse team about additional utility use cases, though those efforts are still early.Edison International’s participation carries a different kind of strategic weight. In January 2026, ThinkLabs publicly announced results from a collaboration with Southern California Edison (SCE), Edison International’s utility subsidiary, that demonstrated the real-world capabilities of its platform. As the Los Angeles Times reported at the time, the collaboration showed that ThinkLabs’ AI could train in minutes per circuit, process a full year of hourly power-flow data in under three minutes across more than 100 circuits, and produce engineering reports with bridging-solution recommendations in under 90 seconds — work that previously required dedicated engineers an average of 30 to 35 days.In today’s announcement, Edison International’s Sergej Mahnovski, Managing Director of Strategy, Technology and Innovation, reinforced that urgency: “We must rapidly transition from legacy planning tools and processes to meet the growing demands on the electric grid — new AI-native solutions are needed to transform our capabilities.”ThinkLabs also works closely with Microsoft, which hosted a webinar in mid-2025 featuring Wong alongside representatives from Southern Company, EPRI, and Microsoft’s own energy team. The SCE collaboration was built on Microsoft Azure AI Foundry, situating ThinkLabs within the cloud infrastructure that many large utilities already use.The 20-year career path that led from Toronto Hydro to an autonomous grid startupWong’s biography reads like a deliberate preparation for this exact moment. He has spent more than 20 years in the utility industry, starting his career at Toronto Hydro before founding Opus One Solutions in 2012 — a smart-grid software company that he grew to over 100 employees serving customers across eight countries before selling it to GE in 2022, as previously reported by BetaKit.After the acquisition, Wong joined what became GE Vernova and was asked to develop the company’s “grid of the future” roadmap. The thesis he developed there — that the grid is the central bottleneck to economic growth, electrification, and national security, and that autonomous grid orchestration powered by AI is the solution — became the intellectual foundation for ThinkLabs.”I was pulling together the thesis that we need to electrify, but the grid is really at the center of attention,” Wong said. “The conclusion is we need to drive towards greater autonomy. We talk a lot about autonomous cars, but I would argue that autonomous grids is the much more pressing priority.”ThinkLabs was incubated inside GE Vernova and spun out as an independent company in April 2024, coinciding with a $5 million seed round co-led by Powerhouse Ventures and Active Impact Investments, as reported by GlobeNewswire at the time. GE Vernova remains a shareholder and strategic partner. Wong is the sole founder.The team composition reflects the company’s dual identity. “Half of our team are power system PhDs, but the other half are the AI folks — people who have been looking at hyper-scalable AI infrastructure platforms and MLOps for other industries,” Wong said. “We have really been blending the two.”How ThinkLabs doubled its utility customer base in a single quarterUtilities are famously among the most conservative technology buyers in the world, with procurement cycles that can stretch years and layers of regulatory oversight that slow adoption. Wong acknowledges this reality but says the landscape is shifting faster than many observers realize.”I have noticed sales cycles really accelerating,” he said. “It’s still long and depends on which utility and how big the deal is, but we have been witnessing firsthand sales cycles going from the traditional one to two years to a shortest two to three months.”On the commercial side, Wong declined to share specific revenue figures but offered several data points that suggest meaningful traction. ThinkLabs is working with more than 10 utilities on AI-native grid simulation for planning and operations, he said, and the company doubled its customer accounts in the first quarter of 2026 alone.”So not one or two, but we’re working with 10-plus utilities,” Wong said. “Things have really picked up pace even before this A round.”The company primarily targets investor-owned utilities and system operators — the organizations that own and operate the grid — though Wong noted that AI is also beginning to democratize grid simulation capabilities for smaller utilities that previously lacked the engineering resources to run sophisticated analyses.Wong said the primary use of funds will go toward advancing the product to enterprise grade and expanding the range of use cases the platform supports. The company sees a significant land-and-expand opportunity within individual utility accounts — moving from modeling a small region to training AI models across entire states or multi-state territories within a single customer.EIP’s involvement as lead investor carries particular significance in this market. The firm is backed by more than half of North America’s investor-owned utilities, giving ThinkLabs a direct line into the executive suites of the customers it is trying to reach. “Utilities are being asked to add capacity on timelines the industry has never seen before, and the stakes extend far beyond the energy sector,” Sameer Reddy, Managing Partner at EIP, said in the press release.What a 99.7% accuracy rate actually means for critical grid infrastructureAny conversation about applying AI to critical infrastructure inevitably confronts the question of failure modes. A hallucination in a chatbot is an embarrassment; a miscalculation in a grid power flow analysis could contribute to equipment damage or widespread outages.Wong addressed this head-on. The 99.7% accuracy figure, he explained, is an average across large-volume planning studies — specifically 8,760-hour analyses (every hour of the year) projected across three to 10 years with multiple sensitivity scenarios. For planning purposes, he argued, this level of accuracy is not only sufficient but may actually exceed what traditional methods deliver in practice.”If you look at a source of truth, the data quality is actually the biggest limiting factor, not the accuracy of these AI models,” he said. “When we bring in traditional engineering analysis and actually snap it with telemetry — metering data, SCADA data — I would actually argue AI is far more accurate because it is data driven on actual measurements, rather than hypothetical planning analysis based on scenarios.”For more critical real-time applications, ThinkLabs deploys what Wong called “hybrid models” that blend AI computation with traditional physics-based simulation. In the most stringent use cases, the AI handles roughly 99% of the computational workload before handing off to a physics-based engine for final validation — a technique Wong described as using AI to “warm start” the simulation.The company also monitors for model drift and maintains strict training boundaries. “We’re not like ChatGPT training the internet here,” Wong said. “We’re training on the possibility of grid conditions. And if we do see a condition where we did not train, or outside of our training boundary, we can always run on-demand training on those certain solution spaces.”Why ThinkLabs says its value proposition survives even if the data center boom slows downThe bullish case for ThinkLabs — and for grid-focused AI more broadly — rests heavily on the assumption that electricity demand will surge dramatically over the coming decade. But some analysts have begun questioning whether those projections are inflated, particularly if AI investment cycles cool and data center build-outs decelerate.Wong argued that his company’s value proposition is resilient to that scenario. Even without dramatic load growth, he said, utilities face a fundamental modernization challenge. They have been using tools and processes from the 1990s and 2000s, and the workforce that knows how to operate those tools is retiring at an alarming rate.”Workforce renewal is a big factor,” he said. “These AI tools not only modernize the tool itself, but also modernize culture and transformation and become major points of retention for the next generation.”He also pointed to energy affordability as a driver that exists independent of load growth projections. If utilities continue to plan based on worst-case deterministic scenarios — building enough infrastructure to cover every conceivable contingency — consumer rates will become unmanageable. AI-powered probabilistic analysis, Wong argued, allows utilities to make smarter, more cost-effective decisions regardless of whether the most aggressive demand forecasts materialize.”A large part of this AI is not only enabling workload, but how do we act with intelligence — going from worst-case to time-series analysis, from deterministic to probabilistic and stochastic analysis, and also coming up with solutions,” he said.Wong frames the broader opportunity with an analogy that captures both the simplicity and the ambition of what ThinkLabs is attempting. For decades, he said, the utility industry’s default response to grid constraints has been the equivalent of building wider highways — more wires, more copper, more steel. ThinkLabs wants to be the navigation system that reroutes traffic instead.”In the past, when we drive, we always drive with what we are familiar with — just the big roads,” he said. “But with AI, we can optimize the traffic patterns to drive on much more effective routes. In this case, it might be a mix of wires, flexibility, batteries, and operational decisions.”Whether ThinkLabs can deliver on that vision at the scale the grid demands remains an open question. But Wong, who has spent two decades building and selling grid software companies, is not thinking in terms of incremental improvement. He sees a narrow window — measured in years, not decades — during which the foundational AI infrastructure for the grid will be built, and whoever builds it will shape the energy system for a generation.”I truly believe the next two years of AI development for the grid will dictate the next decades of what can happen to the grid,” Wong said. “It’s really here now.”The grid, in other words, is getting a copilot. The question is no longer whether utilities will trust AI with their most critical engineering decisions, but how quickly they can afford not to.