Core Conclusions
Data security is moving from a "compliance afterthought" to the primary control plane for enterprise AI adoption. Microsoft 365 Copilot, Copilot connectors, Azure AI Search, Databricks Unity Catalog, Snowflake Horizon, AWS Bedrock Guardrails, and Google Cloud DSPM are all pushing "data permissions, labels, classification, auditing, and filtering" upstream into the AI call chain, rather than bolting security on after the model. Microsoft states explicitly that Copilot will only return organizational data the user has "at least view permission" to; connectors and Azure AI Search likewise place ACLs, permission filtering, and token-based authorization at the retrieval stage.
Once enterprise AI, RAG, and agents spread, the first thing exposed is not a "model capability" bottleneck but the "over-shared data surface" and the "ungoverned permission surface." Copilot, RAG, and agents do not redefine a new enterprise permission system; they amplify the misconfigured sharing and overly broad authorization already sitting in SharePoint, OneDrive, email, tickets, CRM, databases, data lakes, and knowledge bases. Microsoft, Azure AI Search, Elastic, Databricks, and Snowflake all stress document-level security, row filters, column masks, ACL/DLS, ABAC, and sensitivity labels as prerequisite capabilities in their official documentation.
DSPM, DLP, DDR, data access governance, and RAG permission governance are not parallel sectors; in the AI era they string together into one chain. DSPM finds "where the sensitive data is, who can access it, and whether it is exposed"; DLP intercepts "exfiltration"; DDR detects "anomalous access and movement"; access governance and RAG permission control ensure "only the right people and agents see the right content." Google Cloud DSPM, IBM Guardium, Rubrik DSPM, OpenText, Thales, Broadcom, Cloudflare, and Microsoft Purview are all converging on this chain.
AI agent security is fundamentally a crossover problem of "identity security + data permissions + tool-call governance." AI agents add large numbers of machine and non-human identities; CyberArk reports machine identities already reach 82:1, and its 2025 identity security report names AI as the leading source of new privileged identities; CrowdStrike's $740 million acquisition of SGNL places human, NHI, and AI identities directly inside a continuous identity control framework; Microsoft, Databricks, Cloudflare, and Anthropic all treat the permissions and auditing of agents, MCP, or connectors as core elements.
The first budgets to materialize are not "pure AI security stories" but four categories tightly bound to the existing data surface: Microsoft 365 / SharePoint / OneDrive permission cleanup and classification; multi-cloud / SaaS DSPM; GenAI DLP / prompt DLP; and document-level security for high-value RAG / enterprise search. These budgets connect directly to Copilots, knowledge-base assistants, customer-service assistants, developer assistants, internal search, and compliance auditing already in production.
Platform winners and AI-native challengers will coexist. The platform winners are mainly Microsoft, Google/Wiz, AWS, Snowflake, Databricks, Oracle, IBM, Palo Alto, CrowdStrike, and Cloudflare; the AI-native challengers are mainly Cyera, BigID, Sentra, Concentric AI, Securiti, Privacera, Veza, Noma Security, and Lasso Security. The former rely on distribution, identity, cloud, data platforms, and existing customer bases; the latter rely on stronger data discovery, data graphs, cross-cloud and cross-SaaS coverage, AI-native governance, and faster iteration.
There are not many listed companies that directly benefit with financial validation already in hand. Varonis, CyberArk (acquisition by Palo Alto completed), Snowflake, MongoDB, Elastic, Trend Micro, Cloudflare, CrowdStrike, Palo Alto, Microsoft, and Google Cloud all have clearer product-customer-platform paths; yet many of them do not break out "AI data security revenue" as a line item, and it shows up more as platform expansion, RPO/cRPO, large customer deals, and higher product attach rates.
Companies with a "strong AI data security narrative but insufficient revenue evidence" clearly outnumber the genuinely quantifiable beneficiaries. Cloudflare's AI-SPM / prompt protection, Palo Alto's Prisma AIRS, CrowdStrike's Falcon Data Security, BigID's AI governance, and companies like Noma/Lasso/Prompt Security move fast on product, but most do not publicly break out ARR or revenue contribution; for investment, weight "whether it is embedded in the strong distribution chain of an existing large platform" over release count alone.
The sectors with the greatest revenue elasticity are: multi-cloud DSPM, enterprise knowledge base permission governance, agent data access control, GenAI DLP, and data access governance. These sit closest to existing budgets and are most tightly bound to the real-world adoption of Copilot, internal search, data lakehouses, and externally connected SaaS systems. By contrast, vector database security, AI memory security, privacy-enhancing computation, and confidential computing remain proof-of-concept or specialized purchases at many enterprises.
The best long-term margins live not in scanning itself but in the "control plane." Specifically: the permission graph, label/classification engine, policy engine, retrieval authorization, access auditing, key and encryption orchestration, and a unified AI Gateway/Policy plane. These layers are inherently more software-like and platform-like, and easier to reuse across multiple use cases. Databricks Unity Catalog, Snowflake Horizon, Microsoft Purview, Oracle Deep Data Security, and Thales CipherTrust all reflect this.
The most bubble-prone valuations are: high-growth security platforms that talk "AI security" without clear revenue attribution; and the "high funding + low transparency" names among private AI-native security companies. Cyera has already risen to a $9 billion valuation in early 2026, and Databricks' Series K in 2025 valued it above $100 billion; the fundamentals of such names may be excellent, but the market has already paid for a great deal of expectation in advance.
Platform squeeze will be very pronounced. Google Cloud has already built DSPM into Security Command Center; Microsoft binds Purview, Copilot, Graph connectors, and Azure AI Search into one whole; Snowflake, Databricks, and Oracle are also unifying vector retrieval, catalog, labels, permissions, auditing, and AI Gateway into the data platform. Standalone tools that cannot deliver better graphs, precision, and remediation across cloud, SaaS, and data surfaces will be absorbed by the platforms.
Those at the highest risk of being disrupted are traditional point DLP, static data governance, and weak-graph compliance tools. Broadcom Symantec DLP, legacy email/endpoint DLP, tools that do catalog only without runtime access control, and tools that produce only passive compliance reports will all be squeezed by the platform-style unified "classification + permission + retrieval + logging + policy" control plane.
The biggest catalysts over the next 12–24 months are: production deployment of in-house Copilots and agents; product integration after Wiz is consolidated into Google Cloud; cross-selling of the AI data protection components of Palo Alto / CrowdStrike / Cloudflare; the native permissioning of Snowflake Horizon and Databricks Unity Catalog for agents/RAG; and a continued wave of large data security M&A.
The biggest risks over the next 12–24 months are: enterprise AI projects falling short on going into production; customers prioritizing models and compute over governance; platform-native features descending in price or going free; insufficient data classification precision causing false blocks; and cross-border data and AI regulation tightening in tandem. The EU AI Act will become fully applicable on August 2, 2026 (with exceptions for some provisions), the U.S. DOJ's bulk sensitive data rule has taken effect, and China's cross-border data rules continue to tighten.
Value Chain Landscape and Demand Re-rating
The reason data security becomes the core bottleneck once enterprise AI, RAG, and agents are deployed is not that enterprises suddenly "care more about security," but that AI turns what used to be dispersed, static, low-frequency data access into high-frequency, cross-system access amplified by reasoning. Microsoft states clearly that Copilot uses a user's files, email, chats, meetings, and other context through Microsoft Graph; as long as the user has permission, Copilot will use this content as grounding data. For enterprises, this means files that were historically "shared with too many people but never actively sought out" become highly reachable data that "a single natural-language question can surface and summarize."
RAG amplifies this problem further. Microsoft 365 Copilot connectors and Azure AI Search both make "document-level permissions" a core mechanism; Databricks Unity Catalog recommends using ABAC for centralized row/column filtering; Snowflake uses row access policies, dynamic masking, and tag-based masking; Elastic provides document-level / field-level security; Oracle's 26ai introduces Deep Data Security, binding row/column/cell-level access control directly to agentic AI. This is itself an industry signal: if permission inheritance were not a hard problem, platforms would not be rolling out these controls so densely. The difficulty here is that embedding, indexing, retrieval, rerank, and tool use often happen outside the original data, so ACLs, labels, identities, group information, and sensitivity metadata must be synchronized explicitly—otherwise the vector index becomes a "second data surface" detached from source-system authorization. This judgment is an industry inference based on each platform's product design.
The risk introduced by AI agents is more complex than RAG. RAG is mainly "read"; agents may "read + write + call tools + call APIs + persist memory + execute autonomously." Anthropic defines MCP as an open connection standard between AI tools and data sources; Databricks positions Unity AI Gateway as a unified governance layer across LLMs and MCP; Cloudflare released Mesh to protect the AI agent lifecycle; Palo Alto folded Portkey into Prisma AIRS; CrowdStrike, through SGNL, extends continuous identity control to AI identities. The industry's product roadmap already makes the point: agent security is not an extension of traditional prompt protection, but runtime data access governance.
The table below lays out the AI data security value chain from an investment perspective, focusing not on "who can tell a story" but on who sits in the control plane, who can charge for it, and who can build a platform moat.
Value Chain Position Segment Core Products AI Data Security Driver Revenue Model Competitive Moat Margin Profile Representative Companies Listing Status Benefit Strength Investment Elasticity Data sources Documents/email/chat/tickets SharePoint, OneDrive, Teams, email, Confluence, Jira, etc. Over-sharing amplified by Copilot/RAG Existing SaaS add-on security subscription Customer data lock-in + native permission model High platform gross margin Microsoft, Atlassian, ServiceNow, Salesforce, Box Listed High Medium SaaS data SaaS access governance SaaS DSPM, SaaS DLP, SaaS posture Shadow AI, third-party apps, external sharing seat/tenant/usage API coverage + behavioral data Medium-high Microsoft Purview, Cloudflare, Reco, Grip, DoControl Mixed High High Cloud storage Object storage sensitive data discovery Macie, Google DSPM, Alibaba DSC PII/PHI/PCI in S3/GCS/OSS usage + scan volume Cloud-native telemetry Medium AWS, Google Cloud, Alibaba Cloud Mixed High Medium Data lakehouse Catalog/labels/permissions Unity Catalog, Horizon RAG/AI apps connect directly to the lakehouse Platform bundling + higher tier Native control on the data surface High Databricks, Snowflake Mixed Very high Very high Databases Row/column-level security/encryption row policy, masking, queryable encryption Agents accessing transaction and customer databases edition/consumption Kernel integration High Oracle, Snowflake, MongoDB, IBM Listed High Medium-high Vector databases Retrieval filtering/endpoint ACL vector ACL, metadata filter Secondary exposure via embeddings seat/usage Difficulty of syncing with source permissions Medium Databricks, MongoDB, Elastic, Oracle Listed/Private Medium-high High Enterprise knowledge base Document-level permission governance DLS/ACL sync/semantic security trimming A hard requirement for enterprise search and RAG Search/security upsell ACL inheritance + index quality High Azure AI Search, Elastic, Box Listed Very high High Data catalog catalog / metadata / glossary Horizon, Collibra, Alation, Atlan AI needs to know "what data is available" platform subscription Metadata network effects High Snowflake, Collibra, Alation, Atlan Mixed Medium-high Medium Data lineage lineage Unity Catalog lineage, Snowflake external lineage AI auditing, traceability for training/inference Platform bundling Metadata depth High Databricks, Snowflake, BigID Mixed Medium-high Medium Data classification Sensitive data identification and labeling Purview, Google SDP, IBM Guardium Discovery PII/PHI/PCI/source code/contract identification tiered / usage Classifier precision and coverage High Microsoft, Google, IBM, BigID Mixed Very high High DSPM Data discovery, exposure surface, risk scoring Google DSPM, Cyera, BigID, Sentra, Concentric, Rubrik DSPM "Discover first, then govern" is a prerequisite before AI projects go live data source / TB / account Cross-cloud, cross-SaaS graph High Google/Wiz, Cyera, BigID, Sentra, Concentric AI, Rubrik Mixed Very high Very high DLP Email/endpoint/network/browser/GenAI DLP Purview DLP, Symantec DLP, Cloudflare DLP Prompt, copy, upload, exfiltration seat / endpoint / traffic Channel and inline deployment Medium-high Microsoft, Broadcom, Cloudflare, Zscaler Listed High Medium-high DDR Data detection and response Guardium DDR, Varonis / data activity analytics Anomalous access and exfiltration detection platform + module Access logs and behavioral models High IBM, Varonis Listed High High Data access governance entitlement graph / policy Privacera, Immuta, Veza, Wiz CIEM+DAG Least-privilege for agents, cross-system authorization platform subscription Identity-data relationship graph High Privacera, Immuta, Veza, Wiz Mixed Very high High RAG permission governance permission-aware retrieval Azure AI Search DLS, Elastic DLS, Privacera PAIG A hard requirement for enterprise RAG Search/platform add-on Retrieval authorization + citation + audit High Microsoft, Elastic, Privacera Mixed Very high Very high Agent data access control runtime policy / approval / logs Prisma AIRS, Unity AI Gateway, CyberArk Secure AI Agents, Cloudflare Mesh A gate before agents execute autonomously Platform add-on + premium policy plane + identity + logs High Palo Alto, Databricks, CyberArk, Cloudflare Listed/Acquired Very high Very high AI data governance prompt/output/data lineage/usage policy BigID, Securiti, Databricks, Snowflake EU AI Act, auditing, and model accountability platform + governance module Regulatory mapping + policy engine High BigID, Securiti, Databricks, Snowflake Mixed High High Privacy and compliance DSAR, consent, cross-border OneTrust, Securiti, TrustArc, Transcend AI data usability constrained by regulation subscription Regulatory knowledge base + workflow Medium-high Securiti, OneTrust, TrustArc Private Medium-high Medium Encryption and key management KMS, tokenization, BYOK/HYOK Thales CipherTrust, AWS KMS, MongoDB QE "Data-in-use protection" and external key control license + usage Key root of trust High Thales, AWS, MongoDB Mixed Medium-high Medium Cloud vendor data security native DSPM / DLP / guardrails Google SCC DSPM, Macie, Purview Strongest native distribution bundle / consumption Distribution and native telemetry Very high Microsoft, Google, AWS Listed Very high Medium-high AI app and agent platforms AI gateway / observability / guardrails Unity AI Gateway, Bedrock Guardrails, Portkey Model-call governance usage / request volume Breadth of integration surface Medium-high Databricks, AWS, Palo Alto, Cloudflare Mixed High High Enterprise customer-side services MSSP / consulting / managed Managed data security, AI governance consulting High implementation complexity service fee + managed Industry know-how Medium NTT Data, Infosys, TCS, Wipro, HCLTech Listed Medium Medium The chain above also explains how budget boundaries are shifting: budgets that used to belong to IAM, DLP, SaaS security, cloud security, data governance, and GRC are being repackaged under "AI data security." The most direct change is that customers no longer just ask "do you have DLP," but instead ask "will Copilot see content it shouldn't," "can the agent read but not write," "is the RAG permission-aware," "are prompts and outputs audited," and "does the vector index inherit source permissions."
Under three scenarios, the budget path looks roughly as follows:
Dimension Conservative Base Aggressive Assumption Enterprises buy models and Copilot first, retrofit governance later Copilot / enterprise search / light agents gradually go into production Agents enter customer service, R&D, IT operations, and BI/Finance workflows Enterprise AI adoption High High Very high RAG adoption Medium High Very high Agent adoption Low Medium High Change in data security budget Reallocation within the overall security budget, little net new A dedicated AI data security budget emerges, but still co-managed with cloud/identity/data platforms Clear net new governance and audit budget; AI projects must come with it Most-benefiting segments M365 permission governance, basic DLP, S3/GCS/SaaS DSPM DSPM, RAG permission governance, agent access control, GenAI DLP Data access governance, identity/NHI, DDR, AI gateway, knowledge base security Beneficiary companies Microsoft, Google Cloud, AWS, Varonis Microsoft, Google/Wiz, Palo Alto, CrowdStrike, Databricks, Snowflake, Cyera, BigID, Privacera Palo Alto, CrowdStrike, the CyberArk path, Databricks, Snowflake, Cyera, Veza, Noma Disrupted companies Pure new-concept AI security startups Traditional point DLP, catalog-only tools Point tools that only do a "prompt firewall" Key risk AI projects delayed, budget goes to infrastructure first Platform features descend too fast False blocks, permission mismatches, customers unwilling to add complexity The base scenario is the most credible investment assumption right now: AI data security will become a standalone budget pool, but it will not become a fully isolated standalone market—instead it will compete for the control surface alongside identity, cloud, security platforms, and data platforms.
Technical Architecture and Sector Breakdown
Looking at an enterprise-grade AI data security system disassembled, the most valuable part is not a "single detection point" but the continuous control chain from discovery, classification, and permission inheritance through to auditing and response. The table below consolidates the 17-layer architecture requested into a single investment framework.
Architecture Layer Problem Solved Representative Capabilities Long-term Moat Risk of Being Replaced by Platform Built-in Willingness to Pay Notes Data discovery layer Find shadow data, orphan data, ROT data Scan object storage, SaaS, lakehouse, databases Medium-high: connector coverage, efficiency, low intrusiveness Medium High The DSPM base layer; Google DSPM, IBM, Cyera, BigID, and Sentra all build around it. Data classification layer Identify PII/PHI/PCI/code/contracts, etc. classifier, rules, LLM + context High: precision, low false positives, industry templates Medium Very high Google Sensitive Data Protection, Snowflake classification, Purview, and OpenText all emphasize sensitive classification. Sensitive data identification layer Judge data value and risk level labels, risk scoring, sensitivity labels High Medium Very high The more labels can be reused across DLP, RAG, and auditing, the higher the value. Data catalog and lineage layer Audit "where data comes from and where it flows" catalog, lineage, external lineage High: metadata network effects Medium Medium-high Snowflake Horizon/External lineage, Databricks metadata layer. Data permission graph layer Know "who can access what" ACL graph, entitlement map, DAG Very high Low-medium Very high The layer most likely to form a durable moat; Veza, Wiz, Privacera, and Immuta are closest here. Identity and NHI mapping layer Map users, service accounts, and agents to resources PAM, machine identity, continuous identity Very high Low Very high AI agents increase the number of NHIs, sharply raising the importance of the identity layer. RAG permission inheritance layer Carry source permissions through at retrieval time ACL sync, query token, security trimming Very high Medium Very high Azure AI Search, Microsoft connectors, and Elastic DLS are the most direct examples. Vector database permission filtering layer Keep embedding/retrieval within authorization endpoint ACL, metadata filter, DLS Medium-high Medium-high Medium-high Databricks already has vector endpoint ACL/filters; many pure vector databases are still weak. Prompt / input DLP layer Keep sensitive data out of the model, block prompt injection PII filter, prompt protection Medium High Medium-high AWS Bedrock, Cloudflare, and Purview all do this, but it is more easily absorbed by platforms. Output DLP layer Prevent model results from leaking output filter, citation, redaction Medium High Medium-high As important as input DLP, but with weaker standalone pricing power. Agent data access audit layer Reconstruct what the agent did payload logs, tool logs, approval logs Very high Medium High Databricks, Anthropic, and OpenAI all provide logging and monitoring. Data anomaly behavior detection layer Detect anomalous access, lateral movement, sensitive data movement DDR, UEBA, activity analytics High Low-medium High IBM Guardium DDR and Varonis are the most typical paths. Data leak response layer Act automatically once risk is seen quarantine, block, revoke, ticket Medium-high Medium High If it only discovers without disposing, the value is discounted. Encryption and key management layer Protect data at rest/in transit/partly in use KMS, BYOK, queryable encryption Very high Low Medium-high But it is more infrastructure; the incremental elasticity is lower than permission governance. Compliance and audit reporting layer Satisfy GDPR/HIPAA/PCI/FINRA/EU AI Act audit trail, retention, policy evidence Medium Medium Medium-high Easy to commoditize, but still a must-have to close deals. AI governance policy layer Connect data, models, agents, and policy AI use policy, risk register, AI governance Very high Medium Medium-high BigID, Securiti, Databricks, and Snowflake are contesting this layer. Security operations integration layer Connect SOC / SIEM / tickets / SOAR APIs, logs, playbooks Medium-high Medium Medium More about platform expansion than a standalone profit pool. Building on this architecture chain, the 30 segments raised can be compressed into five investment clusters most worth tracking:
Investment Cluster Included Segments Commercialization Stage Revenue Elasticity Margin Outlook Competitive Landscape Investment Appeal Multi-cloud DSPM and data classification DSPM, sensitive data discovery, SaaS data security, cloud data security, unstructured data risk Already platformizing Very high High Large platforms + AI-native startups Highest Data access governance and permission graph Data access governance, permission graph, NHI/identity linkage Heating up fast Very high Very high Identity security firms + data governance firms + startups Highest RAG / Enterprise Search permission governance permission-aware RAG, knowledge base security, vector database filtering, document-level security Moving from PoC to hard requirement Very high High Microsoft / Elastic / Privacera / platform built-in Very high GenAI DLP and agent runtime control Prompt DLP, output protection, agent data access, memory security, action approval Early-to-mid stage High Medium-high PANW / Cloudflare / Databricks / startups High AI data governance and compliance training/inference data governance, AI auditing, data sovereignty, privacy compliance Mid-stage Medium-high High BigID / Securiti / OneTrust / platform companies High Among these, the layer most likely to form a durable moat is the integrated control plane of "data permission graph + classification labels + retrieval authorization + log auditing"; the layer most likely to be replaced by cloud-vendor built-ins is basic scanning, baseline checks, and point prompt DLP; the layer most likely to produce "good product but hard to monetize" is AI security point products that only detect without closing the loop on disposition. This judgment is consistent with the trend of platforms like Google, Microsoft, Databricks, Snowflake, and Oracle continuing to build governance capabilities into the data plane and the AI plane.
Company Tiering and Investment List
First, a high-density investment list. Here I tier by "direct beneficiary / indirect beneficiary / platform beneficiary / AI-native challenger / at risk of platform squeeze," and try to separate "product releases" from "revenue landing."
Priority Research Matrix for Listed Companies
Company Region/Ticker Segment Core Products AI/RAG/Agent Data Security Benefit Path Financial/Commercialization Evidence Category Valuation Observation Microsoft US/MSFT Platform beneficiary Purview, Copilot, Graph connectors, Azure AI Search Sits directly at the center of enterprise knowledge bases, permission models, compliance, and retrieval authorization; Copilot itself drives demand for Purview/permission governance FY25 commercial RPO $368 billion; FY26 Q2 commercial RPO $625 billion; tight binding of Copilot/Graph/Purview. Category A Large scale, high certainty, medium elasticity; not cheap but not reliant on a single narrative Alphabet / Google Cloud / Wiz US/GOOGL Platform beneficiary Google Cloud DSPM, Sensitive Data Protection, Wiz Google has already platformized cloud and AI security; Wiz provides multi-cloud data graphs and DSPM Acquisition of Wiz completed in March 2026; Google Cloud has officially integrated DSPM into SCC. Category A Strong platform-integration elasticity, but security revenue still hard to break out Palo Alto Networks US/PANW Platform beneficiary Prisma AIRS, Protect AI, Portkey, identity platform Expanding from AI model security into agent lifecycle, LLM gateway, and runtime security Q2 FY26 revenue $2.594 billion, RPO $16 billion; completed Protect AI in 2025, plans to acquire Portkey in 2026, AIRS 3.0 targets agentic AI. Category A/B Strong logic, fast M&A, market expectations already elevated CrowdStrike US/CRWD Platform beneficiary Falcon Data Security, Charlotte AI, SGNL Entering the AI data surface via endpoint + identity + data protection FY25 revenue $3.95 billion, up 29% year over year; identity security ARR over $435 million; plans to acquire SGNL in 2026. Category A/B Strongly platformized, but valuation already significantly priced in ahead Zscaler US/ZS DLP / zero trust Inline DLP, SSE, GenAI controls Suited to browser, SaaS, upload/exfiltration, and Shadow AI scenarios Clear product path, but no broken-out AI data security revenue disclosure found this round; overall more of a platform enhancement. Category B Right theme; needs clearer revenue attribution Varonis US/VRNS DDR / data permission governance Data security platform, Copilot risk governance, MDDR The most direct beneficiary of governing SharePoint/OneDrive/email over-sharing 2025 ARR $745.4 million, SaaS ARR $638.5 million; Q1 2026 revenue and SaaS ARR guidance continue to grow. Category A High purity, high elasticity; valuation not low but still has fundamental support Rubrik US/RBRK DSPM + data recovery Rubrik DSPM, data recovery, Annapurna roadmap If AI data security emphasizes "discovery + recovery + resilience," Rubrik benefits clearly Has officially launched DSPM, but AI-related revenue not separately disclosed; still needs ongoing validation of sales mix. Category B Good logic, but more "platform expansion" than validated standalone revenue Snowflake US/SNOW Data platform security Horizon Catalog, row policy, masking, lineage Sits at the core of the enterprise data lakehouse and AI data cloud FY26 Q4 product revenue $1.23 billion, up 30% year over year; NRR 125%; 733 customers over $1 million; RPO $9.77 billion. Category A Strong long-term moat, but the market has partly priced in the "AI data cloud" path MongoDB US/MDB Database/vector/encryption Atlas, Vector Search, Queryable Encryption AI apps often place operational DB + vector search together, so security capabilities can grow directly with usage FY26 revenue $2.46 billion, up 23% year over year; Atlas up 29% year over year; over 65,200 customers. Category A/B Not a pure security name, but with high direct exposure to the AI data surface Elastic US/ESTC Enterprise search/RAG/security Elasticsearch, DLS/FLS, AI Assistant Enterprise search, SOC AI assistants, and vector retrieval all need DLS/FLS FY26 Q3 revenue $450 million, up 18% year over year; sales-led subscription $376 million, up 21% year over year; 1,660+ customers with $100K ACV. Category A/B High purity in RAG permission governance, with market attention still below the large platforms Cloudflare US/NET GenAI DLP / AI-SPM / agent network security AI Gateway, AI prompt protection, AI-SPM, Mesh Entering via network ingress, browser, SASE, and the agent network Q1 2026 revenue $639.8 million, up 34% year over year; current RPO up 34% year over year; has released AI prompt protection, AI-SPM, and Mesh, but does not break out revenue. Category B Excellent product cadence, but valuation is sensitive to AI expectations Oracle US/ORCL Database / agent-native data security Oracle AI Database, AI Vector Search, Deep Data Security The "don't copy data to an external vector database" narrative fits enterprise security demands very well Oracle 26ai launches Deep Data Security with built-in vector search; the path is strong for finance and government/large enterprises. Category A/B Security upside potentially underestimated, but customer adoption needs watching IBM US/IBM DDR / discovery and classification / key governance Guardium, Guardium DDR, Key Lifecycle Manager Directly covers the full Discover / Classify / DDR / key mgmt chain IBM explicitly uses Guardium for data discovery, classification, DDR, and key lifecycle management, but AI data security revenue is not broken out. Category B/C Strong defensive profile, with elasticity below pure-security SaaS platforms Okta US/OKTA Identity / access Workforce Identity, governance Identity governance for AI agents / apps / connectors needs a strong identity foundation Highly relevant to AI data security, but no direct data security revenue validation found this round. Category C More of a "necessary foundation," not the most direct beneficiary of data security revenue Trend Micro Japan/4704 Cloud security + AI security platform Trend Vision One, AI security platform Can carry AI risk detection and link cloud and data risk 2025 enterprise ARR over $1.3 billion, large-enterprise platform ARR $467 million, Q4 enterprise net sales up 8% year over year. Category B An Asia-Pacific representative with a clear platformization path Important Private Company Observation Matrix
Company Country/Region Segment Core Products Funding/Valuation Known Commercialization Signals Competitive Relations Assessment Cyera Israel/US DSPM Multi-cloud data discovery, classification, access analytics, AI data security Raised another $400 million in January 2026, valued at $9 billion. Fast customer expansion, but ARR not public Pressures BigID, Sentra, Google/Wiz, Rubrik One of the AI-native DSPM names most worth tracking BigID US DSPM + AI governance Data discovery, classification, AI governance, vector DB / agent governance Company says 2024 revenue passed $100 million. Has a revenue and platformization base Competes across Microsoft/Purview, Cyera, Securiti Closest to a "platform-type AI data security startup" Sentra Israel/US DSPM Cloud-native DSPM, archive scanning, data attack surface $50 million Series B in 2025. Emphasizes AI-ready data protection, ARR undisclosed Competes with Cyera, BigID, Concentric Worth tracking, but transparency still insufficient Concentric AI US DSPM / unstructured data risk Semantic intelligence, DSPM, DLP $45 million Series B in 2024. Strong in unstructured data and permission semantics Challenges Varonis, BigID, Sentra Worth tracking in the AI knowledge base security direction Securiti US AI data governance / privacy Data+AI security, privacy ops, agent governance Acquired by Veeam; website continues to advance Agent Commander. Strong regulatory/privacy semantics Competes with OneTrust, BigID, Privacera Strong regulatory drivers, but the post-acquisition pace needs watching Privacera US Data access governance / RAG PAIG, vector DB/RAG access control Public news shows a 2026 rebrand to Trust3 AI. Launched vector DB / RAG access control back in 2024 Competes with Databricks/Snowflake/Immuta A highly relevant name in RAG permission governance Immuta US Data access governance Dynamic access control, cloud data access governance $100 million raised in 2022, $267 million total funding. Commercially mature, but funding updates sparse in recent years Competes with Privacera, Veza, and platform-native governance Needs validation of whether growth re-accelerates Veza US Permission graph / identity-data governance Access graph, entitlement intelligence $108 million Series D in 2025, valued at $808 million. Backed by Snowflake/Atlassian/Workday Ventures Spans identity and data governance Very much worth tracking, could become an M&A target Noma Security Israel AI/Agent security AI app, RAG, agent runtime security $100 million Series B in 2025, $132 million total funding. Growing very fast but revenue undisclosed Competes with the PANW/Cloudflare/Protect AI path A textbook AI-native challenger Lasso Security Israel GenAI / LLM security Prompt / LLM cybersecurity $6 million seed round in 2023; later materials show cumulative funding has increased. Clear direction, insufficient transparency Competes with Cloudflare/PANW/peer startups More of a technology bet Protect AI US AI security platform Model-to-runtime AI security Acquired by Palo Alto in 2025. The acquisition validates the sector's value Already folded into PANW Already an M&A pricing anchor OneTrust / TrustArc / Transcend US Privacy and compliance DSAR, consent, policy Funding and valuation mostly outdated or need separate checking Relevant to AI data usage compliance, but not fully overlapping with runtime data security Partly overlaps with Securiti / BigID More of a compliance beneficiary, not the strongest AI security elasticity Collibra / Alation / Atlan Europe/US/India Catalog and lineage catalog, lineage, governance Collibra valued at $5.25 billion in 2021. Important in AI governance, but security monetization needs validation Competes with Snowflake/Databricks platform features Catalog still matters, but pure investment elasticity below DSPM Reco / Grip / DoControl / Adaptive Shield Israel/US SaaS data security SaaS posture / SaaS DLP / access Funding and ARR not systematically verified this round Benefit from SaaS + AI app expansion Also face absorption pressure from Microsoft/Cloudflare Worth tracking, but each needs individual validation Looking at the investment tiers:
Category A: core direct beneficiaries of AI/RAG/Agent data security—Microsoft, Google/Wiz, Varonis, Snowflake, Databricks (private), Cyera, BigID, and the Privacera/Veza path. The common trait: they sit at the core of the data control surface and connect directly to enterprise knowledge bases, lakehouses, retrieval, permissions, and catalogs.
Category B: clear beneficiaries, but with higher valuation or platform-squeeze risk—Palo Alto, CrowdStrike, Cloudflare, MongoDB, Elastic, Oracle, Trend Micro, Sentra, and Concentric. The common trait: products are strongly correlated with demand, but AI data security may not yet be broken out as a primary financial driver.
Category C: more defensive beneficiaries—IBM, Okta, Thales, Broadcom, OpenText, and AWS. Capabilities matter, but near-term financial elasticity is not necessarily the strongest.
Category D: strong narrative, insufficient financial validation—a large number of "AI security startups" and platform add-on modules, especially companies that only do a prompt firewall, LLM scanning, or an advisory layer.
Category E: high risk of platform consolidation—traditional point DLP, catalog-only tools, weak runtime governance products, and data governance tools that lack a permission graph and remediation.
Key Listed Companies and Valuation Observations
Below are the 15 listed companies most worth continued secondary research. Because many companies do not break out AI data security revenue, the following is better treated as a "research-priority and expectation-gap list" than a simple valuation table.
Company Sector Positioning Commercialization Stage Key Financial/Customer Metrics AI Data Security Evidence Current Market Expectation Research Conclusion Microsoft Enterprise knowledge base + permission control plane Mature, demand expanding with Copilot Commercial RPO $368 billion in FY25, FY26 Q2 commercial RPO $625 billion; share price about $423.54, market cap about $3.15 trillion. Purview, Copilot, Graph connectors, and Azure AI Search integrate permissions and retrieval. The market fully recognizes its AI main line, but may not fully price in Purview's secondary benefit High certainty, low purity, high long-term moat Google / Wiz Multi-cloud DSPM + CNAPP + AI security In the platform-consolidation period Wiz consolidated into Google Cloud in March 2026; GOOGL market cap about $4.81 trillion. Google already natively provides DSPM and sensitive data classification. M&A integration and business-model synergy still to be seen High certainty, high platform-suppression power Palo Alto Networks AI security platform + agent runtime + identity Rapid expansion Q2 FY26 revenue $2.594 billion, up 15% year over year; RPO $16 billion, up 23% year over year; share price about $247.55, market cap about $176 billion. Protect AI, Prisma AIRS 3.0, and the Portkey acquisition all point to the agentic AI lifecycle. Expectations clearly elevated; scrutinize M&A delivery and attach rate High elasticity, valuation running hot CrowdStrike identity + data protection + AI agents Expansion FY25 revenue $3.95 billion, up 29% year over year; identity ARR over $435 million; share price about $618.83, market cap about $155.5 billion. Falcon Data Security, Charlotte AI, SGNL. The market sees it as one of the core AI security platforms High quality, high expectations, guard against valuation pullback Zscaler inline DLP / browser / SaaS control Mid-to-late stage Share price about $174.69, market cap about $27.9 billion. A natural fit for GenAI DLP scenarios, but no broken-out AI data security revenue found in this round's materials. Moderately high expectations Worth tracking, needs revenue validation Varonis Data permission governance / DDR Direct-benefit period ARR $745.4 million; SaaS ARR $638.5 million; share price about $28.78, market cap about $3.33 billion. Copilot/SharePoint over-sharing risk closely matches its product. High purity but smaller scale, high elasticity One of the public pure-data-security names most worth digging into Rubrik DSPM + recovery Early-to-mid expansion Share price about $64.98, market cap about $12.89 billion. Has officially made DSPM part of the suite. The market views it more as a recovery/resilience company Medium-high certainty, an AI expectation gap may exist Snowflake Lakehouse security control plane Validated FY26 Q4 product revenue $1.23 billion, up 30% year over year; NRR 125%; RPO $9.77 billion; 733 million-dollar customers; share price about $164.24, market cap about $55.78 billion. Horizon makes governance for AI a core selling point. The market has partly priced in the AI data cloud, but security monetization is not yet fully unfolded Strong long-term moat, continue to track closely MongoDB operational + vector + encryption Validated FY26 revenue $2.46 billion, up 23% year over year; Atlas up 29%; over 65,200 customers; share price about $330, market cap about $26.86 billion. Atlas places operational and vector on the same platform; Queryable Encryption provides the "server doesn't know the plaintext" security property. The AI data platform attribute is gradually being re-rated by the market Medium-high certainty, medium-high elasticity Elastic Enterprise search/RAG security Validated FY26 Q3 revenue $450 million, up 18% year over year; sales-led subscription $376 million, up 21% year over year; 1,660+ customers with $100K ACV; share price about $53.91, market cap about $5.73 billion. DLS/FLS is directly relevant to RAG permission governance logic. The market's awareness of its "security + search + AI" combination is still insufficient Large expectation gap Cloudflare AI gateway / prompt DLP / agent networking Rapid-experimentation stage Q1 2026 revenue $639.8 million, up 34% year over year; cRPO up 34%; share price about $201.75, market cap about $71.1 billion. prompt protection, AI-SPM, Mesh, and AI Gateway are all released. Very strong AI expectations, high near-term valuation elasticity Good company but valuation sensitive to the narrative Oracle Database-built-in AI security Take-off stage Share price about $186.61, market cap about $543.4 billion, P/E about 33.5x. 26ai launches AI Vector Search and Deep Data Security, stressing no need to copy enterprise data to an external vector database. The market focuses more on the cloud and database main line; AI data security may still be underestimated A potential expectation-gap name IBM DDR / discovery and classification / keys Mature Share price about $222.75, market cap about $212.1 billion, P/E about 19.7x. Guardium already covers discovery, classify, DDR, and key management. More of a defensive allocation than a high-elasticity SaaS Defensive beneficiary Okta Identity foundation Mature Share price about $87.04, market cap about $15.5 billion. AI agents / apps / connectors expand identity and access governance demand, but its data security revenue chain is rather indirect. Right logic, insufficient purity Medium beneficiary, not a top-choice pure name Trend Micro AI security platform Regional representative enterprise ARR over $1.3 billion, large-enterprise platform ARR $467 million. Has put the AI Security Platform at the core of its platform narrative. Asia-Pacific elasticity above global market perception Worth adding to the Asia-Pacific watchlist Based on the table above, valuation and expectation gaps can be simplified into a rough judgment:
Expectations already fairly fully reflected: CrowdStrike, Palo Alto, Cloudflare, Databricks (private), Cyera. Their common feature is a strong platform narrative, dense funding/M&A, and high market sentiment.
Possible expectation gap remaining: Varonis, Elastic, Oracle, some of Snowflake / MongoDB's security sidelines, and non-listed permission governance paths like Veza/Privacera. Their common feature: capabilities are already critical, but the market still prices them mainly on "existing business."
Good companies but valuations too expensive: CrowdStrike, Cloudflare, and parts of Palo Alto. Current share price, market cap, and market sentiment are all high, so near-term performance depends more on expansion speed beating expectations.
Revenue growth real, valuation relatively still researchable: Varonis, Elastic, MongoDB, and parts of Snowflake. "Relatively" is the operative word here, not "cheap."
Scoring Model and Current Ranking
I adopt the suggested weights with slight simplification: AI/RAG/Agent revenue exposure 25% + platform position and customer base 20% + permission/classification/governance moat 15% + product coverage 15% + financial quality 10% + growth elasticity 10% + valuation reasonableness 5%.
Given the current materials, the overall ranking (research priority, not investment advice) is roughly as follows:
Ranking Group Companies Logic Group 1 Microsoft, Google/Wiz, Snowflake, Varonis, Databricks, Palo Alto Sit at the core of the control surface and can connect permissions/retrieval/governance/security into a platform Group 2 CrowdStrike, MongoDB, Elastic, Oracle, Cyera, BigID Either strong platform coverage or a larger expectation gap Group 3 Cloudflare, Rubrik, Trend Micro, Privacera, Veza, Sentra Right sector, but revenue attribution/scale expansion still needs tracking Group 4 IBM, Okta, Thales, Broadcom, OpenText Important but more defensive, not the strongest earnings elasticity Group 5 Players that only do a prompt firewall, point LLM scan, or catalog-only Easily built into platforms or squeezed on price In the reverse scoring for "platform consolidation risk," the highest risk is usually: point DLP, point AI-SPM, products that only do prompt/output filtering, catalog-only tools, and weak permission-graph governance vendors. The reason is that Microsoft, Google, AWS, Snowflake, Databricks, and Oracle have already pulled the key features into their platforms.
Risks, Open Questions, and Final Conclusions
The biggest risk is not that "the security demand does not exist," but in what form the demand takes, who captures it, and when it shows up in the financials. From current public materials, five most important investment judgments can be summarized.
First, AI data security will become the core control plane of the enterprise AI era, but it is not a standalone isolated market. It will deeply integrate with identity security, cloud security, data platforms, enterprise search, knowledge bases, and compliance auditing. In other words, it is more like a "control plane layer" than a forever-standalone single-product market.
Second, there are only five segments genuinely worth attention. They are: multi-cloud DSPM, data access governance and permission graph, RAG/enterprise search permission governance, GenAI DLP/agent runtime control, and AI data governance and compliance. This is the most central industry distillation of this report.
Third, the ten listed companies most worth in-depth research, ranked by "certainty × elasticity × platform position," in suggested priority order: Microsoft, Google/Alphabet, Palo Alto Networks, CrowdStrike, Varonis, Snowflake, MongoDB, Elastic, Oracle, Cloudflare. Among them, Microsoft/Google/Snowflake lean toward platform certainty, Varonis/Elastic/Oracle lean toward the expectation gap, and PANW/CRWD/NET lean toward high elasticity and high expectations.
Fourth, the ten private companies most worth continuous tracking are: Cyera, BigID, Sentra, Concentric AI, Securiti, Privacera, Veza, Noma Security, Immuta, Lasso Security. Among them, Cyera/BigID/Veza/Privacera have the highest strategic value; Noma/Lasso lean toward AI-native high-risk, high-payoff; and Immuta/Securiti need more validation of their growth curve.
Fifth, the five points the market most easily misunderstands are: one, AI data security is not the same as model security; two, prompt DLP is not the whole picture; three, vector databases do not naturally inherit source permissions; four, Copilot is not "secure," it "faithfully executes existing permissions"; five, what truly makes money is the control plane, not point detection.
Over the next 6–12 months, the metrics most worth tracking include:
Whether public companies begin to break out ARR, RPO, cRPO, and customer counts for AI data security / data protection / AI governance / identity for AI.
New permission and audit features in Microsoft Purview, Google Cloud DSPM, AWS Bedrock Guardrails, Snowflake Horizon, and Databricks Unity Catalog / AI Gateway.
Whether customers move RAG and agents from pilot to production; whether they begin to require document-level security, approval workflow, payload logging, and retention control.
Whether major M&A keeps happening on the DSPM / identity-data governance / AI gateway / RAG access control front line. Google-Wiz, PANW-Protect AI/Portkey, and CrowdStrike-SGNL have already pointed the direction.
Open Questions and Limitations
This report has tried to compress the conflict between "broad and comprehensive" and "verifiable," but several types of information still have not been adequately disclosed by public companies:
Most companies do not break out AI data security revenue, and it can only be inferred from product releases, customer scenarios, platform attach rates, and RPO/ARR; "releases" must not be mistaken for "revenue."
Some traditional vendors and regional companies (especially A-shares, Hong Kong shares, Europe, Japan, Korea, and India) have insufficient AI data security granularity in this round of public materials, and are better placed on a second-round verification list than turned into strong conclusions this round.
Vector database security, agent memory security, privacy-enhancing computation, and confidential computing remain early-stage for now; they may be important in the short term but will not necessarily form a large revenue pool immediately.
Final Conclusion
If you single out the links in the AI value chain that can genuinely form a long-term profit pool, data security, DSPM, RAG permission governance, data access governance, and enterprise knowledge base security are the cluster that most deserves attention. They determine whether enterprise AI can move from demo to production, whether Copilot and agents can touch core data, and who bears the future regulatory and audit risk.
For investment, the narrower follow-on directions most worth prioritizing converge to four main lines:
DSPM, GenAI DLP, RAG permission governance, and agent data access control. These four main lines have both clear demand and the best odds of converting into revenue, platform attach rate, RPO/cRPO, and margin improvement over the next 12–24 months.
This report is based on public information and does not constitute investment advice. Markets carry risk; invest with caution.
Full report
Sign in to read the full report
Sign up free to unlock the full text, the Baillie growth scorecard, and full-text search.
Log in / Sign up free