Data pillar cost: DLP, classification, encryption and tokenisation
The data pillar of zero trust applies controls to the data itself: classification, encryption, loss prevention, tokenisation, and access policies that travel with the data. This page breaks down each sub-component's cost, sizes the pillar by organisation, explains why most programmes defer it to Phase 3, and answers the regulated-industry questions that change the budget profile sharply.
Data pillar in CISA terms
The CISA Zero Trust Maturity Model v2.0 defines the data pillar across five capability areas: data inventory management, data categorisation, data availability, data access (the decision itself), and data encryption. The principle is straightforward: even after a perimeter compromise, the data itself must remain protected through encryption, labelling, and policies that travel with the data. NIST SP 800-207 frames the data resource as the ultimate object of zero trust protection; every other control exists to defend access to it.
The pillar has seven cost-bearing components in practice. Data classification assigns sensitivity labels to information so that downstream controls can enforce policy. Data loss prevention monitors and blocks unauthorised exfiltration across email, web, endpoint and cloud. Encryption at rest and in transit are baseline expectations included in every modern platform at zero marginal cost. Confidential computing extends encryption to data in use via hardware-backed memory encryption. Tokenisation replaces sensitive data with non-sensitive substitutes for scope reduction. Data discovery and DSPM (data security posture management) catalogues data across cloud stores to keep the inventory current. Key management handles the cryptographic keys that everything else depends on.
The pillar is the most often deferred for a structural reason: every data control is more effective when the earlier pillars are mature. DLP without identity context catches obvious exfiltration but misses sophisticated attacks. Classification without a clean data inventory produces labels that do not match reality. Tokenisation without application refactoring is half-effective at best. CISA frames data as the fifth and most aspirational pillar in the maturity model for this reason, and the budget allocation reflects it: 10 to 15 percent of total zero trust spend in most estates, with the share rising in regulated industries.
Cost by data sub-component
Per-user per-month and per-platform pricing for the seven data components. Pricing is market-typical; Microsoft-bundled paths land at the lower end across most components for M365 E5 customers.
| Component | List price range | Sized on | Notes |
|---|---|---|---|
| DLP (cloud-native) | $5 - $15 / user / month | Workforce with data access | Email, web, endpoint, cloud-app DLP. Microsoft Purview included in M365 E5. Standalone from Symantec, Forcepoint, Netskope. |
| Data classification | $5 - $12 / user / month | Workforce creating or handling sensitive data | Microsoft Purview Information Protection bundled in E5. Standalone Boldon James, Titus, Fortra for heterogeneous estates. |
| Encryption at rest / in transit | $0 marginal | All data | Bundled into every modern platform. Baseline expectation, not a budget line item. |
| Confidential computing | 15-35% premium on cloud bill | Workloads processing sensitive data in memory | Azure Confidential Computing, AWS Nitro Enclaves, GCP Confidential VMs. Plus engineering effort to refactor apps. |
| Tokenisation | $30K - $300K / year + per-txn | Systems handling PAN, SSN, account numbers | Thales CipherTrust, Protegrity, Very Good Security. Required for PCI-DSS scope reduction. |
| Data discovery / DSPM | $40K - $500K / year | Cloud data stores | Data security posture management. Sentra, Cyera, Dig Security, Microsoft Purview Data Map. Newer category. |
| Key management (KMS / HSM) | $1K - $30K+ / year | Encryption-using workloads | Cloud-native KMS at the low end. Dedicated HSM for high-assurance or regulatory environments. |
Data pillar cost by organisation size
| Organisation | Workforce | Year 1 license | Year 1 total | Ongoing / year | Notes |
|---|---|---|---|---|---|
| SMB | 100 users | $10K - $30K | $20K - $60K | $15K - $40K | Microsoft Purview within Business Premium covers most. Data pillar usually scoped down for SMB. |
| Mid-market | 500 users | $40K - $120K | $80K - $250K | $60K - $160K | Add classification, DLP coverage extension, basic data discovery. |
| Enterprise | 2,000 users | $150K - $400K | $320K - $850K | $220K - $580K | Full DLP, full classification, DSPM, tokenisation if regulated, KMS / HSM. |
| Large enterprise | 10,000+ users | $500K - $1.4M | $1.0M - $2.8M | $700K - $1.8M | Multi-vendor, multi-platform. Regulatory variation drives upper bound (financial, healthcare, federal contractor). |
When the data pillar share rises sharply
Three industries see the data-pillar share of zero trust budget rise from the 10 to 15 percent base toward 20 to 25 percent or more. Financial services have explicit data protection mandates under SEC cybersecurity rules, GLBA, and various state-level frameworks (NYDFS Part 500 is the canonical example), plus PCI-DSS for any payment-card handling. Tokenisation, encryption-key management with dedicated HSMs, and comprehensive DLP across customer-data flows are not optional. Healthcare operates under HIPAA, which requires technical safeguards including access controls, audit controls, integrity controls, and encryption for protected health information. The data-pillar buildout for a HIPAA-covered entity is materially more extensive than for a non-regulated equivalent. Federal contractors and government operate under FedRAMP, CMMC, and various DoD-specific frameworks; CMMC Level 2 and above require demonstrable data classification, labelling and DLP controls.
The cost uplift in regulated industries comes from three places. Audit-grade key management with dedicated hardware security modules adds $30K to $200K per year per HSM compared to cloud-native KMS. Comprehensive classification and labelling becomes mandatory rather than optional, which means deploying classification tooling to the full workforce rather than a risk-tier subset. And DLP coverage extends to every channel (email, web, endpoint, cloud, removable media) rather than just the high-traffic channels, which typically doubles the DLP licensing cost.
For PCI-DSS specifically, tokenisation is the highest-value data-pillar investment. Tokenising primary account numbers can remove systems from PCI scope entirely, which eliminates ongoing audit, scanning and assessment costs on those systems. For a mid-sized merchant, scope reduction can save $100K to $400K per year in audit and compliance cost, more than offsetting tokenisation platform cost in the first year. The sister site pcicompliancecost.com has a deeper breakdown of PCI scope-reduction economics.
Why data is Phase 3, and what data work to start earlier
Most data-pillar capabilities are most effective when deployed after the identity, device and network pillars are mature. Identity context lets DLP make policy decisions based on who the user is and what risk tier they sit in, not just what data is moving. Device posture lets DLP decide whether to allow exfiltration to a fully managed endpoint versus a personal device. Network context lets DLP correlate exfiltration attempts with known-risky destinations. Without those inputs, DLP is largely pattern-matching plus IP-based filtering, which catches obvious exfiltration but misses sophisticated attacks.
That said, two pieces of data work should start in Phase 1, alongside identity and device. Data inventory (knowing where the data actually lives) is the prerequisite for every later data control and benefits from being started early because inventories take time to build accurately. Baseline encryption across cloud workloads, databases and storage should be turned on day one because it costs essentially nothing and removes a class of audit findings immediately. The remaining data controls (classification rollout, DLP policy authoring, tokenisation, DSPM, confidential computing) benefit from the wait.
What the data pillar buys you
The data pillar produces the most directly attributable risk-reduction value in zero trust because the data itself is the ultimate target. The IBM 2024 Cost of a Data Breach report found that the dominant cost component in a data breach is the value of the records exposed (lost business, customer notification, regulatory fines), all of which the data pillar directly affects. Encryption and tokenisation, if applied to the exposed data before the breach, materially reduce the records-exposed cost component. DLP, if it catches the exfiltration in progress, can reduce or eliminate the breach itself.
The per-pillar ROI calculation is harder for data than for identity because the value depends on which breach scenarios occur. A breach involving credential abuse without data exfiltration produces no data-pillar value. A breach involving large-scale exfiltration of customer records produces enormous data-pillar value (if the records were tokenised or encrypted in ways the attacker cannot reverse). For regulated industries the data pillar pays back faster because regulatory fine exposure adds to the breach-cost reduction. For non-regulated industries the pillar still pays back, but on a longer time horizon and with more variance.