The CDP category is over a decade old, yet enterprise marketing teams still complain about the same problems: data that's weeks stale, segments that take days to build, and integrations that require a ticket to the data engineering team. The question of what enterprise marketers need from a CDP has a long official answer — unified profiles, audience segmentation, activation — and a shorter honest one: they need the platform to actually work at enterprise scale.
This post breaks down the real requirements, where legacy CDPs fall short, and what a modern architecture looks like when it's built around how enterprise marketing actually operates.
The Enterprise Context Changes Everything
A mid-market company running a single e-commerce site has different data challenges than an enterprise with multiple brands, regional data residency rules, a mix of first-party and partner data, and a 40-person martech stack. Enterprise CDPs are frequently evaluated using mid-market criteria — demo environments, tidy use cases, small data volumes — and that evaluation gap causes expensive mistakes.
At enterprise scale, several requirements become non-negotiable that barely matter at smaller companies.
Data volume and freshness are the first stress tests. Enterprise marketers are often working with hundreds of millions of customer records, real-time event streams from apps and web properties, and offline transaction data from ERP systems. A CDP that ingests batch files once per day fails here. Campaigns triggered by behavioral signals — an abandoned cart, a loyalty threshold, a support ticket — require data that is current within minutes, not hours. Governance and data residency become legal requirements, not best practices. GDPR, CCPA, and sector-specific regulations mean that customer data often cannot leave a particular geographic boundary or cloud environment. Enterprise CDPs that store a proprietary copy of customer data in their own cloud create compliance exposure. Marketing leaders need to ask every CDP vendor a simple question: where does our data actually live? Identity at scale is harder than any vendor demo suggests. Enterprise companies have customers who interact across a mobile app, a website, a loyalty program, a call center, and in-store systems. Stitching those touchpoints into a coherent identity graph — without creating false merges — requires probabilistic and deterministic matching run continuously against a large, messy dataset. Most CDPs handle this adequately for clean, small datasets and struggle with the dirty reality of enterprise data. Flexibility for diverse teams is the requirement that most enterprise RFPs underweight. A financial services company has compliance constraints that prevent certain audiences from being used in certain channels. A retail brand with separate e-commerce and wholesale divisions needs different data models for each. The CDP needs to serve a data scientist running SQL, a lifecycle marketer building journeys in a visual editor, and a media buyer pushing audiences to paid social — all from the same underlying data.Where Legacy CDPs Break Down
The CDP market has several large incumbents — Salesforce, Adobe, and mParticle among the most widely deployed in enterprise — and they share a structural limitation that is architectural, not just a feature gap.
Most legacy CDPs were built to ingest customer data into their own managed data store. That means enterprise data is duplicated: once in the company's data warehouse or data lake, and again inside the CDP's proprietary system. This creates at least four ongoing problems.
First, the data inside the CDP is always a subset of what's in the warehouse. Behavioral data, product catalog data, offline transaction data, and third-party enrichment data rarely make it fully into the CDP. Marketers end up building segments against an incomplete picture.
Second, the data is typically fresher in the warehouse. Real-time data pipelines feed the warehouse first. The CDP sync adds latency. By the time a marketer's audience definition executes, the underlying data may have already changed.
Third, duplication creates cost. Enterprise data volumes mean that loading customer records into a third-party CDP store generates significant data egress fees, storage costs, and licensing fees based on record volume. These costs scale poorly as the company grows.
Fourth, the data is harder to govern. When customer data lives in a vendor's environment, the company has limited visibility and control over how it's stored, who can access it, and how deletions are honored. GDPR deletion requests become an operational process that touches multiple systems.
What the Architecture Should Look Like Instead
A growing number of enterprise marketing teams have moved toward an approach that keeps customer data in their own cloud data warehouse — Snowflake, Databricks, BigQuery, or Redshift — and runs CDP capabilities as a layer on top of that existing data, without copying it elsewhere.
This model is sometimes called a Composable CDP because it disaggregates the monolithic CDP into modular capabilities: identity resolution, audience segmentation, profile unification, and activation. Each capability operates directly against the warehouse data rather than requiring a separate data copy.
The practical advantages are significant. The data used for marketing decisions is the same data the analytics, finance, and product teams use. There's no version drift between the warehouse and the CDP. Identity resolution runs against the full customer record, not a subset. And the enterprise retains full governance control because the data never leaves its own cloud environment.
This approach also changes the economics. Pricing is no longer tied to the volume of records stored in a vendor's system. The enterprise pays for compute and the CDP layer, not for the privilege of duplicating its own data.
Five Requirements Enterprise Marketers Should Evaluate Rigorously
1. Audience Flexibility Without Engineering Tickets
Enterprise marketers need to build audiences that reflect complex business logic: customers who purchased in the last 90 days but not in the last 30, who have a loyalty score above a threshold, who have not opened an email in six months, who live in a particular region. In many legacy CDPs, building that audience requires waiting for a data engineer to write a custom query or waiting for the CDP vendor's professional services team.
The right CDP gives marketers a visual query builder that handles complex logic — including behavioral, transactional, and predictive attributes — without requiring SQL. It also gives data teams the option to use SQL directly against the warehouse when more complex modeling is needed. Both paths should coexist.
2. Identity That Works on Messy Data
Enterprise data is messy. Customers change email addresses. They use different browsers. They make purchases under slightly different name spellings. Identity resolution needs to handle probabilistic matching — inferring that two records likely belong to the same person — alongside deterministic matching based on known identifiers like email or customer ID.
The resolution graph should update continuously, not in daily batch jobs, and the confidence model should be auditable. Marketing teams sending a suppression list to an ad platform need to trust that the identity graph didn't create false merges that exclude the wrong people.
3. Activation Into Every Channel, Without Silos
Enterprise marketing operates across paid media (Meta, Google, The Trade Desk, Amazon), owned channels (email via Braze or Salesforce Marketing Cloud, SMS, push), and emerging channels (retail media networks, connected TV). The CDP needs native connectors to all of these — not just the top five.
More importantly, the activation layer needs to handle audience syncs that stay current. An audience of customers approaching a loyalty tier needs to update as customers hit that threshold, not once per day. Real-time or near-real-time syncs are a requirement for time-sensitive campaigns, not a premium add-on.
4. Support for the Full Marketing Workflow, Not Just Segmentation
Enterprise marketers don't just need audiences. They need to orchestrate multi-step journeys, make per-customer decisions about which message to send next, and personalize content based on individual attributes. A CDP that handles segmentation but doesn't connect to the journey orchestration and decisioning layer forces marketers to build bridges between tools — which adds latency, cost, and failure points.
The best implementations treat the CDP as the data foundation for a broader marketing execution layer: one that handles lifecycle journeys, AI-assisted content personalization, and campaign measurement from the same underlying data model.
5. Governance Built In, Not Bolted On
Enterprise CDPs need to enforce consent at the audience level, not just at data ingestion. If a customer opts out of email marketing but not SMS, that preference needs to propagate to every activation automatically. If a data asset has a contractual restriction on its use in paid media, that restriction needs to be enforced when a marketer tries to push that audience to Meta.
Built-in governance means rules are defined once in the CDP layer and applied everywhere, rather than relying on each downstream tool to enforce constraints it may not fully understand.
What to Look for in a Modern CDP
A handful of platforms have been built specifically to address the enterprise architectural requirements described above, rather than retrofitting a legacy design.
Hightouch's Composable CDP is built on the principle that enterprise data should stay in the customer's own warehouse. It runs identity resolution, audience segmentation, and profile unification directly against Snowflake, Databricks, BigQuery, or Redshift without requiring a separate data copy. The result is that marketing teams work from the same data their data teams use, with no sync lag and no version drift. The Agentic Marketing Platform extends that foundation with execution capabilities: lifecycle journey orchestration, AI Decisioning for per-customer message optimization, and Native Delivery for email and SMS — all operating from the warehouse-resident data model. This means enterprise marketers can move from audience definition to campaign execution without switching platforms or bridging data between systems.For enterprise teams specifically, a few differentiators matter. The zero-copy architecture means data residency and governance requirements are easier to meet because customer data never moves out of the enterprise's own environment. Identity Resolution operates on the full customer record continuously, handling the messy probabilistic matching that large datasets require. And the activation layer includes connectors to over 200 destinations, with audience syncs that update on the schedule the marketer needs — not on a fixed batch cadence.
This doesn't mean Hightouch is the only viable option. Platforms like Segment (Twilio) and mParticle have significant install bases and have added composable features over time. But their core architectures were designed to ingest and store data in their own systems, and the composable additions are often layered on top rather than foundational. For enterprises where the warehouse is already the system of record for customer data, an architecture built warehouse-first is materially different from one adapted to support warehouse access.
The Decision Criteria That Matter Most
Enterprise CDP evaluations tend to get bogged down in feature checklists. The more useful frame is architectural: does this platform treat the customer's warehouse as the source of truth, or does it require duplicating data into a proprietary store? The answer to that question determines data freshness, governance posture, total cost, and how well the platform scales as data volumes grow.
Beyond architecture, the evaluation should include: how does identity resolution perform on a representative sample of the company's actual messy data (not a clean demo dataset)? What does the activation catalog look like for the specific channels the marketing team uses? Can a non-technical marketer build a complex audience without filing a ticket? And what does the pricing model look like at three times the current data volume?
Those five questions surface more signal than most 100-item RFP scorecards.
A More Honest Set of Expectations
Enterprise CDPs have overpromised for years on the difficulty of implementation and the speed of time-to-value. Most enterprise CDP projects take six to eighteen months to reach meaningful activation, and many plateau at basic segmentation use cases because the underlying data quality or the platform's governance model creates bottlenecks.
The marketers who get the most from a CDP are the ones who defined the data requirements before selecting the platform, aligned with data engineering on who owns what, and chose an architecture that matches how their organization actually manages data — not how the vendor demo suggested it should.
What enterprise marketers need from a CDP is straightforward: current data, flexible audience tools, activation into every channel they use, strong identity resolution, and governance that enforces consent automatically. The platforms that deliver all five at scale are fewer than the vendor landscape suggests. Knowing which architectural choices make those capabilities possible is the clearest path to making a defensible decision.