Marketing Incrementality & Holdout Groups: Measuring True ROAS

About this video

How much of your attributed revenue is truly incremental — and how much would have happened without the spend? In this walkthrough, X Wang explains why platform-reported ROAS can overstate performance, and how holdout groups give retention and acquisition teams a clearer read on whether a channel or tactic is worth the budget.

The example comes from a DTC brand layering direct mail into checkout abandonment: email first, then SMS, then physical mail only when neither digital channel converts. After a year live, the direct-mail platform reported strong multi-x ROAS. We pressure-tested that number two ways — conservative Shopify order matching and a live holdout test — before deciding how much to keep investing.

Why incrementality matters for marketing spend

ROAS from a vendor dashboard answers a narrow question: “Did people who received this message also purchase?” It does not answer: “Did this message cause purchases that would not have happened otherwise?”

Customers convert through many paths — organic return visits, email campaigns sent in the same window, SMS, remarketing ads, or simply slower purchase cycles than your automation delays assume. When you add a new channel without a control group, it is easy to credit it for revenue that was already in motion.

The problem with platform-reported attribution

Most martech platforms default to generous attribution windows and last-touch logic tuned to show impact. That is useful for operational reporting, but dangerous for budget decisions if taken at face value.

A practical counter-check is conservative order matching: for each recipient sent a mail piece in a date range, did that specific customer place an order within your defined attribution window in Shopify (or your source of truth)? That usually produces a lower number than the platform — and sets a more honest baseline before you run a holdout.

How holdout analysis works

A holdout randomly withholds the tactic from a slice of eligible contacts. Everyone else receives the normal treatment. You then compare conversion rate, orders, and revenue per recipient between groups over the same window.

In the example from the video, 20% of checkout abandoners who would have triggered direct mail were held out; 80% received the mailer. Both groups had already been eligible for email and SMS earlier in the path — so the test isolates the incremental lift from direct mail, not from digital retention as a whole.

Case study: direct mail on checkout abandonment

The brand had built a typical direct-mail program: cart/checkout abandonment, win-back for lapsed buyers, and acquisition mail for non-purchasers. Platform reporting implied very high ROAS on recent spend.

Holdout results over roughly 40–45 days showed:

~3.7 percentage-point raw lift in conversion rate for the treatment group (~66% relative improvement vs. control) — directionally meaningful even before full statistical significance.
Higher revenue per recipient in treatment vs. control — the normalized metric that matters when group sizes differ.
~$600 implied incremental revenue in the test window under conservative assumptions — net positive, but far smaller than platform totals.

Reported ROAS vs. incremental ROAS

The contrast is stark. Platform-reported attributed revenue implied roughly 9x ROAS on ~$351 spend. Incremental measurement from the holdout — lift above the control baseline — landed around 1.7x ROAS.

Both numbers can be “right” for different questions. The platform figure describes correlated revenue; the holdout figure describes causal-ish lift from withholding the tactic. For spend decisions, the second number is the one that keeps you from over-investing in a channel that looks amazing on paper.

When to run a holdout test

Use holdouts when you introduce a net-new channel (direct mail, influencer, a new paid social tactic) or a meaningful change inside a channel — not for every subject-line test. The goal is to answer whether marginal spend is net positive after accounting for conversions you would have gotten anyway.

Pair holdouts with your channel ladder: hit the most efficient touchpoints first (email, SMS), then escalate to higher-cost tactics only when the incremental math supports it. We cover stat ranking and channel ordering in separate material; this video focuses on the measurement discipline that makes those ladders defensible.

For help designing holdouts across email, SMS, and layered acquisition — or auditing attribution before you scale spend — see our retention marketing strategy work or book a call.

Key takeaways

Platform-reported ROAS often counts revenue that would have converted anyway — without a holdout, you cannot see true incrementality.
Challenge vendor attribution with conservative order matching (e.g. Shopify purchases within a defined post-send window) before trusting dashboard ROAS.
A 20% holdout against an 80% treatment group is enough to directionally test whether a channel is net positive on spend.
Ladder channels by efficiency: email and SMS first on abandonment paths, then more expensive tactics like direct mail only when upstream messages do not convert.
In this direct-mail checkout-abandonment test, reported ROAS was nearly 9x while incremental ROAS was ~1.7x — still positive, but a very different budget decision.
Run holdouts when launching a new channel or testing a new tactic (direct mail, influencer, paid social creative) — not only at full program scale.

Prefer YouTube? Watch on YouTube