A Primer on Data Leakage for Digital Publishers

In this new four-part series on data leakage, I’ll explore how data leakage snuck up on the digital publishing industry as a critical business risk, how data leakage happens, what the costs are, and how publishers can create a policy around their data to manage the risk and capitalize on the opportunity.

What is Data Leakage?

In the digital advertising world, data leakage means the unwanted or unknowing transfer of audience data from one party to another, typically from a publisher to an advertiser, although in some cases, from an advertiser to an intermediary, such as a data exchange or ad network.

That’s my attempt at a Webster’s definition, but plainly speaking, when people talk about data leakage as it relates to interactive advertising, in almost all cases they’re talking about advertisers, ad networks, and data companies dropping cookies on users through ad redirects running on a publisher without that publisher knowing it or wanting it.  The thing is, advertisers have been doing that for years for benign purposes–like tracking ROI, for example, to see how many users from a content buy made it to their website, or conversion page.  Advertisers would drop a cookie on a user through their ad tag, and if the same cookie was recognized on a landing page at some point in the future, they could value to their ad buy, what the ad world calls ‘attribution’.  Measuring ROI was great, but that’s about all you could do with that cookie pool.  As an advertiser, even if you knew all the people in your cookie pool were sourced while reading up on leasing a new Rolls Royce, thus including them in an extremely high-value and rare audience segment, what could you really do with that pile of cookies?

Nothing, that’s what.  So publishers didn’t pay much attention to the practice.  For an advertiser though, it’s pretty easy to drop a cookie with a callback in your redirect, so dropping third party cookies out of ad buys was fairly common in a short while. After all, this is the internet – if you can measure something, why not measure it?

Gradually though, through the increased innovation in the industry and regular practice of cookie or pixel-dropping, publishers have been caught with their pants down.  Today as an advertiser you can absolutely take action against any data you can collect or cookie pool you can build, and often those actions are in direct competition with a publisher’s sales force.  The potential impact to revenue is huge, especially as programmatic buying through ad exchanges continues to build steam.

So what happened?  How did the cookie go from a background distraction to a covert business liability? In the next post, I’ll review a brief history of data collection online and explain how data leakage made it the mainstream.

Read Next – Audience Analytics Lights the Data Leakage Fuse


  1. Hi Ben …

    I have followed you across from your comment made to the exciting Google story, over on AdExchanger.com. They have me ‘blocked’ from commenting on their site and this, they have done so for quite some time now. That’s so-so I feel, but this move is clearly an indictment of a ‘smelly’ industry that hasn’t quite got off on the right foot from the start. I too follow the ‘space’ and enjoy reading of the many developments daily. I will enjoy reading your thoughts here. – Ross

  2. Hi Ross,

    Thanks for the comment – sorry to hear you are being blocked on AdExchanger. I’m not sure why you feel the ad industry is a ‘smelly’ one – would you elaborate?


  3. Hi Ben,

    I hope that publishers are aware of data leakage but yet do nothing to stop it. If you install ghostery, you can witness a ton of ad tech companies tracking your movements on a site. There are proper retargeting ad tech companies as well whose sole trade is re-targeting?

    Why aren’t publishers not taking any action to prevent this according to you?

  4. Hi Rahul,

    I think many publishers (at least large ones) are keenly aware of data leakage and very much try to prevent it. You can use a number of tactics to manage this concern, including language in your contracts with advertisers, and using tools in your SSP. You can also use companies that automatically monitor tags like the MediaTrust to understand where exactly data collection trackers are serving on site, and from which line items in the ad server.

    Part of the reason why publishers are slow to act here, or have been in the past, is that most ad operations teams feel a lot of pressure when they block campaigns from serving, as it has a direct impact on revenue. They have the un-enviable task of changing minds within their organization, and convincing their leadership that data collection erodes the value of their site long term, even if that cost can’t be easily measured. But I agree that it’s a major concern, and publishers have to be aware and take steps to understand and be compensated for their data if they allow collection.


Leave a Reply

Your email address will not be published. Required fields are marked *