A Primer on Data Leakage for Digital Publishers

In this new four-part series on data leakage, I’ll explore how data leakage snuck up on the digital publishing industry as a critical business risk, how data leakage happens, what the costs are, and how publishers can create a policy around their data to manage the risk and capitalize on the opportunity.

What is Data Leakage?

In the digital advertising world, data leakage means the unwanted or unknowing transfer of audience data from one party to another, typically from a publisher to an advertiser, although in some cases, from an advertiser to an intermediary, such as a data exchange or ad network.

That’s my attempt at a Webster’s definition, but plainly speaking, when people talk about data leakage as it relates to interactive advertising, in almost all cases they’re talking about advertisers, ad networks, and data companies dropping cookies on users through ad redirects running on a publisher without that publisher knowing it or wanting it.  The thing is, advertisers have been doing that for years for benign purposes–like tracking ROI, for example, to see how many users from a content buy made it to their website, or conversion page.  Advertisers would drop a cookie on a user through their ad tag, and if the same cookie was recognized on a landing page at some point in the future, they could value to their ad buy, what the ad world calls ‘attribution’.  Measuring ROI was great, but that’s about all you could do with that cookie pool.  As an advertiser, even if you knew all the people in your cookie pool were sourced while reading up on leasing a new Rolls Royce, thus including them in an extremely high-value and rare audience segment, what could you really do with that pile of cookies?

Nothing, that’s what.  So publishers didn’t pay much attention to the practice.  For an advertiser though, it’s pretty easy to drop a cookie with a callback in your redirect, so dropping third party cookies out of ad buys was fairly common in a short while. After all, this is the internet – if you can measure something, why not measure it?

Gradually though, through the increased innovation in the industry and regular practice of cookie or pixel-dropping, publishers have been caught with their pants down.  Today as an advertiser you can absolutely take action against any data you can collect or cookie pool you can build, and often those actions are in direct competition with a publisher’s sales force.  The potential impact to revenue is huge, especially as programmatic buying through ad exchanges continues to build steam.

So what happened?  How did the cookie go from a background distraction to a covert business liability? In the next post, I’ll review a brief history of data collection online and explain how data leakage made it the mainstream.

Read Next – Audience Analytics Lights the Data Leakage Fuse

Get Pixel Tracking Transparency with Ghostery

Thanks to a series of articles in the WSJ, publishers around the country are taking a hard look at their privacy practices and trying to get a handle on who collects data on their site.  You would think this would be a simple task, after all, the publisher owns the site and controls everything on it, right?

Well, not exactly.  In fact, thanks to the off-site redirects inherent to 3rd party adserving, publishers often have no idea when an advertiser or marketer attempts to redirect the user within a 3rd party ad tag.  Due to the number of players involved, it’s actually quite difficult to assess which tags are attempting to cookie the user for audience aggregation.  If publishers can’t audit their site, how can they enforce their privacy policy and contractual agreements with marketers?

Thankfully, the people at Better Advertising have developed a rather brilliant browser extension called Ghostery to make pixel tracking more transparent.  Ghostery runs on your browser and sifts through all the code and ad calls to quickly identify which 3rd parties are tracking data on your site. This particular example is from – as you can see, the tool quickly pulls up a list of the various companies with pixels running on the site or somehow spawning to the browser.

From there, you can take a deeper dive on any particular tracker you want, view a brief summary of what the company does, how to access its privacy policy, and even other sites where that company was seen.  I have to say, Ghostery is a quantum leap ahead of other tools for identifying which ads are spawning pixels or running piggyback cookie requests.

Ghostery was actually developed more for Consumers to give them a way to see who is tracking their behavior online and actually block it, but I see huge potential for industry folks as well to audit their site.  Do you know what is running on your site?

P.S. – the Ghostery Blog isn’t half bad, either…