Data Management Part IV: Syncing Offline Data To Your DMP

Before the internet and digital advertising, direct mail solicitation was perhaps the most technologically advanced form of data driven marketing out there.  Even today, as much as interactive marketers like to poke fun at traditional media people, the direct mail industry is far more sophisticated at accurate audience segmentation and message delivery than most of the digital realm.  Since everything in the snail mail world works off your actual name and address, data management is far simpler and can easily connect the data points in your life – the car you drive, your credit score, your age, gender, and plenty else from public records.  Start adding information about your purchase habits from catalogs, your credit cards, and all the hotel and airline loyalty cards stuck in your wallet and the direct marketers can profile you three ways to Sunday. The truth is that it’s far easier to move data offline by matching on a name and address than to move it online with nothing but a cookie.  That said, data companies and marketers alike have a huge incentive to try, because offline data is generally much more reliable and therefore valuable than its online competitors.

The challenge to moving data online however is matching to a cookie, which tends to be difficult because offline and online systems work on a different paradigm.   Most online companies expressly do not collect PII due to privacy concerns, meaning they think in terms of cookies instead of actual people.  This presents a unique challenge for online marketers because a single device might be used by multiple people, with no way to tell who is using it at any given time.  Alternatively, the offline world functions entirely on PII and can easily differentiate between multiple people in the same household: they just mail it to a different name.  Add in the fact that users can delete their cookies and the benefits of offline data emerge.  People can move, but they can’t delete themselves from the real world. Offline data follows them everywhere they go, so it stays reliable and person specific, making it more valuable that most online data.  If only you could attribute real-world behavior to digital tracking, right?  That’s exactly what the data companies thought.

Cookie Syncing Creates a Common Key

The key to transitioning offline data to a cookie is, well, a key.  A database key to be exact, which allows identification of the same user in an offline database as well as a cookie database.  A database key is a common field between two systems; in the online world, a cookie sync creates a foreign key relationship on the cookie ID values that allow the DMP to cross reference the same user in different data sources.  The concept is the same for the offline world, but the sync is a bit more complicated, and unless the client collects personally-identifiable information (PII) from online users, the sync typically requires an outside data company to perform what’s known as a match service, usually by using an email address as the key.

For example, eBay was a prime provider of match services to data management companies until they shut down the service in March of 2011.  Since eBay was a digital marketplace, they were able to cookie every user and because they had an order management system that required users to register with their real name and email address before they could buy, they knew the name and email for their cookie.  So eBay could serve as its own data provider.  With PII married to a cookie eBay had a tremendous asset they could monetize by syncing other people’s offline data to online cookies.  They’re out of the market, but plenty of other companies have the same data – any large site with user registration pretty much qualifies.  I have no idea who is a current source of match data, but I would imagine the airline booking sites, large eCommerce sites, and even banks could provide the necessary information.

How the Match Provider Connects Identity

Whatever the provider, here’s how it works:

The match service, before it has any client, contracts with a data provider and has the data provider place the match service’s pixel on one of their pages where users frequently pass so they can run a cookie sync.   Users who hit that page call the provider’s cookie, which then piggybacks a call to the match service.  In a server-to-server integration, the provider uploads the PII for each cookie to the match service servers, who in turn attribute to their own cookie ID.  This process just continuously runs, building a larger and larger data set for the match service, which pays the data provider for this information.  Now, a a client wanting to move offline data to a cookie contracts with the match service, which has the client setup a cookie sync with them.  When a user hits the client’s page, the client’s cookie fires and then piggybacks a call to the match service, passing it the client cookie ID on a query string. The match service records the client’s cookie ID, and checks for an existing cookie on the user. If that user has been cookied before by the match service, the user’s identity is known, and the match service can attribute the PII from the data provider to the client’s cookie.

At the same time, the client sends the match service a file of the offline database records.  Now the match service can look for users with the same PII in an offline record as they have from the data provider.  When they find a match, they already know what their own cookie ID on that user is as well as the cookie ID for the client is.  Over time, as more and more users call these tags, the data builds up and the match is complete.  Importantly, to satisfy privacy regulations, the PII known by the match service is typically stripped off before it is sent to the client.  The client can know which users were matched in aggregate, but usually cannot know which cookie ID is which user.  So the match is not anonymized, but the results are.  Unfortunately, the overlap between sources tends to be small, typically 30 – 40% of the total offline records on a good day.  It depends how often users delete their cookies, and how often they visit the site.  Match services typically contract with more than one data provider and match against the aggregate records, but even then, the results are often not ideal. Cookie deletion also makes the match process a constant one to re-sync users who erase their online identifiers.

Still, the results can be powerful – Nielsen was one of the first companies to bring their offline data online, syncing PRIZM segments to cookies as far back as 2009Polk, Experian, and other premier offline data companies have followed and as DMPs become more commonplace among marketers with deep data sets targeting will only improve.  If you are interested in match services, look into LiveRamp, DataLogix, Datran, TargusInfo, or Acxiom for more details.



  1. Great post explaining match services! Readers would probably benefit from a visual diagram for this process similar to how you’ve done with past articles. Also, what is your take on how device fingerprinting/identification plays into the current advertising technology landscape?

  2. Hi Daehee,

    Thanks – I’ll have to work on a visual diagram of the process, I’m inclined to agree with you that it would help make sense of this process.

    In terms of fingerprinting, I think short-term this is probably more of an opportunity on the mobile side of the business assuming 3rd party cookies continue to be a dead end on the iOs platform. Currently I don’t think there is any significant business transacting based on fingerprint-based targeting, but if anyone out there disagrees and knows of a company doing that, I’d love to know more. That said, I think there’s a real possibility that fingerprinting could replace the cookie if the FCC enacts a true, opt-in requirement for cookie-ing users online, or the so-called “Do Not Track” legislation that everyone in digital advertising fears.

    I think these two articles present an interesting perspective on the matter as well:

    The Journal makes a good point that fingerprint IDs are much, much harder to ‘delete’ from a user perspective, which is why they have potential appeal to tracking companies. I think the cookie deletion rate is quite a big problem for real time bidding technologies, but it could be a tough transition to move from a simple cookie ID to a fingerprint ID for those systems and still transact within a 50ms timeframe. That said, if cookie deletion rates stay at a 30 – 50% per month halflife, the match rates could make the case to transition, particularly as advertisers demand more granular and algorithmic targeting methodologies and therefore, reach smaller and smaller groups of users. The ClickZ article also makes a great case for the opportunity around cross-screen identification, and that’s where I think fingerprinting could really be a game changer. In my experience, advertisers still don’t have a sophisticated cross-media approach to most messages, and think in terms of one medium at a time, instead of all media, collectively. They might run the same type of creative, or the same campaign across multiple media, but they don’t think in terms of conditional probability and what the right statistical mix of reach and frequency across all media should be, or if they do consider it, how they would execute that strategy. Being able to control the frequency of a message per user across all media is very interesting, and as more traditional media shifts to digital technologies where you can literally target at a user-level, I think there is a massive opportunity.

    Only time will tell, but I think we’re still in the innovation / prototype phase here – I’d be curious to know what your perspective is, though – do you have any resources you can share on this technology?


  3. Dear Ben:

    Can you give three of four real life examples of using the DMP? From the perspective of the client (different divisions in the organization, i.e. the analytics department vs the media buying group) utilizing the DMP. I am trying to explain this to someone and thought it would help if I used examples in my explanation.


  4. Hi Rita,

    Sure – how about the following?

    1. For the analytics department, you could track and segment all users who went to the Finance section of your website. Then, using 3rd party data sources, you could determine the demographic and psychographic profile of those users. You could further segment Finance users into those that view 10+ pages, or come back every day, and see how they differ from the broad audience. What is it about those people that attracts them to the Finance section, where else do they go on your site, what other audiences are they in? Perhaps the Finance audience is highly likely to read about Politics or Golf.

    2. For the media buying group, you could use your DMP to cookie every user exposed to a media campaign, and track who converts and who doesn’t and build a similar profile of converting users with the methodology above. You might combine the data you have on users with a dynamic creative optimization tool to construct ads on the fly, and personalize your messaging to women versus men. You could simply use the data in the DMP to target only women with your next campaign, or Women age 30 – 50 who make $75K+ a year and also have two kids. You can get extremely granular in who you target, and use the DMP to understand what’s working and what isn’t.

    3. Finally, let’s say you own a catalog company, and you want to advertise to people who bought more than $500 worth of goods from you last holiday season – you can use a DMP to identify those shoppers online with a match service and market to them for next season.

    Hope that helps –


  5. Ben —

    Do you know of any brands actively buying the matching services you are talk about above? Seems pretty very advanced for the industry at large.


  6. Hi Pete,

    It’s difficult information to come by – what’s more common is brands taking advantage of data companies using these matching services. There are lots of examples of companies like Targus, Experian, Acxiom, Nielsen, and others that are bringing offline data like demographic signals, purchase histories, and other behavioral elements online. I do know of a few brands in the retailing and CPG space specifically that are moving their data online, or have in the past, but I’m not at liberty to call them out by name. Certainly big companies with significant online and offline presence are actively connecting transaction data in CRM systems, regardless if the customer purchased online or in a brick and mortar location, I’d say that is business as usual at this point. But you’re right in saying that it isn’t a standard practice to aggressively move offline data online just yet when the marketer can’t connect the dots on their own, but need a matching service instead. My sense is that because of the expense, marketers are fine to rely on data companies to provide insights on targeting, or optimization algorithms to drive performance, and don’t need to make the leap between the offline and online channels to move their business forward.


  7. Hi PM,

    A bridge provider likely refers to a company that can sync a user’s record from an offline system to an online system. Typically that requires a company that understands the personally identifiable information (PII) on a cookie. PII includes things like first name, last name, home of record, phone number, things of that nature. The bridge provider (more commonly called a match provider) can then use that PII to link a record in an offline system like a CRM database to their cookie IDs. It’s the process outlined in this post. So you might know that record 123 in the CRM system is Bob Smith in zip code 10001, and that Bob Smith of zip code 10001 is also cookie ID abc in the match provider’s system, therefore record 123 = cookie ID abc. Once that relationship is established, a customer could move any data attributes on Bob Smith from their CRM system to a database the cookie can understand, and then potentially deploy their CRM data in other online systems through cookie synced relationships with other online systems, for example, a DSP or Ad Exchange.

    Companies like Acxiom, Experian, TransUnion, DataLogix, RapLeaf, and others can provide these services. In some cases, large publishers with significant populations of registered users can also serve as their own match partners, assuming they have enough PII.

    Hope that helps –


  8. Ben, a common incorrect perception among clients that want to take advantage of these types of services is individual match rate (as opposed to cookie match rate). You mentioned “30-40 % on a good day” in this article. I assume that means cookies…….any perception on what that equates to in terms of individuals?

  9. Hi RB,

    Hard to say, but I’d guess perhaps half that figure. It depends somewhat on the age of the cookie pool you are matching an individual to, and truth be told the match rate can be misleading and not terribly useful as a real metric. For example, many data providers will try to use a cookie pool with lots of old cookies in order to maximize their match rate. But, if a cookie hasn’t been seen in 90 days, it doesn’t really matter if you can match a user to that cookie or not, because that cookie is dead; deleted long ago and certainly not reachable any longer with an ad. In that way the data provider might be able to brag about a high match rate and point to your DSP as the reason you can’t deliver many ads to your target when the real reason is that a large portion of your matches were to cookies that were deleted long ago.

    Because of this, you’d be smart to compel your data provider to limit your cookie match process to cookies that were seen in the last 30 days if not more recently. Your match rate will drop, but what remains will be more representative of what’s actually reachable.

    Hope that helps!

Leave a Reply

Your email address will not be published. Required fields are marked *