Data Management Part III: Syncing Online Data to a Data Management Platform

To get the full value out of a relationship with a data management platform, you want to provide the platform with as much data as possible.  That said, the low hanging fruit in any organization will be to integrate 1st party data for which you already have a cookie to the DMP.  The mechanism to accomplish this is your standard cookie sync,which passes a user ID from one system to another via a query string appended to a pixel call, and ideally, a server-to-server integration after that.

Practically speaking this means that when a user hits your site and calls your site analytics tag, either independently or through a container tag, that site analytics tag redirects the user to the DMP, and simultaneously passes the site analytics user ID to the DMP.  When the DMP receives that call, it cookies the same user and also records what the site analytics user ID is.  Now the DMP knows how to associate data from the site analytics tool to its own cookie ID.  The beauty of this system is only the user IDs need to be synced at this time, and the actual data that the site analytics tool records can be passed to the DMP later, without slowing down the user experience on site.  Now imagine replicating this process with all 3rd party tools, and syncing all systems into the DMP.

In the example below, a client is integrating their analytics system, order management system, and email system to a DMP for the sake of selling some of their data on a data exchange and remarketing on the exchange through a DSP. For sake of clarity, this particular setup shows how the process works when the DMP client is using a container tag solution to control which tags serve on the page. Using a container tag or other tag management solution is considered a best practice to ease implementation of new tags, and control tag frequency, among other reasons.

When the user (smiley face) lands on their site, they request the page HTML from the content server (1), which responds to the user with the page content, as well as the DMP’s container tag (2).  While the rest of the page is loading, the container tag forces the browser to call the DMP (3), which responds back with a batch of redirects as well as its own cookie for the user (4) to each of the integration points.  Those are basically pixel requests so the systems can drop a cookie on that user.  The user makes those parallel requests (5), and receives the cookie, along with a callback to the DMP (6).  That callback has each system’s unique ID on that user in the URL (ideally, encrypted), which the DMP receives and stores (7).  Now the DMP has its own cookie ID (from step 4) synced with each 3rd party system.

Now, the DMP can pull data from each system (8) and populate that data against its own cookie ID.  The client can do any audience profiling and segmentation in the DMP, create the right cookie pool and move that cookie pool to other systems on demand, using server to server API connections.

Server-to-server integrations are important because they address the issue of data loss between systems. The alternative is a pixel-to-pixel sync, requiring one system to pass all relevant data in the redirect from one system to the other, usually in some type of key-value string appended to the redirect URL.  This added data makes the call heavier and slows down the user experience as well as the syncing process.  It will also always limit the amount of data you could ever pass between systems.  Slower always means discrepancies and discrepancies mean data loss.  In fact, many people in the industry quote up to 30% loss between systems with a pixel-to-pixel integration, which is just enormous.  Imagine reducing your data’s scale by a third just to move it from one system to another! In short, if the DMP can’t facilitate server-to-server integrations, they probably aren’t worthy of the label.

Read More: Syncing Offline Data to Your DMP

14 comments

  1. Great series of articles !
    How would you sync your ad server data (e.g. Dfa, Mediamind, etc) ? If you want to know which audiences were exposed to your display campaigns for example ; do you need to fire your DMP sync pixel on each impression? If yes, are DMP platforms already doing it?

  2. Hi Matthieu,

    Thanks – what you’re describing is fairly complex because you’re asking for campaign level overlaps, not just overall site overlaps. It’s a lot of extra data to store, and not many systems are around today to give you similar information on a contextual basis. For example, you wouldn’t be able to know from your ad server where a run of site ad actually ran on a channel by channel level, so it’s a similar request. Probably the best way to get there would be to create a unique pixel from your DMP, and then implement that as a 4th party call on the ad tag itself. Then, you could build an audience from that particular tag’s pixel calls, and look at the overlap and affinities between it and other audiences. You might call it a hack approach, but I think that would work, and yes, you would need to fire the pixel on every impression call. I can’t think of a better way to do it without a more complex technical integration.

    Otherwise you likely need a deeper integration between an analytics platform like a Yieldex and your DMP, but I would suspect very few if any people have pushed that far just yet.

    Best of luck – if you have success I’d be curious to know more about your approach and how you did it!

    Ben

  3. Hello Ben,

    Thanks a lot for your answer, very interesting (haven’t thought about it but indeed, it’s similar to a channel by channel level). An option could be indeed to fire the DMP pixel on every impression (with for example, campaign, site & creative adserver macro).
    Then we would have for each campaign/site/creative ; audiences who were exposed/clicked/converted.

    That would be a massive amount of data to consolidate (and analyse) for the DMP and I was wondering if some were already doing it. I saw that BlueKai bought TrackSimple and looking on their website (http://www.bluekai.com/platform/analytics.php), they already import data from Dfa, Mediamind (and even Google Analytics). But I am not sure it’s on cookie level (e.g. if they know which audience were exposed to the campaigns) or if it’s only stats imports from the adserver API (in that case, I don’t understand the added value of BlueKai + TrackSimple). Maybe it’s an intermediate scenario ; TrackSimple would get the origin of the visit (via adding url tracking parameters to the campaigns destination URL).

    In any case, I am interested if you have more information !

    Matthieu

  4. Matthiew, great question!
    Ben, great suggestions!

    It seems like as Ben points out you would have to fire off a pixel at every imps/clicks/conv event and indeed that would generate a lot of data.

    I actually know of one company doing that already. We are already using them. A startup which gives you a tag that gets trafficked for every campaign ( at the creative level) and when it fires off you get data on url, geo, date, campaign, creative etc and any other data you want to append via the pixel ( even from reading the cookie).

  5. Interesting – Jalal – can you share the name of the company?

    I’m glad Jalal commented, because Matthieu, I actually checked into this for my own purposes after you sparked the idea and found out from some of the DMPs that this is precisely the approach they recommend, and it works in practice. But yes, you have to fire a pixel call from every ad call, or whatever you want to see the audience attributes against.

    Regards,

    Ben

  6. Hi guys,

    Thanks for all your comments, it’s exactly what I’m looking at too.

    My question would be that once you have all of this segmented campaign cookie data in the DMP from ad server impressions, clicks and conversions – how would you then be able to release this into a DCO platform to start influencing your display creative messaging? We’d typically be using MediaMind or DoubleClick dynamic ads, but can’t see a way of those DCO’s recognizing the DMP cookies as it would be outside of the MM or DFA domain?

    Any thoughts would be much appreciated!

    Regards

  7. Hi Tom,

    You face a common problem; basically though, you’ll need to do some kind of integration between systems – the exact solution required should really be a conversation between your DMP and DCO suppliers. I would think that a quick and dirty solution would be to include a script in your DCO tag that calls the DMP when the DCO tag fires, which reads the user’s cookie and writes any audience values back to the DCO as a key-value they can use to compile the creative – this is a lot like how a publisher DMP to 3rd party ad server integration would work – or you might try and go the direction of a server to server integration between the DMP and DCO, so the DCO effectively has a local copy of the DMP’s database in its own system. You might have to work through some race condition issues there, which could add latency and discrepancies to your tags, but it’s hard to say before you test it out. Specific to DoubleClick, many publishers make use of the Boomerang cookie functionality to push DMP audience values into a cookie the adserver can read, so you might look down that path.

    I’d be curious if you find something that works – it would be great if you could report back any progress to this thread, as I’m sure lots of people would like to know more, including myself! I’ll probably take on this challenge myself in Q3, so if you haven’t worked something out by then let’s reconnect and maybe I’ll have better advice for you.

    Regards,

    Ben

  8. Awesome read. Thanks for posting. Now I truly understand the DMP functioning and value proposition to advertisers

  9. Hi Ben,

    A lot of great articles the knowledge you share is pretty amazing. When you mentioned the issue of data loss, is this just due to latency and network issues? Would a dynamic CDN solution reduce the data loss or benefit a DMP in any way?

  10. Hi Wade,

    Yes, data loss is primarily related to network latency and dynamic CDNs would undeniably help, yes, but I think it’s an expensive solution to the problem. It doesn’t really make sense to try and make an inefficient process faster when a more elegant and cheaper solution is readily available. For example, if you were running a highway and had a traffic problem, the solution wouldn’t be to increase your speed limits; the solution would be to add more lanes. Or, with a computer, it usually makes more sense to add RAM vs. buy a faster CPU. The same is true here; faster isn’t the ideal solution, more bandwidth is.

    To elaborate, the data loss problem I was talking about was mostly related to a platform that would have to query other services in band to get a response before it could respond to the ad server. For example, if your DMP, in order to load a data provider’s segment has to read it’s cookie ID and then query the data provider with that ID and then wait for the data provider to lookup its IDs and return the relevant segments for the user before the DMP can reply to you, that’s pretty inefficient. The better solution is a server to server sync, where the DMP syns it’s ID with the data provider in band, but then moves all the segment data from the data provider to its own system out of band, so that the provider’s data now sits in the DMP’s own platform. In band basically means using the user to pass data back and forth, and out of band means the DMP’s server can just communicate with the data provider on its own.

    Hope that helps!

    Ben

Leave a Reply

Your email address will not be published. Required fields are marked *