Lookalike Modeling Your Ad Ops Team Can Build With a DMP

Digital Publishers and Advertisers that have access to a Data Management Platform (DMP) can bootstrap their own data modeling, or lookalike model capabilities with some simple index-based approaches.  That is to say, if you can understand both the total population of users for every segment and for any specific segment, how many users of every other segment overlap in that target segment, you can build a fast and easily understood audience model with a little legwork. It’s not the rocket science approach of a regression model or black box algorithm, but it works, and it’s pretty easy for people without a degree in data science to execute once you figure out how to get the right data out of your system.

How to Do Lookalike Modeling Yourself

The first step to building a lookalike segment is to first define what you are trying to model, that is, what audience you want want more of.  This will be your ‘target’ – for our example here, let’s consider the following audiences:

SegmentQualified Users% of Total
Women 20,000 20%
Pet Owners 5,000 5%
Coffee Drinkers 8,000 8%
Outdoor Enthusiasts 9,000 9%
Total Users 100,000 100%

Let’s say we’re trying to reach females.  Unfortunately, we only have 20,000 we can identify, out of a total population of 100,000.  Now let’s assume that our content isn’t skewed to one gender or another, and therefore there’s clearly some users in the 80,000 other users that we can expect would be female.  But we need to find a signal within that group that directs us to which other audiences are likely to be female. (more…)

How Ad Serving Works – Mobile vs. Web Environments

The most popular article on this blog is one of the very first ones I ever wrote – How Does Ad Serving Work. What I probably should have titled it though was How Does Ad Serving Work on the Web, because there are a few important differences when you’re talking about the mobile ecosystem.

Server Redirects vs. Client Redirects

For the most part, it comes down to the interaction between a client and a server – in desktop environments, the user’s browser, or the client in technical-speak, does most of the work fetching and redirecting information, which is ideal for lots of reasons. For one, redirecting the client gives each platform in the ecosystem the ability to drop or read a cookie, which helps with downstream conversion tracking, frequency measurement, and audience profiling. Secondly, it facilitates client-side tracking of key metrics like clicks and impressions for billing purposes. Client-side tracking is the preferred methodology for advertisers because it measures requests from a user instead of from a server, and is therefore a more accurate measure of what a user actually saw.  This process requires more work from the browser, but that’s OK because high-speed connections and unlimited data usage is pretty much the norm these days for home and office connections.

Desktop Ad Serving Sequence

In mobile environments though, connection speeds really matter. Many users are on slow enough connections that if the browser or app was responsible for fetching the ad the way it does on desktop connections, the user is likely to abandon the page before the ad finished loading. Because of that, you often see more of the work being done in the cloud for mobile ad serving, independent of the client. So instead of the browser calling a server, and then being redirected to another server, the browser tends to call a server, which then calls other servers, which can talk to each other through the ultra-fast fiber-optic landlines instead of the cellular network. (more…)

What is Holistic Ad Serving?

Certainly one of the biggest opportunities in ad tech today is integrating real time bidding (RTB) systems to core ad serving platforms such that ad serving decisions are made from a single system. This vision of a fully integrated monetization stack is known as holistic ad serving, and it’s going to be big.

Holistic ad serving consolidates what is today a fragmented marketplace, modernizes the publisher ad serving stack, and lays the groundwork for advertisers and publishes to transact guaranteed campaigns over RTB infrastructure.  In other words, it provides a way for publishers to transition from a world of manual campaign implementations to accepting and trafficking campaigns programmatically without having to manage the balance between two systems.

Tactically, holistic ad serving is a seems like a basic change – instead of filling direct campaigns first and then letting the exchange try to fill whatever is left, the idea is for publishers to call to the exchange marketplace and get a bid for every single impression, thereby allowing RTB demand to compete directly with the traditionally sold campaigns with guaranteed goals.  By at least getting a bid for every impression, the publisher’s ad server can understand the benefit or cost of filling an impression with a direct campaign – it has all the information.  Holistic ad serving also opens the possibility, on an impression by impression basis, for an RTB campaign to trump a direct campaign. (more…)

The Future of Geotargeting is Hyperlocal

This is the fourth article in a four part series on Geotargeting. Click here to read parts one, two, and three

So called hyperlocal geotargeting, particularly on mobile platforms is the real promise of geotargeting in the future.  Hyperlocal is far more granular than just a zip code; it’s as specific as your exact location, within a 10 meter radius.  If you own a smartphone, chances are you’ve already taken advantage of these systems to find a nearby restaurant, get directions while lost, or figure out the best mass transit route from one place to another.  From a mobile perspective, many services and apps depend on hyper-accuracy to work correctly, though the information also provides a huge potential to innovate to the advertising community.  For example, a company might run a campaign that serves a unique offer to someone if they are within a certain distance of their stores.  While likely not all that scalable, it might be particularly appealing for local, brick and mortar businesses.

Hyperlocal Geotargeting Via GPS

Technically speaking, hyperlocal is also likely to be far more reliable than traditional geotargeting on the desktop because unlike the desktop, IP address won’t be the mechanism anymore, the device signal itself will.  What does that mean exactly?  In some cases, geotargeting will leverage a device’s GPS receiver in concert with a customized table of coordinate ranges to identify targetable impressions.  Up until a few years ago, using GPS signals to deliver advertising would have been all but impossible due to the significant latency, up to 30 seconds for a so-called time to first fix (TTFF), which is when a location of the GPS satellite constellation (the physical location of the GPS satellites in orbit above the earth) is finally known and is a result of how often the GPS satellites broadcast a ping.  While generally reliable, 30 seconds is an eternity to ad delivery systems, and hardly a realistic solution to deliver a timely message.

Today however, TTFF is usually only required for non-cellular devices, like standalone GPS systems. For things like smartphones, the GPS coordinates are determined by a process known as ‘assisted GPS’, which speeds up geolocation by referencing a saved copy of the satellite constellation locations known as an almanac. The almanac details the exact locations of every GPS satellite in orbit at regular intervals, as well as the health of the signal. Every day, the cell towers download a fresh copy of the almanac, so instead of needing to acquire a first fix, your smartphone can simply rely on the cell towers to acquire its GPS coordinates in no time at all.

Hyperlocal Geotargeting via Triangulation

In addition to GPS, one concept gaining traction is the notion of signal triangulation by a dedicated 3rd party.  The idea here is that every mobile device has an antenna that not only broadcasts a signal but recognizes other wireless signals like Wi-Fi routers and cell phone towers in addition to the GPS satellite signals. Now, if someone were to read those signals off the device, could identify those other devices, and also knew the physical location of each device, they could use that information to triangulate the mobile device’s exact location, all with incredible accuracy.

If that sounds like science fiction, take a moment to familiarize yourself with a company called Skyhook Wireless, which is doing just that, and has been for years.  They already have millions of wireless signals mapped for virtually every street in the country, and have a response time that is a fraction of GPS, around 1 second.  There’s a very cool video that explains how their process works available on their site.  Their product is in production for a long list of major companies, including many of the major cell carriers.  Google and Microsoft for their part have opted to build their own systems that work on a similar process of triangulating user location based on Wi-Fi signals.  In many ways, the future is now!

Hyperlocal Desktop?

Outside of mobile, there’s a similar thread of innovation happening on the desktop side, though it isn’t nearly as advanced, and still relies on IP address since many desktop systems are directly cabled to their networks and don’t broadcast or receive a wireless signal.  Just this year, computer scientist Yong Wang demonstrated that by using a multi-layered technique combining ping triangulation and traceroutes  with the locations of well-known web landmarks like universities and government offices that locally host their services and publically provide their physical addresses, he could accurately map an IP address within 700m versus the 34km that traditional traceroute triangulation produces.  While this method isn’t in production as of yet, it could be soon, since Wang’s process is quite similar to the existing methodology, but at a much higher frequency.

Limitations of IP-Based Geolocation

This is the third article in a four part series on Geotargeting. Read parts one and two.

Despite the complexity and scientific approach of IP based geolocation identification, there are well known limitations and inaccuracies with the current methodology.  While geolocation data is usually extremely accurate down to the state or city level, as services demand more granular data, many of the current geolocation services start to break down.  The loss in overall coverage is quite small, but accuracy can be another story.

Server Location vs. Machine Location

One of the more challenging aspects of IP based geolocation is that often times, geolocation services end up using the location of the server on which that IP is accessing the internet, not necessarily the location of the end user’s machine. So however impressive you may have found the diagram in the last article on IP triangulation, the method may end up targeting the wrong location.  The classic example of this known within Ad Ops circles was AOL dial up service, which in its heyday represented a large share of internet users.  AOL’s servers were all physically located near its headquarters in Virginia, so every IP address hosted by AOL, was often shown to be located in Virginia, even though users were spread throughout the country.  Today, this is much less of a problem because most consumers have a high speed connection serviced by a locally hosted ISP, but it exposed the problem in a big way at the time.

That said, local ISPs network routers, while usually quite close, are frequently in different zip codes, so while coverage remains high for most IPs at a granular level, accuracy can be less reliable. When researching this article from my location in New York City, most services were more than 7 miles off my actual physical location, perhaps a small difference in much of the country, but an enormous gulf in as dense an area as Manhattan.  Every service however was correct about my location at a country, state, and city level.  You can check your own location on MaxMind’s demo page, which incidentally, was one of the more accurate services. (more…)