How Ad Serving Works – Mobile vs. Web Environments

The most popular article on this blog is one of the very first ones I ever wrote – How Does Ad Serving Work. What I probably should have titled it though was How Does Ad Serving Work on the Web, because there are a few important differences when you’re talking about the mobile ecosystem.

Server Redirects vs. Client Redirects

For the most part, it comes down to the interaction between a client and a server – in desktop environments, the user’s browser, or the client in technical-speak, does most of the work fetching and redirecting information, which is ideal for lots of reasons. For one, redirecting the client gives each platform in the ecosystem the ability to drop or read a cookie, which helps with downstream conversion tracking, frequency measurement, and audience profiling. Secondly, it facilitates client-side tracking of key metrics like clicks and impressions for billing purposes. Client-side tracking is the preferred methodology for advertisers because it measures requests from a user instead of from a server, and is therefore a more accurate measure of what a user actually saw.  This process requires more work from the browser, but that’s OK because high-speed connections and unlimited data usage is pretty much the norm these days for home and office connections.

Desktop Ad Serving Sequence

In mobile environments though, connection speeds really matter. Many users are on slow enough connections that if the browser or app was responsible for fetching the ad the way it does on desktop connections, the user is likely to abandon the page before the ad finished loading. Because of that, you often see more of the work being done in the cloud for mobile ad serving, independent of the client. So instead of the browser calling a server, and then being redirected to another server, the browser tends to call a server, which then calls other servers, which can talk to each other through the ultra-fast fiber-optic landlines instead of the cellular network.

Mobile Ad Serving Sequence

It’s true that this is an ever-changing situation; in many cases people are already starting to think about 4G LTE cellular connections one day replacing fiber connections for web browsing, but here’s no question that in most parts of the country that cellular speeds still vary wildly. You could easily argue over just how fast 3G is to 4G is to 4G LTE, but one need only look at one of the recent studies from RootMetrics to see that even within each type of connection, there is a huge variation in speed even on the same carrier, not to mention 4G LTE coverage is still pretty sparse through most of the country.

Mobile Ad Serving with Exchanges – Even More Complicated

The client to server vs. server-to-server challenge is even more pronounced when the ad request is served to an exchange. For example, if you look at my old article, Diagramming the SSP, DSP, and RTB Redirect Path, you’ll see there are no fewer than four client side requests made to fulfill an ad request; one to the publisher’s ad server, one to the supply side platform or ad exchange, one to winning advertiser’s ad server, and one to the CDN. See the sequence diagram below for a visual:

Desktop RTB Ad Serving Sequence Diagram

In the mobile ecosystem, you effectively have three; one to the publisher ad server, which then makes the call to the SSP itself, and passes the winning advertiser’s tag down and becomes the second client side call, and then a third and final call to the CDN. See the sequence diagram below for a visual:

Mobile RTB Ad Serving Sequence Diagram


So How Long Does it Really Take to Serve an Ad?

In the web ecosystem, you’d typically expect perhaps 250ms to connect to a web server, 150ms to connect to the ad server, 150ms to connect to the SSP, 250 – 400ms to wait for the SSP, 150ms to connect to the marketer’s ad server, and 50 – 100ms to download the content from the CDN, for a total of about a second to serve the ad from start to finish.  Now most (80 – 90%) of this time is network latency vs. waiting for the server to make a decision.  Network latency is the time you have to wait for your browser to do things like the DNS lookup (translating the .com address to an IP address), establishing a connection, and sending the request – basically the time it takes to travel through the network fiber to reach the physical location of the server.  Not only does your browser or device have to suffer this network latency, so does every part of the system.  So the Publisher server has to run through the process with the publisher ad server, which has to run through the process with SSP, which has to run through the process with the ad exchanges, and so on.  The rest of the time is waiting for the various parts of the system to actually make a decision on what to do – serve an ad or respond with no bid?  If serving an ad, which ad?  Usually these decisions, what engineers call “time in I/O”, is actually very fast, under 10ms.

If the same sequence played out in the mobile ecosystem however, you might find it take more or less the same amount of time on a 4G LTE network in downtown Chicago, or 8 to 10 whole seconds in a more suburban or rural area.  Network latency could be 4x higher on 3G cellular networks, and downloads speeds 8 – 10X slower – this may not apply to all you ad tech professionals sporting state of the art devices, but for much of America this is the reality.  In the case below when I ran a test, the network latency was actually better on the cell network, but download speeds were much worse.

cellular vs wifi connection time comparison

To see this process for yourself, download Shunra NetworkCatcher on iPhone or on Android and ping any web destination with your WiFi on and then again with your WiFi off to see this effect in action.


  1. Great explanation, very detailed and thorough. The links provided to explain certain terms and concepts were also very helpful..

  2. BEN – I’m interested in the (typical) time durations for each step the end-to-end process, particularly for RTB versions. Do you have any recommended sites or references I might use to drill down on these?

  3. Hi Jerry,

    Some exchanges have a little information on their timeout policies you can access. AWS has a pretty great article here: And I think you can find some other resources for the various exchanges that publicly post their API specs: Google’s recommendations on peering configuration here: and they specify a timeout requirement of 98% below 100ms here: for all bidders on their exchange. Facebook seems to require 120ms:

    So that means including network latency, the largest global ad exchanges require you to wait for their request, process their request, and get back to the endpoint regardless of the network latency in 100 – 120ms before they time you out. I can tell you from experience that most systems are spending no more than 10ms in IO to process a request. These systems are usually big clusters of boxes with a ton of RAM that can handle lots and lots of QPS. Even smaller systems that I’m aware of are processing 40K QPS when they open shop.

    The reason that latency is often much higher from the user’s perspective though is how the overall ecosystem is configured. If an SSP is involved, they may have direct integrations with bidders that don’t meet the same standards. My experience is most SSPs will allow 150 – 200ms timeouts to bidders, and in some cases even longer. And that doesn’t include publisher to SSP latency, which might be another 150ms before it gets timed-out, though usually it’s a lot less (10 – 20ms). Overall, you could surf around to some large publisher and anecdotally find that the E2E process is often 800ms+, and I wouldn’t be surprised by that.

    Hope that helps get you started – Quora might be another source here, and I would think many other companies would be willing to share their API specs, or at least timeout requirements with you if you just reach out and ask. I imagine they are all pretty similar, and that mobile SSPs are probably the most latent systems in the ecosystem, for obvious reasons.


  4. Hey Ben, thanks for the insightful article!

    Is the second diagram (Mobile Ad Serving Sequence) still up-to-date/in-line with the current industry trends? Does the logic vary between mobile web ads and in-app ads?

    Also, why does the publisher ad server ping the marketer ad server and what data is passed on to the user before the user calls the marketer ad server?


  5. Hi Igor,

    Yes, you make a good point – I should probably split that diagram into two, one for app and one for mobile web. I think the key difference between the two is that mobile apps can sometimes make server side ad requests (rather than client side requests, as is the standard for the web, mobile web included). That of course depends what the app does and if it leverages the browser within the app or not. Finally, even if the app does make server side requests to the ad server, click trackers must be made client side (at least with DFP), so you’re likely to see a mix at the very least. The way I built the diagram shows server side requests between the app, pub ad server, and marketer ad server as an SDK would do, but that may not be the case 100% of the time.

    In that diagram, the publisher ad server is simply fetching the marketer’s tag server side, and returning it to the user, who then calls it client side, allowing the marketer to count the impression. The publisher is returning the marketer’s ad tag to the user before the user can call it.

    Hope that helps!

Leave a Reply

Your email address will not be published. Required fields are marked *