By Gilles Dubuc, Senior Software Engineer, Wikimedia Performance Team
This report will be updated monthly, with historical data made available. The goal is to watch the evolution of these metrics over time, allowing us to identify improvements and potential pain points.
In order to make a fair assessment of the autonomous systems’ performance, real user metrics collected from web browsers are normalised, in order to avoid differences such as average device power for a given network’s users potentially skewing the results. For example, an ISP with more expensive data plans might have users with more expensive, better performing devices on average. This is way we compare data points only for similar effective device CPU power between providers. We also separate the mobile and desktop experiences, because they serve different content, with a notable difference in the median page weight, which directly impacts performance metrics. We wouldn’t want the mobile/desktop mix of a given provider to influence the results.
If you look at the report, you might wonder why some autonomous systems’ underlying mobile networks show up under “desktop” and some wired internet providers appear under “mobile”. The explanation is that the internet providers either sell home internet devices that are effectively mobile network modems, resulting in people using their desktop computers (and as a result, the desktop websites) over a mobile network. Or the providers have mobile device users automatically connect to the same provider’s WiFi routers when users are in reach of one.
One caveat about this report is that in countries that are physically large, like the United States, the country-wide aggregation in no way reflects important regional differences there might be for a given network. The main reason why we can’t look at smaller regions is that we have simply no way of knowing where mobile users are connecting from, short of collecting geolocation data. Since we care deeply about our user’s privacy and their experience, it doesn’t feel appropriate at this time to ask users for their precise location in order to generate this type of finer-grained data. Such a scheme would also suffer from self-selection bias. There’s already a lot of work to be done with the data aggregated at the national level!
We hope that this public report will help network operators understand their customers’ real performance characteristics when it comes to browsing one of the web’s largest websites. We are welcoming of peering requests networks might want to propose, should they seek to improve their connectivity to our datacenters.
About this post
This post was originally published on the Wikimedia Phame blog.
Featured image credit: Arpanet logical map, march 1977, PDP-10, The Computer History Museum, public domain