By Gilles Dubuc, Senior Software Engineer, Wikimedia Performance Team
The micro survey simply asks users on Wikipedia articles, in their own language, if they think that the current page loaded fast enough:
Let’s look at the results on Spanish and Russian Wikipedias, where we’re collecting the most data. We have collected more than 1.1 million survey responses on Spanish Wikipedia and close to 1 million on Russian Wikipedia so far. The survey is displayed to a small fraction of our visitors.
How satisfied are our visitors with our page load performance?
Ignoring neutral responses (“I’m not sure”), we see that consistently across wikis between 85 and 90% of visitors find that the page loaded fast enough. That’s an excellent score, one that we can be proud of. And it makes sense, considering that Wikipedia is one of the fastest websites on the Web.
Now, a very interesting finding is that this satisfaction ratio varies quite a bit depending on whether you’re logged into the website, or if like most Wikipedia visitors, you’re logged out:
It appears that logged-in users are consistently more satisfied about our performance than logged-out visitors.
The contributor performance penalty
What’s very surprising about logged-in users being more satisfied is that we know for a fact that the logged-in experience is slower. Because our logged-in users have to reach our master datacenter in the US, instead of hitting the cache point of presence closest to them. This is a long-standing technical limitation of our architecture. An issue we intend to resolve one day.
Why could they possibly be happier, then?
The Spanish paradox
Spanish Wikipedia, at first glance, seems to contradict this phenomenon of slower page loads for logged-in users. Looking at the desktop site only (to rule out differences in the mobile/desktop mix):
The reason why – contrary to what we see on other wikis and at a global scale – Spanish Wikipedia page loads seem faster for logged-in users, is that Spanish Wikipedia traffic has a very peculiar geographic distribution. Logged-in users are much more likely to be based in Spain (30.04%) than in latin american countries than their logged-out counterparts (22.3%). Since internet connectivity tends to be faster in Spain, this ratio difference explains why the logged-in experience appears to be faster – but isn’t – when looking at RUM data at the website level.
This is a very common pitfall of RUM data, where seemingly contradicting results can emerge depending on how you slice the data. RUM data has to be studied from many angles before drawing conclusions.
Looking at the Navigation Timing data we collect for survey respondants, we see that for logged-in users the median connect time on Spanish Wikipedia is 0 and for logged-out users it’s 144ms. This means that logged-in users view a lot of pages and the survey mostly ends up being displayed on their nth viewed page, where n is more than 1, because their browser is already connected to our domain. Whereas for a lot of logged-out users, we capture their first page load, with a higher probability of a cold cache. This means that logged-in users, despite having a (potential) latency penalty of connecting to the US, tend to have more cached assets, particularly the JS and CSS needed by the page. This doesn’t fully compensate the performance penalty of connecting to a potentially distant datacenter, but it might reduce the variability of performance between page loads.
In order to further confirm this theory, in the future we could try to record information about how much of the JS and CSS was already available in the browser cache and the time the page load happened. This is not information we currently collect. Such data would allow us to confirm whether or not satisfaction is correlated to how well cached dependencies are, regardless of the user’s logged-in/logged-out status.
Becoming a Wikipedia contributor – and therefore, logging in – requires a certain affinity to the Wikipedia project. It’s possible, as a result, that logged-in users have a more favourable view of Wikipedia than logged-out users on average. And that positive outlook might influence how they judge the performance of the website.
This is a theory we will explore in the future by asking more questions in the micro survey, in order to determine whether or not the user who responds has a positive view of our website in general. This would allow us to quantify how large the effect of brand affinity might be on performance perception.
About this post
This post was originally published on the Wikimedia Performance Team Phame blog.