The Wikipedia image/caption matching challenge and a huge release of image data for research!

Wikipedia articles are missing images, and Wikipedia images are missing captions. A scientific competition organized by the Research team at the Wikimedia Foundation could help bridge this gap. The WMF is also releasing a large image dataset to help researchers and practitioners build systems for automatic image-text retrieval in the context of Wikipedia.

Searching for Wikipedia

How people use Search to access Wikipedia is a common question by researchers. Until now, however, there has been little data available about this relationship. To help address these questions, the Wikimedia Foundation is releasing a new, faceted dataset on search engine traffic to Wikipedia so you can ask questions like “What is the most common search engine in my country?” or “Which search engine is most-used by Android users?”

Censorship, outages and Internet shutdowns: monitoring Wikipedia’s accessibility around the world

This article describes the methodology used by the Wikimedia Foundation to monitor outages on Wikipedia around the world. These events are called anomalies and could be due to various causes, among them censorship.

Wikimedia’s Event Data Platform, or JSON is ok too

The Wikimeda Foundation has been working with event data since 2012. Over time, our event collection systems have transitioned from being used only to collect analytics data to being used to build important user facing features. This 3 part series will focus on how Wikimedia has adapted these ideas for our own unique technical environment.