https://commons.wikimedia.org/wiki/File:Wikipedia20_Knowledge.svg

The Wikipedia image/caption matching challenge and a huge release of image data for research!

Wikipedia articles are missing images, and Wikipedia images are missing captions. A scientific competition organized by the Research team at the Wikimedia Foundation could help bridge this gap. The WMF is also releasing a large image dataset to help researchers and practitioners build systems for automatic image-text retrieval in the context of Wikipedia.

https://commons.wikimedia.org/wiki/File:Magnifying_glass_with_focus_on_paper.png

Searching for Wikipedia

How people use Search to access Wikipedia is a common question by researchers. Until now, however, there has been little data available about this relationship. To help address these questions, the Wikimedia Foundation is releasing a new, faceted dataset on search engine traffic to Wikipedia so you can ask questions like “What is the most common search engine in my country?” or “Which search engine is most-used by Android users?”

Censorship, outages and Internet shutdowns: monitoring Wikipedia’s accessibility around the world

This article describes the methodology used by the Wikimedia Foundation to monitor outages on Wikipedia around the world. These events are called anomalies and could be due to various causes, among them censorship.