The Wikimedia Analytics Engineering team manages multiple systems, all gravitating around a big (for our standards) Hadoop cluster. This post describes our path to changing our Hadoop distribution in a single day, together with the lessons learned while doing it.
Maintaining and improving one of the largest websites in the world using Open Source software requires a continuous commitment. The site is always evolving, so for every new component we want (or need!) to deploy, we need to evaluate the Open Source solutions available.