Roundup: Google Summer of Code 2021 and Outreachy Round 22
Edited by Sarah R. Rodlund and Srishti Sethi
Wikimedia technology participated in two major outreach programs this year: Google Summer of Code 2021 and Outreachy Round 22.
According to the program websites, “Google Summer of Code is a global program focused on bringing more student developers into open source software development. Students work with an open source organization on a 10 week programming project during their break from school,” and “Outreachy is a diversity initiative that provides paid, remote internships to people subject to systemic bias and impacted by underrepresentation in the technical industry where they are living.”
Both programs provide the students and interns with the opportunity to work with experienced mentors on technical projects that benefit Wikimedia technical projects.
In this post, you’ll find summaries of each of this year’s outreach projects.
mwsql
mwsql is a Python library that makes it easy to work with Wikimedia SQL dump files. Its simple and user-friendly interface is ideal for exploratory data analysis and conversion to other data and file types commonly used in data science in just a few lines of code.
Slavina Stefanova
Mentors: Sarah R. Rodlund and Isaac Johnson
Synchronising Wikidata and Wikipedias using pywikibot
Synchronizing Wikidata and Wikipedias using Pywikibot: Wikidata is a structured data repository that holds organized data of contents in Wikipedia and the other Wikimedia projects – and this project aims to write scripts for Pywikibot to create new Wikidata Items and extract relevant information from Wikipedia articles and import them to Wikidata once the scripts are accepted through bot requests (example: Niraibot). This ensures access to only the important information from Wikimedia projects in an organized way.
Nirali Sahoo
Synchronizing Wikidata and Wikipedias using Pywikibot project focuses on automating the process of extracting data from Wikipedia articles and exporting them to Wikidata as structured data items. Having the data in Wikidata structured form helps automated tools as well as all Wikimedia projects by allowing them to be able to pull information from the same central place and thus helps in reducing the time outdated/wrong or inappropriate information remains publicly on wikis, especially on projects with a smaller editor base.
Ammar Abdulhamid
Mentor: Mike Peel
WikiNav
WikiNav is a tool that processes the Wikipedia clickstream data to generate statistics and visualizations that help make this data more accessible to folks with varying levels of programming and data wrangling experience. Alternatively, users can invoke the WikiNav API to perform quick lookups on the clickstream dataset and use the results to power their own analyses and visualizations.
Muniza A
Mentors: Martin Gerlach and Isaac Johnson
The Userscript Tour
The Userscript Tour: A guided tour that helps users learn about userscripts, and how they are created using ResourceLoader, MediaWiki Action API, and Object-Oriented User Interface (OOUI). It primarily focuses on newbie developers and existing Wikimedia community members who have a little bit of JavaScript knowledge. If someone does outreach, then every participant would go in the same flow.
Devyansh Chawla
Mentors: Jay Prakash, Krishna Chaitanya Velaga, Enterprisey
Wikidata-Complete-Gadget
Wikidata-Complete-Gadget: The WikidataComplete Gadget is a Wikidata gadget that is intended to help users in adding more facts to the Wikidata knowledge base. The tool is fetching suggestions from an API and shows them to the user directly within the Wikidata Web frontend, s.t., adding more facts is becoming convenient. The suggestions are computed automatically from other sources (e.g., Web Content, Knowledge Bases).
Dhairya Khanna
Mentors: Dennis Diefenbach, Andreas Both, Gabin Guo, Aleksandr Perevalov
Cypress tests for Wikipedia Preview
Cypress tests for Wikipedia Preview: Wikipedia Preview provides Wikipedia content in the form of contextual information to be available on 3rd party sites. This involved writing a quality level of tests that checks the preview on different parameters. This helped in identifying the fallback conditions and in delivering better Preview to the end-users.
Shailesh Kanojiya
Mentors: Gabriel Pita, Vidhi Mody, Soham Parekh
Custom Picture Selector for commons android application
Custom Picture Selector for commons android application: A custom picture selector for commons upload. It has the ability to show differently the images which have already been uploaded. The feature indicates an already uploaded image with a Commons icon overlay, thus saving time and improving the user experience. Find the GitHub issue here.
Aditya Srivastav
Mentors: Nicolas Raoul, Madhur Gupta
Bernard – WMFDBBackups Dashboard
Bernard – WMFDBBackups Dashboard: This is a user-friendly dashboard that can be used by the Wikimedia Data Persistence Team to easily find out the status of the daily MariaDB/SQL database backups that are executed on various databases used by Wikipedia. This would help the team to monitor backup operations and easily pinpoint where backup operations issues are, thus helping members of the Data Persistence Team to resolve any issues relating to daily backups. Daily backups are essential to help Wikipedia recover from any database failures. A lot of work remains to help this prototype become useful in production.
Hari Krishna
Mentors: Jaime Crespo and Manuel Arostegui
Add zoom and pan to the Wikisource Pagelist Widget
Add zoom and pan to the Wikisource Pagelist Widget: The Wikisource Pagelist Widget is an OOUI based widget that streamlines the process of creating a pagelist for new (and existing) users of Wikisource.
While using the Pagelist widget, the user is presented with the picture of a scanned page and is asked to identify the page number on the scan. However, there is no option to zoom or pan the scanned image inside the Pagelist widget. Adding the option to zoom and/or pan the image will allow users to see the page number for scans that have a tiny font, or have lots of text (for example newspapers scans).
Yash Agrawal
Mentors: Sohom Datta, Sam Wilson, Satdeep Gill
Update the front-page of Wikimedia projects
Update the front-page of Wikimedia projects: The main aim of this project is that all the sister portals use the same build system and resources (wiktionary.org). By Using (templates, scripts, styles) of www.wikipedia.org, we can get the same build system to all sister portals.
Bhaarat Kumar Khatri
Mentors: Jan Drewniak
Autocompletion in Page Forms Spreadsheet display
Autocompletion in Page Forms Spreadsheet display: The Page Forms extension provides a spreadsheet-style editing display in two places-
1. In Special Pages: MultiPageEdit – Using this user can edit multiple pages using a particular template in a spreadsheet-style display, making it easier to modify the pages or add new pages.
2. In regular forms, with the form definition setting “display=spreadsheet”- Using this user can edit the values of different fields of multiple instances of the template within the same page in a spreadsheet-style display.
This auto-completion will make users use all possible forms of autocompletion that are present in Page Forms like “values from category,” “values from external data,” “values dependent on,” etc.
Most importantly, the autocompletion is not enabled using Select2 or jQuery, but rather OOUI’s Text Input Widget has been used for maintaining the consistency among the MediaWiki.
Yash Varshney
Mentors: Yaron Koren and Sahaj Khandelwal
Upgrade WebdriverIO to v7 in all repositories
Upgrade WebdriverIO to v7 in all repositories: Testing of software is important since it discovers defects/bugs before the delivery to the client, which guarantees the quality of the software. It makes the software more reliable and easy to use. Thoroughly tested software ensures reliable and high-performance software operation. Now to ease regression testing time and for better efficiency to validate complex scenarios automation is preferred.
Sahil Grewal
Mentors: Vidhi Mody and Soham Parekh
Retraining models from ORES to be deployable on Lift Wing
Retraining models from ORES to be deployable on Lift Wing: This project consisted of a renewed attempt to retrain the existing ores architecture and build new deep learning based techniques and how it improved the performance. This was a pilot project and revealed a lot of flaws in the existing pipelines of the ORES architecture.
Anubhav Sharma
Mentors: Christopher Albon and Chaitanya Mittal
Thank you!
A big thank you to everyone who participated in the outreach programs this year: mentees, mentors, and organizers! Thank you for sharing your time, effort, and expertise to make this one of the most successful years ever!
About this post
File: Dülmen, Brachliegendes Feld mit Wildblumen — 2021 — 9774.jpg, Dietmar Rabich, CC BY-SA 4.0