Sending messages to Wiki users in their preferred language
By Abijeet Patro, Senior Software Engineer and Niklas Laxström, Staff Software Engineer, the Wikimedia Foundation
Summary
The MassMessage extension allows users to send messages to a list of pages, including users’ talk pages, across a wiki farm. It is used to send out newsletters and important communications to the Wikimedia communities. An example is the weekly Tech News newsletter to subscribers across Wikipedias and Wikimedia wikis.
Recently, we made some improvements to the MassMessage extension:
- Users can now send a wiki page or select sections of a wiki page as a message.
- If a wiki page is a translatable page, the message will be sent in the target page language.
In this post, we’ll describe why we started this work, what the primary objectives were, how they were implemented, and we’ll share inputs from the Community Relations Specialists (CRS) team that helped us improve on the work that we had done.
Background
The Language team’s annual plan for FY 2019–20 had an objective to support other WMF teams. As a part of that effort, we reached out to various teams via a survey, to learn and understand their needs and challenges related to translation and supporting many languages.
After the survey was complete, we triaged the requests we received to prioritize items based on their clarity, effort, and impact in order to start working on them during FY 2020–2021. The Community Relations Specialists team filed the following Phabricator task with their comments:
T165128 Allow delivery of multilingual content in fallback languages.
MassMessage can deliver multilingual content but always falls back on English when the former isn’t available. We have other, important fallback languages that could be used at several wikis instead (like Spanish, Russian, French, etc.). We should be able to use them.
The Community Relations Specialists (CRS) team uses the MassMessage extension to send out messages and newsletters, such as Tech News. Tech News is a newsletter that helps subscribers monitor and understand all the technical activity happening across the Wikimedia movement. At the time of writing, it has around 1000 subscribers from across the world and is published on community pages and elsewhere that reach thousands of other editors. It gets translated into different languages, and when it is published, each subscribed user receives the message on their talk page in their preferred language. The initial request from the CRS team was to add support for fallback languages, which would benefit both the newsletter’s audiences and folks attempting to reach out to Wikimedia’s community members in the languages they understand.
What does that mean? Imagine that a message needs to be delivered to the Village Pump in Portuguese wiki, Esplanada, but the message has not yet been translated to Portuguese. In the past, our system would have sent the English version, which is the source language, even if the message was available in a more closely related fallback language such as Brazilian Portuguese. Now, the message will be delivered in Brazilian Portuguese instead of English.
Improvements
Upon reviewing the existing process for getting the Tech News translated and delivered, we realized that there was scope for other general improvements that could reduce manual work and the potential for errors for multilingual MassMessage deliveries.
To summarize, the steps that were being followed before the improvements were:
- Create a translatable page with the Tech News issue
- Translators submit the translations
- Gather the translation into a message using a Lua script and a sandbox page
- Manually control that all languages have been included
- Send the multilingual message with MassMessage
Going through the Lua script revealed that this approach used ParserFunction’s #switch method, substitution, and the {{PAGELANGUAGE}}
magic word. Since neither of these extensions have any knowledge of fallback languages, that feature was not supported.
Allow sending wiki pages as a message
Our primary goal was to make it easy for the CRS team to send out Tech News using the MassMessage extension while keeping the features generic for others. Following discussions with Johan Jönsson, we decided that the first step was to allow sending wiki pages, including translatable pages, as messages via the MassMessage extension.
The other option that we considered was to update the Lua script to add fallback support, but that would not have reduced the manual work that was being done. It also made sense for this feature to be present in the MassMessage extension itself.
The user interface for the MassMessage extension was updated to allow senders to select a page to be sent as a message. The mass message request is sent via the MediaWiki job queue to different wikis (or the same wiki) based on the delivery list. The data for each request includes a flag to identify if the page being sent as a message is a translatable page. On the target wiki, the page content is fetched by calling the API of the source wiki. The content displayed to the subscribed user depends on the type of page being sent as a message,
- For a normal wiki page, the content of the page is displayed directly.
- For a translatable page, the language of the target page is used to identify the content that will be displayed. The target wiki calls the source wiki API to fetch the page in the target page language or its fallbacks, resorting to using the source language in case none of the previous languages are available.
Based on the approach above, the message is rebuilt and posted on the target wiki, including any custom message added by the sender. We decided to use the API to fetch the page content from the source wiki in order to avoid sending large payloads via the job queue.
Once these changes were deployed and Johan had had a chance to test them, we identified two other improvements that needed to be done.
Indicating appropriate language for the message content
Certain sections of a translatable page being sent as a message may not have been translated. These will be delivered to the target page in the source language of the page. Such sections of the page need to be tagged in respective dir
and lang
HTML attributes so that they are displayed in a supported font with the proper text direction and text to speech functionality.
This improvement was done in the Translate extension because these issues are relevant for translatable pages as well. All untranslated sections are wrapped in appropriate dir and lang attributes. To avoid breaking existing translatable pages that do not expect this behavior, we added versioning to translatable pages. The tagging is enabled by default for new pages, and existing pages can opt in to the new behavior. When rendering the translatable page, we use the version of the translatable page to decide whether to apply the tagging behavior. Additionally, we added a nowrap attribute that would skip any wrapping in case that was the desired behavior—for example, when HTML is not allowed.
Another case that was considered for wrapping was if a fallback language is used. This happens if the translatable page is not translated to the target language, but available in one of the fallback languages (Eg: Portuguese Brazilian is not available, but Portuguese is). In this case, the entire message is wrapped in an appropriate lang
and dir
attribute since the target page has a different language from the message. This change was made in the MassMessage extension.
Add support for sending sections of a page
Translatable pages may contain various things such as the language selector, instructions on how to translate, categories, etc which do not make sense to include in the message that is sent.
Previously, Tech News used the <section>
tag to mark the contents of the page that were to be sent as a message. The Lua script would receive this tag as an argument and would only send that part of the page as a message. The <section begin/end>
tag is provided by MediaWiki-extensions-LabeledSectionTransclusion.
Even though the LabeledSectionTransclusion extension is available on the MediaWiki wiki family, we decided to re-implement this functionality into the MassMessage extension for two reasons:
- Using the LabeledSectionTransclusion extension would require large rework in both extensions due to the LabeledSectionTransclusion being tightly coupled with the Parser, which is not used by the MassMessage extension.
- We did not want to add a dependency on another extension.
When a page is selected to be sent as a message, we parse its content and identify any <section>
tags that are present. We allow the sender to select one of these sections, the contents of which are then sent as the message.
Challenges
The most challenging aspect of the development was not being able to simulate a wiki family locally in our development environment to test inter-wiki message sending. Deployment and testing of minor fixes required waiting for the deployment train to run, resulting in development cycles lasting multiple weeks. We started working on this feature towards the end of March 2020, and the final patch related to it was merged in March 2021.
Outcome
The Tech News delivery process now involves:
- Create a translatable page that has the Tech News
- Mark the section to be sent as a message in
<section>
tags - Translators submit the translations
- While sending the message, select the translatable page and section
The new features reduce the amount of manual work and the potential for errors while distributing multilingual MassMessages to the Wikimedia communities, including newsletters such as Tech News. The architecture was simplified by cutting down on the number of different tools that are needed. Other improvements were identified and made to how translated and untranslated content are shown together.
Partnering with other teams while approaching problems from a generic perspective helped to solve the specific needs of the CRS team and to improve our localization infrastructure for others. We would like to thank Johan Jönsson and the CRS team for their feedback and patience while we were working on this feature.
About this post
Featured image credit: Microphone in hall, Tim Napier, CC0 1.0