Pawing around with PAWS: recent updates to Wikimedia Cloud Services’ Jupyter notebooks instance
By Chico Venancio and the Wikimedia Cloud Services Team
As you may have read in a previous blog post, over the last few quarters we have had the opportunity to dedicate more resources to some of the tools that the community uses. First, it was Quarry, a public querying service for Wikireplicas, which got some nice background upgrades to increase stability, along with some user-facing features to increase usability.
This time around PAWS is getting our attention. PAWS: A web shell is a Jupyter notebook deployment for Wikimedia contributors that is hosted by Wikimedia Cloud Services. A Jupyter notebook is an open-source tool that makes it possible to build, test, run and share code and to interact with a shell—all from your web browser.
.md and .rst file rendering
On the user-facing side, we’ve added .md and .rst file rendering! Previously, only .html and .ipynb files could be interpreted and rendered in their format. All other text files appeared as plain text. Now, .md and .rst files have joined them and will render as well, rather than appearing as plain text. This can be done by following the “Alter the URL” instructions.
Additionally, Julia users rejoice! PAWS is now shipping with the Julia kernel. A Julia notebook can now be deployed by selecting “Julia 1.6.3” from the “New” dropdown.
Add Paws-public links to notebooks
We have developed and deployed a new Jupyterlab extension to add PAWS-public links to notebooks and to create them from the context menu. This is in preparation for us to change the default PAWS interface from the deprecated Jupyter Notebook Classic.
You can try Jupyterlab right now by changing the URL ending from “
/tree” to “
Updated Jupyter version
And, on the backend side, we started with an update to the PAWS Jupyter version. This quickly showed us that our minikube deploy was quite out of date, and some other bits would need upgrading. In particular, we discovered that newer versions of minikube had upgraded their Ingress container, and loading MediaWiki inside minikube was no longer working on an upgraded minikube.
We got the minikube development environment running initially by running an old Minikube. This was not ideal. Fortunately, the newer Jupyter that we were upgrading to could handle the Ingress changes, and pulling the Mediawiki deploy in favor of using the actual Mediawiki got us running on the most recent minikube, (at the time, v1.23.2), with less code due to dropping the local Mediawiki deploy. This removed many manual steps in setting up the local Mediawiki during the minikube deploy and made the minikube deploy better resemble the production deploy.
We innocently started with a plan to do nothing more than upgrade Jupyter, but our efforts quickly grew. First, we turned to the more practical matter of making PAWS easy to deploy to minikube while getting Jupyterhub updated as well. This improves our ability to test features before they go live and ensures that community members are able to more easily engage in the PAWS project. The introduced changes and upgrades to the minikube deploy allow for a deploy to a modern minikube without the setup dance that was involved when Mediawiki was being installed locally to minikube. And, perhaps of greater importance, PAWS will actually install to minikube, where previously it would not. 😊
Contribute to PAWS and follow additional updates
For those technically minded contributors who are interested in hacking on PAWS, there is a PAWS GitHub repo where you can have a look at the code. If there are any changes you would like to see and have an interest in Python, please put in a pull request!
Travis updated their CI policy, and we ran out of computer time over at their service. We moved over to GitHub Actions. While we were at it, we added a few features. Now when you open a Pull Request your code will be built automatically, and a commit will be added to your branch that updates the helm chart to deploy the container you just built by opening the PR! At that point, anyone should be able to test in minikube by pulling down your branch and running a Helm upgrade as described in the README.md. Learn more over at the GitHub repo!
And for anyone watching very closely. We’ve upgraded the Kubernetes cluster that hosts PAWS to K8s 1.20.
More work is still going on in the background for PAWS. Keep an eye out for new features, and stability improvements, and perhaps, automatic deployment on a merge. Let us know what you would like to see!
In the meantime, if you are interested, please visit the PAWS Phabricator project. There you will be able to see, or open tickets related to PAWS. If there is a feature you are interested in, or a bug you have found to open a ticket, or if there is a ticket that already exists that you are interested in working on, don’t hesitate. Go ahead and claim it!
About this post
Featured image credit: Bobcat in Yosemite, “Mike” Michael L. Baird, CC BY 2.0