From hell to HTML: releasing a Python package to easily work with Wikimedia HTML dumps Announcing mwparserfromhtml, a new library that makes it easy to parse the HTML content of Wikipedia articles Continue reading “From hell to HTML: releasing a Python package to easily work with Wikimedia HTML dumps”…
Explore wiki project data faster with mwsql Learn how the mwsql library makes it easier to download and work with SQL dump files in formats like Pandas dataframes or CSV. Continue reading “Explore wiki project data faster with mwsql”…