I've written before about how I use MediaWiki for taking notes and as one of my study tools. This has worked well for many years. But a problem started to develop: while I wrote my technical notes in MediaWiki, I wrote my day-to-day notes (books I want to read, notes from podcasts I listen to, and even my weekly planner) in Notion. This meant I had to use different apps for reading/writing in each tool, remember two different markup languages, and couldn't (cleanly) link pieces of content between the two. The final straw was realizing how much more effort I had to expend to maintain my MediaWiki instance; I just didn't have the time or will to keep up with new releases not to mention maintain the server itself.
For these reasons, I decided to move all of my MediaWiki content to Notion and unify all of my notes. But this revealed a new problem: there was no tooling to automate this. So I created my own. Here's how it works.
The solution operates in two phases.
- In phase one you export a dump of all content from MediaWiki and then parse and convert that dump to individual Markdown and media files. You organize the files based on how you want them to show up in Notion. You can also edit the content of the files or do additional processing as per your needs.
- In phase two you trigger the migration pipeline by placing one or more of the Markdown files (and any associated media files) at the input of the pipeline. The pipeline does further processing and then creates a new Notion page for each input file.
You repeat these phases as many times as necessary to get all of the content moved over and to refine the content and ensure everything looks as it should when it lands in Notion.
The solution architecture is shown in the diagram below. The phase 1 steps are shown in the "Manual steps" box at the top and the phase 2 steps are shown in the "AWS Cloud" box underneath.
Some of the features of the solution are:
- A serverless architecture (no more servers to manage or software updates to apply! 🥳)
- Imports embedded media such as images.
- Imports each article to the Notion page or database you specify.
A full list of features is in the solution's documentation which is linked below.
The solution is built to run on AWS and is deployed using the AWS Cloud Development Kit (CDK). You will need an AWS account and the CDK installed on your computer. You'll also need a computer running Python to parse and convert the MediaWiki dump and shell access to the MediaWiki server to create the dump of articles and media files (or some sort of equivalent access).
You can find full documentation and code in the solution's GitHub repo.