Wednesday, February 1, 2017

MediaWiki as a static website and content sharing

Using wiki for knowledge management in a teams or individually is easy and often is an obvious choice.
       Challenges appear when you need to share information stored in the wiki. 
Challenges are: hardening MediaWiki installation for public access and partially sharing wiki content.

If your main goal is to just to publish content, you can extract wiki pages as a static html pages using relatively simple wget one-liner. After extracting, you can publish your wiki using AWS  S3 static web hosting.

To share only part of the information available in the wiki you can leverage Categoies and restrict user access to specified categories using special extension. Afterwards, you can use this user restricted access to grab wiki content.
Another simple way is to use Category special wiki page as a starting point for crawler to grab pages related to the specific category, let's say Public category.
The code is way shorter than all description above:

# get the wiki content
wget --recursive --level=1 --page-requisites --html-extension --no-directories --convert-links --no-parent -R "*Special*" -R "*action=*" -R "*printable=*"  -R "*oldid=*" -R "*title=Talk:*" -R "*limit=*" "http://mywikiprivate:80/wiki/index.php/Category:Public"
# replace sensitive by the link to the stub page
sed -i -E 's/http:\/\/mywikiprivate[^"]*/http:\/\/\/404.html/g' *.html 
# remove sensitive file
rm Category\:Public.1.html
# rename Public category pages to be an a list of published pages
mv Category:Public.html Public.html 
# sync content to AWS
aws s3 sync ./ s3://you_bucket/

Result of such script running along with some public notes from my wiki could be found here:

Disclaimer: current wiki publication contains only small part of the information available and will be updated on almost daily basis to add more content cleared for publishing. Main purpose of this wiki is to keep technical notes and references in the structured way. Some of them are obvious, outdated or incomplete.

Goal of the establishing public publishing process is to keep wiki information up-do-date  and have ability to publish small useful notes which does not fit blog format and style.