Forum:Download the NetHackWiki

From NetHackWiki
Jump to navigation Jump to search


Hello!

It's now possible to download an XML dump of the articles on this wiki and the images, without resorting to scraping the site. The packages are updated once a week. --paxed (talk) 14:45, 5 August 2013 (UTC)

Offline wiki

Thanks!

I sometimes play Nethack without Internet access on my old netbook and I'd like to have the wiki available offline. Is there a light-weight way of using this XML dump on Linux? Installing a local web server with Mediawiki and everything is not an option for a machine this old. Or would you rather recommend recursively wgetting the whole nethackwiki.com for offline browsing or something else? --88.115.0.97 16:09, 13 January 2014 (UTC)

I don't know the answer to your question per se, but honestly, people run nginx + MySQL + mediawiki on 128MB VPSes (I have). Of course, you can't support much concurrency that way and the MySQL is tuned way down but it doesn't take a lot to run MW for a single user. You could ask on vpsboard.com or lowendtalk.com for advice on running small instances. Hope that helps. Raindog308 (talk) 03:45, 16 January 2014 (UTC)
I'm a different user, but I agree that installing a web server plus MySQL seems a bit overkill for offline reading of the NetHack wiki, given that I wouldn't use these applications for anything else. Are there any better solutions? - 65.122.198.30 06:34, 9 March 2014 (UTC)
You can try feeding the dump to a Perl script using modules [1] and [2], thus transforming a giant XML into a giant HTML or a series of small HTMLs. 95.165.173.36 15:38, 27 May 2015 (UTC)
http://zdoom.org/wiki/Downloading_the_ZDoom_Wiki. ZzZombo (talk) 14:44, 4 June 2015 (UTC)


Hi paxed, Is it possible to provide an openZIM formatted copy of NetHackWiki in addition to the XML dump? It would eliminate the headache of installing LAMP and mediawiki. Thanks. --Deadagain (talk) 13:49, 12 June 2015 (UTC)

OpenZIM sounds like a good idea. What about providing a torrent too? With a RSS torrent feed, we could even update it automatically. --Quantum Immortal (talk) 06:18, 14 June 2015 (UTC)
ZIM would be nice (howto). Xowa doesn't work on my Debian Testing and seems unmaintained (last official release fall 2020). ZIM appears to be a gold standard for offline website retrieval. I did attempt to use mwoffliner however the rest api endpoint does not work, so the script cannot download page HTML. As an alternative I tried wget-2-zim but kept getting firewalled for whatever reason. We could benefit from ZIM. Wiki dumps are insufficient to ensure nethackwiki is accurately archived for offline use... Winny (talk) 17:42, 19 August 2024 (UTC)
We were discussing this matter on the roguelikes Discord in #nethack. Turns out somebody did make a ZIM archive using mwoffliner back in 2022. Unfortunately mwoffliner no longer works because of the aforementioned rest api endpoint unavailable for rendering HTML. In the meantime I did try zimit, but the archive it generated wasn't complete (the official zimit instance mentions a limit on requests...). I'm still resuming the wget-2-zim script occasionally, oft triggering some sort of automatic firewall after 1-10 minutes fetches, then suspending the job in order to pick it up later. Any suggestions appreciated. Winny (talk) 19:09, 22 August 2024 (UTC)
As I wrote on roguelikes discord, the problem lies in the absence of VisualEditor. The mwoffliner 1.13.0 doesn't detect this, but the latest commit of it does. It is also briefly discussed at https://github.com/openzim/mwoffliner/issues/1963. --rengaf/Tmp6154 (talk) 21:07, 22 August 2024 (UTC)

Oooh, thank you very much! Now I can throw away that ugly wiki scraper I had lying around. :D ——KeyboardFire (talk) 07:37, 19 June 2015 (UTC)