User:FIQ/Restoring DynaHack saves

From NetHackWiki
Jump to navigation Jump to search

Personal notes for how to restore DynaHack saves. I do not take any responsibility for what you do with this!

* There might be binary save data at the bottom. Get rid of it
  - Search for "abcdefghijkl" to easily jump to the beginning of it
  - The last line should be a diff (~ f:(...))
  - Don't omit the newline at the end!
* Sample first line: NHGAME inpr 00e2e4a1 00008811 0.6.0
  - Make sure it says "inpr" before loading it
  - The game can mess up and make the file worse by failing to load it, so back it up each time before load!
  - The first hexadecimal number is the file size. Adjust it after removing the binary stuff
  - The second one denotes turncount. You don't need to care about this (Replay mode uses it)
* Back up the save after adjusting it as per above and attempt to load it
  - If you get something akin to "Error at position 38062" (somewhat low number), try this:
    - Figure out the largest line in the save file
      - Try this: perl -ne 'print length()."  line $.  $_"' ~/.config/DynaHack/save/THESAVE.nhgame | sort -nr | head -n 1
      - Output: (size of line)  line (line)  (content)
    - Go to file position as per given above (38062 in this example)
      - Emacs: goto-char
    - Remove data starting from the diff. If the line is not a diff, remove starting from last diff above
    - Remove up until the diff given by (line) above
    - Remember to fix file size data at the top
    - The reason this can work is a seperate DynaHack bug: whenever a player loads a save, the first diff after the load
      will essentially contain the whole game's data for some reason (it's a diff from null data). We abuse this and
      remove corrupted diff data starting from when the game errors, and the game will, by loading the null diff,
      get all the correct data in the game and then happily continue with its diffing.
    - If the load fails, find the 2nd largest line. How to? Go figure out by yourself
    - If you're still failing, try the method below
  - If you get errors about the end:
    - Remove the last diff in the file and everything above it until the 2nd last diff
    - Remember to fix file size data at the top
    - This basically rollbacks the game one turn. If the error is about the very end, this should fix things.
    - If the load fails, try to remove the 2nd last diff, etc...
    - If you're still failing, try the method above

Things that cause a DynaHack save to lose sync

  • Saving and restoring a game (always)
  • Recovering from blindness (MTAG_OBJ, byte 39 [obj->ox]). Verification: See game log from Aug 28, 2020 23:30ish UTC. FIQ wiz-vam-mal-cha killed by an invisible gnome lord after 1934 moves (desync on replay action 470/609). No save+restore was made before this action.

Finding desyncs

Play the game, avoiding existing known desyncs (meaning take care not to save+restore!) until you inevitably die. Afterwards, invoke "view replay" and use Page Down to rapidly move 50 actions into the future each step. Once you see something like "desync between recording and save at tag (16, 275) + 39 bytes", take note of the replay action counter. Leave replay mode and re-enter that action, then scroll with Right in single-action steps until you see the same error again. Try to figure out what you just did.

Replay errors are read as such: "desync between recording and save at tag (N, M) + L bytes". Here, M is object/monster/etc ID and may not always show, N is the MTAG (as seen in libnitrohack/include/decl.h -- keep in mind that MTAGs start at 0), and L is the byte discrepancy. To figure out the actual byte, search in the code for the MTAG you found earlier as such: mtag(mf, (something), MTAG_xxx) where "(something)" depends on the tag. Count the bytes by tracking mwrite()s. mwrite32 is 4 bytes, mwrite16 is 2, mwrite8 is 1 byte. For example, "desync between recording and save at tag (16, 275) + 39 bytes" means that MTAG_OBJ (which concern objects) with ID 275 (in game) desynced at byte 39, which, if we count bytes in save_obj() in mkobj.c turns out to be the "ox" field.