The conversion of Treasure Island Dizzy for the Sam Coupe is coming to an end, so Ive started thinking about other projects. There are many other games I would like to convert to the Sam, ill probably run a poll about that at some point, however one game that I have always loved is Target Renegade. The other day there was discussion about remaking it for the ZX Spectrum Next, so I thought I would have a quick look at reverse engineering the spectrum version.
Ive reverse engineered a few titles now, so I thought I would document the process. The first thing was to grab a copy of the game from World Of Spectrum, I usually go for a .TAP version if I can as several of the tools work with this format. With a version of the game to hand I have to get the main code files extracted so I can disassemble them, you could just load the game into an emulator and save the entire of RAM as a binary file, however with games using self modifying code you wouldnt know whats already happened, also you wouldnt be sure of the correct startup process. The solution to this is to debug through the entire loading process, bring in some of the great tools from http://zx-modules.de/ In this case im going to use ZX-Editor to look at the basic program that is first loaded. Once the TAP file has been loaded in you are presented with the following…
Hopefully you can all follow what this program does, the important information here is I could see what files get loaded where. In the case of this game we can see the second file loaded “Target2” is the loading screen, “Target3” appears to be the main code file. The interesting elements of this are the fact that following the main load it then clears the screen and loads “Target4” to the screen address (16384), now this could be another image or it could be additional data, some programs would use the screen address for temporary data. After “Target4” has been loaded it runs some machine code at address 25086, loads the last file “Target5”, again to part of the screen (16470) and runs another machine code program at 25089.
With the information of the basic program I can now look to extract the require code files, I can ignore “Target2” as that is just the loading image. Using another module from zx-modules.de, this time ZX-BlockEditor, I can get a few more bits of information along with the ability to extract the files as binary. I loaded up the .TAP into ZX-BlockEditor and the following information is available.
In the basic program I saw that “Target3”, which appeared to be the main code file, was loaded into its default position. This new information shows that it is loaded into memory address 25048, this would make sense given the addresses that are run using the randomise usr functions. From within ZX-BlockEditor I can extract the files as binary onto the PC, ready to disassemble.
The obvious place to start disassembling the game code from is 25086 (&61FE) as this is the first line executed by the randomise usr function. After loading the binary file saved out for “Target3” into an emulator or disassembler, making sure to load it to address 25048, I can see that the code at &61FE is
&61FE jp &6318
Not the most useful other than to say I need to be looking at &6318, this gives me
&6318 di &6319 push ix &631B ld hl, (&61F7) &631E ld ix, (&61EF) &6322 ld de, 0 &6325 call &6387 &6328 pop ix &632A ld ix, &4000 &632E push ix &6330 pop hl &6331 ld de, (&61F5) &6335 add hl, de &6336 ld de, &5B00 &6339 call &6387 &633C xor a &633D call &62D3 &6340 ld a, (1) &6343 cp &0AF &6345 jr z, &634A &6347 call &62B1 &634A in a, (&9F) &634C ld a, (0) &634F cp &0C3 &6351 jp z, &71 &6354 ret
Now this gives me a little more to start working from, I know we are in some sort of start up routine and I assumed its going to return to basic at some point as it has to load “Target5” and call the next machine code routine. This little functions has 3 additional function calls to look at, one sets up a variable with number I always like to see namely &4000 as its the screen address (reverse engineering software is always easiest from anything that draws to the screen as you hope to find drawing functions), the only slight issue here is I need to remember that “Target4” was loaded to this address so it could be more setup data. The 3 subroutines that I now need to investigate are &6387, &62D3 and &62B1, may as well start from the top.
There are a few variables setup before calling &6387, HL, DE and IX. HL is setup with the word at memory location &61F7 which is &EFFA which happens to be the end of the loaded code binary (“Target3” was loaded into 25048 and is 36386 in length giving 61434 or &EFFA). IX is loaded from &61EF which is &6430, no clues here, and DE is 0… best look at that function.
&6387 dec hl &6388 dec de &6389 dec ix &638B and a &638C sbc hl, de &638E add hl, de &638F ret z
Ok, so decrease everything (Things like this make me wonder why not just supply values one byte less, but there you go), and then a sanity check to make sure HL and DE are not the same number.. in which case SBC HL,DE would set the Z flag. The next address &6390 is jumped back to at a couple of points…
&6390 push hl &6391 push de &6392 push ix &6394 pop de &6395 and a &6396 sbc hl, de &6398 pop de &6399 pop hl &639A ret z
And this chunk of code does little more than check if HL and IX are the same value and return if they are, so something in here keeps looping back until HL and IX are the same…
&639B ld a, (hl) &639C push af &639D cp &0CB &639F dec hl &63A0 jr z, &63A7 &63A2 pop af &63A3 ld (de), a &63A4 dec de &63A5 jr &6390
Now this is getting to some interesting code.. its writing information to DE, remembering this was 0 on entry, but we decreased by -1, it looks like DE is the end of some sort of buffer that this routine fills up. In the case of this chunk of code, it reads a byte from HL (So HL is some sort of data buffer), if that byte ISNT &CB, it writes the byte to DE and loops around… what happens if it IS &CB…
&63A7 ld a, (hl) &63A8 cp &0ED &63AA jr nz, &63A2
Well.. if it was &CB but it wasnt &ED it still write out the value to DE.. so to not just copy the byte from HL to DE the value needs to be &CB then &ED.. Lets carry on and see what happens if the two bytes are &CB, &ED..
&63AC dec hl &63AD ld a, &37 &63AF cp (hl) &63B0 jr z, &63B5 &63B2 inc hl &63B3 jr &63A2
Im thinking this is alot of bytes to check for a command string as we are checking the 3rd byte from HL is now &37, If its not &37 the pointer at HL is put back again and we just write out the initial &CB. At this point I know this routine will copy a byte from HL to DE unless the values are HL are &CB, &ED, &37… Each access is followed by a decrease, so its little surprise that HL was put to the end of the data file “Target3” and DE is the end of the output buffer in this case 0 (as the first thing this function did was -1 so DE will be &FFFF, the end of memory). So.. copy all the data from the loaded file to &FFFF byte by byte unless its &CB, &ED, &37 OR HL and IX is the same (remember that little check at the start of the loop), So HL is the end of the data (remember its using dec, so its working backwards) and IX must be the start of the data. This function is running through a data buffer from HL to IX writing data out to DE, the only thing to work out now is WHAT happens when the commands are &CB, &ED, &37..
&63B5 pop af &63B6 dec hl &63B7 ld a, (hl) &63B8 dec hl &63B9 ld b, (hl) &63BA dec hl &63BB ld c, (hl) &63BC dec hl &63BD ld (de), a &63BE dec de &63BF dec bc &63C0 ex af, af' &63C1 ld a, b &63C2 or c &63C3 jr z, &6390 &63C5 ex af, af' &63C6 jr &63BD
Well, that answers that.. If the source data (pointed to by HL) is &CB, &ED, &37, it then reads the next 3 bytes into A, B and C, it then writes the value A into the buffer pointed at byte DE, BC number of times. This is a sort of byte run compression system.. its working backwards so it can write as this allows the decompression system to store the compressed data at the start of the buffer it wants to decompress into, however it does seem excessive to use 3 bytes to issue a “Copy a run of bytes” command, I appreciate it needed a sequence that wouldnt happen, but a standard byte run compression system handles this differently. Id love to know if there was a reason for this, or if its just because things like byte-run compression wasnt really known about at the time this was written…
Having decoded this first function we can now assume &6387 is a decompression routine the inputs being HL = the end of the compresse data, IX being the start of the compressed data and DE being the end of the output buffer. Its called twice, once to decode the compressed data stored at the end of the “Target3” file to the end of memory and then again to decode the data file “Target4” onto the screen…
After the two decoding calls the game calls &62D3, this is a short function that does…
62D3 ld bc, &7FFD 62D6 out (c), a 62D8 ret
This is nothing more than a function to set the paging register on 128k machines, its called with A=0 which simply pages in page 0 to &c000 (this is the default) and asked for the 128k ROM at address 0. After the call it checked memory address 1 is &AF, if it isnt it calls &62B1, this is simply checking to see if after asking for the 128k rom that it got it, on a 48k machine it will still be &AF where as on a 128K machine this value will be different. Now im not interested in the difference between 128k and 48k, in this game its just music and the fact it loads all the game into ram at once, i want the core game stuff, so im going to ignore this 128k only call for now.
The last little bit puzzles me, maybe someone can throw some light on it, it inputs from port &9f which is a port im not familiar with (unless my memory is failing me, possible 😉 ), it then reads from byte 0 and if its &c3 it calls memory address &71, ill have to check whats there, but on a normal 48k machine byte 0 wont be &C3…
So, at this point I have decoded a couple of data files for the game, ill need to save out the output from that before I disassemble more, also ill have to research this last little bit, but for now happy coding until next time. Any questions feel free to shout.
NOTE: One thing that has been mentioned is that the TZX version is slightly different, ill check what happens on that and see.. more on that next time