The conversion of Treasure Island Dizzy for the Sam Coupe is coming to an end, so Ive started thinking about other projects.  There are many other games I would like to convert to the Sam, ill probably run a poll about that at some point, however one game that I have always loved is Target Renegade.  The other day there was discussion about remaking it for the ZX Spectrum Next, so I thought I would have a quick look at reverse engineering the spectrum version.

Ive reverse engineered a few titles now, so I thought I would document the process.  The first thing was to grab a copy of the game from World Of Spectrum, I usually go for a .TAP version if I can as several of the tools work with this format.  With a version of the game to hand I have to get the main code files extracted so I can disassemble them, you could just load the game into an emulator and save the entire of RAM as a binary file, however with games using self modifying code you wouldnt know whats already happened, also you wouldnt be sure of the correct startup process.  The solution to this is to debug through the entire loading process, bring in some of the great tools from http://zx-modules.de/  In this case im going to use ZX-Editor to look at the basic program that is first loaded.  Once the TAP file has been loaded in you are presented with the following…

Hopefully you can all follow what this program does, the important information here is I could see what files get loaded where.  In the case of this game we can see the second file loaded “Target2” is the loading screen, “Target3” appears to be the main code file.  The interesting elements of this are the fact that following the main load it then clears the screen and loads “Target4” to the screen address (16384), now this could be another image or it could be additional data, some programs would use the screen address for temporary data.  After “Target4” has been loaded it runs some machine code at address 25086, loads the last file “Target5”, again to part of the screen (16470) and runs another machine code program at 25089.

With the information of the basic program I can now look to extract the require code files, I can ignore “Target2” as that is just the loading image.  Using another module from zx-modules.de, this time ZX-BlockEditor, I can get a few more bits of information along with the ability to extract the files as binary.  I loaded up the .TAP into ZX-BlockEditor and the following information is available.

In the basic program I saw that “Target3”, which appeared to be the main code file, was loaded into its default position.  This new information shows that it is loaded into memory address 25048, this would make sense given the addresses that are run using the randomise usr functions.  From within ZX-BlockEditor I can extract the files as binary onto the PC, ready to disassemble.

The obvious place to start disassembling the game code from is 25086 (&61FE) as this is the first line executed by the randomise usr function. After loading the binary file saved out for “Target3” into an emulator or disassembler, making sure to load it to address 25048, I can see that the code at &61FE is

&61FE                 jp      &6318

Not the most useful other than to say I need to be looking at &6318, this gives me

&6318                 di
&6319                 push    ix
&631B                 ld      hl, (&61F7)
&631E                 ld      ix, (&61EF)
&6322                 ld      de, 0
&6325                 call    &6387
&6328                 pop     ix
&632A                 ld      ix, &4000
&632E                 push    ix
&6330                 pop     hl
&6331                 ld      de, (&61F5)
&6335                 add     hl, de
&6336                 ld      de, &5B00
&6339                 call    &6387
&633C                 xor     a
&633D                 call    &62D3
&6340                 ld      a, (1)
&6343                 cp      &0AF
&6345                 jr      z, &634A
&6347                 call    &62B1
&634A                 in      a, (&9F)
&634C                 ld      a, (0)
&634F                 cp      &0C3
&6351                 jp      z, &71
&6354                 ret

Now this gives me a little more to start working from, I know we are in some sort of start up routine and I assumed its going to return to basic at some point as it has to load “Target5” and call the next machine code routine.  This little functions has 3 additional function calls to look at, one sets up a variable with number I always like to see namely &4000 as its the screen address (reverse engineering software is always easiest from anything that draws to the screen as you hope to find drawing functions), the only slight issue here is I need to remember  that “Target4” was loaded to this address so it could be more setup data.  The 3 subroutines that I now need to investigate are &6387, &62D3 and &62B1, may as well start from the top.

There are a few variables setup before calling &6387, HL, DE and IX.  HL is setup with the word at memory location &61F7 which is &EFFA which happens to be the end of the loaded code binary (“Target3” was loaded into 25048 and is 36386 in length giving 61434 or &EFFA).  IX is loaded from &61EF which is &6430, no clues here, and DE is 0… best look at that function.

&6387                 dec     hl
&6388                 dec     de
&6389                 dec     ix
&638B                 and     a
&638C                 sbc     hl, de
&638E                 add     hl, de
&638F                 ret     z

Ok, so decrease everything (Things like this make me wonder why not just supply values one byte less, but there you go), and then a sanity check to make sure HL and DE are not the same number.. in which case SBC HL,DE would set the Z flag.  The next address &6390 is jumped back to at a couple of points…

&6390                 push    hl
&6391                 push    de
&6392                 push    ix
&6394                 pop     de
&6395                 and     a
&6396                 sbc     hl, de
&6398                 pop     de
&6399                 pop     hl
&639A                 ret     z

And this chunk of code does little more than check if HL and IX are the same value and return if they are, so something in here keeps looping back until HL and IX are the same…

&639B                 ld      a, (hl)
&639C                 push    af
&639D                 cp      &0CB
&639F                 dec     hl
&63A0                 jr      z, &63A7

&63A2                 pop     af
&63A3                 ld      (de), a
&63A4                 dec     de
&63A5                 jr      &6390

Now this is getting to some interesting code.. its writing information to DE, remembering this was 0 on entry, but we decreased by -1, it looks like DE is the end of some sort of buffer that this routine fills up. In the case of this chunk of code, it reads a byte from HL (So HL is some sort of data buffer), if that byte ISNT &CB, it writes the byte to DE and loops around… what happens if it IS &CB…

&63A7                 ld      a, (hl)
&63A8                 cp      &0ED
&63AA                 jr      nz, &63A2

Well.. if it was &CB but it wasnt &ED it still write out the value to DE.. so to not just copy the byte from HL to DE the value needs to be &CB then &ED.. Lets carry on and see what happens if the two bytes are &CB, &ED..

&63AC                 dec     hl
&63AD                 ld      a, &37
&63AF                 cp      (hl)
&63B0                 jr      z, &63B5
&63B2                 inc     hl
&63B3                 jr      &63A2

Im thinking this is alot of bytes to check for a command string as we are checking the 3rd byte from HL is now &37, If its not &37 the pointer at HL is put back again and we just write out the initial &CB. At this point I know this routine will copy a byte from HL to DE unless the values are HL are &CB, &ED, &37… Each access is followed by a decrease, so its little surprise that HL was put to the end of the data file “Target3” and DE is the end of the output buffer in this case 0 (as the first thing this function did was -1 so DE will be &FFFF, the end of memory). So.. copy all the data from the loaded file to &FFFF byte by byte unless its &CB, &ED, &37 OR HL and IX is the same (remember that little check at the start of the loop), So HL is the end of the data (remember its using dec, so its working backwards) and IX must be the start of the data. This function is running through a data buffer from HL to IX writing data out to DE, the only thing to work out now is WHAT happens when the commands are &CB, &ED, &37..

&63B5                 pop     af
&63B6                 dec     hl
&63B7                 ld      a, (hl)
&63B8                 dec     hl
&63B9                 ld      b, (hl)
&63BA                 dec     hl
&63BB                 ld      c, (hl)
&63BC                 dec     hl

&63BD                 ld      (de), a
&63BE                 dec     de
&63BF                 dec     bc
&63C0                 ex      af, af'
&63C1                 ld      a, b
&63C2                 or      c
&63C3                 jr      z, &6390
&63C5                 ex      af, af'
&63C6                 jr      &63BD

Well, that answers that.. If the source data (pointed to by HL) is &CB, &ED, &37, it then reads the next 3 bytes into A, B and C, it then writes the value A into the buffer pointed at byte DE, BC number of times. This is a sort of byte run compression system.. its working backwards so it can write as this allows the decompression system to store the compressed data at the start of the buffer it wants to decompress into, however it does seem excessive to use 3 bytes to issue a “Copy a run of bytes” command, I appreciate it needed a sequence that wouldnt happen, but a standard byte run compression system handles this differently. Id love to know if there was a reason for this, or if its just because things like byte-run compression wasnt really known about at the time this was written…

Having decoded this first function we can now assume &6387 is a decompression routine the inputs being HL = the end of the compresse data, IX being the start of the compressed data and DE being the end of the output buffer. Its called twice, once to decode the compressed data stored at  the end of the “Target3” file to the end of memory and then again to decode the data file “Target4” onto the screen…

After the two decoding calls the game calls &62D3, this is a short function that does…

62D3                 ld      bc, &7FFD
62D6                 out     (c), a
62D8                 ret

This is nothing more than a function to set the paging register on 128k machines, its called with A=0 which simply pages in page 0 to &c000 (this is the default) and asked for the 128k ROM at address 0. After the call it checked memory address 1 is &AF, if it isnt it calls &62B1, this is simply checking to see if after asking for the 128k rom that it got it, on a 48k machine it will still be &AF where as on a 128K machine this value will be different. Now im not interested in the difference between 128k and 48k, in this game its just music and the fact it loads all the game into ram at once, i want the core game stuff, so im going to ignore this 128k only call for now.

The last little bit puzzles me, maybe someone can throw some light on it, it inputs from port &9f which is a port im not familiar with (unless my memory is failing me, possible 😉 ), it then reads from byte 0 and if its &c3 it calls memory address &71, ill have to check whats there, but on a normal 48k machine byte 0 wont be &C3…

So, at this point I have decoded a couple of data files for the game, ill need to save out the output from that before I disassemble more, also ill have to research this last little bit, but for now happy coding until next time. Any questions feel free to shout.

NOTE: One thing that has been mentioned is that the TZX version is slightly different, ill check what happens on that and see.. more on that next time