home

The Great Split

decomp adsv1.2

If we want to actually compile any code, we’re going to need to organize things a bit. Let’s start with some naive file splitting.

Deciding where to split

Last time, we cleaned up literal pools. These seem like a reasonable place to try splitting our files. While it’s not a guarantee that that’s where files were split (some files may have no literals, and other files might have literals inserted mid-way), it should work well enough for our general case. We can always manually merge files later.
The following code is mostly curtesy of ChatGPT:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import re

# Change these to match your input and output files
input_file = "repo/asm/code.s"
output_prefix = "test/split_"


# Compile regex pattern for finding the start of a function
func_start_pattern = re.compile(r"\w*_func_start sub_([0-9|A-F]*)")

# Compile regex pattern for finding file boundaries
boundary_pattern = re.compile(r"_080[0-9|A-F]* DCDU \w*")

# Read in the input file
with open(input_file, "r") as f:
    input_data = f.readlines()

start_line = 0
current_file = None
in_block = False

for idx, line in enumerate(input_data):
    result = re.search(func_start_pattern, line)
    if result and current_file == None:
        current_file = result.groups()

    result = re.search(boundary_pattern, line)
    if result:
        in_block = True
    elif in_block == True and line.strip() == "":
        output_file = output_file = output_prefix + current_file[0].strip() + ".s"
        with open(output_file, "w") as f:
            f.write("".join(input_data[start_line:idx]))
        start_line = idx
        current_file = None
        in_block = False

This creates a lot of .s files, which is great. What’s not great is that now we have a ton of .s files that we’re going to have to manage.

Fixing imports

With all of the functions in one file, they were able to reference each other easily. Now we’ve got to import and export everything. We also need to add in our macros import.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import re
import os

dir_path = 'test'
res = []

# Iterate directory
for path in os.listdir(dir_path):
    # check if current path is a file
    if os.path.isfile(os.path.join(dir_path, path)):
        res.append(path)

HEADER = '''\
    INCLUDE asm/macros.inc
    AREA text, CODE

'''

for f_name in res:
    with open(f"test/{f_name}", 'r') as f:
        base = f.readlines()    

    locations = []
    references = []

    for line in base:
        if 'func_start' in line:
            continue
        if line.startswith('	'):
            extract = r"(sub_[0-9|A-F]*)"
            result = re.search(extract, line)
            if not result:
                continue
            ref = result.groups()[0]
            references.append(ref)
        else:
            if ' ' in line:
                locations.append(line.split()[0].strip())
            else:
                locations.append(line.strip())
            if 'DCDU' in line and 'DCDU 0x' not in line:
                references.append(line.split()[-1])

    imports = sorted(list(set(references) - set(locations)))

    with open(f"test/{f_name}", 'r+') as f: 
        file_data = f.read()
        f.seek(0, 0)
        f.write(HEADER)
        for i in imports:
            f.write(f"\tIMPORT {i}\n")
        f.write(file_data)

Updating our linker script

Now we have to manually include every single file in our scatter_script.txt. There doesn’t seem to be any wildcards, and the linker does some dynamic shuffling of locations behind the scenes if the sizes aren’t perfect, so let’s just be very explicit about everything.

1
2
3
4
5
6
7
8
9
10
11
...
    .text2 0x08000210
    {
        split_8000210.o
        split_8000324.o
        split_800065C.o
        split_8000914.o
        split_8000BAC.o
        split_8000C7C.o
        split_8000D64.o
...

We’re finally at a point where we’re able to start working on actual decompilation. I’ll be leaving that for next post though. There’s going to be a lot to talk about, so I’ll be leaving this post off here.

© 2025 Abahbob   •  Theme  Moonwalk