Skip to content

Latest commit

 

History

History
130 lines (82 loc) · 13.3 KB

README.md

File metadata and controls

130 lines (82 loc) · 13.3 KB

Caution

I AM NOT RESPONSIBLE FOR WHAT YOU DO WITH ANY OF THIS OR WHAT HAPPENS TO YOU. Don't come crying to me if your account got banned. (although i might be the one getting banned for all this)

Arknights Reverse Engineering

This is my take on some RE of Arknights, this is rather surface level, don't expect much. Please don't delete me HyperGryph, i like the game :)

Table of contents

Random stuff

Tip

The requirements for all the Python scripts and can be installed with python -m pip install -r requirements.txt.

Other resources

There are other good ressources on this, but i wanted to take on the challenge myself so here we are. Please tell me if there are some more.

GS API

A bit of the Arknights GameServer API implemented inside api.py. There are no rate limiting built-in, so you have to be reasonable with the speed of the actions

  • The script create_login.py allows you to generate credentials to login to the GS using a Yostar account and a device ID. You can generate a random one fairly easily. Theses credentials are stored in a file called creds.json and allows the auto_daily.py script to function

  • The daily script auto_daily.py which does a simple login, settles factories and trading posts, does recruitment, PRTS Annihilation Proxy, sends clues to a friend and visits all friends. This script doesn't implement rate-limiting, use at your own risk. It requires the creds.json generated by create_login.py. The daily script doesn't "play" levels but it could technically just send modified versions of the battle records as we can encrypt and decrypt them fine (see decryption).

It was done mostly by looking at the traffic with mitmproxy and Inspeckage to disable HTTPS certificate pinning and force the proxy. I wouldn't recommend this method as Inspeckage is quite finicky and Nox is the only emulator i managed to make it work with, also being the wort emulator i used.

Extraction of assets

You get can most of the assets by downloading the xapk file online. This file contains the OBB data and the APK itself. To view and dump the data inside the AssetBundles files (.ab), I used AssetStudio.

The OBB contains the normal game data, so parameter tables, textures, models, sounds. The APK contains the base Unity assets, the code, some basic assets for the game and a couple of configs.

The game also uses what it calls "Hot Update" which is the update screen you see before logging into the game. This one contains the remaining data (for example extra voices or L2D skins) and updates to the base OBB data. It is formatted as a list of singular AssetBundles to update only parts if there is a small update. Or "packs" which contains a lot of AssetBundles in a specific category. This is more for big updates or for the first time you install the game.

You can get the list of all theses packs and assets by using the hot_links.py script. It will generate a modified hot_update_list.json with the link added for each item. All the Hot Update files are .dat but are really zip files, so open them with 7zip or similar.

Some assets, for example most of the tables and all the lua files are encrypted and need to be decrypted.

Reversing the code

Warning

I won't help you to do any of this, i'm just documenting how i've done it.

Note

The addresses, values and the patch are all for Arknights 24.2.21

How I got the global-metadata.dat and the dumped libil2cpp.so

I used a modified Fork of Zygisk-Il2CppDumper by 000ylop (my patch is in dumper.patch) to get the decrypted global-metadata.dat file. I had to readd the header to it and afterwards i could use IL2CppDumper to generate all the DummyDLLs and the data to make the IDA Python script. The IL2CPP library i used in IDA was the one dumped with my modified Zygisk IL2Cpp Dumper as it was more complete. It also gave the address for the Code Registration and Metadata Registration in the logs which was needed. (0xA0724A0 and 0xA072510 respectively). A good reference for all things IL2Cpp reversing is il2cppdumper.com.

You could probably do all of the work of the Zygisk dumper using Frida in an emulator but i only heard of it after i had done most of the work on a real phone using the dumper.

Decrypting the global-metadata.dat without dumping it

TL;DR: Not done yet but very feasible

After the fact, I looked into the encryption of the global-metadata.dat file which was interesting for sure. Inside libil2cpp, at 0x2343904 is the function to load the global-metadata.dat file the the part that loads it and decrypts it in inside the function at 0x2341308 in the mmap call at 0x22E0710 which leads to 0xA861818 through a pointer. This one calls a function at 0x137220 in another library: libanort.so and finally we reach the real modifed mmap at 0x136770 in the other library. I managed to find this out using Frida on my real device as it always crashed on the emulator.

From there, it decompiles not too bad, so I fixed the decompiled code and put it into a standalone app. I get the majority of the file decrypted but a single 0x4000 bytes long chunk at offset 0x1000, which corresponds to the most complex part of the decrypter, doesn't decrypt properly. Honestly I really don't want to dive into the decompiled horrible mess so I'll shelf the problem for later. I won't put the code here as it is really nasty, being straight out of IDA's decompiler but you can redo it pretty easily.

Some interesting things: 2 functions called are obfuscated and IDA doesn't dissasemble it properly: 0xF88D4 and 0xF9740, their real body can be found by scrolling down a little bit but they do weird stuff with threads that i don't get. I've also had a big problem getting debuggers to run, I've tried gdb on device but it can't set breakpoints, gdbserver just doesn't work, lldb works even less and Frida fails to hook the important functions or it just makes it crash. So I can't poke around in-situ but removing the functions calls don't seem to affect the functionallity of the decrypter and the game still works. So I guess they're not essential ?

Reversing

Afterwards it's relatively straight forwards, just do as any other IL2CPP game. I recommend using dnSpy to find interesting functions inside the DummyDLLs and then use IDA or Ghidra to look at the code.

Decryption of assets

In the assets, some files are encrypted, there are multiple custom encryption schemes, the ones supported by the encryptor/decryptor I wrote are:

  • Cryptic A
  • Cryptic B (not tested but round-trip works)
  • Cryptic A with Signature (only decrypt)
  • Only Signature (only decrypt)
  • Battle Data (used over the internet for level clear)
  • Battle Signature (used over the internet for level clear)

The CLI of the converter library is the script decrypt.py, it's use is fairly straight forwards so i won't detail it here.

Of course none of the .ab files are encrypted, when i reference one of them, i'm talking about files inside of them, you have to extract them.

The converter used for asset types that I remember are:

Asset type Location Converter Type Data Type Inside
Lua data gamedata/[uc]lua.ab Cryptic A with Sign Lua
Tables ending with 6digit hex value gamedata/ Only Sign Flatbuffers
Tables without hex value gamedata/ Cryptic A with Sign JSON/BSON
Level data gamedata/levels/obt/ Only Sign Flatbuffers

If you don't know what an asset uses, just try a bunch of them, most of the time it's Cryptic A, Cryptic A with Sign or Only Sign. (Sign adds 128 bytes signature at the beginning of the file and can be spotted with an hex editor most of the time).

Flatbuffers

Older version of tables used simple BSON/JSON, now it uses Flatbuffers. Which is much harder to parse if you don't have the schema used to generate the files... But thankfully, the classes which hold the data that the flatbuffer serializes are structured the same way and we losslessly recovered thoses thanks to the reversing in the DummyDLLs.

So I wrote a "tool" held together with duct tape in C# (because the Python lib didn't work well and was missing some stuff) using the dnlib library to parse the .NET DLLs to automatically generate Flatbuffers schemas for all classes that use it. The tool is in the DNFBDmp folder, you're going to have to compile it yourself but it probably works under mono fairly easily ? I don't know myself, as i used Visual Bloat 2022 to compile it. You use it in the command line by feeding it the folder where all the DummyDLLs are located and optionally the output folder and it will generate a lot of .fbs flatbuffer schema files.

You can then manually use the flatbuffer compiler flatc (that you can download here) to read the binary files with their schema and transform them into JSON or create bindings in your language of choice to read/write the data.

The dicts, nested arrays and JObjects in the manual version are rather annoying to read so i made a simple Python script to fix them called fb_conv.py that converts the data and fixes it (you still need the flatc compiler). It takes the schema file and the binary file as an argument and creates a JSON file with the data.

In regards to what is the schema for a binary file, the 6digit hex value after the name if it has one is a MD5 checksum that corresponds to the right class and are straight strings in the IL2Cpp stringlitterals that you can CTRL+F easily. You can find out the linked class in the function Torappu.FlatBuffers.FlatLookupConverter$$_LoadRootTypeMD5. You can also guess from the context like for example, a gacha table will probably have gacha in the name. Of course, you have to snoop around the classes in the DummyDLLs aswell.

A nice tool i used to poke around the flatbuffers was FlatCrawler by kwsch. You can manually try to guess the schema of a binary flatbuffer file.

Logging telemetry

Arknights now sends some logs to their servers for more telemetry, and i thought I'd be interesting to look at what they were sending. It uses Alibaba Cloud's library called Simple Log Service. It specifically uses the Android SDK version of it, which is based on the C SDK version.

It's pretty simple and uses only one Protocol Buffer define in the docs (mirrored in the repo here). But it compresses the data before sending it, with LZ4 or ZSTD.

I made a very simple mitmproxy addon that adds a Content View for the compressed protocol buffer of SLS in mitm/slsview.py. To use it you need to have the LZ4 package installed in mitmproxy's venv. You can do it by installing mitmproxy through pipx and then injecting the pip dependency (only lz4, zstd is not implemented as it's not used for Arknights) as explained in the docs of mitmproxy. This Content View returns the message decoded converted to JSON so you can easily peek at what's inside.