Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More info on p4k file? #49

Open
damccull opened this issue Apr 15, 2022 · 14 comments
Open

More info on p4k file? #49

damccull opened this issue Apr 15, 2022 · 14 comments

Comments

@damccull
Copy link

@peter-dolkens I am trying to figure out how to open this p4k file myself, but I'm not succeeding like you seem to be. I see you have some kind of encryption key in your unzip code, but what kind of encryption is it? I'm not familiar with zip so I am having to rely heavily on libs doing this for me. I am trying to do it in rust, but I'm not sure what the differences are between this file and a regular zip. I was wondering if you could elaborate on the file format a bit more than the readme already contains?

Also, weirdly, when I use 7zip to open it I only see about 6 files, all related to a starfarer crash...

@Stryxus
Copy link

Stryxus commented Apr 15, 2022

As far as I can help, Im not an expert on compression either but the project is using a modified version of SharpZipLib albeit a very old one, iv updated it manually in the net6 branch and my pull request (#48).

The modification is the official DLL's for ZStd using Zip64 support in SharpZipLib. The encryption keys I have no clue about so, I updated how they were provided so it will only use the keys on key requests.

This is the core implementation of ZStd but there are parts throughout SharpZipLib which add it into the pipeline. https://github.com/Stryxus/SharpZipLib/blob/master/src/ICSharpCode.SharpZipLib/Zstd.cs

@damccull
Copy link
Author

Oh, so do you mean to say that the p4k file is in zip64 format?

@Stryxus
Copy link

Stryxus commented Apr 15, 2022

ZStd format but in 64-bit mode. So, ye basically.
Keep in mind ZIP is the container format for the many different compression formats.

@damccull
Copy link
Author

Ah... Thank you. Maybe that'll help me get started opening this thing up. I keep getting errors talking about how it doesn't have a central directory, so I am very confused lol.

@damccull
Copy link
Author

As far as I can help, Im not an expert on compression either but the project is using a modified version of SharpZipLib albeit a very old one, iv updated it manually in the net6 branch and my pull request (#48).

Wow, just looked at your PR. You reworking the whole app?

@damccull
Copy link
Author

So I understand the p4k file is a zip container with multiple compression schemes and occasional encryption inside. I'm experiencing errors from the libs I'm using stating that it either does not have a central directory header or it's a multipart zip file. I don't believe it's multipart, but why would I not find a central directory? Is the container itself somehow encrypted?

@Stryxus
Copy link

Stryxus commented Apr 16, 2022

So I understand the p4k file is a zip container with multiple compression schemes and occasional encryption inside. I'm experiencing errors from the libs I'm using stating that it either does not have a central directory header or it's a multipart zip file. I don't believe it's multipart, but why would I not find a central directory? Is the container itself somehow encrypted?

No the file it self is not encrypted. It could be because there are both ZStd compressed entries as well as deflate entries inside the p4k.

As far as I can help, Im not an expert on compression either but the project is using a modified version of SharpZipLib albeit a very old one, iv updated it manually in the net6 branch and my pull request (#48).

Wow, just looked at your PR. You reworking the whole app?

Already done it :) Besides the GUI which Im going to be doing in MAUI but thats still a bit tedious to work with atm.

@damccull
Copy link
Author

What programs exist to examine the structure of a zip file? Not the user presentation of folders and files but the actual structure of a particular file? I know a hex editor will show it all but that's a bit low level for what I want to see, I think. Any suggestions?

@Stryxus
Copy link

Stryxus commented Apr 17, 2022

What programs exist to examine the structure of a zip file? Not the user presentation of folders and files but the actual structure of a particular file? I know a hex editor will show it all but that's a bit low level for what I want to see, I think. Any suggestions?

I dont know any for Rust. You will probably have to look for zip libraries like SharpZipLib what can be used in Rust and supports ZStd because they should also support deflate. I mean you could always interop with my repo for SharpZipLib but that will only be updated for unp4k and I want to eventually transfer it to Peter.

@damccull
Copy link
Author

What programs exist to examine the structure of a zip file? Not the user presentation of folders and files but the actual structure of a particular file? I know a hex editor will show it all but that's a bit low level for what I want to see, I think. Any suggestions?

I dont know any for Rust. You will probably have to look for zip libraries like SharpZipLib what can be used in Rust and supports ZStd because they should also support deflate. I mean you could always interop with my repo for SharpZipLib but that will only be updated for unp4k and I want to eventually transfer it to Peter.

Sorry, I just meant in general so I can see what the file looks like and understand it.

@peter-dolkens
Copy link
Member

peter-dolkens commented Apr 29, 2022

Hi @damccull - this file is basically a zip file with extra features

If you want to see the extra stuff added to support p4k files, you can compare SharpZipLib@8feb39a0c26e83b80e7fb0c481834e88d35297ee with the copy inside unp4k

The summary is:

Content in a p4k can be "Stored", "Deflated" or "ZStd" compressed. "Stored" is literally just putting one file inside the other, no compression. "Deflate" is the standard/basic zip compression method, and "ZStd" is a simple implementation of ZStd (the copy included in unp4k is the same version that CIG was using when p4k released - it may have been updated since). Zip also supports some other methods, such as BZip2, and WinZipAES, but I have not seen those used in p4k files. CIG uses Compression Method 100 to represent zstd.

Each entry has "ExtraData" associated with it, which contains flags and other information which determine things like the compression and encryption used. CIG use flags in ExtraData 168 to indicate encrypted content.

Additionally, each entry has a header, which is slightly different for an encrypted entry.

	public const int LocalHeaderSignature = 'P' | ('K' << 8) | (3 << 16) | (4 << 24);
	public const int EncryptedHeaderSignature = 'P' | ('K' << 8) | (3 << 16) | (20 << 24);

This is likely why 7zip can read a few files before it gives up. There are a few other tools out there which just attempt to brute force any and every entry out of the file, which works well, but misses any files that are encrypted.

For those, you need to extract the stream, then process it with an AES cypher - the hard work for this is found in ZipFile.cs

As for visualizing - I basically took a library that was easy to work with (IZipSharpLib) and stepped through, observing things that failed because of invalid values in the library, and adding support in the appropriate places.

That's why @Stryxus's efforts are so appreciated - my code is all "discovery" code bashing done in large part on a train over 2 weekends. Now that the thing is discovered, he's been able to go back through and optimize much of it now that the formats and techniques are understood.

@damccull
Copy link
Author

@peter-dolkens Thank you for the reply! This is very helpful. I was under the impression it was a normal zip file that just had some stuff stored in it encrypted but that everything else was standard.

Are these two things series of bitwise operations designed to generate a specific set of bits? I've never seen this notation before. Looks like the character P logically OR'd with the letter K after an 8 bit left shift? Then the same for the next two but number instead of letters?

public const int LocalHeaderSignature = 'P' | ('K' << 8) | (3 << 16) | (4 << 24);
public const int EncryptedHeaderSignature = 'P' | ('K' << 8) | (3 << 16) | (20 << 24);

@Stryxus
Copy link

Stryxus commented Apr 29, 2022

@damccull << and >> are shift operators.
If you aren't that knowledgeable with C# then https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/operators/bitwise-and-shift-operators

@peter-dolkens
Copy link
Member

@peter-dolkens Thank you for the reply! This is very helpful. I was under the impression it was a normal zip file that just had some stuff stored in it encrypted but that everything else was standard.

Are these two things series of bitwise operations designed to generate a specific set of bits? I've never seen this notation before. Looks like the character P logically OR'd with the letter K after an 8 bit left shift? Then the same for the next two but number instead of letters?

public const int LocalHeaderSignature = 'P' | ('K' << 8) | (3 << 16) | (4 << 24);
public const int EncryptedHeaderSignature = 'P' | ('K' << 8) | (3 << 16) | (20 << 24);

Yup - that's basically what it is - I could have just shoved the raw header in, but this is a bit more readable - it's basically "PK" - short for PKZip, a leftover from the original Zip format days, along with a kind of version number (the numbers) stuck on the end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants