So, we have a pcap (links to all files at the end of the post), and we know that it contains the dump of some sort of file transfer protocol, and that a "not so sophisticated" encryption was used.
Several hours (and people) later, we extracted the following info:
- The data that the server sent us is split in two "sessions," each terminated by the "END" string.
- The first session contains 8 packets of 68 bytes each.
- The second session contains 28 packets of 212 bytes each.
- The first 12 bytes of each packet identify the total length of the session, some control codes (we still don't understand most of them...), the number of packets, and a packet counter. As it turns out, all these bytes are useless.
- The remaining bytes of each packet contain the actual "data".
- The first session is a file listing (once decrypted, one of the listed file is "flag.png"...)
- The second session is the content of the flag.png file.
- Each packet is encrypted with a 2-byte xor key.
So, we focused on the second session, as it contains the flag.png file. To recap, we have 28 encrypted packets that, once decrypted and concatenated, will give us the content of the "flag.png".
As a first step, we guessed the keys for the first and last packet. Since some PNG-related keywords need to appear at the beginning of a PNG file (e.g., "PNG" or "IDAT" strings), it was trivial to get the first 2-byte key: 20543. Same goes for the last packet: in this case there were many repeated 2-byte string at the end of the packet, so we guessed there was some padding (likely \x00) at the end. This gave us the key for the last packet: 19564.
But what about the keys for the other 26 packets? Well....
The general idea was to bruteforce the key for each packet. But how can we guess what's the correct key? After all, the IDAT payload is zlib-copressed stuff, so everything, once decrypted, will look like random bytes...
...and after several hours, we came up with this trick: for each packet, we decrypt it with a key K, try to uncompress it, and look at the result. If it's impossible to uncompress it, then for sure K is not the right key. If we can uncompress it, we then check the payload and see if it looks like RGBA (RGB + alpha channel!). We were able to confirm that this trick might have worked given that the IDAT payload contained in the first packet followed the pattern, and that alpha was always set to zero!
We wrote an heuristic to check for this pattern for each of the "attempt", and we sort the candidate keys according to another heuristics. Few hours later, we got the PNG, which contained the flag :-)
These are the 28 keys: 20543 44829 21138 23618 15062 59478 13198 54610 4633 46710 41810 38097 56123 58392 52387 12251 26106 43868 15618 57633 1053 53731 53447 30269 24329 17183 6131 19564.
Here there are a bunch of links: github repo, original pcap, and the code. And this is the final, decrypted PNG: