Adventures in Security

Writing a Malware Config Parser Using Radare2 and Ruby

Radare2 has been receiving a lot of attention lately. Rather than browsing through some of the documentation, I decided to try and port some existing code to use Radare.

Rewind to 2014, GData released a whitepaper on “Operation TooHash” and the malware featured was dubbed “Cohhoc.”

Without diving into the deep into cohhoc, there are 2 steps to decode the C2 address. The URL is stored as a base64 string that contains the URL (which is bitshifted and OR’d).

Below is a screenshot of the encoded URL string being pushed to a base64 decode function.

The resulting data is then passed to another decode function which uses a number of bitwise shifts to decode the data.

Creating a parser for this malware should be easy enough. The thought process is to look for all strings being pushed and then check which one of those decode cleanly. The only issue is iterating through the binary using a method that makes sense. Enter Radare2 and Ruby!

Radare2 allows scripting via its API r2pipe and has bindings for many popular languages.

First, we’ll get the decode sections out of the way and then start iterating through the binary.

require 'r2pipe'
require 'json'
require 'base64'
def decode(config)
decode = Base64.decode64(config)
uri = ""
decode.each_byte do |b|
#shr dl,6
#shl al,2
#or dl,al
uri += (((b<<6)%0xff |(b>>2)%0xff)).chr
return uri

The next step will be finding instances of base64 strings being pushed. We’ll break this step into smaller pieces.

r2p ="mal.exe")  #initialize the object
r2p.cmd('aaa')               #analyze all functions
functions = r2p.cmd('aflj')  #return the function lists in JSON
func = JSON.parse(functions) #parse the JSON

The resulting JSON will have information about all the functions of the binary.

        "datarefs"=>[4269968, 4269856, 4206032], 

The useful information to us is the function name, offset, and size. Now that we have that information we’ll iterate through the functions and disassemble them, looking for push instructions.

func.each do |elem|
#disassemble each function and return JSON
contents = JSON.parse(r2p.cmd("pdfj @ #{elem["offset"]}"))

#iterate through the operations
contents["ops"].each do |operations|

#is the operation a push?
if operations["type"].eql?("push")
#look for addresses being pushed
next unless operations["opcode"] =~ /\ 0x/

#grab the value being pushed
addr = operation["opcode"].split(" ").last.hex

#use radare "psz" to grab the string
str = r2p.cmd("psz @ #{addr}")

#ugly regex looking for base64 data
if str =~ /[0-9a-zA-Z\+\=]{10,}/
#decode the string
decoded_str = decode(str)
#is the decode something that looks like a URL?
if decoded_str =~ /[0-9a-zA-Z\.\-]{5,}/
puts "Function #{elem['name']} - #{str.chomp} - #{decoded_str}"

Running this script results in the following:

~$ ./cohhoc_radare.rb 7136ba78671c6c4d801957be8e768d444389a28471679a6ba713adf6b564784f 
Function fcn.00403890 - 3ZWJtYWlsbiludGFybmV0c2VydmljZW4jb21 -
Function fcn.00403890 - oZWxwbjdlYm1haWxlcnNlcnZpY2VzbiNvbU= -

In less than 100 lines of code, we were able to find encoded data, decode it, and tell the function where it was pushed onto the stack. This is incredibly useful because now you have a function address where you can focus your efforts.

I previously wrote a version of this script to find the cohhoc encoded data and decode it. The previous version of this script used the wonderful Capstone library and its Ruby binding Crabstone.

We hope that this can showcase some of the useful features of Radare and Ruby!

This project is maintained by securitykitten