This blog post is going to show some quick analysis techniques you can leverage on a malicious document.  The sample we will analyze will be a Microsoft XML (.docx) formatted document exploiting CVE-2013-3906:
Filename: IMEI.docx
MD5 Sum: b44359628d7b03b68b41b4536314083

First we can start out with some very simple static analysis by confirming the file type using the Linux “file” utility.  Then we can leverage a hex editor to see if we can identify anything interesting:

hex

Let’s keep this detail in mind as we move through the document because leveraging ActiveX as a Heap Spraying technique is a approach that can help the attacker get their shellcode to execute. Shellcode is raw assembly instructions that will operate without having to be compiled/run from a normal executable. For this reason shellcode is extremely small and often leveraged in the payloads of exploits.  As we examine this document further we will see that this shellcode will download a file from a remote server and execute it from %TEMP%.

Next since this document is using XML format (.docx) we can simply rename the file from (.docx) to (.zip) and we can extract the embedded content like a normal zip archive:

With the file contents extracted we can begin to examine them for interesting items for our analysis. Lets attempt to locate the ActiveX content and the bin files we saw referenced in the hex viewer of the original file:

hex2

What we can see is that each .bin file is actually containing shellcode. Highlighted in the screen shot above you can see the URI for the GET request of the 2nd stage malicious binary.
If we turn back to the raw hex editor of the original file we can see a reference to the image file which is likely used to trigger the exploit:

Once the shellcode downloads the 2nd stage binary it will attempt to execute the code from the %TEMP% directory. In the screen shot below I had a Python script emulating a web server and when the malicious code made a request for “WINWORD.exe” I gave it “WINWORD.bat” which is a batch file that just echo “This is a test of the emergency broadcast system” and then pauses:

winword

If we had the 2nd stage malicious binary we could have passed the actual WINWORD.EXE using the python web server. Further indicators of this shellcode can be found by examining the systems memory using Volatility:

This was a quick look at some basic static and dynamic analysis techniques that can be applied to a Microsoft (.docx) document.  If you’re interested in learning more about some additional tools and techniques used to analyze additional Microsoft Office document formats please check out OfficeMalScanner.