How to analyze Malicious RTF Files?

Analyzing malicious RTF files by examining their structure, inspecting embedded objects & identifying potential threats.

Analyzing malicious RTF files by examining their structure, inspecting embedded objects & identifying potential threats.

Sunday, 14 April, 2024

How to analyze Malicious RTF Files?
How to analyze Malicious RTF Files?
How to analyze Malicious RTF Files?

RTF (Rich Text Format) files, developed by Microsoft for cross-platform document interchange. Recently it's utilized by malware authors due to their compatibility across different applications and platforms. These files can contain embedded objects like macros or scripts, exploited to execute malicious code upon opening.

How does it work?

Malware authors often employ social engineering tactics, disguising RTF files as legitimate documents, to deceive users into opening them. Additionally, RTF files can evade detection by antivirus software through obfuscation techniques or zero-day exploits, providing an effective means for distributing malware payloads while minimizing the risk of detection and analysis.

File Details:

Hash    : b2b8ef2a3bf64dd5531bd414e7f946c9f040ab2674bc73eb0d4af0d314623174
Magic   : Rich Text Format data, version 1
Filename: Unknown.rtf / SecuriteInfo.com.Exploit.ShellCode.69.14498.22623.rtf
FileType: Rich Text Format (.rtf)
Size    : 72.10 KB (73827 bytes)

How to Analyze the RTF File?

To analyze an RTF file effectively, it's essential to first understand its structure and syntaxes.

The RTF (Rich Text Format) file structure usually begins with a header, followed by the document body containing content, formatting instructions, and embedded objects. Formatting elements are defined using control words, denoted by special sequences with a backslash (). Special characters and escapes represent characters with special meaning. Embedded objects like images or macros may be included. The file concludes with an end-of-file marker indicating the document's end.

Control words in RTF files are specific sequences of characters, typically represented by a backslash (), that function as commands to define formatting, document properties, or other instructions. These control words play a crucial role in specifying various elements within the document, such as font styles, text alignment, paragraph formatting, and more. 

For instance, "\b" signifies bold text, "\i" denotes italic text, and "\par" indicates a paragraph break. Essentially, control words enable the encoding of the document's structure and formatting within the plain text content of the RTF file, facilitating consistent rendering across different software applications.

Let's analyze the sample, going through each step one by one.

Header of the RTF File:

Manual Extraction of RTF Components:

Step 1: Open the RTF sample using Microsoft Word.
Step 2: After opening it, navigate to the "Save As" option and save a copy of the document in another format, such as Word 97-2003.
Step 3: Once the copy is saved, extract the sample and generate a dump file.
Step 4: Unzip the dump file to reveal the components of the RTF.
Step 5: Proceed to analyse each of these components individually.
Step 6: Notably, within the Word document, you'll find the embedded Equation3 Editor.

Extracted Dump file:

Obtained Embedded Equation3 Editor:

Dynamic Technique:

After opening the RTF file, the embedded function links with the object, exploiting the equation editor, and establishes a connection with their malicious command and control (CNC) server. This connection facilitates the download of the payload file, enabling malicious activities to be executed on the system.

Obtained URL:

AV Vendor Sandbox Results:

Analyzing sandbox results enables us to know whether the URL is associated with malware, phishing attempts, or other malicious activities. Here the obtained URL is marked as 100% malware.

Happy Hunting !!