Skip to main content

0801 | x86 Architecture Overview

Malware Analysis | x86 Architecture Overview | Summary:

The room provides an overview of CPU architecture, explaining how it executes instructions and interacts with external components. It details the basic components of a CPU (Control Unit, Arithmetic Logic Unit, Registers) and how they interact with memory and I/O devices.

It then delves deeper into registers, explaining their types (Instruction Pointer, General-Purpose Registers, Status Flag Registers), and how they are used to store data temporarily while it is being processed by the CPU. Additionally, the room covers program memory layout, highlighting the importance of the stack in malware analysis, and explains common malware techniques such as stack buffer overflow attacks.


Disclaimer: Please note that this write-up is NOT intended to replace the original room or its content, but rather serve as supplementary material for those who are stuck and need additional guidance.

Learning Objectives

  • Overview of CPU architecture and its components
  • Different types of CPU registers and their usage
  • Memory layout as viewed by a program
  • Stack layout and stack registers

1 | CPU architecture overview

Here they explain the basic components of a CPU (Control Unit, Arithmetic Logic Unit, Registers) and how they interact with external devices (Memory, I/O devices). Here's a brief overview:

  • The Control Unit fetches instructions from memory and executes them one at a time.
  • The Arithmetic Logic Unit performs calculations and operations based on those instructions.
  • The Registers are a small, high-speed storage area that holds important data for quick access by the CPU.
  • Memory (RAM) stores all program code and data.
  • I/O devices include keyboards, displays, printers, etc. that interact with the computer.

Overall, they provide a simple explanation of how a CPU executes instructions and interacts with external components to run programs.

2 | Registers overview

They follow up with an explanation about the concept of registers as the CPU's storage medium, which provides quick access to data compared to other storage mediums. The registers are divided into several types, including:

  • Instruction Pointer (IP/RIP): A register that stores the address of the next instruction to be executed by the CPU.
  • General-Purpose Registers: These 32-bit or 64-bit registers store data during general execution of instructions by the CPU. They include:
    • EAX/RAX, EBX/RBX, ECX/RCX, EDX/RDX | Accumulator, Base, Counter, and Data Registers
    • ESP/RSP, EBP/RBP, ESI/RSI, EDI/RDI | Stack Pointer, Base Pointer, Source Index, and Destination Index Registers
    • R8-R15 | 64-bit general-purpose registers not present in 32-bit systems

These registers are used to store data temporarily while it is being processed by the CPU. Moreover they also explain how each register type can be addressed at different levels of precision (e.g., byte, word, double-word).

3 | Registers - Continued

Status Flag Registers | Indicating Execution Status

Here they explain the concept of Status Flag Registers, which provide information about the status of execution in a CPU. These registers are:

  • EFLAGS (32-bit) and RFLAGS (64-bit) | A single 32-bit or 64-bit register that contains individual single-bit flags.

Some key flags include:

  • Zero Flag (ZF) | Indicates when the result of an instruction is zero.
  • Carry Flag (CF) | Indicates when a number is too big or small for the destination register.
  • Sign Flag (SF) | Indicates if a result is negative or has the most significant bit set to 1.
  • Trap Flag (TF) | Indicates if the processor is in debugging mode.

Segment Registers | Organizing Memory

They also explain Segment Registers, which are used to organize memory into different segments for easier addressing. There are six segment registers:

  • Code Segment (CS) | Points to the code section in memory.
  • Data Segment (DS) | Points to the program's data section in memory.
  • Stack Segment (SS) | Points to the program's stack in memory.
  • Extra Segments (ES, FS, and GS) | Divide the program's memory into four distinct data sections.

4 | Memory overview

Program Memory Layout
When a program runs on Windows, it sees an abstracted view of memory, with its own memory space isolated from the rest of the system. The program has access to four main sections of memory:

  • Code | Contains the program's instructions and code, which can be executed by the CPU.
  • Data | Holds initialized data that remains constant during execution, including global variables.
  • Heap (Dynamic Memory) | Allocates and deallocates memory for variables created and destroyed at runtime.
  • Stack | Stores local variables, function arguments, and return addresses, which can be targeted by malware to hijack control flow.

They also provide here a brief overview of each section, highlighting their characteristics and importance in malware analysis. The Stack is emphasized as a critical area from a malware perspective, particularly due to its role in controlling the program's execution flow.

5 | Stack Layout

Understanding the Stack

The Stack is a critical part of a program's memory that contains local variables, function arguments, and control flow information. It's essential for malware analysis and reverse engineering. The Stack follows a Last In First Out (LIFO) principle, where the last element added is the first to be removed.

Stack Pointers

The CPU uses two registers to manage the Stack:

  • Stack Pointer (ESP/RSP) | Points to the top of the stack, adjusting as elements are pushed or popped.
  • Base Pointer (EBP/RBP) | Remains constant for a program, tracking local variables and arguments.

Function Prologue and Epilogue

  • When a function is called:
    • Function prologue pushes arguments, return address, and old base pointer onto the stack.
    • The base pointer address changes to point to the top of the stack (the current stack pointer).
  • When a function exits:
    • Function epilogue pops off the old base pointer, return address, and rearranges the stack pointers.

Malware Technique | Stack Buffer Overflow

A common malware technique is to overflow a local variable on the stack, overwriting the Return Address with an attacker's chosen address. This can hijack control flow and execute malicious code.