x86 Assembly Continued

x86 Disassembly Part 2

Quick Recap

In Part 1, we covered the fundamentals of x86 disassembly - registers, basic instructions, control flow, and how malware uses APIs. Now we're diving deeper into the mechanics that make assembly both powerful and challenging to analyze.

Think of this as moving from learning vocabulary to understanding grammar. We'll explore how functions really work at the machine level, how data moves between memory and registers, and how to spot malicious patterns in real disassembly.

Functions and Stack Frames

Functions in assembly don't have nice parentheses or clearly defined parameters like in high-level languages. Instead, they follow a strict dance of stack manipulation that we need to understand.

The Stack Frame Lifecycle

Every function typically follows this pattern:


; Prologue
push ebp       ; Save the caller's base pointer
mov ebp, esp  ; Set our new base pointer
sub esp, 0x10  ; Allocate space for local variables

; Function body here

; Epilogue
mov esp, ebp  ; Deallocate locals
pop ebp       ; Restore caller's base pointer
ret          ; Return to caller

This creates what we call a stack frame - a self-contained workspace for the function's variables and operations. Here's what it looks like in memory:

Accessing Arguments and Locals

Parameters and local variables are accessed relative to EBP:

  • [ebp+8] - First argument
  • [ebp+12] - Second argument
  • [ebp-4] - First local variable

When you see patterns like these, you're looking at a function accessing its inputs and workspace.

Calling Conventions

Calling conventions define how functions receive parameters and return values. Malware analysts need to recognize these patterns to track data flow.

CDECL (C Declaration)

The most common convention in x86:

  • Caller pushes arguments right-to-left
  • Caller cleans up the stack after
  • Return value in EAX

; Calling a CDECL function
push 3        ; Second argument
push 7        ; First argument
call add_numbers
add esp, 8  ; Clean up stack

STDCALL

Common in Windows APIs:

  • Arguments pushed right-to-left
  • Callee cleans up the stack
  • Return value in EAX

You'll see this in Windows DLL calls:


push 0        ; NULL
push 0        ; NULL
push 0        ; 0 bytes to write
push buffer   ; Our data buffer
call WriteFile  ; STDCALL - callee cleans up

Memory Operations

Malware frequently manipulates memory - allocating space, writing shellcode, or modifying existing structures. Here are key patterns to recognize:

Memory Allocation

Windows APIs for memory operations:


; Typical VirtualAlloc pattern
push 0x40      ; PAGE_EXECUTE_READWRITE
push 0x1000    ; MEM_COMMIT
push 0x1000    ; Allocation size
push 0        ; NULL (let system decide location)
call VirtualAlloc

Seeing this in malware often indicates:

  • Space being prepared for shellcode
  • Memory being made executable (red flag!)
  • Evasion of static analysis

String Manipulation

Malware needs to work with strings (file paths, URLs, commands). Common instructions:


mov edi, buffer  ; Destination
mov esi, string_data  ; Source
mov ecx, length  ; Counter
rep movsb       ; Copy ECX bytes from ESI to EDI

This is how malware might build a file path or URL in memory before using it.

Practical Analysis

Let's put this knowledge to work analyzing a real malware snippet:


push 0                ; Step 1: Push arguments for LoadLibraryA
push 0x6165722E       ; ".rea"
push 0x646C756F       ; "ould"
push 0x682E3233       ; "32.h"
push 0x72657375       ; "user"
mov eax, esp        ; EAX now points to "user32.hould.rea"
push eax
call LoadLibraryA   ; Step 2: Load the DLL

push 0x646E6148       ; Step 3: Push "Hand"
push 0x6C654D6F       ; "oMel"
mov ebx, esp        ; EBX points to "oMelHand"
push ebx
push eax
call GetProcAddress ; Step 4: Get "MessageBoxA" address

What's happening here:

  1. The malware builds "user32.dll" on the stack in chunks (reversed due to little-endian)
  2. Loads the library dynamically (avoids static imports)
  3. Builds "MessageBoxA" string in similar fashion
  4. Gets the function address for later use

This is classic dynamic API resolution - a hallmark of sophisticated malware trying to evade static analysis.

Key Takeaways

  • Stack manipulation is how functions create their workspace
  • Calling conventions dictate argument passing and cleanup
  • Memory operations reveal how malware stores and accesses data
  • Dynamic API resolution is a major red flag

Remember - practice is key. The more you analyze, the more patterns you'll recognize.

Happy reversing