kai's blog

Warning! Assembly Required! - Solving my first CTF challenge

Published September 01, 2025

Doing something is often the best way to improve at it. This is no different for cybersecurity. Knowledge can not become understanding without application. One of the best ways to apply cybersecurity knowledge is to participate in a capture the flag (CTF) event. CTFs are competitions where teams compete to solve as many cybersecurity challenges as possible. These challenges cover various skills and topics such as reverse engineering, cryptography, forensics, web exploiting, and more. After solving a challenge, you submit a "flag" used to claim points. I did not expect to be high on the leaderboard, I just wanted to experience what a CTF was actually like, learn more about computers, and gain some new skills.

The Challenge

I went with a reverse engineering challenge, something I had never done before. The challenge provided a file, which I found, after running the file command on it, to be:

Based on the last two characteristics, the program conforms to the System V ABI, which defines how programs interact with each other and the operating system at the machine level. Many Unix-based systems, such as Linux, conform to the System V ABI.

Analyzing the disassembly

I first thought to run the executable to see what it did; it simply asked me to input a username and password, which I of course did not know, and then outputted "Invalid credentials". Since I only had an executable, I had no high-level source code, in a language such as C, to read; instead, I had to read assembly. Linux has a handy command named objdump that can disassemble a binary file; all of a binary file can be disassembled with the -D option. I then searched for the main function, which is the entry point of the program. Let's walk through the important parts of the disassembly3.

The disassembly starts with the function prelude, which is responsible for setting up stack space for main:

push   rbp
mov    rbp,rsp
sub    rsp,0x40

A stack is a common data structure used in computing. Think of it precisely as a stack of books: new books go on top and the top book is the first to be removed4. In computer architecture, the stack keeps track of the state of executing functions in a stack frame. This stack frame contains data such as local variables and the return address, which is where the CPU should continue executing after the function returns. In x86, the stack grows downward, starting at higher addresses; so, the top of the stack is at a lower address than the bottom. x86 contains two special registers for the stack: rsp, the stack pointer which points to the top of the stack; and rbp, which points to the bottom of the current stack frame. Let's walk through the assembly:

The next section of the disassembly handled taking user input, indicated by the read calls; this first sequence was for reading the inputted username. read is a C library function that facilitates reading data from a file. It takes 3 arguments, in the following order: a file descriptor, which is where data should be read from; a pointer to a buffer where data is to be read to; and a count, specifying the maximum amount of bytes that should be read:

lea  rax,[rbp-0x20]
mov  edx,0x60
mov  rsi,rax
mov  edi,0x0
call read

All of the instructions following call are setting up function arguments. According to the System V ABI, function arguments are passed in the following order: rdi, rsi, rdx, rcx, r8, and r9. Thus, the first argument of a function is passed in rdi, the second in rsi, and so forth. But, wait! The assembly uses edi and edx, not rdi and rdx. Aren't those different registers? Yes, but no. At its inception, x86 was a 16-bit architecture; thus, registers had 16-bits. When x86 was extended to 32-bits, registers were also extended to 32-bits. Now, the original di register was named edi (extended di). In x86-64, registers were extended to 64-bits. edi became rdi (register d extended). To maintain backwards compability, x86-64 still allows use of the legacy registers. Using the legacy registers will just operate on that portion of the extended 64-bit register, so edi is the lower 32-bits of rdi.

main then makes a call to a function named check, which takes in the username and password buffers as arguments, in rdi and rsi respectively. The disassembly, once again, begins by allocating stack space, specifically 16 bytes. Then, the arguments are moved onto the stack:

mov QWORD PTR [rbp-0x8],rdi
mov QWORD PTR [rbp-0x10],rsi

The QWORD PTR [<ADDR>],<SRC> syntax reads: store 8 bytes (a quad word)5 from <SRC> starting at address <ADDR>. Here, the brackets indicate a memory dereference, telling the CPU to access the value starting at address <ADDR>, not the address itself. The next instructions prepare for and execute a strcmp call:

mov  rax,QWORD PTR [rbp-0x8]
lea  rsi,[rip+0xe53]
mov  rdi,rax
call strcmp

rdi stores the inputted username. rsi stores an address; here, the brackets do not indicate a deference, they are simply used to enclose a calculation6. From a C perspective, strcmp takes two pointers to strings (char *) as arguments, so I thought the address now stored in rsi must contain a string. To confirm, I ran x/s $rsi in GDB to examine the value at the address in rsi as a string. Sure enough, it was a string: "LITCTF". The remainder of check's disassembly provides insight into the control flow it implements. Addresses are included for clarity:

4011bd:       test   eax,eax
4011bf:       jne    4011df <check+0x49>
4011c1:       mov    rax,QWORD PTR [rbp-0x10]
4011c5:       lea    rsi,[rip+0xe43]        
4011cc:       mov    rdi,rax
4011cf:       call   401090 <strcmp@plt>
4011d4:       test   eax,eax
4011d6:       jne    4011df <check+0x49>
4011d8:       mov    eax,0x1
4011dd:       jmp    4011e4 <check+0x4e>
4011df:       mov    eax,0x0
4011e4:       leave
4011e5:       ret

After the strcmp call, a new instruction named test appears! test performs a bitwise AND on its operands and discards the result, but still modifies select flags in the special eflags register, which contains bits that track the status of the CPU after certain instructions. Of particular importance to this code is the zero flag, which tracks whether the result of the last instruction is zero (1 if so, 0 if not). In the System V ABI, eax stores the return value of functions, so the code is checking whether strcmp returns a zero, which happens when the two strings passed as arguments are equal. Next, there is a jne (jump if not equal) instruction, which checks the zero flag and jumps to an address if it is not set (ZF = 0). In the context of our code, this occurs when the inputted username is not equal to "LITCTF". This branch jumps to address 0x4011df, which simply returns7 from check with a value of zero. If the branch is not taken (when username = "LITCTF"), the code then checks, using another strcmp call, if password is equal to the string "d0nt_57r1ngs_m3_3b775884", stored at address rip+0xe43, taking the same branch if not. If they are equal, though, then the function returns with a value of 1. Thus, this whole section of code represents an AND statement, that returns 1 if satisfied and 0 if not.

Now, we find ourselves back in main:

40125b:       test   eax,eax
40125d:       je     4012a2 <main+0xbc>
40125f:       lea    rdi,[rip+0xde2]   
401266:       call   401070 <puts@plt>
40126b:       lea    rdi,[rip+0xdde]   
401272:       call   401070 <puts@plt>
401277:       lea    rdi,[rip+0xe12]   
40127e:       call   401070 <puts@plt>
401283:       lea    rdi,[rip+0xdc6]   
40128a:       call   401070 <puts@plt>
40128f:       lea    rdi,[rip+0xe34]   
401296:       call   401070 <puts@plt>
40129b:       mov    eax,0x0
4012a0:       jmp    4012b8 <main+0xd2>
4012a2:       lea    rdi,[rip+0xe29]   
4012a9:       call   401070 <puts@plt>
4012ae:       mov    edi,0x0
4012b3:       call   4010a0 <exit@plt>
4012b8:       leave
4012b9:       ret

eax is once subjected to a test. Instead of jne, there is now a je (jump if equal) instruction; this does the exact opposite of jne, jumping to an address if the zero flag is set. In the context of our program, this branch, which terminates the program, would be taken if check returns zero, indicating that one or both of the inputted username and password was incorrect. If both the inputted username and password check out, then a sequence of puts calls are executed before main returns with a value of zero.

Capturing the flag

After this analysis, I attempted to recreate the C program that the assembly seemed to have come from:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

int check(char *username, char *password) {
	if (strcmp(username, "LITCTF") == 0 && strcmp(password, "d0nt_57r1ngs_m3_3b775884") == 0) {
		return 1;
	}
	
	return 0;
}

int main() {
	puts("Enter username:
");
	// Read 96 bytes from stdin into the `username` buffer.
	char[96] username;
	read(0, username, 96); // 0 is the file descriptor for `stdin`.
	
	puts("Enter password:
");
	// Read 96 bytes fron stdin into the `password` buffer.
	char[96] password;
	read(0, password, 96);
	
	if (check(username, password) == 0) {
		puts("Invalid credentials");
		exit(0);
	} else {
		puts(...);
		puts(...);
		puts(...);
		puts(...);
		puts(...);
	}
	
	return 0;
}

Notice that not every assembly instruction translates directly to a line of C code. Many things happen behind the scenes, such as passing function arguments and setting up stack frames.

I then ran the program with the correct inputs and got five lines of output, so the five puts calls checked out. But, there was no flag. I checked what the flag format was: LITCTF{...}. I paused for a moment, laughing when I realized the correct username and password was actually the flag. Silly me.

Takeaways

This was honestly one of the most exciting tasks I have done in my computing journey! I ended up obtaining the flag after the CTF was over, but it was still a worthwhile learning experience. I have certainly became better at reading assembly and figuring out how it translates to source code. I find my biggest takeaway to be that every single instruction probably doesn't matter. Instead, I should be able to see what instructions are doing something that I need to know about, and what values are involved. This is a skill I will hopefully develop with more reverse engineering experience. Anything is open source if you can read assembly.

  1. Linking is the process that combines compiler outputted object files into a final executable.

  2. Think of the instruction set architecture (ISA) as a manual for a processor. It defines available instructions and registers, how memory is to be addressed, etc.

  3. The disassembly shown throughout this post is in Intel's x86 syntax, which formats instructions as follows: <INST> <DEST>,<SRC>.

  4. This behavior is known as last in, first out (LIFO).

  5. To change the size of data involved, QWORD can be replaced by any size indicator, such as WORD or BYTE.

  6. The lea (load effective address) instruction computes a memory address and stores it in a register.

  7. The leave instruction cleans up the current stack frame. It is equivalent to the sequence mov rsp,rbp pop rbp.

Table of Contents