Warning! Assembly Required! - Solving my first CTF challenge
Published September 01, 2025
Doing something is often the best way to improve at it. This is no different for cybersecurity. Knowledge can not become understanding without application. One of the best ways to apply cybersecurity knowledge is to participate in a capture the flag (CTF) event. CTFs are competitions where teams compete to solve as many cybersecurity challenges as possible. These challenges cover various skills and topics such as reverse engineering, cryptography, forensics, web exploiting, and more. After solving a challenge, you submit a "flag" used to claim points. I did not expect to be high on the leaderboard, I just wanted to experience what a CTF was actually like, learn more about computers, and gain some new skills.
The Challenge
I went with a reverse engineering challenge, something I had never done before. The challenge provided a file, which I found, after running the file command on it, to be:
- Non-stripped, meaning the executable contains debugging information generated by the compiler, which connects binary code to its equivalent source allowing for easier debugging. This information includes symbol names, line numbers, data types, etc.
- Dynamically linked, meaning the executable relies on shared operating system libraries loaded at runtime instead of link time1. Thus, the code for these libraries is not included in the executable as it is provided by the operating system.
- Compiled for x86-64, which is the instruction set architecture2 the executable is meant to run on.
- An ELF file, short for Executable and Linkable Format, which is the executable file format used on Linux systems.
Based on the last two characteristics, the program conforms to the System V ABI, which defines how programs interact with each other and the operating system at the machine level. Many Unix-based systems, such as Linux, conform to the System V ABI.
Analyzing the disassembly
I first thought to run the executable to see what it did; it simply asked me to input a username and password, which I of course did not know, and then outputted "Invalid credentials". Since I only had an executable, I had no high-level source code, in a language such as C, to read; instead, I had to read assembly. Linux has a handy command named objdump that can disassemble a binary file; all of a binary file can be disassembled with the -D option. I then searched for the main function, which is the entry point of the program. Let's walk through the important parts of the disassembly3.
The disassembly starts with the function prelude, which is responsible for setting up stack space for main:
push rbp
mov rbp,rsp
sub rsp,0x40
A stack is a common data structure used in computing. Think of it precisely as a stack of books: new books go on top and the top book is the first to be removed4. In computer architecture, the stack keeps track of the state of executing functions in a stack frame. This stack frame contains data such as local variables and the return address, which is where the CPU should continue executing after the function returns. In x86, the stack grows downward, starting at higher addresses; so, the top of the stack is at a lower address than the bottom. x86 contains two special registers for the stack: rsp, the stack pointer which points to the top of the stack; and rbp, which points to the bottom of the current stack frame. Let's walk through the assembly:
push rbpplace the value ofrbpon the top of the stack. At the time this instruction is executed,rbppoints to the base of the caller's stack frame. Functions being called, dubbed the callee, must saverbpso the proper value can be restored when it returns.mov rbp,rspmoves the value of the stack pointer into the base pointer.rsppoints to the top of the caller's stack frame. This same boundary is the base of the callee's stack frame.sub rsp,0x40makes space on the stack, precisely 64 bytes.
The next section of the disassembly handled taking user input, indicated by the read calls; this first sequence was for reading the inputted username. read is a C library function that facilitates reading data from a file. It takes 3 arguments, in the following order: a file descriptor, which is where data should be read from; a pointer to a buffer where data is to be read to; and a count, specifying the maximum amount of bytes that should be read:
lea rax,[rbp-0x20]
mov edx,0x60
mov rsi,rax
mov edi,0x0
call read
All of the instructions following call are setting up function arguments. According to the System V ABI, function arguments are passed in the following order: rdi, rsi, rdx, rcx, r8, and r9. Thus, the first argument of a function is passed in rdi, the second in rsi, and so forth. But, wait! The assembly uses edi and edx, not rdi and rdx. Aren't those different registers? Yes, but no. At its inception, x86 was a 16-bit architecture; thus, registers had 16-bits. When x86 was extended to 32-bits, registers were also extended to 32-bits. Now, the original di register was named edi (extended di). In x86-64, registers were extended to 64-bits. edi became rdi (register d extended). To maintain backwards compability, x86-64 still allows use of the legacy registers. Using the legacy registers will just operate on that portion of the extended 64-bit register, so edi is the lower 32-bits of rdi.
main then makes a call to a function named check, which takes in the username and password buffers as arguments, in rdi and rsi respectively. The disassembly, once again, begins by allocating stack space, specifically 16 bytes. Then, the arguments are moved onto the stack:
mov QWORD PTR [rbp-0x8],rdi
mov QWORD PTR [rbp-0x10],rsi
The QWORD PTR [<ADDR>],<SRC> syntax reads: store 8 bytes (a quad word)5 from <SRC> starting at address <ADDR>. Here, the brackets indicate a memory dereference, telling the CPU to access the value starting at address <ADDR>, not the address itself. The next instructions prepare for and execute a strcmp call:
mov rax,QWORD PTR [rbp-0x8]
lea rsi,[rip+0xe53]
mov rdi,rax
call strcmp
rdi stores the inputted username. rsi stores an address; here, the brackets do not indicate a deference, they are simply used to enclose a calculation6. From a C perspective, strcmp takes two pointers to strings (char *) as arguments, so I thought the address now stored in rsi must contain a string. To confirm, I ran x/s $rsi in GDB to examine the value at the address in rsi as a string. Sure enough, it was a string: "LITCTF". The remainder of check's disassembly provides insight into the control flow it implements. Addresses are included for clarity:
4011bd: test eax,eax
4011bf: jne 4011df <check+0x49>
4011c1: mov rax,QWORD PTR [rbp-0x10]
4011c5: lea rsi,[rip+0xe43]
4011cc: mov rdi,rax
4011cf: call 401090 <strcmp@plt>
4011d4: test eax,eax
4011d6: jne 4011df <check+0x49>
4011d8: mov eax,0x1
4011dd: jmp 4011e4 <check+0x4e>
4011df: mov eax,0x0
4011e4: leave
4011e5: ret
After the strcmp call, a new instruction named test appears! test performs a bitwise AND on its operands and discards the result, but still modifies select flags in the special eflags register, which contains bits that track the status of the CPU after certain instructions. Of particular importance to this code is the zero flag, which tracks whether the result of the last instruction is zero (1 if so, 0 if not). In the System V ABI, eax stores the return value of functions, so the code is checking whether strcmp returns a zero, which happens when the two strings passed as arguments are equal. Next, there is a jne (jump if not equal) instruction, which checks the zero flag and jumps to an address if it is not set (ZF = 0). In the context of our code, this occurs when the inputted username is not equal to "LITCTF". This branch jumps to address 0x4011df, which simply returns7 from check with a value of zero. If the branch is not taken (when username = "LITCTF"), the code then checks, using another strcmp call, if password is equal to the string "d0nt_57r1ngs_m3_3b775884", stored at address rip+0xe43, taking the same branch if not. If they are equal, though, then the function returns with a value of 1. Thus, this whole section of code represents an AND statement, that returns 1 if satisfied and 0 if not.
Now, we find ourselves back in main:
40125b: test eax,eax
40125d: je 4012a2 <main+0xbc>
40125f: lea rdi,[rip+0xde2]
401266: call 401070 <puts@plt>
40126b: lea rdi,[rip+0xdde]
401272: call 401070 <puts@plt>
401277: lea rdi,[rip+0xe12]
40127e: call 401070 <puts@plt>
401283: lea rdi,[rip+0xdc6]
40128a: call 401070 <puts@plt>
40128f: lea rdi,[rip+0xe34]
401296: call 401070 <puts@plt>
40129b: mov eax,0x0
4012a0: jmp 4012b8 <main+0xd2>
4012a2: lea rdi,[rip+0xe29]
4012a9: call 401070 <puts@plt>
4012ae: mov edi,0x0
4012b3: call 4010a0 <exit@plt>
4012b8: leave
4012b9: ret
eax is once subjected to a test. Instead of jne, there is now a je (jump if equal) instruction; this does the exact opposite of jne, jumping to an address if the zero flag is set. In the context of our program, this branch, which terminates the program, would be taken if check returns zero, indicating that one or both of the inputted username and password was incorrect. If both the inputted username and password check out, then a sequence of puts calls are executed before main returns with a value of zero.
Capturing the flag
After this analysis, I attempted to recreate the C program that the assembly seemed to have come from:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int check(char *username, char *password) {
if (strcmp(username, "LITCTF") == 0 && strcmp(password, "d0nt_57r1ngs_m3_3b775884") == 0) {
return 1;
}
return 0;
}
int main() {
puts("Enter username:
");
// Read 96 bytes from stdin into the `username` buffer.
char[96] username;
read(0, username, 96); // 0 is the file descriptor for `stdin`.
puts("Enter password:
");
// Read 96 bytes fron stdin into the `password` buffer.
char[96] password;
read(0, password, 96);
if (check(username, password) == 0) {
puts("Invalid credentials");
exit(0);
} else {
puts(...);
puts(...);
puts(...);
puts(...);
puts(...);
}
return 0;
}
Notice that not every assembly instruction translates directly to a line of C code. Many things happen behind the scenes, such as passing function arguments and setting up stack frames.
I then ran the program with the correct inputs and got five lines of output, so the five puts calls checked out. But, there was no flag. I checked what the flag format was: LITCTF{...}. I paused for a moment, laughing when I realized the correct username and password was actually the flag. Silly me.
Takeaways
This was honestly one of the most exciting tasks I have done in my computing journey! I ended up obtaining the flag after the CTF was over, but it was still a worthwhile learning experience. I have certainly became better at reading assembly and figuring out how it translates to source code. I find my biggest takeaway to be that every single instruction probably doesn't matter. Instead, I should be able to see what instructions are doing something that I need to know about, and what values are involved. This is a skill I will hopefully develop with more reverse engineering experience. Anything is open source if you can read assembly.
-
Linking is the process that combines compiler outputted object files into a final executable. ↩
-
Think of the instruction set architecture (ISA) as a manual for a processor. It defines available instructions and registers, how memory is to be addressed, etc. ↩
-
The disassembly shown throughout this post is in Intel's x86 syntax, which formats instructions as follows:
<INST> <DEST>,<SRC>. ↩ -
This behavior is known as last in, first out (LIFO). ↩
-
To change the size of data involved,
QWORDcan be replaced by any size indicator, such asWORDorBYTE. ↩ -
The
lea(load effective address) instruction computes a memory address and stores it in a register. ↩ -
The
leaveinstruction cleans up the current stack frame. It is equivalent to the sequencemov rsp,rbppop rbp. ↩