Assembly language is a low-level programming language that is specific to a computer architecture. It’s a human-readable representation of machine code, which is the binary code that a computer’s central processing unit (CPU) understands.
Understanding assembly language helps you gain insights into how computers work at a fundamental level. It’s also crucial for tasks like writing efficient code, debugging, and understanding the inner workings of software. Certain high pay jobs might also ask for it.
Setting up
Compiling assembly code on Windows, Linux, and macOS involves using different tools and commands.
For Windows:
- Using NASM and Linker:
- Download and install NASM (Netwide Assembler) from https://www.nasm.us/.
- Open Command Prompt.
- Navigate to the directory containing your assembly code.
- Use the following commands:
nasm -f win32 your_file.asm -o your_file.obj
This assembles the code.
gcc -m32 your_file.obj -o your_file.exe
This links the object file and produces the executable.
For Linux:
- Install NASM using your package manager (e.g.,
sudo apt-get install nasm
on Debian/Ubuntu, through the terminal). - Use the following commands:
nasm -f elf your_file.asm -o your_file.o
to assemble the code, then:
ld your_file.o -o your_file
To link the object file and produces the executable.
Finally, for MacOS:
- Install NASM using a package manager like Homebrew (
brew install nasm
, using the terminal). - Run the commands:
nasm -f macho your_file.asm -o your_file.o
to assemble and
ld your_file.o -o your_file
to Link, just like with Linux.
Done that, install an editor such as VS Code or Sublime.
Basic Concepts: Registers and Instructions
- Registers:
- Think of registers as tiny storage spaces inside the CPU.
- Each register can hold a small piece of data, like a number.
- Common registers include EAX, EBX, ECX, and EDX.
- Instructions:
- Assembly programs consist of instructions that tell the CPU what to do.
- Instructions are simple operations like moving data between registers, performing arithmetic, or jumping to another part of the program.
Assembly Syntax:
Commands:
- Commands are the basic operations in assembly language.
- Examples:
MOV
– Move data between registers or memory locations.ADD
– Add two numbers.SUB
– Subtract one number from another.JMP
– Jump to another part of the program.
Operands:
- Operands are the data that instructions work with.
- Examples:
- Registers (e.g., EAX, EBX).
- Memory addresses (e.g., [0x1000]).
- Immediate values (actual numbers).
- Constants
A Simple Assembly Program:
Let’s create a basic program that adds two numbers and stores the result:
section .data
PI equ 3.14 ; Define a constant value for PI
num1 dw 5 ; Define a word (16-bit) variable with value 5
num2 dw 3 ; Define another word variable with value 3
result dw 0 ; Reserve space for the result
section .text
global _start
_start:
; Load num1 into EAX register
mov eax, [num1]
; Add num2 to the value in EAX
add eax, [num2]
; Store the result in the 'result' variable
mov [result], eax
; Exit the program
mov eax, 1 ; syscall: exit
xor ebx, ebx ; status: 0
int 0x80 ; Call the kernel
section .data
num1 dw 5 ; Define a word (16-bit) variable with value 5
num2 dw 3 ; Define another word variable with value 3
result dw 0 ; Reserve space for the result
section .text
global _start
_start:
; Load num1 into EAX register
mov eax, [num1]
; Add num2 to the value in EAX
add eax, [num2]
; Store the result in the 'result' variable
mov [result], eax
; Exit the program
mov eax, 1 ; syscall: exit
xor ebx, ebx ; status: 0
int 0x80 ; Call the kernel
This simple program defines two numbers (num1
and num2
), adds them together, and stores the result in the result
variable. Finally, it exits the program.
Directives:
- Directives provide instructions to the assembler but are not executed at runtime.
- Example:
section .data
: Defines a section for data.section .text
: Defines a section for executable code.
Comments:
- Use comments (lines starting with
;
) to explain your code. - Comments are essential for making your code readable and understandable.
Memory Access:
- Memory can be accessed using brackets
[]
. - Example:
mov eax, [num1]
: Moves the value at the memory addressnum1
into theeax
register.
Conditional Jumps:
- Use conditional jumps to make decisions in your code.
- Example:
cmp eax, ebx
: Compares the values ineax
andebx
.je label
: Jumps to the specified label if the previous comparison was equal.
A Conditional Jump Example:
Let’s modify the previous program to print “Equal” if the result is zero and “Not Equal” otherwise:
section .data
num1 dw 5
num2 dw 5
result dw 0
section .text
global _start
_start:
; Load num1 into EAX register
mov eax, [num1]
; Subtract num2 from the value in EAX
sub eax, [num2]
; Compare EAX with 0
cmp eax, 0
; Jump to the 'equal' label if the result is zero
je equal
; If not equal, print "Not Equal" and exit
mov eax, 4 ; syscall: write
mov ebx, 1 ; file descriptor: stdout
mov ecx, not_equal_msg ; message address
mov edx, not_equal_len ; message length
int 0x80 ; Call the kernel
; Exit the program
mov eax, 1 ; syscall: exit
xor ebx, ebx ; status: 0
int 0x80 ; Call the kernel
jmp end ; Jump to the end of the program
equal:
; If equal, print "Equal" and exit
mov eax, 4 ; syscall: write
mov ebx, 1 ; file descriptor: stdout
mov ecx, equal_msg ; message address
mov edx, equal_len ; message length
int 0x80 ; Call the kernel
; Exit the program
mov eax, 1 ; syscall: exit
xor ebx, ebx ; status: 0
int 0x80 ; Call the kernel
end:
; Data section
section .data
equal_msg db 'Equal', 0xA ; Message for equal
not_equal_msg db 'Not Equal', 0xA ; Message for not equal
equal_len equ $ - equal_msg ; Calculate the length of the equal message
not_equal_len equ $ - not_equal_msg ; Calculate the length of the not equal message
This modified program introduces conditional jumps (je
and normal jumps jmp
) based on the result of the subtraction. It also includes messages for “Equal” and “Not Equal” and prints them accordingly.
Loops in Assembly:
- Assembly language supports loops for repetitive tasks.
- Example:
ecx
register is often used as a loop counter.loop
instruction decrementsecx
and jumps to a label untilecx
becomes zero.
A Loop Example:
Let’s create a program that prints the numbers 1 to 5:
section .text
global _start
_start:
mov ecx, 5 ; Set the loop counter to 5
print_loop:
; Print the current value in ecx
mov eax, 4 ; syscall: write
mov ebx, 1 ; file descriptor: stdout
mov ecx, [ecx] ; Load the current loop counter value into ecx
add ecx, '0' ; Convert the number to ASCII
mov edx, 1 ; message length
int 0x80 ; Call the kernel
; Print a newline character
mov eax, 4 ; syscall: write
mov ebx, 1 ; file descriptor: stdout
mov ecx, newline ; newline character address
mov edx, 1 ; message length
int 0x80 ; Call the kernel
; Decrement the loop counter and check if it's zero
loop print_loop
; Exit the program
mov eax, 1 ; syscall: exit
xor ebx, ebx ; status: 0
int 0x80 ; Call the kernel
section .data
newline db 0xA ; Newline character
This program uses a loop to print the numbers 1 to 5 along with newline characters.
Subroutines:
- Subroutines are reusable pieces of code.
- Use
call
to jump to a subroutine andret
to return. - Example:
assembly call my_subroutine ; ... my_subroutine: ; Subroutine code here ret
A Subroutine Example:
Let’s modify the previous program to use a subroutine for printing a newline character:
section .text
global _start
_start:
mov ecx, 5 ; Set the loop counter to 5
print_loop:
; Print the current value in ecx
mov eax, 4 ; syscall: write
mov ebx, 1 ; file descriptor: stdout
mov ecx, [ecx] ; Load the current loop counter value into ecx
add ecx, '0' ; Convert the number to ASCII
mov edx, 1 ; message length
int 0x80 ; Call the kernel
; Call the newline subroutine
call print_newline
; Decrement the loop counter and check if it's zero
loop print_loop
; Exit the program
mov eax, 1 ; syscall: exit
xor ebx, ebx ; status: 0
int 0x80 ; Call the kernel
print_newline:
; Subroutine to print a newline character
mov eax, 4 ; syscall: write
mov ebx, 1 ; file descriptor: stdout
mov ecx, newline ; newline character address
mov edx, 1 ; message length
int 0x80 ; Call the kernel
ret
section .data
newline db 0xA ; Newline character
This program introduces a print_newline
subroutine to handle printing newline characters.