Skip to content

Latest commit

 

History

History
140 lines (110 loc) · 3.89 KB

File metadata and controls

140 lines (110 loc) · 3.89 KB

01: Introduction to Assembly Language

What is Assembly Language?

Assembly language is a low-level programming language that corresponds closely to machine code (the binary instructions that CPUs execute). Each assembly instruction typically represents a single CPU operation.

Assembly vs Machine Code

Machine Code (Binary):  10110000 01100001  (what the CPU sees)
Assembly (Mnemonic):    MOV AL, 61h        (what humans write)
High-Level (C):         a = 97;            (what we prefer)

Why Learn Assembly?

  1. System Understanding - Learn how computers actually work
  2. Performance Optimization - Write extremely efficient code
  3. Reverse Engineering - Understand compiled programs
  4. Embedded Systems - Directly control hardware
  5. Operating Systems - Understand OS internals
  6. Debugging - Debug complex low-level issues
  7. Security - Identify vulnerabilities and exploits

Historical Context

  • 1940s-1950s: Assembly was the primary programming language
  • 1950s onwards: High-level languages emerged (FORTRAN, C)
  • Today: Assembly used for:
    • Critical performance sections
    • Operating system kernels
    • Bootloaders
    • Device drivers
    • Reverse engineering

Assembly Variants

Different CPU architectures have different assembly languages:

x86/x86-64 (Intel/AMD)

  • Most common for personal computers and servers
  • Complex instruction set (CISC)
  • Examples: Intel Core, AMD Ryzen

ARM

  • Dominant in mobile devices and embedded systems
  • Simpler instruction set (RISC)
  • Examples: Apple M1, Qualcomm Snapdragon

MIPS

  • Used in networking and some embedded systems
  • Classic RISC architecture

PowerPC

  • Used in some embedded and aerospace systems

Key Concepts

Registers

Ultra-fast memory locations inside the CPU itself. Much faster than RAM.

Typical Register Size:
- 8-bit   (1 byte)   - old systems
- 16-bit  (2 bytes)  - very old systems
- 32-bit  (4 bytes)  - older systems
- 64-bit  (8 bytes)  - modern systems

Memory Hierarchy

CPU Registers    ← Fastest, smallest (KB)
   ↓
L1 Cache         ← Very fast (KB)
   ↓
L2 Cache         ← Fast (MB)
   ↓
L3 Cache         ← Slower (MB)
   ↓
RAM (Memory)     ← Slow, large (GB)
   ↓
Disk Storage     ← Slowest, huge (TB)

Instructions

A single operation, like:

  • Moving data: MOV AL, 5
  • Adding numbers: ADD AX, BX
  • Jumping to code: JMP LABEL

Assembly Development Tools

Assemblers (convert assembly to machine code)

  • NASM - Netwide Assembler (free, popular)
  • MASM - Microsoft Macro Assembler (Windows)
  • GAS - GNU Assembler (Linux)
  • YASM - YASM Ain't an Assembler

Linkers & Loaders

  • ld - GNU Linker
  • link.exe - Microsoft Linker

Debuggers

  • GDB - GNU Debugger
  • OllyDbg - Windows debugger
  • Radare2 - Reverse engineering framework

Disassemblers (convert machine code back to assembly)

  • objdump - Part of GNU tools
  • IDA Pro - Professional disassembler
  • Ghidra - NSA's free disassembler

Your Learning Path

1. Understand CPU basics (how it works internally)
   ↓
2. Learn registers (how to store data in the CPU)
   ↓
3. Study memory (how to work with larger data)
   ↓
4. Master instructions (actual operations)
   ↓
5. Study control flow (loops, conditions, jumps)
   ↓
6. Learn functions (calling conventions, stack)
   ↓
7. Practice (write real programs)

Important Notes

  • Assembly is CPU-specific - Code for x86 won't run on ARM
  • Assembly is OS-specific - Windows, Linux, macOS have different conventions
  • Syntax varies - NASM and GAS use different syntax (we'll focus on NASM)
  • No built-in functions - Assembly has no printf(), malloc(), etc. (unless you call library functions)

Next Steps

→ Continue to CPU Architecture to understand how processors work at the quantum level.