fs_sim - ext2-Inspired File System

fs_sim is a clean, educational implementation of a block-based file system inspired by ext2.

Built entirely in C++17, it simulates a disk as a contiguous std::vector<std::byte> and provides a simple REPL/shell where you can format, mount, create files/directories, read/write, list, delete (even recursively), navigate paths, and remount — with data surviving simulated reboots.

The goal was never production use.
It was to move from textbook diagrams to actually making it work — understanding superblocks, block groups, inodes, bitmaps, directory packing, and metadata consistency under real operations.

View fs_sim on GitHub →

Motivation

Reading about inodes, bitmaps, and block groups is one thing.
Seeing your ls output survive a simulated crash is another.

I built fs_sim to bridge that gap:

Turn OS course theory into runnable code
Practice raw C++ memory management and pointer math
Debug real corruption bugs (off-by-one inode offsets, directory entry misalignment)
Create an interactive testbed where I could mount → mkdir → echo → rm -r → remount → ls and verify everything survived

It’s one of the best ways to internalize why real file systems make the tradeoffs they do.

Core Architecture

Disk — simulated as a single std::vector<std::byte> (default 4 KiB blocks)
Superblock — fixed location; tracks total blocks/inodes, blocks per group, inode size, free counts
Block Groups — disk partitioned into equal groups; each has:
- Block bitmap
- Inode bitmap
- Inode table
- Data blocks
Inodes — fixed-size; store type (file/dir), permissions (basic), size, timestamps, 12 direct block pointers
Directories — special files containing variable-length {name, inode} entries
Persistence — serialize/deserialize the entire buffer so state survives umount/mount cycles

No indirect/double-indirect blocks yet → files capped at ~48 KiB (12 × 4 KiB).
No full UID/GID or advanced permissions (planned).

Supported Commands (REPL Shell)

Run ./fs_sim to enter the interactive shell:

format — wipe and initialize fresh FS
mount — load existing disk state
ls [path] — list directory contents
mkdir <path>
touch <path>
echo "content" > <path> — create/write file
cat <path> — read file content
rm <path>
rm -r <path> — recursive delete
pwd, cd <path> — navigation
Absolute (/home), relative (../docs), . / .. support

All ops correctly update bitmaps, link counts, free counters, and directory entries.

Technical Challenges & Lessons

The hard parts were exactly what made it valuable:

Address calculations — inode # → group → table offset → byte offset (off-by-one = instant corruption)
Directory packing — variable name lengths + alignment → fragile serialization/deserialization
Path resolution — walking tree from root, handling ./.., preventing cycles/orphans
Memory safety — raw pointers over giant byte buffer; used std::span heavily, still chased overruns
Remount consistency — every struct must serialize/deserialize perfectly

Testing: Catch2 unit suite + the REPL itself (format → deep tree → write → umount → mount → verify).

Outcome & Reflection

After months of segfaults and bitmap bugs, the moment I typed ls and saw my directory tree survive a remount felt huge.

This project gave me intuition no textbook could:

Why block groups exist (localized allocation)
How bitmaps prevent over-allocation
The cost of metadata updates
Why real FSes obsess over crash consistency

If you’re studying OSes or low-level systems, build something like this.
Start flat (no dirs), add hierarchy, then bitmaps/inodes.
The “aha” when your toy FS survives is worth every crash.

Future ideas (some already in progress):

Single/double indirect pointers
mmap-backed real file persistence
Full rwx + UID/GID
Basic journaling concepts

Links & Next Steps

Repository & Full README: github.com/pavandhadge/fs_sim

Build & Run:

mkdir build && cd build
cmake ..
make
./fs_sim          # enter REPL
make check        # run tests

Clone it, break it, fix it, learn from it.

“Building fs_sim showed me that file systems aren’t magic — they’re careful data structures and relentless consistency checks. And the best way to understand them is to build one yourself.”