Capstone Project 4

File Encryption Tool

Build a command-line file encryption tool in C that implements multiple encryption algorithms including XOR cipher, Caesar cipher, and Vigenère cipher. You'll handle binary file I/O, implement a clean CLI interface with argument parsing, and ensure proper error handling and secure key management.

12-18 hours
Intermediate-Advanced
450 Points
What You Will Build
  • XOR cipher implementation
  • Caesar cipher with shift
  • Vigenère cipher (bonus)
  • Command-line interface
  • Binary file handling
Contents
01

Project Overview

This capstone project focuses on cryptography fundamentals and file I/O operations. You will build a professional-grade file encryption tool that can encrypt and decrypt files using multiple algorithms. The tool will feature a command-line interface with proper argument parsing, support for both text and binary files, and robust error handling for various edge cases.

Skills Applied: This project tests your proficiency in file I/O (binary mode), bitwise operations, string manipulation, command-line argument parsing, modular programming with header files, and memory management in C.
XOR Cipher

Implement XOR-based encryption with variable key length

Caesar Cipher

Classic shift cipher with configurable offset

File Handling

Read/write text and binary files securely

CLI Interface

Parse arguments and provide help documentation

Learning Objectives

Technical Skills
  • Master binary file I/O with fread/fwrite
  • Implement bitwise XOR operations
  • Build modular alphabet shift functions
  • Parse command-line arguments (argc/argv)
  • Handle memory allocation correctly
Security Concepts
  • Understand symmetric encryption principles
  • Learn about key management best practices
  • Recognize strengths and weaknesses of classic ciphers
  • Handle sensitive data in memory securely
  • Implement proper error messages (no info leakage)
Ready to submit? Already completed the project? Submit your work now!
Submit Now
02

Project Scenario

SecureData Solutions

You have been contracted by SecureData Solutions, a small cybersecurity consulting firm. They need a lightweight, portable file encryption utility for their clients who work in environments where installing large security suites is not practical. The tool must be small, fast, and work entirely from the command line.

"We need a simple but effective encryption tool that our field agents can use on any system with a C compiler. It should support multiple algorithms so users can choose their security level, and it must handle any file type - text documents, images, even executables. Can you build this for us?"

Sarah Chen, Chief Security Officer

Requirements from the Client

Core Algorithms
  • XOR cipher with user-provided key
  • Caesar cipher with configurable shift (1-25)
  • Decrypt mode for all algorithms
  • Auto-detect original file from encrypted
File Operations
  • Read files of any size (streaming for large files)
  • Write encrypted output to new file
  • Preserve original file (no overwrite)
  • Support both text and binary files
Command-Line Interface
  • Support short and long argument forms
  • Display help message with --help
  • Show version with --version
  • Clear error messages for invalid input
Bonus Features
  • Vigenère cipher implementation
  • Progress bar for large files
  • File integrity check (checksum)
  • Batch encryption (multiple files)
Pro Tip: Focus on correctness first! Make sure your encryption and decryption work perfectly for simple cases before adding advanced features. A tool that encrypts but cannot decrypt correctly is useless.
03

The Dataset

You will work with sample text files for testing your encryption algorithms. Download the files containing various content types and sizes for comprehensive testing:

Dataset Download

Download the sample text files and save them to your project folder. The files contain various content patterns for testing encryption algorithms.

Original Data Source

This project uses the Gutenberg Text Corpus from Kaggle - a collection of 18 classic literature .txt files (11.8 MB total). Includes works by Austen, Carroll, Melville, Milton, Shakespeare, and more - perfect for testing file encryption with real-world text data.

Dataset Info: 18 .txt files | Total Size: 11.8 MB | Authors: Austen, Carroll, Melville, Milton, Shakespeare, Chesterton | Format: Plain text (.txt) | Content: Classic literature novels and poems
Dataset Schema

ColumnTypeDescription
idIntegerRow identifier (1-10)
textStringSample text patterns (Hello World, pangram, etc.)
categoryStringPattern type (greeting, alphabet, numbers, special)
lengthIntegerCharacter count of text field
Use for: Initial testing, verifying basic encryption/decryption works

ColumnTypeDescription
section_idIntegerSection identifier (1-20)
topicStringCryptography topic (symmetric, asymmetric, hash, etc.)
contentStringEducational content about encryption
exampleStringCode or pattern example
difficultyStringBeginner, Intermediate, Advanced
Use for: Testing with realistic document content

ColumnTypeDescription
row_idIntegerUnique row identifier (1-500)
chapterStringChapter name (History, XOR, Caesar, Vigenere, etc.)
sectionStringSection title within chapter
contentStringDetailed documentation text
code_sampleStringC code examples (may contain special chars)
ascii_valueIntegerASCII reference values (0-255)
Use for: Performance testing, memory handling, buffer management

ColumnTypeDescription
record_idIntegerRecord identifier (1-25)
data_typeStringType: PII, financial, credentials, network
field_nameStringField label (name, ssn, card_number, etc.)
sample_valueStringFictional sample data (NEVER real data)
sensitivityStringLow, Medium, High, Critical
Note: All data is fictional! This demonstrates the type of sensitive content that encryption tools typically protect.
Test File Stats: 4 sample files covering 200B to 15KB range with diverse content patterns
Verification: Encrypt then decrypt - output should match original exactly
04

Key Concepts

Before implementing the encryption algorithms, make sure you understand these fundamental concepts:

XOR Cipher

XOR (exclusive or) is a bitwise operation that returns 1 when inputs differ:

0 XOR 0 = 0
0 XOR 1 = 1
1 XOR 0 = 1
1 XOR 1 = 0

Key property: A XOR B XOR B = A
(XORing twice with same key returns original)

Implementation: Each byte of plaintext is XORed with corresponding byte of key. If key is shorter than plaintext, repeat the key.

Caesar Cipher

Each letter is shifted by a fixed number of positions in the alphabet:

Shift = 3:
A → D    B → E    C → F    ...
X → A    Y → B    Z → C

Encryption: E(x) = (x + shift) mod 26
Decryption: D(x) = (x - shift) mod 26

Important: Only shift letters (A-Z, a-z). Leave numbers, spaces, and punctuation unchanged. Preserve case.

Binary File I/O

Use binary mode for reliable file handling:

// Open for binary read
FILE *in = fopen("input.txt", "rb");

// Open for binary write
FILE *out = fopen("output.enc", "wb");

// Read/write binary data
size_t bytes = fread(buffer, 1, SIZE, in);
fwrite(buffer, 1, bytes, out);

Why binary mode? Text mode may translate newlines differently on Windows vs Linux. Binary mode preserves exact bytes.

Command-Line Arguments

Parse argc/argv to get user options:

int main(int argc, char *argv[]) {
    for (int i = 1; i < argc; i++) {
        if (strcmp(argv[i], "-k") == 0) {
            key = argv[++i];
        } else if (strcmp(argv[i], "-i") == 0) {
            input_file = argv[++i];
        }
    }
}

Tip: Consider using getopt() for more robust argument parsing, or implement your own loop for learning purposes.

05

Project Requirements

Your project must implement the following features. Focus on correctness first, then add bonus features.

1
XOR Cipher Implementation (Required)
  • Accept key of any length as string
  • XOR each byte of input with corresponding key byte (repeating key as needed)
  • Work with both text and binary files
  • Same function should work for both encryption and decryption
  • Handle files larger than available memory (streaming)
2
Caesar Cipher Implementation (Required)
  • Accept shift value (1-25) as command-line argument
  • Shift only alphabetic characters (A-Z, a-z)
  • Preserve letter case (uppercase stays uppercase)
  • Leave non-alphabetic characters unchanged
  • Support both encrypt (positive shift) and decrypt (negative shift)
3
Command-Line Interface (Required)
  • Support arguments: -a (algorithm), -k (key), -s (shift), -i (input), -o (output)
  • Support -d or --decrypt flag for decryption mode
  • Display help message with -h or --help
  • Show version with --version
  • Clear error messages for missing or invalid arguments
4
Error Handling (Required)
  • File not found - clear error message
  • Permission denied - handle gracefully
  • Invalid algorithm selection
  • Missing required arguments
  • Invalid shift value (outside 1-25 range)
  • Memory allocation failures
5
Bonus Features (Optional)
  • Vigenère Cipher: Polyalphabetic cipher using keyword
  • Progress Bar: Show encryption progress for large files
  • Checksum: Add integrity verification (CRC32 or simple checksum)
  • Batch Mode: Encrypt multiple files with wildcard pattern
  • Key from File: Read encryption key from file instead of command line
06

Example Usage

Here are example commands showing how your program should work:

XOR Encryption
# Encrypt a file with XOR cipher
$ ./encrypt -a xor -k "MySecretKey123" -i sample.txt -o sample.enc
Encrypting sample.txt with XOR cipher...
Done! Encrypted 1024 bytes to sample.enc

# Decrypt the file (same command, add -d flag)
$ ./encrypt -a xor -k "MySecretKey123" -d -i sample.enc -o sample_decrypted.txt
Decrypting sample.enc with XOR cipher...
Done! Decrypted 1024 bytes to sample_decrypted.txt
Caesar Cipher
# Encrypt with Caesar cipher (shift of 5)
$ ./encrypt -a caesar -s 5 -i message.txt -o message.enc
Encrypting message.txt with Caesar cipher (shift: 5)...
Done! Encrypted 256 bytes to message.enc

# Decrypt (use -d flag)
$ ./encrypt -a caesar -s 5 -d -i message.enc -o message_plain.txt
Decrypting message.enc with Caesar cipher (shift: 5)...
Done! Decrypted 256 bytes to message_plain.txt
Help and Errors
# Display help
$ ./encrypt --help
File Encryption Tool v1.0

Usage: encrypt [OPTIONS]

Options:
  -a, --algorithm ALG   Encryption algorithm (xor, caesar, vigenere)
  -k, --key KEY         Encryption key (for xor, vigenere)
  -s, --shift N         Shift value 1-25 (for caesar)
  -i, --input FILE      Input file path
  -o, --output FILE     Output file path
  -d, --decrypt         Decrypt mode (default: encrypt)
  -h, --help            Show this help message
      --version         Show version information

Examples:
  encrypt -a xor -k "secret" -i input.txt -o output.enc
  encrypt -a caesar -s 3 -d -i encrypted.txt -o plain.txt

# Error handling
$ ./encrypt -a xor -i nonexistent.txt -o output.enc
Error: File 'nonexistent.txt' not found.

$ ./encrypt -a caesar -s 30 -i input.txt -o output.enc
Error: Shift value must be between 1 and 25.
07

Submission Requirements

Create a public GitHub repository with the exact name shown below:

Required Repository Name
c-file-encryption
github.com/<your-username>/c-file-encryption
Required Project Structure
include/
  • encryption.h
  • decryption.h
  • file_handler.h
  • key_manager.h
  • utils.h
src/
  • main.c
  • xor_cipher.c
  • caesar_cipher.c
  • vigenere_cipher.c (bonus)
  • file_handler.c
  • utils.c
Other
  • data/ (test files)
  • tests/ (test cases)
  • Makefile
  • README.md
README.md Required Sections
1. Project Header
  • Project title and description
  • Your full name and submission date
  • Course and project number
2. Features
  • List of implemented algorithms
  • Supported file types
  • CLI options available
3. Build Instructions
  • Prerequisites (GCC, make)
  • How to compile with make
  • How to run the program
4. Usage Examples
  • XOR encryption/decryption commands
  • Caesar cipher examples
  • Error handling examples
5. Algorithm Details
  • Brief explanation of each cipher
  • Security considerations
  • Limitations of classic ciphers
6. Testing
  • How to run tests
  • Test files used
  • Expected vs actual output verification
08

Grading Rubric

Your project will be evaluated on the following criteria (450 points total):

Category Criteria Points
XOR Cipher Correct encryption/decryption 40
Variable key length support 20
Binary file handling 20
Caesar Cipher Correct shift implementation 40
Case preservation 15
Non-alpha character handling 15
CLI Interface Argument parsing works correctly 30
Help message complete 15
Error messages are clear 15
File Handling Correct file read/write 30
Large file support (streaming) 20
Error handling (file not found, etc.) 20
Code Quality Modular structure (separate files) 25
Clear comments and documentation 20
No memory leaks 20
Documentation README completeness 40
Usage examples work correctly 15
Bonus Vigenère cipher, progress bar, checksum, etc. +50
Total 450 (+50 bonus)
Excellent
400+

Exceeds all requirements with exceptional quality

Good
350-399

Meets all requirements with good quality

Satisfactory
270-349

Meets minimum requirements

Needs Work
< 270

Missing key requirements

Ready to Submit?

Make sure you have completed all requirements and reviewed the grading rubric above.

Submit Your Project
09

Pre-Submission Checklist

Use this checklist to verify you have completed all requirements before submitting your project.

XOR Cipher
Caesar Cipher
CLI Interface
Repository Requirements
Final Check: Encrypt a test file, then decrypt it, and verify the output matches the original exactly. Use diff or fc command to compare.
10

Common Issues and Solutions

Encountering problems? Don't worry! Here are the most common issues students face and how to resolve them quickly.

Encryption/Decryption Mismatch
Problem

Decrypted file doesn't match original - characters are garbled

Solution

Ensure you're using the exact same key. For XOR, check key length repeating logic. For Caesar, use negative shift for decryption.

Tip: Start with a simple test: encrypt "ABC" and manually verify the output
Binary File Corruption
Problem

Binary files (images, executables) get corrupted after decryption

Solution

Always use binary mode ("rb", "wb") not text mode. Text mode may translate newlines:

FILE *f = fopen(filename, "rb"); // Not "r"
Important: On Windows, text mode converts \n to \r\n
Segmentation Fault on Large Files
Problem

Program crashes when processing large files (> 1MB)

Solution

Use buffered reading instead of loading entire file. Process in chunks:

char buffer[4096];
while ((bytes = fread(buffer, 1, 4096, in)) > 0) {...}
Debug: Use valgrind ./encrypt ... to find memory issues
Caesar Cipher Wrapping Errors
Problem

Letters near end of alphabet (X, Y, Z) don't wrap correctly

Solution

Use modulo arithmetic correctly. For uppercase:

encrypted = ((c - 'A' + shift) % 26) + 'A';
Test: Verify Z + 1 = A and A - 1 = Z
Argument Parsing Fails
Problem

Program crashes or ignores arguments when parsing CLI options

Solution

Check bounds before accessing argv[i+1]. Use strcmp() for string comparison:

if (i + 1 < argc && strcmp(argv[i], "-k") == 0)
Alternative: Consider using getopt() for robust parsing
Makefile Build Errors
Problem

"make: Nothing to be done" or linker errors

Solution

Use tabs (not spaces) for indentation. Add all .c files to build. Run make clean first:

gcc -o encrypt src/*.c -I include
Note: Makefiles require TAB characters, not spaces for indentation
Still Having Issues?

Check the course discussion forum or reach out for help