Project 5: Image Processor | C++ Programming Course

Project Overview

This advanced capstone project challenges you to build a professional image processing library from scratch. You will work with the Intel Image Classification Dataset from Kaggle containing 25,000 images across 6 categories (buildings, forest, glacier, mountain, sea, street) at 150x150 resolution for testing your image processing algorithms. The library must demonstrate proficiency in pixel manipulation, convolution filters, color theory, and geometric transformations using modern C++17/20 features.

Skills Applied: This project tests your proficiency in C++ templates, smart pointers, operator overloading, multithreading for parallel processing, memory management, and mathematical algorithms for image manipulation.

Image I/O

Load and save PNG, JPG, BMP with proper error handling

Filters

Convolution-based blur, sharpen, emboss, edge detection

Color Spaces

RGB, HSV, HSL, Grayscale conversions with accuracy

Transforms

Scale, rotate, flip, crop with interpolation

Learning Objectives

Technical Skills

Master 2D array manipulation and pixel-level operations
Implement convolution kernels for various filter effects
Build thread-safe parallel image processing pipelines
Design efficient memory management for large images
Create flexible template-based image container classes

Image Processing Skills

Understand color models and their mathematical relationships
Implement edge detection using Sobel and Canny algorithms
Apply geometric transformations with interpolation
Create histogram equalization for image enhancement
Build a composable filter pipeline architecture

Ready to submit? Already completed the project? Submit your work now!

Submit Now

Company Scenario

PixelPerfect Labs

You have been hired as a Software Engineer at PixelPerfect Labs, a computer vision startup developing image analysis tools for photographers and content creators. The company needs a high-performance, cross-platform image processing library that can be integrated into their desktop application suite. They require a library that's lightweight, fast, and doesn't depend on heavy frameworks like OpenCV.

"We need a lean, mean image processing library written in pure C++. It should handle common operations like filters, color adjustments, and transformations efficiently. Our users process thousands of images daily, so performance is critical. Can you build something that processes a 4K image in under 100ms?"

Sarah Chen, CTO

Technical Requirements

Performance Targets

Process 1920x1080 image in < 50ms for basic filters
Edge detection on 4K image in < 200ms
Memory usage < 4x image size during processing
Support for multi-threaded batch processing

Core Features

PNG, JPG, BMP file format support
At least 10 different filter effects
Geometric transformations with interpolation
Color space conversions (RGB, HSV, HSL)

Architecture

Header-only or static library option
Template-based for different pixel formats
RAII for all resource management
Exception-safe operations

Quality

Unit tests with > 80% code coverage
Comprehensive documentation
Example programs demonstrating features
Cross-platform (Windows, Linux, macOS)

Pro Tip: Design your library with a clean API! Users should be able to chain operations fluently like: image.blur(5).grayscale().rotate(45).save("output.png")

Test Data & Benchmarks

Download the test suite and benchmark data to validate your image processing library implementation. The data includes performance targets, test scenarios, and expected outputs:

Test Suite Download

Download the image processing test data files containing filter benchmarks, transformation test cases, and color accuracy targets for validation.

image_processing_test_suite.csv filter_benchmark_data.csv color_conversion_tests.csv transform_accuracy_tests.csv

Original Data Source

This project uses test images from the Intel Image Classification Dataset from Kaggle - containing 25,000 images of natural scenes (150x150 pixels) across 6 categories (buildings, forest, glacier, mountain, sea, street). These images are perfect for testing your image processing algorithms on real-world photography.

View on Kaggle Explore Similar Datasets

Test Data Schema

Column	Type	Description
`test_id`	String	Unique test identifier (e.g., FLT_001, TRF_015)
`category`	String	filter, transform, color, edge, histogram
`test_name`	String	Descriptive test name
`input_width`	Integer	Input image width (64-4096)
`input_height`	Integer	Input image height (64-4096)
`operation`	String	blur, sharpen, grayscale, rotate, etc.
`parameters`	String	JSON parameters for the operation
`expected_time_ms`	Float	Maximum allowed processing time
`accuracy_threshold`	Float	Minimum accuracy percentage (PSNR/SSIM)
`priority`	String	critical, high, medium, low

Column	Type	Description
`filter_id`	String	Unique filter test ID
`filter_type`	String	blur, sharpen, emboss, edge_detect, etc.
`kernel_size`	Integer	Kernel size (3, 5, 7, 9)
`image_size`	String	Image dimensions (e.g., 1920x1080)
`target_fps`	Float	Minimum frames per second
`expected_output_hash`	String	MD5 hash of expected output
`tolerance`	Float	Pixel value tolerance (0.0-1.0)

Column	Type	Description
`color_test_id`	String	Unique color test ID
`source_space`	String	RGB, HSV, HSL, CMYK, Grayscale
`target_space`	String	Target color space
`input_values`	String	JSON array of input color values
`expected_values`	String	JSON array of expected output values
`precision`	Integer	Decimal precision required (2-6)
`round_trip`	Boolean	Whether round-trip accuracy is tested

Column	Type	Description
`transform_test_id`	String	Unique transform test ID
`operation`	String	scale, rotate, flip, crop, skew
`input_dimensions`	String	Input image size
`parameters`	String	JSON parameters (angle, scale factor, etc.)
`interpolation`	String	nearest, bilinear, bicubic
`expected_dimensions`	String	Expected output size
`psnr_threshold`	Float	Minimum PSNR value (dB)

Test Suite Stats: 300 test scenarios, 150 filter benchmarks, 100 color conversion tests, 100 transform accuracy tests

Performance Targets: 1080p filter <50ms, 4K edge detection <200ms, Color conversion <5ms per pixel

Project Requirements

Your image processing library must include all of the following systems. Structure your code with clean separation between core components and user-facing API.

Image I/O System

File Format Support:

PNG loading and saving with alpha channel support
JPEG loading and saving with quality settings
BMP loading and saving (24-bit and 32-bit)
Graceful error handling for corrupted files

Image Container:

Image<T> template class supporting various pixel types
Support for 8-bit, 16-bit, and floating-point channels
Efficient copy and move semantics
Iterator support for pixel-level operations

Deliverable: Image I/O module supporting PNG, JPG, BMP with proper error handling and RAII resource management.

Filter System

Convolution Filters:

Box blur with configurable kernel size
Gaussian blur with sigma parameter
Sharpen filter with intensity control
Emboss effect with direction options
Custom kernel support

Edge Detection:

Sobel operator (horizontal, vertical, combined)
Prewitt operator
Laplacian filter
Canny edge detector with thresholds

Deliverable: Filter pipeline supporting at least 10 different effects with composable operations.

Color Processing

Color Space Conversions:

RGB ↔ HSV conversion with accurate formulas
RGB ↔ HSL conversion
RGB → Grayscale (luminance-weighted)
Sepia tone effect
Color inversion (negative)

Color Adjustments:

Brightness adjustment
Contrast adjustment
Saturation control
Hue rotation
Gamma correction

Deliverable: Complete color processing module with accurate conversions and adjustable parameters.

Geometric Transformations

Basic Transformations:

Scale (up/down) with multiple interpolation methods
Rotation by arbitrary angle
Horizontal and vertical flip
Crop to region of interest

Interpolation Methods:

Nearest neighbor (fast)
Bilinear interpolation
Bicubic interpolation (high quality)

Deliverable: Transform system with selectable interpolation methods and smooth rotation at any angle.

Library Architecture

Design your library with a clean, modular architecture. The core should be template-based for flexibility, with a simple user-facing API for common operations.

Core Image Class

// Pixel type representing RGBA color
struct RGBA {
    uint8_t r, g, b, a;
    
    RGBA() : r(0), g(0), b(0), a(255) {}
    RGBA(uint8_t r, uint8_t g, uint8_t b, uint8_t a = 255) 
        : r(r), g(g), b(b), a(a) {}
    
    // Convert to grayscale using luminance weights
    uint8_t luminance() const {
        return static_cast<uint8_t>(0.299 * r + 0.587 * g + 0.114 * b);
    }
    
    // Blend with another color
    RGBA blend(const RGBA& other, float alpha) const {
        return RGBA(
            static_cast<uint8_t>(r * (1 - alpha) + other.r * alpha),
            static_cast<uint8_t>(g * (1 - alpha) + other.g * alpha),
            static_cast<uint8_t>(b * (1 - alpha) + other.b * alpha),
            static_cast<uint8_t>(a * (1 - alpha) + other.a * alpha)
        );
    }
};

template<typename PixelType = RGBA>
class Image {
public:
    Image() : width_(0), height_(0) {}
    Image(size_t width, size_t height) 
        : width_(width), height_(height), pixels_(width * height) {}
    
    // Load from file
    static Image load(const std::string& filename);
    
    // Save to file
    bool save(const std::string& filename) const;
    
    // Accessors
    size_t width() const { return width_; }
    size_t height() const { return height_; }
    bool empty() const { return pixels_.empty(); }
    
    // Pixel access
    PixelType& at(size_t x, size_t y) {
        return pixels_[y * width_ + x];
    }
    
    const PixelType& at(size_t x, size_t y) const {
        return pixels_[y * width_ + x];
    }
    
    // Safe access with bounds checking
    PixelType get(int x, int y, PixelType default_val = PixelType()) const {
        if (x < 0 || x >= static_cast<int>(width_) ||
            y < 0 || y >= static_cast<int>(height_)) {
            return default_val;
        }
        return at(x, y);
    }
    
    // Iterator support
    auto begin() { return pixels_.begin(); }
    auto end() { return pixels_.end(); }
    auto begin() const { return pixels_.cbegin(); }
    auto end() const { return pixels_.cend(); }
    
    // Raw data access
    PixelType* data() { return pixels_.data(); }
    const PixelType* data() const { return pixels_.data(); }
    
private:
    size_t width_, height_;
    std::vector<PixelType> pixels_;
};

Filter Pipeline

// Base filter interface
class IFilter {
public:
    virtual ~IFilter() = default;
    virtual Image<RGBA> apply(const Image<RGBA>& input) const = 0;
    virtual std::string name() const = 0;
};

// Convolution kernel base
class ConvolutionFilter : public IFilter {
public:
    ConvolutionFilter(std::vector<std::vector<float>> kernel)
        : kernel_(std::move(kernel)) {
        // Normalize kernel
        float sum = 0;
        for (const auto& row : kernel_) {
            for (float val : row) sum += val;
        }
        if (std::abs(sum) > 0.001f) {
            for (auto& row : kernel_) {
                for (float& val : row) val /= sum;
            }
        }
    }
    
    Image<RGBA> apply(const Image<RGBA>& input) const override {
        int kh = kernel_.size();
        int kw = kernel_[0].size();
        int pad_h = kh / 2;
        int pad_w = kw / 2;
        
        Image<RGBA> output(input.width(), input.height());
        
        // Parallel processing with OpenMP
        #pragma omp parallel for
        for (int y = 0; y < static_cast<int>(input.height()); ++y) {
            for (int x = 0; x < static_cast<int>(input.width()); ++x) {
                float r = 0, g = 0, b = 0;
                
                for (int ky = 0; ky < kh; ++ky) {
                    for (int kx = 0; kx < kw; ++kx) {
                        auto pixel = input.get(x + kx - pad_w, y + ky - pad_h);
                        float weight = kernel_[ky][kx];
                        r += pixel.r * weight;
                        g += pixel.g * weight;
                        b += pixel.b * weight;
                    }
                }
                
                output.at(x, y) = RGBA(
                    std::clamp(static_cast<int>(r), 0, 255),
                    std::clamp(static_cast<int>(g), 0, 255),
                    std::clamp(static_cast<int>(b), 0, 255),
                    input.at(x, y).a
                );
            }
        }
        
        return output;
    }
    
protected:
    std::vector<std::vector<float>> kernel_;
};

// Specific filter implementations
class GaussianBlur : public ConvolutionFilter {
public:
    GaussianBlur(int size = 5, float sigma = 1.0f)
        : ConvolutionFilter(createGaussianKernel(size, sigma)) {}
    
    std::string name() const override { return "Gaussian Blur"; }
    
private:
    static std::vector<std::vector<float>> createGaussianKernel(int size, float sigma) {
        std::vector<std::vector<float>> kernel(size, std::vector<float>(size));
        int center = size / 2;
        float sum = 0;
        
        for (int y = 0; y < size; ++y) {
            for (int x = 0; x < size; ++x) {
                int dx = x - center;
                int dy = y - center;
                kernel[y][x] = std::exp(-(dx*dx + dy*dy) / (2 * sigma * sigma));
                sum += kernel[y][x];
            }
        }
        
        // Normalize
        for (auto& row : kernel) {
            for (float& val : row) val /= sum;
        }
        
        return kernel;
    }
};

Performance Tip: Use OpenMP or std::thread for parallel pixel processing. Modern CPUs can process different image regions simultaneously, providing significant speedups for large images.

Filter Algorithms

Implement these filter algorithms with proper convolution. Each filter should produce visually correct results matching standard image processing tools.

Blur Filter Kernels

Box Blur (3×3)

// All weights equal
1/9 * [1 1 1]
      [1 1 1]
      [1 1 1]

Gaussian Blur (5×5)

// Center-weighted
1/256 * [1  4  6  4 1]
        [4 16 24 16 4]
        [6 24 36 24 6]
        [4 16 24 16 4]
        [1  4  6  4 1]

Motion Blur (5×5)

// Diagonal direction
1/5 * [1 0 0 0 0]
      [0 1 0 0 0]
      [0 0 1 0 0]
      [0 0 0 1 0]
      [0 0 0 0 1]

Edge Detection Kernels

Sobel Operator

// Horizontal (Gx)        Vertical (Gy)
[-1  0  1]              [-1 -2 -1]
[-2  0  2]              [ 0  0  0]
[-1  0  1]              [ 1  2  1]

// Magnitude: sqrt(Gx² + Gy²)
// Direction: atan2(Gy, Gx)

Laplacian Operator

// 4-connectivity        8-connectivity
[ 0  1  0]              [ 1  1  1]
[ 1 -4  1]              [ 1 -8  1]
[ 0  1  0]              [ 1  1  1]

Canny Edge Detection Algorithm

Image<uint8_t> cannyEdgeDetection(const Image<RGBA>& input, 
                                   float low_threshold, 
                                   float high_threshold) {
    // Step 1: Convert to grayscale
    auto gray = toGrayscale(input);
    
    // Step 2: Apply Gaussian blur to reduce noise
    auto blurred = GaussianBlur(5, 1.4f).apply(gray);
    
    // Step 3: Compute gradient magnitude and direction (Sobel)
    auto [magnitude, direction] = sobelGradient(blurred);
    
    // Step 4: Non-maximum suppression
    auto suppressed = nonMaximumSuppression(magnitude, direction);
    
    // Step 5: Double threshold
    auto edges = doubleThreshold(suppressed, low_threshold, high_threshold);
    
    // Step 6: Edge tracking by hysteresis
    return hysteresisTracking(edges);
}

Special Effect Kernels

Sharpen

[ 0 -1  0]
[-1  5 -1]
[ 0 -1  0]

Emboss

[-2 -1  0]
[-1  1  1]
[ 0  1  2]

Unsharp Mask

// Amount = 2.0
1/256 * [-1 -4  -6 -4 -1]
        [-4 -16 -24 -16 -4]
        [-6 -24 476 -24 -6]
        [-4 -16 -24 -16 -4]
        [-1 -4  -6 -4 -1]

Geometric Transformations

Implement geometric transformations with proper interpolation for high-quality results. The choice of interpolation method significantly affects output quality and performance.

Transformation Implementation

// Interpolation types
enum class Interpolation {
    Nearest,    // Fastest, pixelated results
    Bilinear,   // Good balance of quality and speed
    Bicubic     // Highest quality, slower
};

// Bilinear interpolation
RGBA bilinearInterpolate(const Image<RGBA>& img, float x, float y) {
    int x0 = static_cast<int>(std::floor(x));
    int y0 = static_cast<int>(std::floor(y));
    int x1 = x0 + 1;
    int y1 = y0 + 1;
    
    float fx = x - x0;
    float fy = y - y0;
    
    auto p00 = img.get(x0, y0);
    auto p10 = img.get(x1, y0);
    auto p01 = img.get(x0, y1);
    auto p11 = img.get(x1, y1);
    
    // Interpolate along x
    auto top = lerp(p00, p10, fx);
    auto bottom = lerp(p01, p11, fx);
    
    // Interpolate along y
    return lerp(top, bottom, fy);
}

// Rotation transform
Image<RGBA> rotate(const Image<RGBA>& input, float angle_degrees, 
                    Interpolation interp = Interpolation::Bilinear) {
    float angle = angle_degrees * M_PI / 180.0f;
    float cos_a = std::cos(angle);
    float sin_a = std::sin(angle);
    
    // Calculate new dimensions
    int w = input.width();
    int h = input.height();
    int new_w = static_cast<int>(std::abs(w * cos_a) + std::abs(h * sin_a));
    int new_h = static_cast<int>(std::abs(w * sin_a) + std::abs(h * cos_a));
    
    Image<RGBA> output(new_w, new_h);
    
    float cx = w / 2.0f;
    float cy = h / 2.0f;
    float ncx = new_w / 2.0f;
    float ncy = new_h / 2.0f;
    
    #pragma omp parallel for
    for (int y = 0; y < new_h; ++y) {
        for (int x = 0; x < new_w; ++x) {
            // Map from output to input coordinates (inverse mapping)
            float dx = x - ncx;
            float dy = y - ncy;
            
            float src_x = cos_a * dx + sin_a * dy + cx;
            float src_y = -sin_a * dx + cos_a * dy + cy;
            
            if (src_x >= 0 && src_x < w - 1 && src_y >= 0 && src_y < h - 1) {
                switch (interp) {
                    case Interpolation::Nearest:
                        output.at(x, y) = input.at(
                            static_cast<int>(std::round(src_x)),
                            static_cast<int>(std::round(src_y))
                        );
                        break;
                    case Interpolation::Bilinear:
                        output.at(x, y) = bilinearInterpolate(input, src_x, src_y);
                        break;
                    case Interpolation::Bicubic:
                        output.at(x, y) = bicubicInterpolate(input, src_x, src_y);
                        break;
                }
            }
        }
    }
    
    return output;
}

// Scale transform
Image<RGBA> scale(const Image<RGBA>& input, float scale_x, float scale_y,
                   Interpolation interp = Interpolation::Bilinear) {
    int new_w = static_cast<int>(input.width() * scale_x);
    int new_h = static_cast<int>(input.height() * scale_y);
    
    Image<RGBA> output(new_w, new_h);
    
    #pragma omp parallel for
    for (int y = 0; y < new_h; ++y) {
        for (int x = 0; x < new_w; ++x) {
            float src_x = x / scale_x;
            float src_y = y / scale_y;
            
            // Apply interpolation
            output.at(x, y) = interpolate(input, src_x, src_y, interp);
        }
    }
    
    return output;
}

Submission Requirements

Create a public GitHub repository with the exact name shown below:

Required Repository Name

cpp-image-processor

github.com/<your-username>/cpp-image-processor

Required Project Structure

cpp-image-processor/
├── include/
│   └── imgproc/
│       ├── image.hpp           # Core image class
│       ├── filters.hpp         # Filter implementations
│       ├── transforms.hpp      # Geometric transforms
│       ├── color.hpp           # Color space conversions
│       └── io.hpp              # File I/O operations
├── src/
│   ├── filters.cpp
│   ├── transforms.cpp
│   ├── color.cpp
│   └── io.cpp
├── tests/
│   ├── test_filters.cpp
│   ├── test_transforms.cpp
│   ├── test_color.cpp
│   └── test_io.cpp
├── examples/
│   ├── basic_filters.cpp
│   ├── edge_detection.cpp
│   └── batch_processing.cpp
├── docs/
│   └── API.md
├── CMakeLists.txt
└── README.md

Do Include

All required header and source files
CMake build configuration
Unit tests for all major components
Example programs demonstrating features
API documentation
Sample images for testing

Do Not Include

Build artifacts (*.o, *.exe, build/)
IDE-specific files (.vs/, .idea/)
Large image datasets (> 10MB total)
External libraries (use CMake FetchContent)
Temporary or backup files

Submit Your Project

Enter your GitHub username - we will verify your repository automatically

Grading Rubric

Your project will be graded on the following criteria. Total: 700 points.

Criteria	Points	Description
Image I/O	100	PNG, JPG, BMP support with proper error handling
Filter System	150	At least 10 filters including edge detection
Color Processing	100	Accurate color space conversions and adjustments
Transformations	100	Scale, rotate, flip with interpolation options
Performance	100	Meets timing requirements with parallel processing
Code Quality	75	Clean API, proper OOP, modern C++ practices
Testing & Docs	75	Unit tests, examples, and API documentation
Total	700

Grading Levels

Excellent

630-700

Exceeds all requirements

Good

525-629

Meets all requirements

Satisfactory

420-524

Meets minimum requirements

Needs Work

< 420

Missing key requirements

Ready to Submit?

Make sure you have completed all requirements and reviewed the grading rubric above.

Submit Your Project

Image Processor

What You Will Build

Contents

Project Overview

Image I/O

Filters

Color Spaces

Transforms

Learning Objectives

Technical Skills

Image Processing Skills

Company Scenario

PixelPerfect Labs

Technical Requirements

Test Data & Benchmarks

Test Suite Download

Original Data Source

Test Data Schema

1 image_processing_test_suite.csv (300 test scenarios)

2 filter_benchmark_data.csv (150 filter tests)

3 color_conversion_tests.csv (100 color tests)

4 transform_accuracy_tests.csv (100 transform tests)

Project Requirements

Image I/O System

Filter System

Color Processing

Geometric Transformations

Library Architecture

Core Image Class

Filter Pipeline

Filter Algorithms

Blur Filter Kernels

Box Blur (3×3)

Gaussian Blur (5×5)

Motion Blur (5×5)

Edge Detection Kernels

Sobel Operator

Laplacian Operator

Canny Edge Detection Algorithm

Special Effect Kernels

Sharpen

Emboss

Unsharp Mask

Geometric Transformations

Transformation Implementation

Submission Requirements

Required Repository Name

Required Project Structure

Do Include

Do Not Include

Grading Rubric

Grading Levels

Excellent

Good

Satisfactory

Needs Work

Ready to Submit?