Project Overview
This advanced capstone project challenges you to build a professional image processing library from scratch. You will work with the Intel Image Classification Dataset from Kaggle containing 25,000 images across 6 categories (buildings, forest, glacier, mountain, sea, street) at 150x150 resolution for testing your image processing algorithms. The library must demonstrate proficiency in pixel manipulation, convolution filters, color theory, and geometric transformations using modern C++17/20 features.
Image I/O
Load and save PNG, JPG, BMP with proper error handling
Filters
Convolution-based blur, sharpen, emboss, edge detection
Color Spaces
RGB, HSV, HSL, Grayscale conversions with accuracy
Transforms
Scale, rotate, flip, crop with interpolation
Learning Objectives
Technical Skills
- Master 2D array manipulation and pixel-level operations
- Implement convolution kernels for various filter effects
- Build thread-safe parallel image processing pipelines
- Design efficient memory management for large images
- Create flexible template-based image container classes
Image Processing Skills
- Understand color models and their mathematical relationships
- Implement edge detection using Sobel and Canny algorithms
- Apply geometric transformations with interpolation
- Create histogram equalization for image enhancement
- Build a composable filter pipeline architecture
Company Scenario
PixelPerfect Labs
You have been hired as a Software Engineer at PixelPerfect Labs, a computer vision startup developing image analysis tools for photographers and content creators. The company needs a high-performance, cross-platform image processing library that can be integrated into their desktop application suite. They require a library that's lightweight, fast, and doesn't depend on heavy frameworks like OpenCV.
"We need a lean, mean image processing library written in pure C++. It should handle common operations like filters, color adjustments, and transformations efficiently. Our users process thousands of images daily, so performance is critical. Can you build something that processes a 4K image in under 100ms?"
Technical Requirements
- Process 1920x1080 image in < 50ms for basic filters
- Edge detection on 4K image in < 200ms
- Memory usage < 4x image size during processing
- Support for multi-threaded batch processing
- PNG, JPG, BMP file format support
- At least 10 different filter effects
- Geometric transformations with interpolation
- Color space conversions (RGB, HSV, HSL)
- Header-only or static library option
- Template-based for different pixel formats
- RAII for all resource management
- Exception-safe operations
- Unit tests with > 80% code coverage
- Comprehensive documentation
- Example programs demonstrating features
- Cross-platform (Windows, Linux, macOS)
image.blur(5).grayscale().rotate(45).save("output.png")
Test Data & Benchmarks
Download the test suite and benchmark data to validate your image processing library implementation. The data includes performance targets, test scenarios, and expected outputs:
Test Suite Download
Download the image processing test data files containing filter benchmarks, transformation test cases, and color accuracy targets for validation.
Original Data Source
This project uses test images from the Intel Image Classification Dataset from Kaggle - containing 25,000 images of natural scenes (150x150 pixels) across 6 categories (buildings, forest, glacier, mountain, sea, street). These images are perfect for testing your image processing algorithms on real-world photography.
Test Data Schema
| Column | Type | Description |
|---|---|---|
test_id | String | Unique test identifier (e.g., FLT_001, TRF_015) |
category | String | filter, transform, color, edge, histogram |
test_name | String | Descriptive test name |
input_width | Integer | Input image width (64-4096) |
input_height | Integer | Input image height (64-4096) |
operation | String | blur, sharpen, grayscale, rotate, etc. |
parameters | String | JSON parameters for the operation |
expected_time_ms | Float | Maximum allowed processing time |
accuracy_threshold | Float | Minimum accuracy percentage (PSNR/SSIM) |
priority | String | critical, high, medium, low |
| Column | Type | Description |
|---|---|---|
filter_id | String | Unique filter test ID |
filter_type | String | blur, sharpen, emboss, edge_detect, etc. |
kernel_size | Integer | Kernel size (3, 5, 7, 9) |
image_size | String | Image dimensions (e.g., 1920x1080) |
target_fps | Float | Minimum frames per second |
expected_output_hash | String | MD5 hash of expected output |
tolerance | Float | Pixel value tolerance (0.0-1.0) |
| Column | Type | Description |
|---|---|---|
color_test_id | String | Unique color test ID |
source_space | String | RGB, HSV, HSL, CMYK, Grayscale |
target_space | String | Target color space |
input_values | String | JSON array of input color values |
expected_values | String | JSON array of expected output values |
precision | Integer | Decimal precision required (2-6) |
round_trip | Boolean | Whether round-trip accuracy is tested |
| Column | Type | Description |
|---|---|---|
transform_test_id | String | Unique transform test ID |
operation | String | scale, rotate, flip, crop, skew |
input_dimensions | String | Input image size |
parameters | String | JSON parameters (angle, scale factor, etc.) |
interpolation | String | nearest, bilinear, bicubic |
expected_dimensions | String | Expected output size |
psnr_threshold | Float | Minimum PSNR value (dB) |
Project Requirements
Your image processing library must include all of the following systems. Structure your code with clean separation between core components and user-facing API.
Image I/O System
File Format Support:
- PNG loading and saving with alpha channel support
- JPEG loading and saving with quality settings
- BMP loading and saving (24-bit and 32-bit)
- Graceful error handling for corrupted files
Image Container:
Image<T>template class supporting various pixel types- Support for 8-bit, 16-bit, and floating-point channels
- Efficient copy and move semantics
- Iterator support for pixel-level operations
Filter System
Convolution Filters:
- Box blur with configurable kernel size
- Gaussian blur with sigma parameter
- Sharpen filter with intensity control
- Emboss effect with direction options
- Custom kernel support
Edge Detection:
- Sobel operator (horizontal, vertical, combined)
- Prewitt operator
- Laplacian filter
- Canny edge detector with thresholds
Color Processing
Color Space Conversions:
- RGB ↔ HSV conversion with accurate formulas
- RGB ↔ HSL conversion
- RGB → Grayscale (luminance-weighted)
- Sepia tone effect
- Color inversion (negative)
Color Adjustments:
- Brightness adjustment
- Contrast adjustment
- Saturation control
- Hue rotation
- Gamma correction
Geometric Transformations
Basic Transformations:
- Scale (up/down) with multiple interpolation methods
- Rotation by arbitrary angle
- Horizontal and vertical flip
- Crop to region of interest
Interpolation Methods:
- Nearest neighbor (fast)
- Bilinear interpolation
- Bicubic interpolation (high quality)
Library Architecture
Design your library with a clean, modular architecture. The core should be template-based for flexibility, with a simple user-facing API for common operations.
Core Image Class
// Pixel type representing RGBA color
struct RGBA {
uint8_t r, g, b, a;
RGBA() : r(0), g(0), b(0), a(255) {}
RGBA(uint8_t r, uint8_t g, uint8_t b, uint8_t a = 255)
: r(r), g(g), b(b), a(a) {}
// Convert to grayscale using luminance weights
uint8_t luminance() const {
return static_cast<uint8_t>(0.299 * r + 0.587 * g + 0.114 * b);
}
// Blend with another color
RGBA blend(const RGBA& other, float alpha) const {
return RGBA(
static_cast<uint8_t>(r * (1 - alpha) + other.r * alpha),
static_cast<uint8_t>(g * (1 - alpha) + other.g * alpha),
static_cast<uint8_t>(b * (1 - alpha) + other.b * alpha),
static_cast<uint8_t>(a * (1 - alpha) + other.a * alpha)
);
}
};
template<typename PixelType = RGBA>
class Image {
public:
Image() : width_(0), height_(0) {}
Image(size_t width, size_t height)
: width_(width), height_(height), pixels_(width * height) {}
// Load from file
static Image load(const std::string& filename);
// Save to file
bool save(const std::string& filename) const;
// Accessors
size_t width() const { return width_; }
size_t height() const { return height_; }
bool empty() const { return pixels_.empty(); }
// Pixel access
PixelType& at(size_t x, size_t y) {
return pixels_[y * width_ + x];
}
const PixelType& at(size_t x, size_t y) const {
return pixels_[y * width_ + x];
}
// Safe access with bounds checking
PixelType get(int x, int y, PixelType default_val = PixelType()) const {
if (x < 0 || x >= static_cast<int>(width_) ||
y < 0 || y >= static_cast<int>(height_)) {
return default_val;
}
return at(x, y);
}
// Iterator support
auto begin() { return pixels_.begin(); }
auto end() { return pixels_.end(); }
auto begin() const { return pixels_.cbegin(); }
auto end() const { return pixels_.cend(); }
// Raw data access
PixelType* data() { return pixels_.data(); }
const PixelType* data() const { return pixels_.data(); }
private:
size_t width_, height_;
std::vector<PixelType> pixels_;
};
Filter Pipeline
// Base filter interface
class IFilter {
public:
virtual ~IFilter() = default;
virtual Image<RGBA> apply(const Image<RGBA>& input) const = 0;
virtual std::string name() const = 0;
};
// Convolution kernel base
class ConvolutionFilter : public IFilter {
public:
ConvolutionFilter(std::vector<std::vector<float>> kernel)
: kernel_(std::move(kernel)) {
// Normalize kernel
float sum = 0;
for (const auto& row : kernel_) {
for (float val : row) sum += val;
}
if (std::abs(sum) > 0.001f) {
for (auto& row : kernel_) {
for (float& val : row) val /= sum;
}
}
}
Image<RGBA> apply(const Image<RGBA>& input) const override {
int kh = kernel_.size();
int kw = kernel_[0].size();
int pad_h = kh / 2;
int pad_w = kw / 2;
Image<RGBA> output(input.width(), input.height());
// Parallel processing with OpenMP
#pragma omp parallel for
for (int y = 0; y < static_cast<int>(input.height()); ++y) {
for (int x = 0; x < static_cast<int>(input.width()); ++x) {
float r = 0, g = 0, b = 0;
for (int ky = 0; ky < kh; ++ky) {
for (int kx = 0; kx < kw; ++kx) {
auto pixel = input.get(x + kx - pad_w, y + ky - pad_h);
float weight = kernel_[ky][kx];
r += pixel.r * weight;
g += pixel.g * weight;
b += pixel.b * weight;
}
}
output.at(x, y) = RGBA(
std::clamp(static_cast<int>(r), 0, 255),
std::clamp(static_cast<int>(g), 0, 255),
std::clamp(static_cast<int>(b), 0, 255),
input.at(x, y).a
);
}
}
return output;
}
protected:
std::vector<std::vector<float>> kernel_;
};
// Specific filter implementations
class GaussianBlur : public ConvolutionFilter {
public:
GaussianBlur(int size = 5, float sigma = 1.0f)
: ConvolutionFilter(createGaussianKernel(size, sigma)) {}
std::string name() const override { return "Gaussian Blur"; }
private:
static std::vector<std::vector<float>> createGaussianKernel(int size, float sigma) {
std::vector<std::vector<float>> kernel(size, std::vector<float>(size));
int center = size / 2;
float sum = 0;
for (int y = 0; y < size; ++y) {
for (int x = 0; x < size; ++x) {
int dx = x - center;
int dy = y - center;
kernel[y][x] = std::exp(-(dx*dx + dy*dy) / (2 * sigma * sigma));
sum += kernel[y][x];
}
}
// Normalize
for (auto& row : kernel) {
for (float& val : row) val /= sum;
}
return kernel;
}
};
Filter Algorithms
Implement these filter algorithms with proper convolution. Each filter should produce visually correct results matching standard image processing tools.
Blur Filter Kernels
Box Blur (3×3)
// All weights equal
1/9 * [1 1 1]
[1 1 1]
[1 1 1]
Gaussian Blur (5×5)
// Center-weighted
1/256 * [1 4 6 4 1]
[4 16 24 16 4]
[6 24 36 24 6]
[4 16 24 16 4]
[1 4 6 4 1]
Motion Blur (5×5)
// Diagonal direction
1/5 * [1 0 0 0 0]
[0 1 0 0 0]
[0 0 1 0 0]
[0 0 0 1 0]
[0 0 0 0 1]
Edge Detection Kernels
Sobel Operator
// Horizontal (Gx) Vertical (Gy)
[-1 0 1] [-1 -2 -1]
[-2 0 2] [ 0 0 0]
[-1 0 1] [ 1 2 1]
// Magnitude: sqrt(Gx² + Gy²)
// Direction: atan2(Gy, Gx)
Laplacian Operator
// 4-connectivity 8-connectivity
[ 0 1 0] [ 1 1 1]
[ 1 -4 1] [ 1 -8 1]
[ 0 1 0] [ 1 1 1]
Canny Edge Detection Algorithm
Image<uint8_t> cannyEdgeDetection(const Image<RGBA>& input,
float low_threshold,
float high_threshold) {
// Step 1: Convert to grayscale
auto gray = toGrayscale(input);
// Step 2: Apply Gaussian blur to reduce noise
auto blurred = GaussianBlur(5, 1.4f).apply(gray);
// Step 3: Compute gradient magnitude and direction (Sobel)
auto [magnitude, direction] = sobelGradient(blurred);
// Step 4: Non-maximum suppression
auto suppressed = nonMaximumSuppression(magnitude, direction);
// Step 5: Double threshold
auto edges = doubleThreshold(suppressed, low_threshold, high_threshold);
// Step 6: Edge tracking by hysteresis
return hysteresisTracking(edges);
}
Special Effect Kernels
Sharpen
[ 0 -1 0]
[-1 5 -1]
[ 0 -1 0]
Emboss
[-2 -1 0]
[-1 1 1]
[ 0 1 2]
Unsharp Mask
// Amount = 2.0
1/256 * [-1 -4 -6 -4 -1]
[-4 -16 -24 -16 -4]
[-6 -24 476 -24 -6]
[-4 -16 -24 -16 -4]
[-1 -4 -6 -4 -1]
Geometric Transformations
Implement geometric transformations with proper interpolation for high-quality results. The choice of interpolation method significantly affects output quality and performance.
Transformation Implementation
// Interpolation types
enum class Interpolation {
Nearest, // Fastest, pixelated results
Bilinear, // Good balance of quality and speed
Bicubic // Highest quality, slower
};
// Bilinear interpolation
RGBA bilinearInterpolate(const Image<RGBA>& img, float x, float y) {
int x0 = static_cast<int>(std::floor(x));
int y0 = static_cast<int>(std::floor(y));
int x1 = x0 + 1;
int y1 = y0 + 1;
float fx = x - x0;
float fy = y - y0;
auto p00 = img.get(x0, y0);
auto p10 = img.get(x1, y0);
auto p01 = img.get(x0, y1);
auto p11 = img.get(x1, y1);
// Interpolate along x
auto top = lerp(p00, p10, fx);
auto bottom = lerp(p01, p11, fx);
// Interpolate along y
return lerp(top, bottom, fy);
}
// Rotation transform
Image<RGBA> rotate(const Image<RGBA>& input, float angle_degrees,
Interpolation interp = Interpolation::Bilinear) {
float angle = angle_degrees * M_PI / 180.0f;
float cos_a = std::cos(angle);
float sin_a = std::sin(angle);
// Calculate new dimensions
int w = input.width();
int h = input.height();
int new_w = static_cast<int>(std::abs(w * cos_a) + std::abs(h * sin_a));
int new_h = static_cast<int>(std::abs(w * sin_a) + std::abs(h * cos_a));
Image<RGBA> output(new_w, new_h);
float cx = w / 2.0f;
float cy = h / 2.0f;
float ncx = new_w / 2.0f;
float ncy = new_h / 2.0f;
#pragma omp parallel for
for (int y = 0; y < new_h; ++y) {
for (int x = 0; x < new_w; ++x) {
// Map from output to input coordinates (inverse mapping)
float dx = x - ncx;
float dy = y - ncy;
float src_x = cos_a * dx + sin_a * dy + cx;
float src_y = -sin_a * dx + cos_a * dy + cy;
if (src_x >= 0 && src_x < w - 1 && src_y >= 0 && src_y < h - 1) {
switch (interp) {
case Interpolation::Nearest:
output.at(x, y) = input.at(
static_cast<int>(std::round(src_x)),
static_cast<int>(std::round(src_y))
);
break;
case Interpolation::Bilinear:
output.at(x, y) = bilinearInterpolate(input, src_x, src_y);
break;
case Interpolation::Bicubic:
output.at(x, y) = bicubicInterpolate(input, src_x, src_y);
break;
}
}
}
}
return output;
}
// Scale transform
Image<RGBA> scale(const Image<RGBA>& input, float scale_x, float scale_y,
Interpolation interp = Interpolation::Bilinear) {
int new_w = static_cast<int>(input.width() * scale_x);
int new_h = static_cast<int>(input.height() * scale_y);
Image<RGBA> output(new_w, new_h);
#pragma omp parallel for
for (int y = 0; y < new_h; ++y) {
for (int x = 0; x < new_w; ++x) {
float src_x = x / scale_x;
float src_y = y / scale_y;
// Apply interpolation
output.at(x, y) = interpolate(input, src_x, src_y, interp);
}
}
return output;
}
Submission Requirements
Create a public GitHub repository with the exact name shown below:
Required Repository Name
cpp-image-processor
Required Project Structure
cpp-image-processor/
├── include/
│ └── imgproc/
│ ├── image.hpp # Core image class
│ ├── filters.hpp # Filter implementations
│ ├── transforms.hpp # Geometric transforms
│ ├── color.hpp # Color space conversions
│ └── io.hpp # File I/O operations
├── src/
│ ├── filters.cpp
│ ├── transforms.cpp
│ ├── color.cpp
│ └── io.cpp
├── tests/
│ ├── test_filters.cpp
│ ├── test_transforms.cpp
│ ├── test_color.cpp
│ └── test_io.cpp
├── examples/
│ ├── basic_filters.cpp
│ ├── edge_detection.cpp
│ └── batch_processing.cpp
├── docs/
│ └── API.md
├── CMakeLists.txt
└── README.md
Do Include
- All required header and source files
- CMake build configuration
- Unit tests for all major components
- Example programs demonstrating features
- API documentation
- Sample images for testing
Do Not Include
- Build artifacts (*.o, *.exe, build/)
- IDE-specific files (.vs/, .idea/)
- Large image datasets (> 10MB total)
- External libraries (use CMake FetchContent)
- Temporary or backup files
Enter your GitHub username - we will verify your repository automatically
Grading Rubric
Your project will be graded on the following criteria. Total: 700 points.
| Criteria | Points | Description |
|---|---|---|
| Image I/O | 100 | PNG, JPG, BMP support with proper error handling |
| Filter System | 150 | At least 10 filters including edge detection |
| Color Processing | 100 | Accurate color space conversions and adjustments |
| Transformations | 100 | Scale, rotate, flip with interpolation options |
| Performance | 100 | Meets timing requirements with parallel processing |
| Code Quality | 75 | Clean API, proper OOP, modern C++ practices |
| Testing & Docs | 75 | Unit tests, examples, and API documentation |
| Total | 700 |
Grading Levels
Excellent
Exceeds all requirements
Good
Meets all requirements
Satisfactory
Meets minimum requirements
Needs Work
Missing key requirements
Ready to Submit?
Make sure you have completed all requirements and reviewed the grading rubric above.
Submit Your Project