Computer VisionColor AnalysisHuman-Centered AI

Melatone

A personal color detection project that weaves together classical computer vision, deep learning, and interface design into something genuinely usable.

The idea came from a familiar frustration — friends constantly second-guessing whether a color worked for them in clothing, makeup, or everyday styling. What started as a casual fix slowly turned into a real technical question: can a machine read skin tone from messy, real-world photos and give back something actually helpful?

Overview

TLDR

Melatone detects facial skin tone, classifies it using image models, and maps the result to a curated palette recommendation. Building it made one thing clear: useful computer vision is rarely a single model — it is a pipeline of detection, preprocessing, reasoning, and thoughtful interface decisions stacked together.

What draws me to this project is where it sits — somewhere between careful engineering and everyday human experience. It is not chasing an abstract benchmark. It is asking a quieter question: why do some colors feel right on us, while others feel subtly off?

Once I started building, the hidden complexity surfaced quickly. Lighting inconsistency, face localization, noisy color regions, skin masking, class imbalance — things that look simple on the surface tend to carry a lot of real problems underneath.

Workflow

Melatone was built as a modular system rather than a single end-to-end script. It covers webcam input, a Streamlit UI, FastAPI serving, preprocessing utilities, evaluation tools, and multiple classification models across different approaches.

Thinking in modules shifted how I approach applied machine learning. When a prediction looks wrong, the culprit is rarely the model alone — it might be the camera feed, poor cropping, drifting light, segmentation noise, or class imbalance that crept in much earlier in the pipeline.

Why the traditional computer vision stack is still worth understanding?

I once asked my Computer Vision professor why he still taught methods most students had mentally filed away as outdated: Kalman Filter, Hough Transform, Haar Cascade — when deep learning dominates nearly every conversation in the field.

His answer was straightforward: classical vision often works without the volume of labeled data that modern deep learning demands. In constrained environments, limited datasets, or domains where collecting examples is costly, these methods still hold their ground. He brought up NASA imaging from Mars as a case where scarcity is not optional — it is just the reality.

Honestly, I was not fully convinced at the time. But building Melatone changed that. Classical methods are not just a fallback for data-scarce situations, they teach you how images actually behave. They force you to reason about geometry, noise, edges, thresholds, and uncertainty instead of trusting a network to absorb all of it silently.

What I used

Haar Cascade

Lightweight face detection for early region localization before heavier processing.

HSV Color Space

Separates brightness from color information, which made skin masking noticeably more stable.

Gaussian Blur

Smoothed out noisy segmentation artifacts and unstable pixel regions before sampling.

Kalman Filter

A useful lens for thinking about how to smooth observations across time in live video systems.

HOG + XGBoost / KNN

Classical baselines I built first — helpful for understanding the problem before moving to CNNs.

EfficientNet

The final deep learning classifier, which generalized considerably better than earlier approaches.

Reflection

Melatone sharpened my appreciation for the relationship between classical and modern vision methods. Deep learning delivered the strongest final results, but traditional techniques were often what kept the pipeline stable enough to be worth using at all.

It also reinforced something I keep coming back to: useful AI does not have to be grand in scope. A small system that reliably solves an ordinary frustration can teach more real engineering than a flashy benchmark ever will.

If I return to this project, the priorities are clear — better lighting robustness, finer skin tone granularity, and more adaptive recommendations. Even as it stands, Melatone remains the project that most concretely taught me how computer vision becomes something people can actually use.