Experiments
How we trained both models — from dataset preparation to fine-tuning to evaluation. All training runs on Kaggle free GPU (T4) at $0 cost.
Understanding YOLOv12s
YOLOv12s
Attention-Centric Detection
YOLOv12s
Attention-Centric DetectionWhat it is
YOLOv12 is the latest in the YOLO (You Only Look Once) family of real-time object detectors. The “s” variant has 9.3M parameters — small enough for fast inference but large enough for accurate detection. The key innovation is the Area Attention Module (A²) which replaces traditional convolution blocks with attention mechanisms.
Why it's good for signatures
Signatures are thin, elongated stroke patterns on cluttered document backgrounds. The attention mechanism can focus on these fine details better than pure convolution, which tends to average over local patches and miss thin lines.
Dataset
tech4humans/signature-detection (HuggingFace)
Combined from Tobacco800 (scanned business documents) + Roboflow 100 signatures. Apache 2.0 license.
Training Process
Results
Phase 1 → Phase 2 improvement: mAP@0.5 improved from 0.846 to 0.910 (+7.6%). Precision jumped from 0.812 to 0.916 (+12.8%). Trained on RunPod RTX 4090 in ~20 minutes total, cost ~$0.15.