Links to free data sets for computer vision applications. If you would like to submit a link, please contact us.

Benchmark for Multi-platform Photogrammetry

The aim of the benchmark is to assess the accuracy and reliability of current methods for calibration and orientation of images acquired by different platforms as well as their integration for imag

MPII Human Pose dataset

MPII Human Pose dataset is a state of the art benchmark for evaluation of articulated human pose estimation.

ALL-IDB: Acute Lymphoblastic Leukemia Image Database

ALL-IDB is a dataset of microscopic images of blood samples, specifically designed for the evaluation and the comparison of algorithms for segmentation and image classification.

OUI-Adience Face collection

This dataset intends to facilitate the study of age and gender recognition.

MTFL: Multi-Task Facial Landmark

This dataset contains 12,995 face images which are annotated with (1) five facial landmarks, (2) attributes of gender, smiling, wearing glasses, and head pose.


Phos is a dataset for evaluating illumination invariance: 15 scenes, 14 types of illumination (with shadows) + Ground Truth.

PASCAL-Context Dataset

PASCAL-Context dataset augments PASCAL VOC 2010 dataset with annotations for 400+ additional categories.

DogCentric Activity Dataset

DogCentric Activity Dataset is composed of first-person videos taken from a camera mounted on top of a dog.

VENTURI Mountain Dataset

The VENTURI Mountain Dataset is a collection of 12 outdoor sequences, captured with a smartphone and manually verified and annotated with Ground Truth data.


UNICT-FD889 dataset is a food dataset composed by 889 distinct plates of food.

TIme Square Intersection (TISI) Dataset

The TIme Square Intersection (TISI) dataset was collected from a publicly accessible webcam for high-level event based video synopsis research.

Educational Resource Centre (ERCe) Dataset

The Educational Resource Centre (ERCe) dataset was collected from a publicly accessible webcam deployed on a university campus across about 2 months for semantic event based video synopsis research