Data & Models¶
Datasets, pretrained models, and data preparation guides.
Pretrained Models¶
Available Models¶
Model | Framework | Task | Download | Size |
---|---|---|---|---|
Cascade R-CNN | MMDetection | Face Detection | Link | 280MB |
HRNet-W32 | MMPose | 68-pt Landmarks | Link | 110MB |
YOLOv8-Face | Ultralytics | Face Detection | Link | 45MB |
MLP Converter | PyTorch | 68→48 pts | Link | 2MB |
DINOv2-base | Transformers | Features | Auto | 340MB |
Quick Download¶
# Download all models
python demos/download_models.py
# Download specific model
python demos/download_models.py --model hrnet
Model Specifications¶
Coming soon - detailed specifications for each model including: - Architecture details - Training data - Performance metrics - Hardware requirements
Datasets¶
PrimateFace Dataset¶
Our comprehensive dataset includes: - 60+ primate genera - 10,000+ annotated images - 68-point facial landmarks - Species labels
Data Format¶
All annotations use COCO format:
{
"info": {
"description": "PrimateFace Dataset",
"version": "1.0"
},
"images": [
{
"id": 1,
"file_name": "macaque_001.jpg",
"width": 640,
"height": 480
}
],
"annotations": [
{
"id": 1,
"image_id": 1,
"category_id": 1,
"bbox": [100, 100, 200, 200],
"keypoints": [x1,y1,v1, x2,y2,v2, ...],
"num_keypoints": 68
}
],
"categories": [
{
"id": 1,
"name": "primate_face",
"keypoints": ["point_1", "point_2", ...],
"skeleton": [[1,2], [2,3], ...]
}
]
}
Data Preparation¶
Creating COCO Annotations¶
From Image Directory¶
from gui.imgdir2coco_facedet import create_coco_dataset
create_coco_dataset(
image_dir="path/to/images",
output_json="annotations.json"
)
From Video Files¶
from gui.viddir2coco_facedet import extract_and_annotate
extract_and_annotate(
video_dir="path/to/videos",
output_json="video_annotations.json",
sample_rate=5 # fps
)
Data Validation¶
from evals.core.utils import validate_coco
# Check annotation integrity
issues = validate_coco("annotations.json")
if not issues:
print("Dataset is valid!")
Data Splitting¶
from evals.core.utils import split_dataset
# Create train/val/test splits
split_dataset(
input_json="full_dataset.json",
output_dir="splits/",
ratios=(0.7, 0.15, 0.15)
)
Data Augmentation¶
For training, apply augmentations:
augmentation_config = {
"RandomFlip": {"p": 0.5},
"RandomRotate": {"max_angle": 30},
"ColorJitter": {"brightness": 0.2, "contrast": 0.2},
"GaussianBlur": {"p": 0.1}
}
Keypoint Definitions¶
68-Point System¶
Standard facial landmarks:
FACIAL_LANDMARKS_68 = {
"jaw": list(range(0, 17)), # Jaw line
"right_eyebrow": list(range(17, 22)),
"left_eyebrow": list(range(22, 27)),
"nose_bridge": list(range(27, 31)),
"lower_nose": list(range(31, 36)),
"right_eye": list(range(36, 42)),
"left_eye": list(range(42, 48)),
"outer_lip": list(range(48, 60)),
"inner_lip": list(range(60, 68))
}
48-Point System¶
Primate-optimized landmarks:
PRIMATE_LANDMARKS_48 = {
"face_contour": list(range(0, 12)), # Simplified jaw
"right_eyebrow": list(range(12, 16)),
"left_eyebrow": list(range(16, 20)),
"nose": list(range(20, 28)),
"right_eye": list(range(28, 34)),
"left_eye": list(range(34, 40)),
"mouth": list(range(40, 48))
}
Dataset Statistics¶
Species Distribution¶
Analysis of the PrimateFace dataset:
from evals.plot_genus_distribution import analyze_dataset
stats = analyze_dataset("annotations.json")
print(f"Total images: {stats['n_images']}")
print(f"Total annotations: {stats['n_annotations']}")
print(f"Unique species: {stats['n_species']}")
Quality Metrics¶
from evals.core.metrics import dataset_quality
quality = dataset_quality("annotations.json")
print(f"Average keypoints per face: {quality['avg_keypoints']}")
print(f"Annotation completeness: {quality['completeness']:.1%}")
External Datasets¶
Compatible Datasets¶
PrimateFace can work with: - MacaquePose: 13,000+ macaque images - ChimpFace: Chimpanzee facial dataset - OpenMonkeyChallenge: Multi-species competition data
Converting External Data¶
from landmark_converter.apply_model import convert_dataset
# Convert human face dataset
convert_dataset(
input_json="human_faces.json",
output_json="primate_compatible.json",
converter_model="68to48_converter.pth"
)
Model Zoo Integration¶
Using HuggingFace Models¶
from transformers import AutoModel
# Load from HuggingFace
model = AutoModel.from_pretrained("primateface/hrnet-w32")
Model Cards¶
Each model includes: - Training details - Performance benchmarks - Limitations - Citation information