Evaluation Utilities¶
Tools for model comparison and performance metrics.
Overview¶
PrimateFace provides comprehensive evaluation utilities for comparing models across frameworks and computing standard metrics.
Quick Start¶
from evals.compare_pose_models import compare_models
# Compare multiple models
results = compare_models(
models=["mmpose", "deeplabcut", "sleap"],
test_data="test_annotations.json"
)
Available Metrics¶
Detection Metrics¶
- mAP: Mean Average Precision at various IoU thresholds
- Precision/Recall: Detection accuracy metrics
- F1 Score: Harmonic mean of precision and recall
Pose Estimation Metrics¶
- NME: Normalized Mean Error (percentage of face size)
- PCK: Percentage of Correct Keypoints
- OKS: Object Keypoint Similarity
- AUC: Area Under Curve for PCK
Evaluation Scripts¶
Compare Detection Models¶
python evals/compare_det_models.py \
--models mmdet yolo \
--test-json test.json \
--output results.csv
Compare Pose Models¶
python evals/compare_pose_models.py \
--models mmpose dlc sleap \
--test-json test.json \
--metric nme
Per-Genus Evaluation¶
Visualization¶
Performance Plots¶
from evals.visualize_eval_results import plot_metrics
# Generate comparison plots
plot_metrics(
results_file="evaluation_results.csv",
output_dir="plots/"
)
Attention Maps¶
from dinov2.visualization import plot_attention
# Visualize model attention
plot_attention(
model=model,
image="primate.jpg",
save_path="attention.png"
)
Cross-Dataset Evaluation¶
Test generalization across datasets:
from evals.core.metrics import cross_dataset_eval
results = cross_dataset_eval(
model=model,
datasets=["primateface", "macaquepose", "chimpface"],
metric="nme"
)
Statistical Analysis¶
Significance Testing¶
from evals.core.metrics import compare_significance
# Test if model A is significantly better than B
p_value = compare_significance(
model_a_results=results_a,
model_b_results=results_b,
test="wilcoxon"
)
Export Results¶
Results can be exported in multiple formats:
# CSV for analysis
results.to_csv("evaluation.csv")
# JSON for web visualization
results.to_json("evaluation.json")
# LaTeX for papers
results.to_latex("evaluation.tex")