While there isn’t a widely cited, standalone benchmark exactly matching the specific acronym string “FaMeBench,” your query closely overlaps with a few major, highly related face perception benchmarks.
Depending on the context of your research, you are likely looking for one of the following key frameworks: 1. FaceBench (Hierarchical Face Perception)
If you are evaluating how advanced AI models process detailed facial metadata, you are likely thinking of FaceBench.
The Core Purpose: It is a comprehensive dataset designed to evaluate the face perception capabilities of Multimodal Large Language Models (MLLMs).
The Structure: It organizes facial metadata into a hierarchical structure across 5 distinct views (Appearance, Accessories, Surrounding, Psychology, and Identity) broken down into 3 granular levels.
The Dataset: It contains 49,919 Visual Question-Answering (VQA) pairs for testing and evaluation. 2. FaceXBench (Comprehensive Face Understanding)
If your focus is broader facial analytics and operational tasks, this framework is a primary industry standard.
The Core Purpose: FaceXBench evaluates how well MLLMs handle complex face understanding.
The Scope: It covers 14 different tasks across 6 categories, directly measuring capabilities in facial analysis, localization, demographic fairness/bias, and face authentication.
The Dataset: It consists of 5,000 multimodal multiple-choice questions curated from dozens of public biometric datasets. 3. F-Bench (AI-Generated Face Quality)
If your metadata extraction involves assessing the quality, authenticity, or distortions of computer-generated faces, you might be looking for F-Bench.
The Core Purpose: It assesses face generation, customization, and restoration models.
The Scope: It leverages the FaceQ database to extract human preference metadata across dimensions like identity fidelity, authenticity, and text-image alignment.
Could you clarify a few details so I can give you the exact information you need?
Are you looking at this benchmark for traditional computer vision (like facial recognition/detection) or for Multimodal LLMs?
Is there a specific author, university, or publication year you are trying to track down?
What specific facial metadata (e.g., age/gender estimation, emotion, deepfake detection, or geometric landmarks) are you trying to extract?
FaceXBench: Evaluating Multimodal LLMs on Face Understanding
Leave a Reply