Jump to main content
その他

VariFace: Fair and Diverse Synthetic Dataset Generation for Face Recognition

VariFace: Fair and Diverse Synthetic Dataset Generation for Face Recognition

Material type
その他
Author
鈴木, 健二ほか
Publisher
-
Publication date
2024-12
Material Format
Paper
Capacity, size, etc.
-
NDC
-
View All

Notes on use

Note (General):

出版タイプ: AOThe use of large-scale, web-scraped datasets to train face recognition models has raised significant privacy and bias concerns. Synthetic met...

Search by Bookstore

Holdings of Libraries in Japan

This page shows libraries in Japan other than the National Diet Library that hold the material.

Please contact your local library for information on how to use materials or whether it is possible to request materials from the holding libraries.

other

  • Tokyo Tech Research Repository

    Paper
    You can check the holdings of institutions and databases with which Institutional Repositories DataBase(IRDB)(Institutional Repository) is linked at the site of Institutional Repositories DataBase(IRDB)(Institutional Repository).

Bibliographic Record

You can check the details of this material, its authority (keywords that refer to materials on the same subject, author's name, etc.), etc.

Paper

Material Type
その他
Author/Editor
鈴木, 健二
Suzuki, Kenji
Publication Date
2024-12
Publication Date (W3CDTF)
2024
Alternative Title
VariFace: Fair and Diverse Synthetic Dataset Generation for Face Recognition
Periodical title
arXiv preprint
Target Audience
一般
Note (General)
出版タイプ: AO
The use of large-scale, web-scraped datasets to train face recognition models has raised significant privacy and bias concerns. Synthetic methods mitigate these concerns and provide scalable and controllable face generation to enable fair and accurate face recognition. However, existing synthetic datasets display limited intraclass and interclass diversity and do not match the face recognition performance obtained using real datasets. Here, we propose VariFace, a two-stage diffusion-based pipeline to create fair and diverse synthetic face datasets to train face recognition models. Specifically, we introduce three methods: Face Recognition Consistency to refine demographic labels, Face Vendi Score Guidance to improve interclass diversity, and Divergence Score Conditioning to balance the identity preservation-intraclass diversity trade-off. When constrained to the same dataset size, VariFace considerably outperforms previous synthetic datasets (0.9200 → 0.9405) and achieves comparable performance to face recognition models trained with real data (Real Gap = -0.0065). In an unconstrained setting, VariFace not only consistently achieves better performance compared to previous synthetic methods across dataset sizes but also, for the first time, outperforms the real dataset (CASIA-WebFace) across six evaluation datasets. This sets a new state-of-the-art performance with an average face verification accuracy of 0.9567 (Real Gap = +0.0097) across LFW, CFP-FP, CPLFW, AgeDB, and CALFW datasets and 0.9366 (Real Gap = +0.0380) on the RFW dataset.
identifier:oai:t2r2.star.titech.ac.jp:50722059