VariFace: Fair and Diverse Synthetic Dataset Generation for Face Recognition

Material type: その他

Author: 鈴木, 健二ほか

Publisher: -

Publication date: 2024-12

Material Format: Paper

Capacity, size, etc.: -

NDC: -

View All

Notes on use

Note (General)：: 出版タイプ： AOThe use of large-scale, web-scraped datasets to train face recognition models has raised significant privacy and bias concerns. Synthetic met...

Search by Bookstore

Holdings of Libraries in Japan

This page shows libraries in Japan other than the National Diet Library that hold the material.

List of Cooperating Institutions and Databases

Please contact your local library for information on how to use materials or whether it is possible to request materials from the holding libraries.

other

Tokyo Tech Research Repository
Paper
You can check the holdings of institutions and databases with which Institutional Repositories DataBase(IRDB)(Institutional Repository) is linked at the site of Institutional Repositories DataBase(IRDB)(Institutional Repository).
Tokyo Tech Research Repository

Search by Bookstore

Bibliographic Record

You can check the details of this material, its authority (keywords that refer to materials on the same subject, author's name, etc.), etc.

Paper

Material Type: その他
Title: VariFace: Fair and Diverse Synthetic Dataset Generation for Face Recognition
Author/Editor: 鈴木, 健二
Suzuki, Kenji
Author Heading: 鈴木, 健二
Suzuki, Kenji
Publication Date: 2024-12
Publication Date (W3CDTF): 2024
Alternative Title: VariFace: Fair and Diverse Synthetic Dataset Generation for Face Recognition
Periodical title: arXiv preprint
Target Audience: 一般
Note (General): 出版タイプ： AO
The use of large-scale, web-scraped datasets to train face recognition models has raised significant privacy and bias concerns. Synthetic methods mitigate these concerns and provide scalable and controllable face generation to enable fair and accurate face recognition. However, existing synthetic datasets display limited intraclass and interclass diversity and do not match the face recognition performance obtained using real datasets. Here, we propose VariFace, a two-stage diffusion-based pipeline to create fair and diverse synthetic face datasets to train face recognition models. Specifically, we introduce three methods: Face Recognition Consistency to refine demographic labels, Face Vendi Score Guidance to improve interclass diversity, and Divergence Score Conditioning to balance the identity preservation-intraclass diversity trade-off. When constrained to the same dataset size, VariFace considerably outperforms previous synthetic datasets (0.9200 → 0.9405) and achieves comparable performance to face recognition models trained with real data (Real Gap = -0.0065). In an unconstrained setting, VariFace not only consistently achieves better performance compared to previous synthetic methods across dataset sizes but also, for the first time, outperforms the real dataset (CASIA-WebFace) across six evaluation datasets. This sets a new state-of-the-art performance with an average face verification accuracy of 0.9567 (Real Gap = +0.0097) across LFW, CFP-FP, CPLFW, AgeDB, and CALFW datasets and 0.9366 (Real Gap = +0.0380) on the RFW dataset.
identifier:oai:t2r2.star.titech.ac.jp:50722059
Data Provider (Database): 国立情報学研究所 : 学術機関リポジトリデータベース（IRDB）（機関リポジトリ）
https://irdb.nii.ac.jp
Original Data Provider (Database): 東京科学大学 : 東京科学大学リサーチリポジトリ(T2R2)