Ethical Considerations for Responsible Data Curation

Jerone Andrews

Dora Zhao

William Thong

Apostolos Modas

Orestis Papakyriakopoulos

Alice Xiang

NeurIPS 2023

2023

Abstract

Human-centric computer vision (HCCV) data curation practices often neglect privacy and bias concerns, leading to dataset retractions and unfair models. HCCV datasets constructed through nonconsensual web scraping lack crucial metadata for comprehensive fairness and robustness evaluations. Current remedies are post hoc, lack persuasive justification for adoption, or fail to provide proper contextualization for appropriate application. Our research focuses on proactive, domain-specific recommendations, covering purpose, privacy and consent, as well as diversity, for curating HCCV evaluation datasets, addressing privacy and bias. We adopt an ante hoc reflective perspective, drawing from current practices, guidelines, dataset withdrawals, and audits, to inform our considerations and recommendations.

Related Publications

Not My Voice! A Taxonomy of Ethical and Safety Harms of Speech Generators

FaccT, 2024
Wiebke Hutiri*, Orestis Papakyriakopoulos, Alice Xiang

The rapid and wide-scale adoption of AI to generate human speech poses a range of significant ethical and safety risks to society that need to be addressed. For example, a growing number of speech generation incidents are associated with swatting attacks in the United States…

Query by Activity Video in the Wild

ICIP, 2023
Tao Hu*, William Thong, Pascal Mettes*, Cees Snoek*

This paper considers retrieval of videos containing human activity from just a video query. In the literature, a common assumption is that all activities have sufficient labelled examples when learning an embedding for retrieval. However, this assumption does not hold in pra…

Beyond Skin Tone: A Multidimensional Measure of Apparent Skin Color

ICCV, 2023
William Thong, Przemyslaw Joniak*, Alice Xiang

This paper strives to measure apparent skin color in computer vision, beyond a unidimensional scale on skin tone. In their seminal paper Gender Shades, Buolamwini and Gebru have shown how gender classification systems can be biased against women with darker skin tones. While…

SEE ALL

HOME
Publications
Ethical Considerations for Responsible Data Curation

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.

LEARN MORE