The humans behind AI: Co-creating best practices for data workers’ well-being

August 4th, 2024

Publication : BlogPolicy response
Themes : Data workGig EconomyPlatform Governance

The humans behind AI: Co-creating best practices for data workers’ well-being

Image sourced by karya.in

The explosive growth in artificial intelligence (AI) is creating an immense demand for large training datasets. Millions of workers generate these datasets by labelling text, images, video, and audio for everything, from voice recognition assistants to 3D image recognition for autonomous vehicles. Data labelling, also known as data annotation, is a crucial step for building high-quality data sets to train AI models, with a quarter of all time spent on AI projects spent on data labelling.

India is emerging as one of the largest data-labelling markets in the world. As per industry estimates, by 2028, the global market for data annotations will be valued at $8.22 billion, with a predicted annual growth rate of 26.2%. Of this, the market serviced by India can exceed $7 billion by 2030, engaging a workforce of up to 1 million people. NASSCOM has stated that, as of 2023, 80% of data annotation workers come from rural areas, and 90% of industry players operate in Tier II and Tier III cities.

The gig economy in the Global South has seen significant growth over the past decade due to several key factors. An abundant supply of affordable labour migrating from rural to urban areas, low entry barriers for platform work, and increased digital penetration have allowed workers to access smartphones, internet connectivity, and opportunities for income generation. Out of approximately 15 million gig workers in India, less than 10 per cent are thought to be women, having traditionally been prevalent in gig work services like caregiving and home cleaning. However, as women recognise the potential of gig work to enhance financial independence, more women are participating in various segments of gig work. Women often opt for flexible work that allows them to balance household chores and paid employment, preferring jobs that keep them close to home. Their ability to manage both roles depends on the type and arrangement of the work. Flexible data work, which doesn’t require physical presence, is a top choice, resulting in more women taking up these roles.

Data enrichment work, particularly annotation and labelling, significantly contributes to the expanding gig economy. Estimates show that online gig workers in the global labour force range from 4.4 to 12.5 per cent. While this growing market creates opportunities for people in rural areas, the industry’s gig-work nature possesses challenges such as ensuring diversity, safety, fair pay, dignity, and proper worker representation. Data work often gets invisibilised and is considered under-valued or de-glamorised in the AI value chain that requires repetitive manual labour. Therefore, there is a growing need to support welfare efforts that align with the aspirations of data workers. Understanding the challenges faced by data workers, especially women, the enablers that support them, and the overall impact of this work on their daily lives, agency, and personhood is crucial.

In this context, Karya and Aapti Institute have partnered to engage in a wider inquiry into data workers in India. Aapti is a public research institute that examines the impact at the intersection of technology and society, with a focus on platform work, the gig economy, and data governance as its crucial research stream. Karya is an impact-focused organisation working to bring dignified smartphone-based gig work to low-income communities globally. This collaboration brings Karya’s impact-focused approach to low-income communities and Aapti’s research expertise in understanding impact and generating evidence. Through these collective research efforts, we aim to foster fair and inclusive data generation and labelling practices, setting industry standards centred on ethical labour and promoting an impactful value chain for AI/ML dataset generation.

This collaborative research study between Karya and Aapti aims to:

  • Build just, inclusive data generation and labelling practices and measurement systems within Karya, and highlight industry-level best practices.
  • Pilot methods for participatory governance that can enhance workers’ voices within Karya’s operations.
  • Engage with policy-makers and multilateral institutions, within India and across borders, on responsible governance in gig work, in matters like ethical labour practices, working conditions, and fair compensation

Given the emerging nature of this engagement, we are curious and eager to engage with experts who are interested in this work and would like to exchange insights on these questions. If you, or anyone you know, works in the areas of future of work or data-worker welfare, livelihoods and human-AI interaction please write to us at [email protected] (Aapti Institute) and [email protected] (Karya). We would love to speak with you.