min read

RLHF Workers and the New Shadow Economy

A new gig economy powered the mercurial rise of ChatGPT et al--here's how it can benefit humans, too.
Written by
Coleman Numbers
Published on
August 18, 2023

It occurs to me that my thinking about AI safety has, until extremely recently, been something like the blank dread that the narrator in Franz Kafka’s “The Burrow” experiences in relation to an omnipresent, inexplicable, dreadful noise. There’s a real, tangible sense of impending threat that basic instincts can’t deny—but I can’t articulate what it is, or where it comes from, or why I should be hair-raisingly afraid when my favorite tech prognosticators prophesy arcanely about post-scarcity futures, an end to boring work, a new scientific revolution, etc. etc. Until recently, I’ve vacillated between chalking up this fear to Luddite paranoia and adopting the deadly serious, existentially interruptive morbidity that journalist and professed AI skeptic Freddie DeBoer identifies as its own form of dark millenarianism.

But that’s changed. I’ve found the source of my peculiar psychic noise. And I’m not burrowing any slower.

The Task

A couple months ago I wrote about AI and meaningful work, essentially summarizing a longer academic article that sketches out varying outcomes for how AI might negatively (mostly) and positively (vaguely) affect our experience of daily work. At the time, I found the insights interesting in the theoretical way that ads about glistening fast food are appetizing.

Then I read this New York Magazine piece on RLHF taskers in Kenya and, well, became extremely hungry. If you’re following my metaphor.

The story charts the rapid expansion of a gig-based underclass. Remote workers in developing nations are hired to sit at home and label data: images of clothing, seconds-long clips of traffic footage, spam emails, TikTok videos, food packaging, and anything else under the sun that can be fed into the rapacious training runs of various machine learning applications.

The annotators are restricted from describing their work to anyone and are given minimal information about what their work is actually for. Taskers’ work is transient and inconsistent, relying on the large training contracts from remote corporations, which means taskers can’t rely on the work for a steady wage. And the job itself isn’t precisely sunshine and rainbows. In the case of the RLHF work that enabled ChatGPT, OpenAI contracted a firm in Nairobi to hire annotators at 1.32-2.00 USD to label explicit content, including such appetizing offerings as “child sexual abuse, bestiality, murder, suicide, torture, self harm, and incest,” according to Times writer Billy Perrigo. Taskers who spoke with the Times recount disturbing dreams that animate the greatest hits of their workday’s horrors. (The tasker firm later cancelled this contract with OpenAI.)

The Machines

This pop-up annotation industry is a supply-side lesson in the worst incentives tied to AI progress. The most miraculous and impressive gen-AI achievements of the last year are inextricably bound up in a worryingly dystopian work environment: one where human intellectual labor is atomized Ford-like into repetitive tasks and decontextualized from any sense of meaningful production. The capacities that make people distinct and valuable from machines—the ability to understand context, recognize and integrate edge cases, deal with novel and difficult situations—is requisitioned and consumed by an impersonal network of gig contractors. Of course, this type of tasker work isn't totally new, and not unique to AI; but the productivity explosion that modern generative AI promises incentivize a rapid expansion of the industry.

As seductive as it is to blame this on reckless corporate greed, I don’t find it entirely productive, and not entirely true—after all, the new robber barons of the machine learning age have been conspicuously vocal about a commitment to AI alignment and safety. Anthropic committed major governing power to a  board of safety-minded trustees who are unable to profit from the company. Sam Altman’s been vigorous in his calls for AI regulation and has testified before the U.S. Senate to that effect. Over at Google Deepmind, Demis Hassabis has affirmed that companies need to “proceed with exceptional care”.

Whether these early overtures materialize into deep safety commitments is a question for another blog. Suffice it to say: I think honest neglect, rather than willful ignorance, explains the predicament of the maligned taskers in Kenya and elsewhere. I maintain this contention partially because it’s more interesting than your perfunctory eat-the-rich Wall Street-occupying animus but mostly because, if true, it opens us up to a more productive way forward: building companies and systems that are actively cognizant and concerned with whole human well-being.

The Humans

I won’t be the first armchair designer to invoke the principles of human-centered design in a critique of multi-billion dollar tech ventures, but in the by-turns messianic and diabolic predictions about AI’s future, it’s a perspective I don’t often hear from the popular discourse. There’s probably a lot to say there. For now, though, I’ll constrain my comments to the particular case of the taskers. The financial instability inherent in this emerging job class, its exploitative qualities, and its psychological toll, can all be alleviated if companies do a better job of listening: of reorienting themselves in relation to taskers such that taskers are key stakeholders in addition to—and perhaps above—large data-hungry clients like OpenAI.

Design professor and former Apple VP Don Norman calls this approach being “people-centered”:

Much of today’s systems, procedures, and devices are technology-centered, designed around the capabilities of the technology with people being asked to fill in the parts that the technology cannot do.”

This, in a sense, is where we’re at today with the taskers. AI needs enormous loads of training data that it can’t supply itself—and even as it’s trained, edge cases appear that the algorithms don’t have the mental structure to deal with. Taskers predigest this novelty for AI systems by labeling and categorizing it into discernible classes. Humans become appendages to the superorganism of a deep learning system and its owner-organization.

People-centered means changing this, starting with the needs and abilities of people. It means considering all the people who are involved, taking account of the history, culture, beliefs, and environment of the community. The best way to do this is to let those who live in the community provide the answers.”

How might the lives of the taskers change if the companies that hire them take time to better understand their daily experience? If these companies were thoughtful in promoting and training men and women from the community—subject matter experts in their “history, culture, beliefs, and environment”, could the companies build better training procedures? Create work environments more amenable to psychological well-being and therefore more productive? Offer compensation packages that better match the needs of local workers and attract the best people?

Maybe this all sounds a little trite and obvious; I think it’s important, nevertheless. The tasker industry isn’t going away, especially as AI systems become more generalized and ubiquitous. For better or worse, more and more work is going to be about introducing disembodied AI into the messy uncertainties of embodied life. This pursuit, as we’ve seen from the emergence of ChatGPT through RLHF, promises huge gains in research and business.

At the heart of that pursuit is the nagging question: are we going to be AI- or human-centered? Will our workplaces be reflexive shrines to inhuman superintelligence or enlivening centers of partnership between these stunning mind-enhancing systems and conscious, versatile, vivid human brains?

I know the future I want.

AI in Learning Newsletter
Keep up to date on the cutting edge technologies that are changing the way people learn and instruct.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.