Research Guides: Generative AI: Ethics and Costs

Ethics and Costs of Generative AI

"Technologies are not neutral and neither are the societies into which they are introduced."

Generative AI is implicated in a host of ethical issues and social costs, including:

bias, misrepresentation, and marginalization
labor exploitation and worker harms
misinformation and disinformation
privacy violations and data extraction
copyright and authorship issues
environmental costs

Scholars across multiple fields have pointed out the ways that AI discourse and development has been dominated by large corporate interests, with a focus on hypothetical benefits and risks rather than current, real world impacts.

Featured Reading

AI’s Present Matters More Than Its Imagined Future, The Atlantic. October 4, 2023. By Inioluwa Deborah Raji.
ChatGPT, Galactica, and the Progress Trap. Wired. December 9, 2022. By Abeba Birhane and Deborah Raji.

This page provides starting resources to learn more about these issues as generative AI technologies continue to develop.

Bias, Misrepresentation, Marginalization

Generative AI tools can exhibit bias, and bias can happen at different stages of development.

How Bias Happens:

Datasets: if datasets used for training generative AI models misrepresent, underrepresent, exclude, or marginalize certain social identities, communities, and practices, the models will reflect and often amplify these biases
Design choices: bias can also become embedded in the design of generative AI products through
- design goals
- assumptions about who users are ("imagined users")
- contexts that tools are designed for
- how tools are evaluated

Bias in generative AI is not a new issue, but rather a continuation of problems within machine learning and algorithmic system development.

Featured Reading

Humans Absorb Bias from AI—And Keep It after They Stop Using the Algorithm
Scientific American. October 26, 2023. By Lauren Leffer.
These Women Tried to Warn Us About AI
Aug 12, 2023. Rolling Stone. By Lorena O'Neil.
AI Is Steeped in Big Tech’s ‘Digital Colonialism’
May 25, 2023. Wired. By Grace Browne.

Large Language Models (LLMs)

We read the paper that forced Timnit Gebru out of Google. Here’s what it says
MIT Technology Review. December 4, 2020. By Karen Hao.
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜
Emily M. Bender et al., in Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21 (New York, NY, USA: Association for Computing Machinery, 2021), 610–23
OpenAI Chatbot Spits Out Biased Musings, Despite Guardrails
Bloomberg. December 8, 2022. By Davey Alba.
Quantifying ChatGPT’s gender bias
AI Snake Oil. Apr 26, 2023. By Sayash Kapoor and Arvind Narayanan.
Large Language Models Associate Muslims with Violence
Abid, Abubakar, Maheen Farooqi, and James Zou. Nature Machine Intelligence 3, no. 6 (June 1, 2021): 461–63.
Challenging systematic prejudices: an investigation into bias against women and girls in large language models
UNESCO, International Research Centre on Artificial Intelligence. 2024.

Images, videos, and image classification

OpenAI’s Sora Is Plagued by Sexist, Racist, and Ableist Biases
Wired. March 23, 2025. By Reece Rogers and Victoria Turk.
These fake images reveal how AI amplifies our worst stereotypes
The Washington Post. Nov. 1, 2023. By Nitasha Tiku, Kevin Schaul and Szu Yu Chen.
Humans Are Biased. Generative AI Is Even Worse
Bloomberg Technology + Equality. 2023. By Leonardo Nicoletti and Dina Bass.
How AI reduces the world to stereotypes
Rest of World. October 10, 2023. By Victoria Turk.
Black Artists Say A.I. Shows Bias, With Algorithms Erasing Their History
The New York Times. July 4, 2023. By Zachary Small.
AI was asked to create images of Black African docs treating white kids. How'd it go?
NPR: Goats and Soda. October 6, 2023. By Carmen Drahl.
Reflections before the storm: the AI reproduction of biased imagery in global health visuals
The Lancet Global Health. August 09, 2023. Arsenii Alenichev, Patricia Kingori, Koen Peeters Grietens.
These new tools let you see for yourself how biased AI image models are
MIT Technology Review. March 22, 2023. By Melissa Heikkilä.
Stable Bias: Analyzing Societal Representations in Diffusion Models
Hugging Face. Alexandra Sasha Luccioni, Christopher Akiki, Margaret Mitchell, and Yacine Jernite.
Multimodal datasets: misogyny, pornography, and malignant stereotypes
Birhane, Abeba et al. ArXiv abs/2110.01963 (2021).
Study finds gender and skin-type bias in commercial artificial-intelligence systems
MIT News. February 11, 2018. By Larry Hardesty.
Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification
Proceedings of Machine Learning Research 81:1–15, 2018 Conference on Fairness, Accountability, and Transparency. 2018. By Joy Buolamwini and Timnit Gebru.
Gender Shades project website
The Gender Shades project evaluates the accuracy of AI powered gender classification products.

Labor Exploitation, Worker Harms

Generative AI training and improvement depends upon human labor in multiple ways. These systems are based on massive datasets of human-made works scraped from the internet. Training and improving these models requires people to annotate data and review and rate outputs. Data sets and outputs can include depictions of violence, self-harm, abuse, and other traumatizing content. Companies employ people as contract workers to describe, rate, and review materials, and these positions are low wage, high pressure, and precarious.

Suicide attempts, sackings and a vow of silence: Meta’s new moderators face worst conditions yet
The Bureau of Investigative Journalism, The Guardian. April 27, 2025. By Claire Wilmot and Rachel Hall.
China’s AI boom depends on an army of exploited student interns
Rest of World. September 14, 2023. By Viola Zhou and Caiwei Chen.
America Already Has an AI Underclass
The Atlantic. July 26, 2023. By Matteo Wong
Cleaning Up ChatGPT Takes Heavy Toll on Human Workers
The Wall Street Journal. July 24, 2023. By Karen Hao and Deepa Seetharaman.
AI needs to face up to its invisible-worker problem
MIT Technology Review. December 11, 2020. By Will Douglas Heaven.
Data Work and its Layers of (In)visibility
Just Tech. nd. By Adrienne Williams and Milagros Miceli

Misinformation and Disinformation

Generative AI is being used to create manipulated and entirely faked text, video, images, and audio, sometimes featuring prominent politicians and celebrities. These tools make it easier for bad actors to create persuasive, customized disinformation at scale. They may also reproduce false claims and other misinformation, despite safety guardrails. Digital watermarking and automated detection systems are insufficient on their own, as these can be bypassed in various ways.

Generative AI may also provide factually inaccurate outputs, generate "fake citations," or misrepresent information from sources. Despite model improvements, hallucination / confabulation rates may be high for real-world uses.

As AI models improve, it is increasingly difficult to tell the difference between images of real people and AI-generated images. AI-powered image manipulation tools are also being built into the latest generations of smartphones, with broad implications for fact-checking and navigating social media.

'MAHA Report' marred by AI garble, experts say
Washington Post. May 31, 2025. By Lauren Weber and Caitlin Gilbert.
OpenAI’s new reasoning AI models hallucinate more
TechCrunch. April 18, 2025. By Maxwell Zeff.
Why do LLMs make stuff up? New research peers under the hood.
Ars Technica. March 28, 2025. By Kyle Orland.
AI Search Has A Citation Problem
Columbia Journalism Review. March 6, 2025. By Klaudia Jaźwińska and Aisvarya Chandrasekar.
Study suggests that even the best AI models hallucinate a bunch
TechCrunch. Aug 14, 2024. By Kyle Wiggers.
Test Yourself: Which Faces Were Made by A.I.?
The New York Times. Jan. 9, 2024. By Stuart A Thompson.
Chatbots May ‘Hallucinate’ More Often Than Many Realize
The New York Times. Nov. 6, 2023. By Cade Metz.
‘A.I. Obama’ and Fake Newscasters: How A.I. Audio Is Swarming TikTok
The New York Times. Oct. 12, 2023. By Stuart A. Thompson and Sapna Maheshwari.
Red-Teaming Finds OpenAI’s ChatGPT and Google’s Bard Still Spread Misinformation
NewsGuard. August 8, 2023. By Jack Brewster and McKenzie Sadeghi.
Can AI Write Persuasive Propaganda?
Goldstein, Josh A., et al. SocArXiv, 8 Apr. 2023.
Synthetic Lies: Understanding AI-Generated Misinformation and Evaluating Algorithmic and Human Solutions
Zhou, Jiawei, et al. CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. April 2023.

Privacy and Data Extraction

Privacy and data ownership are implicated in the development of generative AI models, by scraping large datasets from the web that contain personal information, as well as via user interactions with these tools. Privacy policies vary across tools, and some may take user input (text, images, etc.) and use them to further train models or provide future outputs. There are also indications that large language models can infer user characteristics like age, sex, income, and location from fairly innocuous text inputs.

Recently, researchers have been using various methods (data extraction attacks) to prompt systems like ChatGPT to reveal training data directly.

Researchers have also discovered ways to remove safety guardrails, so that AI models provide harmful information. Some of these methods can be used in attacks to provide "hidden instructions" to models, with huge security implications for users.

The Right to Be Forgotten Is Dead: Data Lives Forever in AI
Tech Policy Press. May 20, 2025. By Haley Higa, Suzan Bedikian, and Lily Costa.
AI chatbots can be tricked into misbehaving. Can scientists stop it?
ScienceNews. February 1, 2024. By Emily Conover.
Extracting Training Data from ChatGPT
Summary and announcement of recent pre-print: Nasr, Milad, et al. Scalable Extraction of Training Data from (Production) Language Models. arXiv:2311.17035, arXiv, 28 Nov. 2023. arXiv.org, https://doi.org/10.48550/arXiv.2311.17035.
Your Personal Information Is Probably Being Used to Train Generative AI Models
Scientific American. October 19, 2023. By Lauren Leffer.
Beyond Memorization: Violating Privacy via Inference with Large Language Models
Privacy inference game, developed by SRILab. Based on pre-print by Robin Staab, Mark Vero, Mislav Balunović, and Martin Vechev.

Copyright and Authorship

Generative AI models are trained on large datasets crawled from the web, including artworks, performances, books, essays, and other materials created by humans. These creators were not notified that their work was being ingested and were not provided an opportunity to refuse. Further, some of these datasets may have been scraped from illegal collections of copyrighted works. There have been several lawsuits alleging copyright infringement against OpenAI, Meta, and other companies.

Whether AI-generated works can be copyrighted is also an evolving issue, with cases being filed and appealed in court.

It's important to note that legal frameworks and ethical frameworks related to copyright and authorship are slightly different lenses. Legal arguments around copyright may not (and may not be able to) address the broader ethical implications of machine systems harvesting human creative and knowledge works at scale.

Background information

Generative AI Legal Explainer
Resource with simplified responses to complex legal questions. Developed by Knowing Machines, a research project tracing the histories, practices, and politics of machine learning systems.
Copyright and Artificial Intelligence
U.S. Copyright Office.
The Office is undertaking a study of the copyright law and policy issues raised by generative AI and is assessing whether legislative or regulatory steps are warranted. The Office will use the record it assembles to advise Congress; inform its regulatory work; and offer information and resources to the public, courts, and other government entities considering these issues.
Copyright and Artificial Intelligence Part 1: Digital Replicas
United States Copyright Office. July 2024.
Part 1 of an ongoing multi-part report analyzing copyright law and policy issues raised by AI. Part 1 addresses digital replicas.
Copyright and Artificial Intelligence Part 2: Copyrightability
United States Copyright Office. January 2025.
Addresses the copyrightability of outputs created using generative AI.
Copyright and Artificial Intelligence Part 3: Generative AI Training (pre-publication)
On May 9, 2025, the Office released a pre-publication version of Part 3 in response to congressional inquiries and expressions of interest from stakeholders. A final version of Part 3 will be published in the near future, without any substantive changes expected in the analysis or conclusions.

Legal challenges and issues

The Unbelievable Scale of AI's Pirated-Books Problem
The Atlantic. March 20, 2025. By Alex Reisner.
Search LibGen, the Pirated-Books Database That Meta Used to Train AI
Link to search tool, associated with the following article, "The Unbelievable Scale of AI's Pirated-Books Problem"
The Atlantic. March 20, 2025. By Alex Reisner
Thomson Reuters Wins First Major AI Copyright Case in the US
Wired. Feb 11, 2025. By Kate Knibbs.
How Tech Giants Cut Corners to Harvest Data for A.I.
The New York Times. April 6, 2024. By By Cade Metz, Cecilia Kang, Sheera Frenkel, Stuart A. Thompson and Nico Grant.
We Asked A.I. to Create the Joker. It Generated a Copyrighted Image
The New York Times. January 25, 2024. By Stuart A. Thompson.
Generative AI’s end-run around copyright won’t be resolved by the courts
AI Snake Oil. January 22, 2024. By Arvind Narayanan and Sayash Kapoor.
Generative AI Has a Visual Plagiarism Problem
IEEE Spectrum. January 6, 2024. By Gary Marcus and Reid Southen.
The Times Sues OpenAI and Microsoft Over A.I. Use of Copyrighted Work
The New York Times. December 27, 2023. By Michael M. Grynbaum and Ryan Mac.
Franzen, Grisham and Other Prominent Authors Sue OpenAI
The New York Times. September 20, 2023. By Alexandra Alter and Elizabeth A. Harris.
These 183,000 Books Are Fueling the Biggest Fight in Publishing and Tech
The Atlantic. September 25, 2023. By Alex Reisner.
New AI systems collide with copyright law
BBC News. 1 August 2023. By Suzanne Bearne.
As Fight Over A.I. Artwork Unfolds, Judge Rejects Copyright Claim
The New York Times. Aug. 21, 2023. By Zachary Small.
Sarah Silverman Sues OpenAI and Meta Over Copyright Infringement
The New York Times. July 10, 2023. By Zachary Small.
Meta Lawsuit
AI art tools Stable Diffusion and Midjourney targeted with copyright lawsuit
The Verge. Jan 16, 2023. By James Vincent.

Scholarly Publishing

Generative AI Licensing Agreement Tracker
By Ithaka S+R, tracks publisher deals licensing their scholarly content as LLM training data. Includes publisher, purchaser, deal type and size, impact and strategy.
Oxford University Press ‘Actively Working’ With AI Companies
By Kathryn Palmer. Inside Higher Ed. August 5, 2024.
Two Major Academic Publishers Signed Deals With AI Companies. Some Professors Are Outraged
The Chronicle of Higher Education. July 29, 2024. By Christa Dutton.
AG Recommends Clause in Publishing and Distribution Agreements Prohibiting AI Training Uses
The Author's Guild. March 1, 2023.

Cloaking and data poisoning

Glaze
An "image cloaking" tool meant to help artists prevent generative AI from mimicking styles from their posted artworks. By SAND Lab at UChicago.
Nightshade
A tool that turns any image into a data sample that is unsuitable for model training. Nightshade transforms images into "poison" samples, so that models training on them without consent will see their models learn unpredictable behaviors that deviate from expected norms. By SAND Lab at UChicago.
This new data poisoning tool lets artists fight back against generative AI
MIT Technology Review, October 23, 2023. By Melissa Heikkilä.

Environmental Costs

Training and using generative AI models requires a lot of energy, increasing emissions, causing strain on electrical grids, and consuming drinking water. Researchers are working on ways to assess energy usage and environmental impacts, but companies often do not disclose this information. Accelerations in AI development have caused an exponential rise in energy demand from data centers, generating a corresponding rise in emissions as centers draw from fossil fuel-reliant grids.

Google Environmental Report 2024. Google. July 2024.

Google Environmental Report 2023. Google. July 2023

Microsoft 2024 Environmental Sustainability Report. Microsoft. n.d.

We did the math on AI’s energy footprint. Here’s the story you haven’t heard.
MIT Technology Review. May 20, 2025. By James O'Donnell and Casey Crownhart.
AI Is Draining Water From Areas That Need It Most.
May 8, 2025. Bloomberg. By Leonardo Nicoletti, Michelle Ma, and Dina Bass.
In the shadows of Arizona's data center boom, thousands live without power
The Washington Post. December 23, 2024. By Pranshu Verma.
A bottle of water per email: the hidden environmental costs of using AI chatbots
September 18, 2024. The Washington Post. By Pranshu Verma and Shelly Tan.
Data center emissions probably 662% higher than big tech claims
September 15, 2024. The Guardian. By Isabel O'Brien.
AI brings soaring emissions for Google and Microsoft, a major contributor to climate change
July 12, 2024. NPR. By Dara Kerr
AI Is Taking Water From the Desert
March 1, 2024. The Atlantic. By Karen Hao.
Generative AI’s environmental costs are soaring — and mostly secret
Nature. February 20, 2024. By Kate Crawford.
Artificial intelligence technology behind ChatGPT was built in Iowa — with a lot of water
September 9, 2023. AP News. By Matt O’Brien and Hannah Fingerhut.
Training a Single AI Model Can Emit as Much Carbon as Five Cars in Their Lifetimes
June 6, 2019. MIT Technology Review. By Karen Hao.
We’re Getting a Better Idea of AI’s True Carbon Footprint
November 14, 2022. MIT Technology Review. By Melissa Heikkilä.