Skip to Main Content Research Guides | Library | Amherst College

Generative AI

Ethics and Costs of Generative AI

"Technologies are not neutral, and neither are the societies into which they are introduced."

-Civics of Technology


Generative AI is implicated in a host of ethical issues and social costs, including:

  • bias, misrepresentation, and marginalization
  • labor exploitation and worker harms
  • misinformation and disinformation
  • privacy violations and data extraction
  • copyright and authorship issues
  • environmental costs

Scholars across multiple fields have pointed out the ways that AI discourse and development has been dominated by large corporate interests, with a focus on hypothetical benefits and risks rather than current, real world impacts.

Featured Reading


This page provides starting resources to learn more about these issues as generative AI technologies continue to develop.

Bias, Misrepresentation, Marginalization

Generative AI tools can exhibit bias, and bias can happen at different stages of development.

How Bias Happens:

  • Datasets: if datasets used for training generative AI models misrepresent, exclude, or marginalize certain social identities, communities, and practices, the models will reflect and often amplify these biases
  • Design choices: bias can also become embedded in the design of generative AI products through
    • design goals
    • assumptions about who users are ("imagined users")
    • the contexts that tools are designed for
    • how tools are evaluated

Bias in generative AI is not a new issue, but rather a continuation of problems within machine learning and algorithmic system development.

Featured Reading

Large Language Models (LLMs)

We read the paper that forced Timnit Gebru out of Google. Here’s what it says. MIT Technology Review. December 4, 2020. By Karen Hao.

OpenAI Chatbot Spits Out Biased Musings, Despite Guardrails. Bloomberg. December 8, 2022. By Davey Alba.

Quantifying ChatGPT’s gender bias. AI Snake Oil. Apr 26, 2023. By Sayash Kapoor and Arvind Narayanan.

Abid, Abubakar, Maheen Farooqi, and James Zou. Large Language Models Associate Muslims with Violence. Nature Machine Intelligence 3, no. 6 (June 1, 2021): 461–63. doi:10.1038/s42256-021-00359-2.

Shikha Bordia and Samuel R. Bowman. 2019. Identifying and Reducing Gender Bias in Word-Level Language Models. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 7–15, Minneapolis, Minnesota. Association for Computational Linguistics.

Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng. 2019. The Woman Worked as a Babysitter: On Biases in Language Generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3407–3412, Hong Kong, China. Association for Computational Linguistics.
 

Moin Nadeem, Anna Bethke, and Siva Reddy. 2021. StereoSet: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5356–5371, Online. Association for Computational Linguistics.

Image generation and classification

These fake images reveal how AI amplifies our worst stereotypes. The Washington Post. Nov. 1, 2023. By Nitasha Tiku, Kevin Schaul and Szu Yu Chen.

Humans Are Biased. Generative AI Is Even Worse. Bloomberg Technology + Equality. 2023. By Leonardo Nicoletti and Dina Bass.

How AI reduces the world to stereotypes. Rest of World. October 10, 2023. By Victoria Turk.

Black Artists Say A.I. Shows Bias, With Algorithms Erasing Their History. The New York Times. July 4, 2023. By Zachary Small.

AI was asked to create images of Black African docs treating white kids. How'd it go?. NPR: Goats and Soda. October 6, 2023. By Carmen Drahl.

These new tools let you see for yourself how biased AI image models are. MIT Technology Review. March 22, 2023. By Melissa Heikkilä.

Birhane, Abeba et al. Multimodal datasets: misogyny, pornography, and malignant stereotypes. ArXiv abs/2110.01963 (2021).

Study finds gender and skin-type bias in commercial artificial-intelligence systems. MIT News. February 11, 2018. By Larry Hardesty.

Labor Exploitation, Worker Harms

Generative AI training and improvement depends upon human labor in multiple ways, directly and indirectly. Indirectly, these systems use massive datasets of materials scraped from the internet - materials created by humans. Directly, training and improving these models requires humans to review and rate output. Machine generated output can include depictions of violence, self-harm, abuse, and other traumatizing content. Companies often employ people as contract workers to rate and review materials, and these positions are often low wage, high pressure, and precarious.

Misinformation and Disinformation

Generative AI is being used to create manipulated and entirely faked text, video, images, and audio, sometimes featuring prominent politicians and celebrities. These tools make it easier for bad actors to create persuasive, customized disinformation at scale. They may also reproduce false claims and other misinformation, despite safety guardrails. Digital watermarking and automated detection systems are insufficient on their own, as these can be bypassed in various ways.

Generative AI may also provide factually inaccurate outputs, generate "fake citations," or misrepresent information in other sources.

As AI models improve, it is increasingly difficult to tell the difference between images of real people and AI-generated images. AI-powered image manipulation tools are also being built into the latest generations of smartphones, with broad implications for fact-checking and navigating social media.

Privacy and Data Extraction

Privacy and data ownership are implicated in the development of generative AI models, by scraping large datasets from the web that contain personal information, as well as via user interactions with these tools. Privacy policies vary across tools, and some may take user input (text, images, etc.) and use them to further train models or provide future outputs. There are also indications that large language models can infer user characteristics like age, sex, income, and location from fairly innocuous text inputs.

Recently, researchers have been using various methods (data extraction attacks) to prompt systems like ChatGPT to reveal training data directly.

Researchers have also discovered ways to remove safety guardrails, so that AI models provide harmful information. Some of these methods can be used in attacks to provide "hidden instructions" to models, with huge security implications for users.

AI chatbots can be tricked into misbehaving. Can scientists stop it? ScienceNews. February 1, 2024. By Emily Conover.

Extracting Training Data from ChatGPT, summary and announcement of recent pre-print:

Your Personal Information Is Probably Being Used to Train Generative AI Models. Scientific American. October 19, 2023. By Lauren Leffer.

Beyond Memorization: Violating Privacy via Inference with Large Language Models. Privacy inference game, developed by SRILab. Based on pre-print by Robin Staab, Mark Vero, Mislav Balunović, and Martin Vechev.

Copyright and Authorship

Generative AI models are trained on large datasets crawled from the web, including artworks, performances, books, essays, and other materials created by humans. These creators were not notified that their work was being ingested and were not provided an opportunity to refuse. Further, some of these datasets may have been scraped from illegal collections of copyrighted works. There have been several lawsuits alleging copyright infringement against OpenAI, Meta, and other companies.

Whether AI-generated works can be copyrighted is also an evolving issue, with cases being filed and appealed in court.

It's important to note that legal frameworks and ethical frameworks related to copyright and authorship are slightly different lenses. Legal arguments around copyright may not (and may not be able to) address the broader ethical implications of machine systems harvesting human creative and knowledge works at scale.


Background information

Generative AI Legal Explainer for non-experts, featuring simplified responses to complex legal questions. Developed by Knowing Machines, a research project tracing the histories, practices, and politics of machine learning systems.

Copyright and Artificial Intelligence. U.S. Copyright Office.

The Office is undertaking a study of the copyright law and policy issues raised by generative AI and is assessing whether legislative or regulatory steps are warranted. The Office will use the record it assembles to advise Congress; inform its regulatory work; and offer information and resources to the public, courts, and other government entities considering these issues.


Legal challenges and issues

We Asked A.I. to Create the Joker. It Generated a Copyrighted Image. The New York Times. January 25, 2024. By Stuart A. Thompson.

Generative AI’s end-run around copyright won’t be resolved by the courts. AI Snake Oil. January 22, 2024. By Arvind Narayanan and Sayash Kapoor.

Generative AI Has a Visual Plagiarism Problem. IEEE Spectrum. January 6, 2024. By Gary Marcus and Reid Southen.

The Times Sues OpenAI and Microsoft Over A.I. Use of Copyrighted Work. The New York Times. December 27, 2023. By Michael M. Grynbaum and Ryan Mac.

Franzen, Grisham and Other Prominent Authors Sue OpenAI. The New York Times. September 20, 2023. By Alexandra Alter and Elizabeth A. Harris.

These 183,000 Books Are Fueling the Biggest Fight in Publishing and Tech. The Atlantic. September 25, 2023. By Alex Reisner.

New AI systems collide with copyright law. BBC News. 1 August 2023. By Suzanne Bearne.

As Fight Over A.I. Artwork Unfolds, Judge Rejects Copyright Claim. The New York Times. Aug. 21, 2023. By Zachary Small.

Sarah Silverman Sues OpenAI and Meta Over Copyright Infringement. The New York Times. July 10, 2023. By Zachary Small.

AI art tools Stable Diffusion and Midjourney targeted with copyright lawsuit. The Verge. Jan 16, 2023. By James Vincent.


Cloaking and data poisoning

Glaze by SAND Lab at UChicago - an "image cloaking" tool meant to help artists prevent generative AI from mimicking styles from their posted artworks.

  • "At a high level, Glaze works by understanding the AI models that are training on human art, and using machine learning algorithms, computing a set of minimal changes to artworks, such that it appears unchanged to human eyes, but appears to AI models like a dramatically different art style."

Nightshade (under development)

 

Environmental Costs

Training and using generative AI models can require a lot of energy, resulting in large carbon footprints and consuming drinking water. Researchers are working on ways to assess the environmental impacts and reduce energy usage, but this can be difficult as some companies do not disclose this information in detail.