Skip to Main Content Research Guides | Library | Amherst College

Generative AI

Ethics and Costs of Generative AI

"Technologies are not neutral, and neither are the societies into which they are introduced."

-Civics of Technology


Generative AI is implicated in a host of ethical issues and social costs, including:

  • bias, misrepresentation, and marginalization
  • labor exploitation and worker harms
  • misinformation and disinformation
  • privacy violations and data extraction
  • copyright and authorship issues
  • environmental costs

Scholars across multiple fields have pointed out the ways that AI discourse and development has been dominated by large corporate interests, with a focus on hypothetical benefits and risks rather than current, real world impacts.

Featured Reading


This page provides starting resources to learn more about these issues as generative AI technologies continue to develop.

Bias, Misrepresentation, Marginalization

Generative AI tools can exhibit bias, and bias can happen at different stages of development.

How Bias Happens:

  • Datasets: if datasets used for training generative AI models misrepresent, underrepresent, exclude, or marginalize certain social identities, communities, and practices, the models will reflect and often amplify these biases
  • Design choices: bias can also become embedded in the design of generative AI products through
    • design goals
    • assumptions about who users are ("imagined users")
    • contexts that tools are designed for
    • how tools are evaluated

Bias in generative AI is not a new issue, but rather a continuation of problems within machine learning and algorithmic system development.

Featured Reading

Large Language Models (LLMs)

Image generation and classification

Labor Exploitation, Worker Harms

Generative AI training and improvement depends upon human labor in multiple ways, directly and indirectly. Indirectly, these systems use massive datasets of materials scraped from the internet - materials created by humans. Directly, training and improving these models requires humans to review and rate output. Machine generated output can include depictions of violence, self-harm, abuse, and other traumatizing content. Companies often employ people as contract workers to rate and review materials, and these positions are often low wage, high pressure, and precarious.

Misinformation and Disinformation

Generative AI is being used to create manipulated and entirely faked text, video, images, and audio, sometimes featuring prominent politicians and celebrities. These tools make it easier for bad actors to create persuasive, customized disinformation at scale. They may also reproduce false claims and other misinformation, despite safety guardrails. Digital watermarking and automated detection systems are insufficient on their own, as these can be bypassed in various ways.

Generative AI may also provide factually inaccurate outputs, generate "fake citations," or misrepresent information in other sources.

As AI models improve, it is increasingly difficult to tell the difference between images of real people and AI-generated images. AI-powered image manipulation tools are also being built into the latest generations of smartphones, with broad implications for fact-checking and navigating social media.

Privacy and Data Extraction

Privacy and data ownership are implicated in the development of generative AI models, by scraping large datasets from the web that contain personal information, as well as via user interactions with these tools. Privacy policies vary across tools, and some may take user input (text, images, etc.) and use them to further train models or provide future outputs. There are also indications that large language models can infer user characteristics like age, sex, income, and location from fairly innocuous text inputs.

Recently, researchers have been using various methods (data extraction attacks) to prompt systems like ChatGPT to reveal training data directly.

Researchers have also discovered ways to remove safety guardrails, so that AI models provide harmful information. Some of these methods can be used in attacks to provide "hidden instructions" to models, with huge security implications for users.

Copyright and Authorship

Generative AI models are trained on large datasets crawled from the web, including artworks, performances, books, essays, and other materials created by humans. These creators were not notified that their work was being ingested and were not provided an opportunity to refuse. Further, some of these datasets may have been scraped from illegal collections of copyrighted works. There have been several lawsuits alleging copyright infringement against OpenAI, Meta, and other companies.

Whether AI-generated works can be copyrighted is also an evolving issue, with cases being filed and appealed in court.

It's important to note that legal frameworks and ethical frameworks related to copyright and authorship are slightly different lenses. Legal arguments around copyright may not (and may not be able to) address the broader ethical implications of machine systems harvesting human creative and knowledge works at scale.


Background information

Legal challenges and issues

Scholarly Publishing

Cloaking and data poisoning

Environmental Costs

Training and using generative AI models can require a lot of energy, increasing emissions and consuming drinking water. Researchers are working on ways to assess the environmental impacts and reduce energy usage, but this can be difficult as some companies do not disclose this information in detail.