chat loading...
Skip to Main Content

Generative AI

Appropriate Use

Always talk with your professor and review the relevant policies (e.g., syllabus statements, honor code, publisher policies) before using generative AI tools or features in your work.


Make sure you understand whether and how you're allowed to use generative AI, including at which stages of your research and how to cite or indicate your use.

Be intentional about using generative AI, which means considering the ways generative AI use may affect your learning and engagement with the research process. You are responsible for your own work, and should be able to explain the decisions you make about your process. Asking yourself questions about values, benefits, and costs to use can help clarify your reasons.

Privacy, security, accessibility

Generative AI tools vary in terms of privacy, security and accessibility. Before using any tool, make sure you understand how it handles your data and any risks involved.

Most generative AI tools and platforms retain your chat/prompt history by default and many use your inputs to train their models. Your input (including uploads) may also be reviewed for quality control or abuse prevention.

  • You may be able to opt-out of data collection - check policies and data settings for specific instructions.

Protecting Privacy & Intellectual Property

It's best not to share anything with a generative AI tool that you don’t want to become public, or that you don’t want used in future generative AI outputs.

Don’t share private, sensitive, or personally identifiable information (PII)—about you or others, which includes lecture notes, slides, or audio from your faculty—and consult College guidelines prior to use.

Generative AI vs. Library Search

Generative AI text tools draw from their training data and statistical associations to create new text, attempting to provide plausible, coherent outputs by predicting the most likely sequence of words in a given response. They do not "read" or "understand" sources. As a result, they may generate "fake citations," or references to sources that do not exist.

 

Some generative AI tools can search the Internet or custom databases using a technique called Retrieval Augmented Generation (RAG). These tools are different from library-database searches.

Library-database searches

Library-database searches identify keywords and phrases in your search, then match your terms with indexed sources using ranking algorithms based on a variety of factors to determine relevance. They may also use natural language processing to identify context and relationships for key terms and sources.

You can often sort the results by criteria like date, relevance, or location; filter based on aspects like source type; and choose which sources to explore further from a list of results. You can also try different keywords, or combinations of keywords, and observe how this affects your search results.

Generative AI with RAG

Generative AI tools with RAG use a different approach to search. First, the model translates your query into a vector-based search (imagine a "map" of concepts, with related concepts closer to each other). This search retrieves information based on similarity, or closeness in meaning. The retrieved information is then passed to a generative model, which synthesizes a coherent response, often including links to retrieved sources.

Diagram of a Retrieval-Augmented Generation (RAG) system with three steps: (1) Convert query via embedding for vector database search, (2) Augment the query with retrieved context, and (3) Generate a response using a large language model (LLM)"Errors can happen at each stage of this translation and synthesis process, and these models can still “hallucinate” (generate inaccurate information). It may be difficult or impossible to determine how the models are interpreting your specific search terms or your question. Further, generative AI with RAG prioritizes similarity to your query, not necessarily source authority, timeliness, or other elements.

The quality of outputs is highly dependent on the quality of sources in the database. It is important to directly check the sources referenced in any AI-generated summary. You should also consider whether the database (if known) is broad enough to contain all of the information you may need, or if you should supplement with specific library-database searches in your field of interest.


Sources
Image: Retrieval-Augmented Generation Workflow, from “Retrieval-Augmented Generation (RAG): From Theory to LangChain Implementation,” November 14, 2023. Towards Data Science. By Leonie Monigatti.

What Is Retrieval-Augmented Generation, aka RAG?. NVIDIA blog. November 15, 2023. By Rick Merritt.

What is retrieval-augmented generation? IBM blog. August 22, 2023. By Kim Martineau.

Lewis, Patrick, et al. "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." Advances in Neural Information Processing Systems, 6-12 December 2020, Online, 2020.

Evaluating Generative AI Outputs

Generative AI is a tool, not a source — it’s built on prediction, not evaluating or creating information as a human author does. You should always evaluate the sources provided by a generative AI tool directly.

 

Keep a critical perspective when evaluating the outputs of any generative AI tool. Consider the following aspects:

Bias

Generative AI has been proven to reproduce and amplify the social biases present in the datasets it was trained on. If the model doesn’t have access to certain communities of knowledge, practices, or high-quality datasets in specific languages, it may misrepresent communities and cultures underrepresented in its data. It may also reproduce harmful social stereotypes or associations in outputs.

Accuracy

Generative AI tools may hallucinate or confabulate, which means it produces inaccurate information. Despite model improvements, this problem may be impossible to totally eliminate. Since the outputs sound persuasive and coherent, it can be difficult to recognize inaccurate information if you don’t have background knowledge in the subject area. Generative AI might also have difficulty with context and nuance when summarizing or linking information from multiple sources, misattributing information or misrepresenting content as a result.

Transparency

Generative AI tools often aren’t transparent about their datasets, how they use or retrieve sources to generate responses (simplified for end users), or their methods for processing inputs and responding to different prompts. Tools have “system prompts” embedded in every input that end users do not see. Consider whether you understand well enough how a tool is working to determine if it’s a good match for your goals, process, and values.


If you’re unsure of how to evaluate generative AI outputs in your research process, research librarians can help!

Citing Generative AI

Always check with your professor before using generative AI for coursework: faculty may have different policies across types of assignments or in different fields of study.

When to cite

You should indicate when you've used an AI tool in any of the following processes:

  • gathering information
  • writing text
  • editing text
  • synthesizing ideas
  • cleaning or manipulating data

Depending on your citation style, you may use a citation, a note, or an in-text acknowledgement to indicate AI use.

Sources cited by AI: When an AI tool mentions a source, you should always check that source yourself and cite it directly. Generative AI tools can create fake citations and also misrepresent the information within real sources.


Citation guidance by style guides


Information to save

When using generative AI tools, you'll want to capture all the information you might need for citation. This includes:

  • name and version of the tool (ex: ChatGPT 4o)
  • time and date of usage
  • your prompt
  • output
  • any follow up prompts and outputs
  • name of the user

Saving your prompts and the outputs is especially important because generative AI tools can provide different outputs in response to the same prompts.

Zotero

Zotero does not have an item type for "generative AI." Currently the best practice is to use the "Software" item type and experiment with fields according to your style guide requirements.


Sources: Adapted from material from MIT Libraries Citing AI Tools guide, Brown University Library Citation and Attribution with AI Tools guide, Harvard Library Citing Generative AI guide