Make sure you understand whether and how you're allowed to use generative AI, including at which stages of your research and how to cite or indicate your use.
Be intentional about using generative AI, which means considering the ways generative AI use may affect your learning and engagement with the research process. You are responsible for your own work, and should be able to explain the decisions you make about your process. Asking yourself questions about values, benefits, and costs to use can help clarify your reasons.
Generative AI tools vary in terms of privacy, security and accessibility. Before using any tool, make sure you understand how it handles your data and any risks involved.
Most generative AI tools and platforms retain your chat/prompt history by default and many use your inputs to train their models. Your input (including uploads) may also be reviewed for quality control or abuse prevention.
It's best not to share anything with a generative AI tool that you don’t want to become public, or that you don’t want used in future generative AI outputs.
Don’t share private, sensitive, or personally identifiable information (PII)—about you or others, which includes lecture notes, slides, or audio from your faculty—and consult College guidelines prior to use.
Some generative AI tools can search the Internet or custom databases using a technique called Retrieval Augmented Generation (RAG). These tools are different from library-database searches.
Library-database searches identify keywords and phrases in your search, then match your terms with indexed sources using ranking algorithms based on a variety of factors to determine relevance. They may also use natural language processing to identify context and relationships for key terms and sources.
You can often sort the results by criteria like date, relevance, or location; filter based on aspects like source type; and choose which sources to explore further from a list of results. You can also try different keywords, or combinations of keywords, and observe how this affects your search results.
Generative AI tools with RAG use a different approach to search. First, the model translates your query into a vector-based search (imagine a "map" of concepts, with related concepts closer to each other). This search retrieves information based on similarity, or closeness in meaning. The retrieved information is then passed to a generative model, which synthesizes a coherent response, often including links to retrieved sources.
Errors can happen at each stage of this translation and synthesis process, and these models can still “hallucinate” (generate inaccurate information). It may be difficult or impossible to determine how the models are interpreting your specific search terms or your question. Further, generative AI with RAG prioritizes similarity to your query, not necessarily source authority, timeliness, or other elements.
The quality of outputs is highly dependent on the quality of sources in the database. It is important to directly check the sources referenced in any AI-generated summary. You should also consider whether the database (if known) is broad enough to contain all of the information you may need, or if you should supplement with specific library-database searches in your field of interest.
Sources
Image: Retrieval-Augmented Generation Workflow, from “Retrieval-Augmented Generation (RAG): From Theory to LangChain Implementation,” November 14, 2023. Towards Data Science. By Leonie Monigatti.
What Is Retrieval-Augmented Generation, aka RAG?. NVIDIA blog. November 15, 2023. By Rick Merritt.
What is retrieval-augmented generation? IBM blog. August 22, 2023. By Kim Martineau.
Lewis, Patrick, et al. "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." Advances in Neural Information Processing Systems, 6-12 December 2020, Online, 2020.
Generative AI is a tool, not a source — it’s built on prediction, not evaluating or creating information as a human author does. You should always evaluate the sources provided by a generative AI tool directly.
Keep a critical perspective when evaluating the outputs of any generative AI tool. Consider the following aspects:
Generative AI has been proven to reproduce and amplify the social biases present in the datasets it was trained on. If the model doesn’t have access to certain communities of knowledge, practices, or high-quality datasets in specific languages, it may misrepresent communities and cultures underrepresented in its data. It may also reproduce harmful social stereotypes or associations in outputs.
Generative AI tools may hallucinate or confabulate, which means it produces inaccurate information. Despite model improvements, this problem may be impossible to totally eliminate. Since the outputs sound persuasive and coherent, it can be difficult to recognize inaccurate information if you don’t have background knowledge in the subject area. Generative AI might also have difficulty with context and nuance when summarizing or linking information from multiple sources, misattributing information or misrepresenting content as a result.
Generative AI tools often aren’t transparent about their datasets, how they use or retrieve sources to generate responses (simplified for end users), or their methods for processing inputs and responding to different prompts. Tools have “system prompts” embedded in every input that end users do not see. Consider whether you understand well enough how a tool is working to determine if it’s a good match for your goals, process, and values.
If you’re unsure of how to evaluate generative AI outputs in your research process, research librarians can help!
You should indicate when you've used an AI tool in any of the following processes:
Depending on your citation style, you may use a citation, a note, or an in-text acknowledgement to indicate AI use.
Sources cited by AI: When an AI tool mentions a source, you should always check that source yourself and cite it directly. Generative AI tools can create fake citations and also misrepresent the information within real sources.
When using generative AI tools, you'll want to capture all the information you might need for citation. This includes:
Saving your prompts and the outputs is especially important because generative AI tools can provide different outputs in response to the same prompts.
Zotero does not have an item type for "generative AI." Currently the best practice is to use the "Software" item type and experiment with fields according to your style guide requirements.
Sources: Adapted from material from MIT Libraries Citing AI Tools guide, Brown University Library Citation and Attribution with AI Tools guide, Harvard Library Citing Generative AI guide