Concerns About Blindly Trusting AI Chatbots for Research

Concerns About Blindly Trusting AI Chatbots for Research

Large language models (LLMs), such as ChatGPT and Bard, have captured the imagination of the public with their ability to generate human-like text. However, a new study by researchers from the Oxford Internet Institute cautions against placing blind trust in these AI chatbots for scientific research. The researchers argue that false outputs from LLMs could pose a threat to the integrity of scientific knowledge.

According to Brent Mittelstadt, director of research at the Oxford Internet Institute, the design of LLMs contributes to users' unwarranted trust in their responses. These models are designed to sound confident and helpful, leading users to believe that their answers are accurate. However, LLMs are not infallible and can produce false information, sometimes referred to as “hallucinations,” that may appear convincing to users.

One of the reasons researchers shouldn’t blindly trust LLMs is the potential for misleading outputs caused by the training data. LLMs rely on datasets sourced from the internet, which can contain false statements, opinions, creative writing, and jokes. This can result in incorrect outputs that may be perceived as accurate by users. Furthermore, the datasets used by LLMs are often kept secretive, making it difficult to scrutinize their sources.

Mittelstadt highlights that the concern is not just about blatant hallucinations but also outputs that are slightly wrong or biased, requiring specific expertise to identify their inaccuracies. For example, LLMs may generate fabricated references to scientific articles or provide misleading information about the content of the papers. As a result, researchers should approach LLMs as unreliable research assistants and fact-check their outputs.

In terms of solutions, the researchers propose using large language models as “zero-shot translators” rather than relying on them as knowledge bases. This involves feeding the model inputs that contain reliable information or data, along with a request to perform a specific task with that data. By utilizing LLMs in this way, researchers can rewrite texts in a more accessible language, curate and translate data from one format to another, and ensure responsible usage of these models.

The concerns raised by the Oxford researchers are not unique. Leading scientific publication Nature has implemented safeguards to address the use of LLMs in scientific research. It prohibits LLM tools from being credited as authors on research papers, considering it a liability issue. Additionally, authors are required to disclose the use of large language models in their papers.

The study emphasizes the importance of responsible usage of LLMs in the scientific community. While these models have great potential, researchers must exercise caution and critically evaluate the outputs they receive. As Sandra Wachter, a co-author of the study, warns, if LLMs are used to generate and disseminate scientific articles without proper scrutiny, it could result in serious harm to the credibility of scientific knowledge.

In the rapidly evolving landscape of AI technology, it is crucial to strike a balance between utilizing the capabilities of LLMs and ensuring the integrity of scientific research. This requires researchers to employ critical thinking, skepticism, and rigorous fact-checking when relying on AI chatbots like ChatGPT and Bard as tools in their work.


Written By

Jiri Bílek

In the vast realm of AI and U.N. directives, Jiri crafts tales that bridge tech divides. With every word, he champions a world where machines serve all, harmoniously.