AI Officially Passes the Turing Test, Landmark Study Shows

LLMs break a 76-year-old benchmark and are nearly indistinguishable from humans.

Posted May 26, 2026 | Reviewed by Monica Vilhauer Ph.D.

An important scientific benchmark that has lasted for over seven decades has been broken by artificial intelligence (AI). A new breakthrough study published in the Proceedings of the National Academy of Sciences (PNAS) shows how large language models (LLMs) can pass the Turing test, making AI indistinguishable from humans.

“The results imply current AI systems can effectively imitate people in short interactions, while also raising questions about how effective the test is as a measure of intelligence ,” wrote the University of California San Diego (UCSD) study co-authors Cameron Jones and Benjamin Bergen.

Chances are high that the average person has taken some form of the Turing test many times when visiting websites. When a website security check asks the user to perform a task such as click on every image where a bicycle appears, or type the combinations of smashed-together letters and symbols that appear, the CAPTCHA test (Completely Automated Public Turing Test to Tell Computers and Humans Apart) is a form of the Turing Test.

The test itself is relatively modern. In October 1950 British mathematician Alan Turing (1912-1954) published “Computing Machinery and Intelligence” in MIND: a Quarterly Review of Psychology and Philosophy and introduced the “Imitation Game” as a method to gauge the ability of machines to think or mimic human-like intelligence using natural language in a manner that is indistinguishable from real humans. Now referred to as the “Turing test,” this method has been used reliably for over 75 years.

“The Turing test has been widely discussed as a test of machine intelligence, but it also provides a measure of how humans distinguish other humans from machines,” wrote the researchers.

Who Plays It Better: AI or Humans?

This study evaluated four LLMs: OpenAI’s GPT-4.5 and GPT-4o, Meta’s LLaMa-3.1-405B, and ELIZA, the original chatbot developed in the mid-1960s by Massachusetts Institute of Technology (MIT) professor Joseph Wizenbaum.

To conduct the testing, Jones and Bergen recruited a total of 284 participants, of which 158 came from online and the remaining 126 participants were from a UCSD psychology subject pool. A game round consisted of five minutes of an interrogator in text conversation with a human witness and an AI witness. When time was up, the interrogator had to pick which witness was human and which was AI, along with their reasons for their decision. Each participant completed eight rounds of conversations with half as an interrogator and the other half as the witness.

If a witness was picked by the interrogator to be human, it was considered a win. In all, the UCSD researchers examined 1,023 games and the LLM that was selected the most to be human by the interrogator was GPT-4.5 with a 73% win rate, followed by Llama with a 56% win rate.

“The fact that models perform so well poses new challenges in understanding what the Turing test measures,” the researchers wrote.

Not surprisingly, the older, less advanced LLMs did not fool the interrogator very often and had low win rates with 21% for GPT-4o and 23% for ELIZA.

The Rise of Counterfeit People

The researchers demonstrate that, at least for brief exchanges, state-of-the-art LLMs can credibly pass as a human, and thereby breaking the 76-year-old Turing test.

“Irrespective of whether passing the Turing test entails that LLMs are humanlike or intelligent, the findings reported here have immediate social and economic relevance,” warned Bergen and Jones.

The researchers point out the potential negative consequences from AI that can pass as humans, or “counterfeit people.” Sophisticated state-of-the-art LLMs have the potential to cause job replacement, cause the displacement of real social engagement, exert influence on humans from those that control the AI, and “undermine the value of real human interaction."

This study demonstrates that the machines have officially crossed a threshold that will impact online security and trust. Yet the researchers leave the door open for humans to differentiate ourselves from LLMs that were trained to imitate us.

“While a machine has now passed the Turing test for the first time, this is not the last time humans will have a chance to succeed at it,” the researchers concluded.

Share this post Facebook Bluesky Linkedin Email

There was a problem adding your email address. Please try again.

By submitting your information you agree to the Psychology Today Terms & Conditions and Privacy Policy

Cami Rosso writes about science, technology, innovation, and leadership.

Get the help you need from a therapist near you–a FREE service from Psychology Today.

This article is part of the Bringwise Psychology Journal — daily insights on human behavior, mental health, and personal growth.