1960s chatbot ELIZA beat OpenAI’s GPT-3.5 in a recent Turing test study

6 min read
0 Views
An illustration of a man and a robot sitting in boxes, talking.

Enlarge / An artist's impression of a human and a robot talking. (credit: Getty Images | Benj Edwards)

In a preprint research paper titled "Does GPT-4 Pass the Turing Test?", two researchers from UC San Diego pitted OpenAI's GPT-4 AI language model against human participants, GPT-3.5, and ELIZA to see which could trick participants into thinking it was human with the greatest success. But along the way, the study, which has not been peer-reviewed, found that human participants correctly identified other humans in only 63 percent of the interactions—and that a 1960s computer program surpassed the AI model that powers the free version of ChatGPT.

Even with limitations and caveats, which we'll cover below, the paper presents a thought-provoking comparison between AI model approaches and raises further questions about using the Turing test to evaluate AI model performance.

Read Also :

British mathematician and computer scientist Alan Turing first conceived the Turing test as "The Imitation Game" in 1950. Since then, it has become a famous but controversial benchmark for determining a machine's ability to imitate human conversation. In modern versions of the test, a human judge typically talks to either another human or a chatbot without knowing which is which. If the judge cannot reliably tell the chatbot from the human a certain percentage of the time, the chatbot is said to have passed the test. The threshold for passing the test is subjective, so there has never been a broad consensus on what would constitute a passing success rate.

Read 13 remaining paragraphs | Comments



source https://arstechnica.com/?p=1986387
BotolBaba aka Mehedi Hasan Ariyan is an Bangladeshi Actor, Musical Artist, Entrepreneur & YouTube Personality. He releases his soundtracks on different music platforms like Spotify, Google Play M…

Post a Comment

Cookies Consent

We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.

Learn More