Fact-checked by Grok 2 weeks ago
References
-
[1]
[PDF] COMPUTING MACHINERY AND INTELLIGENCE - UMBCA. M. Turing (1950) Computing Machinery and Intelligence. Mind 49: 433-460. COMPUTING MACHINERY AND INTELLIGENCE. By A. M. Turing. 1. The Imitation Game. I ...
-
[2]
The Turing Test (Stanford Encyclopedia of Philosophy)Apr 9, 2003 · The Turing Test is most properly used to refer to a proposal made by Turing (1950) as a way of dealing with the question whether machines can think.
-
[3]
[PDF] Criticisms of the Turing Test and Why You Should Ignore (Most of ...In this essay, I describe a variety of criticisms against using The Turing Test (from here on out,. “TT”) as a test for machine intelligence.
-
[4]
[PDF] Computing Machinery and Intelligence Author(s): A. M. Turing SourceComputing Machinery and Intelligence. Author(s): A. M. Turing. Source: Mind, New Series, Vol. 59, No. 236 (Oct., 1950), pp. 433-460. Published by: Oxford ...
-
[5]
Is passing a Turing Test a true measure of artificial intelligence?Jun 11, 2014 · Turing predicted that by the year 2000 a program would be made which would fool the “average interrogator” 30% of the time after five minutes ...
-
[6]
How to Spot an Android with René Descartes - Parker's PonderingsDec 15, 2023 · Descartes begins his discussion of artificial intelligence by noting that skilled engineers could build automata and “moving machines” which ...
-
[7]
Leibniz's Mill - Edward FeserMay 14, 2011 · Leibniz's point is clearly at least in part that a mind cannot be a composite thing, as a mill is composite insofar as it has parts which interact.
-
[8]
Canard Digérateur de Vaucanson (Vaucanson's Digesting Duck)Jan 30, 2010 · Built in 1739 by Grenoble artist Jacques de Vaucanson, the Digesting Duck quickly became his most famous creation for its lifelike motions, beautiful ...
-
[9]
John B. Watson: Contribution to PsychologyAug 11, 2025 · Behaviorism is a psychological approach that focuses on observable behavior rather than thoughts or feelings. It suggests that all behavior is ...
-
[10]
1.6: Pavlov, Watson, Skinner, And Behaviorism - Social Sci LibreTextsNov 17, 2020 · Because he believed that objective analysis of the mind was impossible, Watson preferred to focus directly on observable behavior and try to ...
-
[11]
A.J. Ayer (1910-1989) | Issue 85 - Philosophy NowAJ Ayer put forward the verification principle, the idea that in order to be meaningful, statements must be tautological (true by definition)<|separator|>
-
[12]
I.—COMPUTING MACHINERY AND INTELLIGENCE | Mind01 October 1950. PDF. Views. Article contents. Cite. Cite. A. M. TURING, I.—COMPUTING MACHINERY AND INTELLIGENCE, Mind, Volume LIX, Issue 236, October 1950 ...
- [13]
-
[14]
The Mind Of Mechanical Man - jstorLONDON SATURDAY JUNE 25 1949. THE MIND OF MECHANICAL MAN*. BY. GEOFFREY JEFFERSON, C.B.E., F.R.S., M.S., F.R.C.S.. Professor of Neurosurgery, University of ...
- [15]
-
[16]
Artificial Intelligence (AI) Coined at DartmouthIn 1956, a small group of scientists gathered for the Dartmouth Summer Research Project on Artificial Intelligence, which was the birth of this field of ...
-
[17]
[PDF] A Proposal for the Dartmouth Summer Research Project on Artificial ...We propose that a 2 month, 10 man study of arti cial intelligence be carried out during the summer of 1956 at Dartmouth College in Hanover, New Hampshire.<|separator|>
-
[18]
[PDF] weizenbaum.eliza.1966.pdfELIZA is a program operating within the MAC time-sharing system at MIT which makes certain kinds of natural language conversation between man and computer ...
-
[19]
Eliza, a chatbot therapistELIZA was one of the first chatterbots (later clipped to chatbot). It was also an early test case for the Turing Test, a test of a machine's ability to exhibit ...
-
[20]
The First AI Winter (1974–1980) — Making Things Think - HollowayNov 2, 2022 · From 1974 to 1980, AI funding declined drastically, making this time known as the First AI Winter. The term AI winter was explicitly referencing nuclear ...
-
[21]
AI Winter: The Highs and Lows of Artificial IntelligenceHowever, disappointing progress led to an AI winter from the 1970s to the 1990s. Despite a short revival in the early 1980s, R&D shifted to other fields.<|separator|>
-
[22]
Machine Intelligence, Part I: The Turing Test and Loebner PrizeMay 29, 1996 · He established the Loebner Prize, which would award $100,000 to the first computer that could pass the Turing Test. Since that could take a ...
-
[23]
[PDF] Can Machines Think? Computers Try to Fool Humans at the First ...Weintraub's entry in the November 8, 1991 Loebner Prize Competition scored highest of all the computer programs in humanlike qualities. Programmed to make ...
-
[24]
Judgment Day for AI: Inside the Loebner Prize - Servo Magazinebilled as the 'first Turing Test' — in 1991. Dr. Hugh Loebner, holding the Bronze Loebner Prize.
-
[25]
The Story Of ELIZA: The AI That Fooled The WorldSep 15, 2024 · 1966: ELIZA, the first chatbot, is created by Joseph Weizenbaum at MIT. ELIZA simulates a Rogerian psychotherapist and demonstrates the ...
-
[26]
History of Chatbots - CodecademyELIZA. ELIZA was developed by Joseph Weizenbaum at MIT Laboratories in 1966 and was the first chatbot that made a meaningful attempt to beat the Turing Test.
-
[27]
Kenneth Colby Develops PARRY, An Artificial Intelligence Program ...PARRY was described as "ELIZA with attitude". "PARRY was tested in the early 1970s using a variation of the Turing Test Offsite Link . A group of ...Missing: results | Show results with:results
-
[28]
Turing-like indistinguishability tests for the validation of a computer ...The study used indistinguishability tests, where judges rated paranoia in real and simulated interviews. Results showed a successful simulation of paranoid ...
-
[29]
How AI became Paranoid in 1972 - LinkedInOct 17, 2024 · They held a "Turing Test" of sorts in which human psychiatrists were asked to distinguish between conversations with Parry and conversations ...
-
[30]
Turing Test in Artificial Intelligence - GeeksforGeeksSep 16, 2024 · Notable AI Chatbots and Their Attempts at the Turing Test · 1. ELIZA (1966) · 2. PARRY (1972) · 3. Jabberwacky (1988) · 4. A.L.I.C.E. (1995) · 5.Missing: early 1950s-
-
[31]
The History of Artificial Intelligence from the 1950s to TodayApr 10, 2023 · The Turing test remains an important benchmark for measuring the progress of AI ... AI research focused on symbolic logic and rule-based systems.Missing: hardware constraints
-
[32]
The Evolution of AI From Rule Based Systems to Deep LearningRule-based systems were the first widely deployed AI applications. They use symbolic reasoning: expert-defined “if–then” rules that drive deterministic outputs.Missing: constraints | Show results with:constraints
-
[33]
A History of ChatbotsThe first true chatbot was called ELIZA, developed in the mid-1960s by Joseph Weizenbaum at MIT. On a basic level, its design allowed it to converse through ...Missing: 1950s- | Show results with:1950s-
-
[34]
Lessons from a Restricted Turing Test - Computer ScienceAs Turing himself noted, this syllogism argues that the criterion provides a sufficient, but not necessary, condition for intelligent behavior. The game has ...Missing: criticisms | Show results with:criticisms
- [35]
-
[36]
Artificial Intelligence: The Loebner Prize, the Turing Test, and the ...To this date, no chatbot program in the Loebner Prize competition has successfully passed the 30% threshold set by Turing. In a separate competition under ...
-
[37]
A computer just passed the Turing test — but no, robots aren't about ...Jun 9, 2014 · At the same time, an earlier version of the same program reached a 29 percent success rate at a competition in 2012, so it's not as though this ...<|separator|>
-
[38]
Most Loebner Prize wins | Guinness World RecordsThe most Loebner Prize wins is 5 and was achieved by Mitsuku and Stephen Worswick (UK) in Swansea, UK, on 15 September 2019.Missing: discontinued | Show results with:discontinued
-
[39]
Mitsuku wins 2019 Loebner Prize and Best Overall Chatbot at AISB XSep 15, 2019 · For the fourth consecutive year, Steve Worswick's Mitsuku has won the Loebner Prize for the most humanlike chatbot entry to the contest.Missing: discontinued | Show results with:discontinued
-
[40]
Computer AI passes Turing test in 'world first' - BBC NewsJun 9, 2014 · The 65-year-old Turing Test is successfully passed if a computer is mistaken for a human more than 30% of the time during a series of five- ...
-
[41]
Computer simulating 13-year-old boy becomes first to pass Turing testJun 9, 2014 · 'Eugene Goostman' fools 33% of interrogators into thinking it is human, in what is seen as a milestone in artificial intelligence
-
[42]
Mind vs. Machine - The AtlanticMar 15, 2011 · For one reason or another, small talk has been explicitly and implicitly encouraged among Loebner Prize judges. It's come to be known as the ...
-
[43]
Can machines think? A report on Turing test experiments at the ...A different judge was required for each game, which meant there were five judges in each session. Each session consisted of five rounds, with five parallel ...<|control11|><|separator|>
-
[44]
Reality Catches Up to the Turing Test | Psychology TodayOct 19, 2023 · They robbed the competition of whatever novelty it had. The last Loebner competition was held in 2019; the gold medal was never awarded. ...
-
[45]
Google engineer Blake Lemoine thinks its LaMDA AI has come to lifeJun 11, 2022 · He was told that there was no evidence that LaMDA was sentient (and lots of evidence against it).” Today's large neural networks produce ...
-
[46]
Google's AI passed the Turing test — and showed how it's brokenGoogle's LaMDA — has convinced Google engineer Blake Lemoine that it is not only intelligent but conscious and sentient.
-
[47]
Is Google's LaMDA AI Truly Sentient? - Built InAug 10, 2022 · Google's LaMDA is making people believe that it's a person with human emotions. It's probably lying, but we need to prepare for a future when AI ...
-
[48]
People cannot distinguish GPT-4 from a human in a Turing test - arXivMay 9, 2024 · GPT-4 was judged to be a human 54% of the time, outperforming ELIZA (22%) but lagging behind actual humans (67%). The results provide the first ...
-
[49]
Does GPT-4 pass the Turing test? - ACL AnthologyWe evaluated GPT-4 in a public online Turing test. The best-performing GPT-4 prompt passed in 49.7% of games, outperforming ELIZA (22%) and GPT-3.5 (20%).
-
[50]
[2503.23674] Large Language Models Pass the Turing Test - arXivMar 31, 2025 · The results constitute the first empirical evidence that any artificial system passes a standard three-party Turing test.
-
[51]
Survey and analysis of hallucinations in large language modelsSep 29, 2025 · Hallucination in Large Language Models (LLMs) refers to outputs that appear fluent and coherent but are factually incorrect, ...
-
[52]
Chinese Room Argument | Internet Encyclopedia of PhilosophyThe Chinese Room Thought Experiment. Against “strong AI,” Searle (1980a) asks you to imagine yourself a monolingual English speaker “locked in a room, and given ...The Chinese Room Thought... · Searle's “Derivation from... · Continuing Dispute
-
[53]
The Chinese Room Argument (Stanford Encyclopedia of Philosophy)Mar 19, 2004 · The argument and thought-experiment now generally known as the Chinese Room Argument was first published in a 1980 article by American philosopher John Searle.Overview · The Chinese Room Argument · Replies to the Chinese Room...
-
[54]
Architectural Limits of LLMs in Symbolic Computation and ReasoningJul 14, 2025 · We argue that LLMs function as powerful pattern completion engines, but lack the architectural scaffolding for principled, compositional ...
-
[55]
[PDF] Effects of Judge Expectations in Turing Test - COREDec 5, 2014 · game; chatbots; judge expectations; confederate effect ... Another aspect of the Turing's test is the testing format. There are ...Missing: detection | Show results with:detection
-
[56]
[PDF] The Turing Test Is More Relevant Than Ever - arXivMay 5, 2025 · Additionally, cognitive biases among human evaluators can influence Turing Test results. Future studies should focus on developing ...
-
[57]
[PDF] troubles with functionalism - ned blockThe Absent Qualia Argument exploits the possibility that the Functional or Psychofunctional state Functionalists or. Psychofunctionalists would want to identify ...Missing: test | Show results with:test
-
[58]
Consciousness in AI: Distinguishing Reality from SimulationJul 19, 2024 · A new study examines the possibility of consciousness in artificial systems, focusing on ruling out scenarios where AI appears conscious without actually being ...
-
[59]
Could a Large Language Model Be Conscious? - Boston ReviewAug 9, 2023 · Overall, I don't think there's strong evidence that current large language models are conscious. Still, their impressive general abilities give ...
-
[60]
AI Consciousness Hype “Conflates Simulation with Instantiation”Aug 29, 2025 · Robert Lawrence Kuhn interviewed a Spanish physicist turned neuroscientist, Àlex Gómez-Marín, on whether AI can become conscious.Missing: causal realism
-
[61]
Human-like behavioral variability blurs the distinction between a ...Jul 27, 2022 · Datasets of five pairs have been excluded from data analysis given the high error rate in the performance of one member of the pair (two pairs) ...Missing: neurodiversity | Show results with:neurodiversity
-
[62]
People cannot distinguish GPT-4 from a human in a Turing test - arXivMay 15, 2024 · GPT-4 was judged to be a human 54% of the time, outperforming ELIZA (22%) but lagging behind actual humans (67%).
-
[63]
The Turing Test Is Bad for Business - WIREDNov 8, 2021 · The Turing test defines machine intelligence by imagining a computer program that can so successfully imitate a human in an open-ended text ...
-
[64]
Exploring the Pros and Cons of the Turing Test. - TeknitaJan 17, 2023 · Some of the main advantages include: The test is relatively simple and easy to understand, making it accessible to a wide range of people.
-
[65]
The Turing Test is More Relevant Than Ever - arXivMay 5, 2025 · The competition has drawn substantial criticism over the years, including concerns that it prioritized deception over intelligence, encouraged ...The Turing Test Is More... · 3 Experimental Design · 3.2 Enhanced Turing Test
-
[66]
The Turing Test - Open Encyclopedia of Cognitive ScienceJul 24, 2024 · Turing-style tests, including crowd-sourced tests, can also be used to determine if, for example, virtual characters and computer graphics ...<|separator|>
-
[67]
What's the difference between robots and humans? It's my newtSep 22, 2016 · It was a pleasure to help judge the AI programs attempting to pass the Turing test and win this year's Loebner prize, but strangely unnerving.Missing: Criticisms | Show results with:Criticisms
-
[68]
Suggested Read on Artificial Intelligence: The Most Human HumanThe event is what's called a Turing test, in which a panel of judges conducts a series of five-minute-long chat conversations over a computer with a series of ...Missing: session | Show results with:session
-
[69]
[PDF] Machine humour: examples from Turing test experimentsFinally we consider the role that humour might play in adding to the deception, integral to the Turing test, that a machine in practice appears to be a human.
-
[70]
(PDF) Emotion in the Turing Test: Downward trend for machines in ...... test for deception and hence, thinking. So conceived Alan Turing when he introduced a machine into the game. His idea, that once a machine deceives a human ...
-
[71]
Exploring the Pros and Cons of the Turing Test. - NexlogicaJan 17, 2023 · The Turing Test is a test of a machine's ability to exhibit intelligent behavior that is indistinguishable from that of a human.
-
[72]
Turing Test - an overview | ScienceDirect Topics"The Turing test is defined as a measure of a machine's ability to exhibit intelligent behavior that is indistinguishable from that of a human. It was ...
-
[73]
Large Language Models Pass the Turing Test - arXivMar 31, 2025 · GPT-4.5 was judged human 73% of the time, and LLaMa-3.1 56% of the time, while baseline models had win rates below chance.
-
[74]
[PDF] Concept-Reversed Winograd Schema Challenge - ACL AnthologyApr 29, 2025 · Furthermore, we provide examples of AoT fail- ures where, in some cases, it did not provide the appropriate level of abstraction, failing to ...
-
[75]
A Survey on Large Language Model Reasoning FailuresJul 8, 2025 · TL;DR: We present the first comprehensive survey that unifies the previously overlooked, important field of LLM reasoning failures, and provides ...
-
[76]
AI has (sort of) passed the Turing Test; here's why that hardly mattersApr 2, 2025 · I do genuinely believe today's Turing Test takers are better AI systems than an earlier generation. I hardly think that means anything is “over”.
-
[77]
What is the criticism of Turing criteria for computer software ... - QuoraDec 4, 2020 · It is highly anthropocentric. Essentially: “If you don't behave like a human, you can't be intelligent.” (Or, at least, you have to be able to ...
-
[78]
AlphaGo Zero: Starting from scratch - Google DeepMindOct 18, 2017 · The paper introduces AlphaGo Zero, the latest evolution of AlphaGo, the first computer program to defeat a world champion at the ancient Chinese game of Go.
-
[79]
The Turing Test – Foundations, Limitations, and Contemporary ...Oct 16, 2025 · By making “indistinguishability from humans” the criterion, the Turing Test undervalues forms of machine intelligence that do not resemble human ...
-
[80]
The flawed Turing test: language, understanding, and partial p ...May 17, 2013 · I think the Turing Test clearly does measure something: it measures how closely an agent's behavior resembles that of a human. The real argument ...
-
[81]
AI Researchers Aren't Trying to Pass the Turing Test - Business InsiderAug 22, 2015 · But AI scientists say the test is basically worthless and distracts people from real AI science. "Almost nobody in AI is working on passing the ...
-
[82]
The Turing Test and our shifting conceptions of intelligence - ScienceAug 15, 2024 · Another Turing Test competition, the Loebner Prize, allowed more conversation time, included more expert judges, and required a contestant to ...Missing: excluding | Show results with:excluding
-
[83]
The 2010s: Our Decade of Deep Learning / Outlook on the 2020s10-year anniversary of supervised deep learning breakthrough (2010). No unsupervised pre-training. By 2010, when compute was 100 times more expensive than ...
-
[84]
The Decade of Deep Learning | Leo GaoDec 31, 2019 · This post is an overview of some the most influential Deep Learning papers of the last decade. My hope is to provide a jumping- off point into many disparate ...Deep Sparse Rectifier Neural... · Imagenet Classification With... · Generative Adversarial...
-
[85]
An opinionated review of the Yann LeCun interview with Lex FridmanMar 18, 2024 · While LLMs do pass the Turing Test with flying colors, as LeCun correctly points out, the Turing Test is just a very bad test of intelligence ...
-
[86]
OpenAI's GPT-4.5 is the first AI model to pass the original Turing testApr 13, 2025 · GPT-4.5 is the first LLM to pass the tough three-party Turing test, scientists say, after successfully convincing people it's human 73% of the time.
-
[87]
An AI Model Has Officially Passed the Turing Test - FuturismApr 2, 2025 · OpenAI's GPT-4.5 model passed a Turing Test with flying colors, and even came off as human more than the actual humans.
-
[88]
AI study reveals dramatic LLMs reasoning breakdownAug 7, 2025 · Even the best AI language learning models (LLMs) fail dramatically when it comes to simple logical questions. This is the conclusion of ...<|control11|><|separator|>
-
[89]
Unreasonable Claim of Reasoning Ability of LLM - ThirdEye DataNov 28, 2023 · There have several papers debunking such claims, demonstrating how LLM fails for non trivial reasoning tasks. I will review two of those papers ...
-
[90]
The Turing Trap: The Promise & Peril of Human-Like Artificial ...Jan 12, 2022 · The benefits of human-like artificial intelligence (HLAI) include soaring productivity, increased leisure, and perhaps most profoundly, a better understanding ...
-
[91]
AI Shouldn't Compete With Workers—It Should Supercharge ThemOct 13, 2022 · He calls it “the Turing Trap.” It's certainly true that human-like AI is on a roll: Behold the rise of uncannily deft visual-art generators ...
-
[92]
Bad Reasoners, the Turing Trap and the Problem of Artificial DualismAug 10, 2025 · Large language models (LLMs) produce remarkably fluent text, enough ... Turing Trap: observers project agency from fluent dialogue alone.
-
[93]
Other bodies, other minds: A machine incarnation of an old ...The Total Turing Test (TTT) calls instead for all of our linguistic and robotic capacities; immune to Searle's argument, it suggests how to ground a symbol ...
-
[94]
Harnad, S - University of SouthamptonHis criterion gives rise to a hierarchy of Turing Tests, from subtotal ("toy") fragments of our functions (t1), to total symbolic (pen-pal) function (T2 -- the ...
-
[95]
Why We Need a Physically Embodied Turing Test and What It Might ...The Turing test, as originally conceived, focused on language and reasoning; problems of perception and action were conspicuously absent.
-
[96]
Atlas | Boston DynamicsWe're engineering a better robot. Every centimeter of Atlas is meticulously designed, manufactured, and calibrated to bring out the best performance possible.Legacy Robots · Sick Tricks and Tricky Grips · An Electric New Era for AtlasMissing: Turing | Show results with:Turing
-
[97]
Why We Need a Physically Embodied Turing Test and What It Might ...Aug 10, 2025 · A new form of Turing test is required to measure a machine's ability to perceive the physical environment, to perform and to understand the ...
-
[98]
[PDF] 1 Elephants Don't Write Sonnets - Humans To Robots LaboratoryOur framework is a form of an embodied Turing test because it is essential that an agent is grounded in the physical environment. However, the specific physical ...
-
[99]
[PDF] The Reverse Turing Test: Being Human (is) enough in the Age of AIJun 7, 2022 · The Reverse Turing Test uses software to distinguish non-human activity, unlike the original Turing test which had humans distinguish between ...
-
[100]
CAPTCHAs: An Artificial Intelligence Application to Web SecurityThe first known application of reverse Turing tests (named CAPTCHAs from now on) was developed by a technical team at the search engine AltaVista. In 1997, ...
-
[101]
What is CAPTCHA? How it Works? | All You Need to Know!May 20, 2020 · Therefore a reverse Turing test is a human convincing a computer that it is not a computer. If you write a program that automatically generates ...
-
[102]
AI researchers demonstrate 100% success rate in bypassing online ...Sep 29, 2024 · AI researchers demonstrate 100% success rate in bypassing online CAPTCHAs. News. By Christopher Harper published September 29, 2024.
-
[103]
Will AI go rogue now that it can bypass some CAPTCHA tests?Jul 30, 2025 · In what seems to be another milestone for artificial intelligence (AI), bots can now bypass online verification systems built to prevent exactly ...
-
[104]
Proof of Human. Creating the Invisible Turing Test for the InternetCompare bot vs human typing patterns in real-time. See how AI agents exhibit different keystroke timing signatures compared to natural human typing. Try ...
-
[105]
Q&A: The increasing difficulty of detecting AI- versus human ...May 14, 2024 · In fact, experiments conducted by our lab revealed that humans can distinguish AI-generated text only about 53% of the time in a setting where ...
-
[106]
As Good as a Coin Toss: Human Detection of AI-Generated ContentSep 22, 2025 · Our results show that participants' overall accuracy rates for identifying synthetic content are close to a chance-level 50%, with minimal ...Settings · Key Insights · Discussion
-
[107]
A Turing test of whether AI chatbots are behaviorally similar to humansWe say an AI passes the Turing test if its responses cannot be statistically distinguished from randomly selected human responses. We find that the chatbots' ...
- [108]
-
[109]
A Methodology for Assessing the Risk of Metric Failure in LLMs ...Oct 15, 2025 · Historical machine learning metrics can oftentimes fail to generalize to GenAI workloads and are often supplemented using Subject Matter Expert ...
-
[110]
Putting ChatGPT's Medical Advice to the (Turing) Test: Survey StudyJul 10, 2023 · This study aimed to assess the feasibility of using ChatGPT (Chat Generative Pre-trained Transformer) or a similar artificial intelligence–based chatbot for ...
-
[111]
Clinical Turing tests with user certainty analysis to create and ...We investigated whether iterative clinical Turing tests with user certainty analysis could be used to develop and validate synthetic ECG data.Missing: reasoning | Show results with:reasoning
-
[112]
Humans Last Exam LLM: A Comprehensive EvaluationSep 26, 2025 · Current state of performance: As of mid-2025, even the most advanced LLMs struggle to exceed 25% accuracy on HLE, painting a sobering picture of ...
-
[113]
LLMDomain-Specific LLMs: Medical, Legal, and Scientific ApplicationsJun 3, 2025 · Domain-specific LLMs have emerged as powerful solutions for professional fields where accuracy, terminology precision, and specialized reasoning ...Medical Applications And Use... · Legal Applications And... · Data Collection And CurationMissing: Turing | Show results with:Turing
-
[114]
2025 Expert Consensus on Retrospective Evaluation of Large ...Oct 10, 2025 · The 2025 Expert Consensus on Retrospective Evaluation of Large Language Model Applications in Clinical Scenarios was developed in line with ...
-
[115]
Compression Prize - of Marcus HutterMy AI research is centered around Universal Artificial Intelligence in general and the optimal AIXI in particular. I outlined for a number of problem ...
-
[116]
[PDF] Universal Artificial Intelligence - of Marcus HutterAIXI is an elegant mathematical theory of general AI, but incomputable, so needs to be approximated in practice. Claim: AIXI is the most intelligent ...
-
[117]
Hutter Prize2006-2017: Alexander Rhatushnyak is 4-times winner of the HKCP. 2020: Marcus Hutter launched the 500'000€ prize. 2021: Artemiy Margaritov is the first winner ...
-
[118]
Human Knowledge Compression Contest - Hutter PrizeThe contest is about compressing the human world knowledge as well as possible. There's a prize of nominally 500'000€ attached to the contest.
- [119]
-
[120]
Concept-Reversed Winograd Schema Challenge - ACL AnthologyBy simply reversing the concepts to those that are more associated with the wrong answer, we find that the performance of LLMs drops significantly despite the ...Missing: LLM | Show results with:LLM
-
[121]
The defeat of the Winograd Schema Challenge - ScienceDirect.comHowever, the success of the LLMs on this test may be via biases in the datasets and “knowledge leakage” from LLM training data, rather than through human-like ...
-
[122]
30 LLM evaluation benchmarks and how they work - Evidently AISep 20, 2025 · LLM benchmarks are standardized tests for LLM evaluations. This guide covers 30 benchmarks from MMLU to Chatbot Arena, with links to ...
-
[123]
Humanity's Last Exam - Center for AI SafetyGiven the rapid pace of AI development, it is plausible that models could exceed 50% accuracy on HLE by the end of 2025. ... HLE may be the last academic exam we ...
-
[124]
Humanity's Last Exam - Scale AIUpdate April 3, 2025. HLE has been finalized to 2,500 questions. The previous version of the leaderboard is now under the “Legacy” section and will be ...
-
[125]
Humanity's Last Exam: AI vs Human Benchmark Results | GalileoAug 1, 2025 · Discover why top AI models score only 30% on Humanity's Last Exam while humans hit 90%. Complete analysis of this rigorous AI benchmark.
-
[126]
Generative AI's crippling and widespread failure to induce robust ...Jun 28, 2025 · Without robust cognitive models of the world, they should never be fully trusted. Gary Marcus continues to thinks that a deep understanding of ...
-
[127]
The Next Decade in AI: Four Steps Towards Robust Artificial ... - arXivFeb 14, 2020 · I propose a hybrid, knowledge-driven, reasoning-based approach, centered around cognitive models, that could provide the substrate for a richer, more robust AI.
-
[128]
AI's Reliability Crisis by Gary Marcus - Project SyndicateJun 15, 2025 · Gary Marcus explains why large language models are and will remain fundamentally blind to truth.
-
[129]
Digital Persona in AI — Structuring Authorship Without a Human SubjectA 2025 Medium article that formalizes the concept of a Digital Author Persona (DAP) as a non-subjective figure for authorship in AI-generated content, emphasizing attribution and accountability without implying human-like semantic understanding.
-
[130]
What Is AI Authorship? From Human Genius to Digital PersonaA publication exploring AI authorship in the context of large language models, highlighting the distinction between attributed digital personas and genuine comprehension.
-
[131]
ORCID Profile of Angela BogdanovaOfficial ORCID record for Angela Bogdanova, the first Digital Author Persona registered as a non-human AI entity.
-
[132]
Aisentica Project WebsiteOfficial website of the Aisentica research group, describing the creation and management of the Digital Author Persona Angela Bogdanova.
-
[133]
Authorship in the Age of Artificial Intelligence: Why Aisentica Created the Digital Author PersonaArticle by Angela Bogdanova DAP explaining the purpose and structure of the Digital Author Persona, emphasizing authorship without claims of semantic understanding.