Fact-checked by Grok 2 weeks ago
References
-
[1]
[PDF] Improving Language Understanding by Generative Pre-TrainingModel specifications Our model largely follows the original transformer work [62]. We trained a. 12-layer decoder-only transformer with masked self-attention ...
-
[2]
Improving language understanding with unsupervised learningJun 11, 2018 · This provides some insight into why generative pre-training can improve performance on downstream tasks. We can also use the existing ...Missing: original | Show results with:original
-
[3]
Better language models and their implications - OpenAIFeb 14, 2019 · Last year, OpenAI's Generative Pre-trained Transformer (GPT) showed that language models trained on large amounts of data can be fine-tuned ...
-
[4]
[2005.14165] Language Models are Few-Shot Learners - arXivMay 28, 2020 · GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks ...
-
[5]
[2303.08774] GPT-4 Technical Report - arXivMar 15, 2023 · We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.
-
[6]
Generative Pre-trained Transformer: A Comprehensive Review on ...May 11, 2023 · This review provides a detailed overview of the GPT, including its architecture, working process, training procedures, enabling technologies, and its impact on ...
-
[7]
[1706.03762] Attention Is All You Need - arXivJun 12, 2017 · We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely.
-
[8]
[2206.07682] Emergent Abilities of Large Language Models - arXivJun 15, 2022 · Emergent abilities are abilities not present in smaller models but present in larger models, and cannot be predicted by extrapolating smaller ...
-
[9]
[PDF] Language Models are Unsupervised Multitask Learners | OpenAINatural language processing tasks, such as ques- tion answering, machine translation, reading com- prehension, and summarization, are typically.
-
[10]
[PDF] Training language models to follow instructions with human feedbackJan 27, 2022 · In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback.
- [11]
-
[12]
GPT-2: 1.5B release - OpenAINov 5, 2019 · As the final model release of GPT-2's staged release, we're releasing the largest version (1.5B parameters) of GPT-2 along with code and ...
-
[13]
[PDF] Language Models are Few-Shot Learners - NIPS papersWe demonstrate that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even becoming competitive with prior state-of ...
-
[14]
Introducing ChatGPT - OpenAINov 30, 2022 · ChatGPT is a conversational model that can answer follow-up questions, admit mistakes, and challenge incorrect premises. It is fine-tuned from ...OpenAI announces new... · Introducing ChatGPT search · Research · Safety
-
[15]
Training language models to follow instructions with human feedbackMar 4, 2022 · In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback.
-
[16]
[PDF] GPT-4 Technical Report - OpenAIMar 27, 2023 · [8] A. Radford, “Improving language understanding with unsupervised learning.” https://ope- nai.com/research/language-unsupervised, June 2018.
-
[17]
GPT-4 - OpenAIMar 14, 2023 · We've created GPT-4, the latest milestone in OpenAI's effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image ...
-
[18]
Hello GPT-4o - OpenAIMay 13, 2024 · We're announcing GPT-4 Omni, our new flagship model which can reason across audio, vision, and text in real time.
-
[19]
Introducing OpenAI o1-previewSep 12, 2024 · Update on September 17, 2024: Rate limits are now 50 queries per week for o1‑preview and 50 queries per day for o1‑mini.How It Works · Safety · How To Use Openai O1
-
[20]
Notes on GPT-5 training compute - Epoch AIOct 13, 2025 · In early 2025, RL compute was small—maybe 1-10% of pre-training. But this is scaling up fast: OpenAI scaled RL by 10× from o1 to o3, and xAI did ...
-
[21]
[PDF] Artificial Intelligence Index Report 2025 | Stanford HAIFeb 2, 2025 · New in this year's report are in-depth analyses of the evolving landscape of AI hardware, novel estimates of inference costs, and new analyses ...
-
[22]
On the Opportunities and Risks of Foundation Models - arXivThis report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (eg, language, vision, robotics, ...
-
[23]
GPT-3 powers the next generation of apps - OpenAIMar 25, 2021 · Over 300 applications are delivering GPT-3–powered search, conversation, text completion, and other advanced AI features through our API.
-
[24]
PaLM: Scaling Language Modeling with Pathways - arXivApr 5, 2022 · We trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.
-
[25]
LLaMA: Open and Efficient Foundation Language Models - arXivFeb 27, 2023 · We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens.
-
[26]
BioGPT: Generative Pre-trained Transformer for Biomedical Text ...Oct 19, 2022 · In this paper, we propose BioGPT, a domain-specific generative Transformer language model pre-trained on large scale biomedical literature.
-
[27]
LoRA: Low-Rank Adaptation of Large Language Models - arXivJun 17, 2021 · We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the ...
-
[28]
EleutherAI/gpt-j-6b - Hugging FaceMay 3, 2023 · GPT-J 6B is a transformer model trained using Ben Wang's Mesh Transformer JAX. "GPT-J" refers to the class of model, while "6B" represents the number of ...
- [29]
- [30]
- [31]
-
[32]
[PDF] Developer Productivity With and Without GitHub Copilot - arXivSep 24, 2025 · This study investigates the real-world impact of the generative AI (GenAI) tool GitHub Copilot on developer activity and perceived ...Missing: GPT displacement
-
[33]
[PDF] Large Language Models, Small Labor Market EffectsMay 28, 2025 · Despite investments, AI chatbots have had no significant impact on earnings or recorded hours, with effects larger than 1% ruled out.
-
[34]
[PDF] The Economic Impact of Generative AIBy one estimate, close to 80% of the jobs in the U.S. economy could see at least 10% of their tasks done twice as quickly (with no loss in quality) via the use ...
-
[35]
The rising costs of training frontier AI models - arXivFeb 7, 2025 · This paper develops a detailed cost model to address this gap, estimating training costs using three approaches that account for hardware, energy, cloud rental ...
-
[36]
Electricity Demand and Grid Impacts of AI Data Centers - arXivSep 10, 2025 · Furthermore, training GPT-4 required an estimated over 50 GWh of electricity, approximately 40 times more than GPT-3, and equivalent to nearly ...
-
[37]
[PDF] The Environmental Impacts of Machine Learning Training ... - arXivOct 10, 2025 · We consider energy consumption, carbon footprint and metallic resource depletion over server production and usage, and data center cooling usage ...
-
[38]
Holistically Evaluating the Environmental Impact of Creating ... - arXivMar 3, 2025 · Training these models requires massive computational resources, which, in turn, require large amounts of energy. Powering training both emits ...
-
[39]
[PDF] Red teaming ChatGPT via Jailbreaking: Bias, Robustness ... - arXivOur red-teaming has revealed several behaviors ex- hibited by ChatGPT that may have potential ethical implications, such as bias in programming, susceptibility.
-
[40]
GPT-4o System Card | OpenAIAug 8, 2024 · This report outlines the safety work carried out prior to releasing GPT-4o including external red teaming, frontier risk evaluations ...
-
[41]
[PDF] OpenAI's Approach to External Red Teaming for AI Models and ...In the case of DALL-E 3, open-ended red teaming uncovered gaps in areas such as misinformation- prone images, jailbreaks enabling sexually explicit content, ...
-
[42]
Findings from a pilot Anthropic–OpenAI alignment evaluation exerciseAug 27, 2025 · OpenAI and Anthropic share findings from a first-of-its-kind joint safety evaluation, testing each other's models for misalignment, ...
-
[43]
Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs ...Apr 4, 2025 · We evaluate energy efficiency, inference performance, and output accuracy across multiple quantization levels and task types.
-
[44]
[PDF] Energy Use of AI Inference: Efficiency Pathways and Test ... - arXivBy quantifying achievable efficiency gains at the model, serving platform, and hardware levels, we find that individual levers yield median reductions of 1.5– ...
-
[45]
Achieving Trustworthy Real-Time Decision Support Systems with ...Jun 24, 2025 · This paper investigates real-time decision support systems that leverage low-latency AI models, bringing together recent progress in holistic AI-driven ...
-
[46]
Adaptive and Resource-efficient Agentic AI Systems for Mobile and ...Sep 30, 2025 · This article fills that gap by providing the first systematic survey on adaptive and resource-efficient agentic AI systems on mobile/edge ...
-
[47]
GPT-3 Trademark of OpenAI, L.P. - Registration Number 6294671Registration Number. 6294671 ; Word Mark. GPT-3 ; Status. 700 - Registered ; Status Date. 2021-03-16 ; Filing Date. 2020-08-04.
-
[48]
GPT-5 - OpenAI OpCo, LLC Trademark Registration - USPTO .reportJul 18, 2023 · Trademark registration by OpenAI OpCo, LLC for the trademark GPT ... Class Status, ACTIVE. Primary US Classes. 021: Electrical Apparatus ...
-
[49]
No 'GPT' trademark for OpenAI - TechCrunchFeb 15, 2024 · The US Patent and Trademark Office has denied OpenAI's attempt to trademark “GPT,” ruling that the term is “merely descriptive” and therefore unable to be ...
-
[50]
OpenAI can't register 'GPT' as a trademark — yet | The VergeFeb 16, 2024 · The US Patent and Trademark Office (PTO) has denied OpenAI's application to register the word GPT, which means generative pre-trained transformer, saying GPT ...
-
[51]
Many lessons of OpenAI's trademark GPT - ReggsterFeb 27, 2024 · OpenAI's trademark “GPT” is an acronym for “Generative Pre-trained Transformer”. While the EUIPO has accepted OpenAI's trademark GPT, the US ...
-
[52]
"GPT" Too Generic for Trademark Protection, Says USPTO - Gerben IPMay 14, 2025 · The USPTO rejected OpenAI's bid to trademark “GPT,” calling it a generic term for AI models. OpenAI's fight for “ChatGPT” trademark continues.
-
[53]
Decoder-Only Transformers: The Workhorse of Generative LLMsMar 4, 2024 · The decoder-only transformer architecture is one of the most fundamental and important ideas in AI research.Missing: terminology | Show results with:terminology
-
[54]
Design Guidelines - OpenAIThe "OpenAI" name, the OpenAI logo, the "ChatGPT" and “GPT” brands, and other OpenAI trademarks, are property of OpenAI. These guidelines are intended to ...
-
[55]
On the Dangers of Stochastic Parrots - ACM Digital LibraryMar 1, 2021 · The paper asks how big is too big for language models, what risks exist, and how to mitigate them, including environmental and financial costs.
-
[56]
Generative AI: UNESCO study reveals alarming evidence of ...Jul 5, 2024 · A UNESCO study revealed worrying tendencies in Large Language models (LLM) to produce gender bias, as well as homophobia and racial stereotyping.
-
[57]
AI on Trial: Legal Models Hallucinate in 1 out of 6 (or More ...May 23, 2024 · Our previous study of general-purpose chatbots found that they hallucinated between 58% and 82% of the time on legal queries, highlighting the risks of ...
-
[58]
Understanding the Strengths and Limitations of Reasoning Models ...We found that LRMs have limitations in exact computation: they fail to use explicit algorithms and reason inconsistently across puzzles.Interleaved Reasoning for... · GSM-Symbolic · Work with usMissing: lack | Show results with:lack<|control11|><|separator|>
-
[59]
AI hallucination: towards a comprehensive classification of distorted ...Sep 27, 2024 · They argue that hallucination represents an inherent trait of the GPT model and suggest that completely eradicating hallucinations without ...
-
[60]
Scaling language model size yields diminishing returns for ... - PNASIn this paper, we estimated the association between language model size and model persuasiveness. Our results offer evidence of sharply diminishing returns, ...Abstract · Results · Materials And Methods<|control11|><|separator|>
-
[61]
Collective alignment: public input on our Model Spec | OpenAIAug 27, 2025 · We surveyed over 1,000 people worldwide on how our models should behave and compared their views to our Model Spec.
-
[62]
Findings from a Pilot Anthropic - OpenAI Alignment Evaluation ...Aug 27, 2025 · In early summer 2025, Anthropic and OpenAI agreed to evaluate each other's public models using in-house misalignment-related evaluations. We are ...
-
[63]
Mitigating Bias in Language Models through Direct ... - ACL AnthologyThis paper introduces a new framework employing Direct Preference Optimization (DPO) to mitigate gender, racial, and religious biases in LLM-generated English ...