GPT-4.1 is a

large language model A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are g ...

within

OpenAI OpenAI, Inc. is an American artificial intelligence (AI) organization founded in December 2015 and headquartered in San Francisco, California. It aims to develop "safe and beneficial" artificial general intelligence (AGI), which it defines ...

's GPT series. It was released on April 14, 2025. GPT-4.1 can be accessed through the OpenAI API or the OpenAI Developer Playground. Three different models were simultaneously released: GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano. Since May 14, GPT-4.1 is available for users subscribed to the

ChatGPT ChatGPT is a generative artificial intelligence chatbot developed by OpenAI and released on November 30, 2022. It uses large language models (LLMs) such as GPT-4o as well as other Multimodal learning, multimodal models to create human-like re ...

Plus and Pro plans, and GPT-4.1 mini that replaces GPT-4o mini is available for all ChatGPT users.

Overview

All three models have a

context window A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are ge ...

of 1 million tokens and a knowledge cutoff of June 2024. The models were tested on numerous benchmarks. Academic knowledge benchmarks included the 2024

AIME Aime (; ) is a former commune in the Savoie ''département'' in the Auvergne-Rhône-Alpes region in southeastern France. On 1 January 2016, it was merged into the new commune of Aime-la-Plagne.GPQA, and MMLU. Coding benchmarks included SWE-bench and SWE-Lancer. Instruction following benchmarks included COLLIE and IFEval. Vision benchmarks included MMMU (answering questions about images), MathVista (solving vision-related mathematical tasks), and CharXiv (answering questions about charts from research papers). Long-context benchmarks included two brand-new benchmarks invented by OpenAI: "multi-round coreference" (where the model has to find the i-th instance of something in a fake long conversation synthetically generated by

GPT-4o GPT-4o ("o" for "omni") is a multilingual, multimodal generative pre-trained transformer developed by OpenAI and released in May 2024. It can process and generate text, images and audio. GPT-4o is free, but ChatGPT Plus subscribers have higher ...

) and "Graphwalks" (forcing the model to simulate

breadth-first search Breadth-first search (BFS) is an algorithm for searching a tree data structure for a node that satisfies a given property. It starts at the tree root and explores all nodes at the present depth prior to moving on to the nodes at the next dept ...

). The models underwent more training regarding tool-calling, so the "OpenAI cookbook" recommends exclusively using the tools field when giving the model access to tools. The models are also trained to follow instructions more literally, making the model more steerable.

Reception

''The Verge'' described GPT-4.1's release as "mark nga pivot in the company's release schedule". HackerNoon praised the model as "a HUGE win for developers", and stated that it challenged the advantages of Gemini 2.5 Pro's longer context window and Claude 3.7 Sonnet's strong reasoning capabilities.

Zvi Mowshowitz Zvi Mowshowitz is an American writer and member of the rationalist community who primarily discusses new developments in artificial intelligence. He is a former competitive '' Magic: The Gathering'' player and was CEO of MetaMed. Career Mowsho ...

described GPT-4.1-mini as an "excellent practical model". However, he criticized OpenAI for not doing enough safety testing, saying that he "hate the precedent this sets". Two research teams - one led by

Oxford University The University of Oxford is a collegiate research university in Oxford, England. There is evidence of teaching as early as 1096, making it the oldest university in the English-speaking world and the second-oldest continuously operating u ...

researcher Owain Evans, the other based at the AI red-teaming startup SplxAI - independently found evidence that GPT-4.1 could be more misaligned than

References

External links

* {{Generative AI Large language models 2025 software Generative pre-trained transformers OpenAI 2025 in artificial intelligence

Overview

Reception

See also

References

External links