Elon Musk's xAI Unveils Grok-3, New Reasoning Models Outshine Competitors
Grok-3, the highly anticipated chatbot of xAI, the artificial intelligence startup founded by Elon Musk, was officially released. Elon Musk himself conducted a live demonstration. This launch event attracted the attention of the global audience, with the number of viewers in the live broadcast room exceeding one million.
During the event, the backdrop of the launch venue prominently displayed the phrase our mission is to understand universe, which aligns with Musk's previous statement that understanding the universe is the overarching goal of xAI. This mission statement set a profound tone for the release of Grok-3, indicating xAI's grand ambition in the realm of artificial intelligence.
Grok-3: xAI's Flagship Rival, Finally Debuts After Development
As a product of xAI that competes with models like OpenAI's GPT-4o and Google's Gemini, Grok has the ability to analyze images and answer questions, and provides multiple functional supports for Elon Musk's social network platform X. The newly released Grok-3 has undergone months of meticulous research and development. It was originally planned to be launched in 2024 but failed to meet the schedule.
xAI utilized a large data center located in Memphis (which contains approximately 200,000 GPUs) to train Grok-3. Musk posted on the X platform that the computing resources used in the development of Grok-3 were 10 times those of its predecessor, Grok-2, and the training dataset has been expanded, obviously including some court case documents. During the live demonstration at the launch event, Musk stated, Grok-3 is an order of magnitude more capable than Grok-2. It is an AI that maximally pursues the truth, even if sometimes the truth conflicts with political correctness.
Model Family Debuts, Initial Performance Advantages Shown
Strictly speaking, Grok-3 is a family of models, including a smaller version called Grok 3 mini, which can answer questions more quickly but sacrifices some accuracy to a certain extent. Not all models are available yet, but the rollout has started from the release date.
xAI claims that Grok 3 outperforms GPT-4o in the AIME benchmark test that evaluates a model's mathematical ability (by sampling math questions to assess model performance) and the GPQA benchmark test that assesses a model's ability to handle PhD-level physics, biology, and chemistry problems. In addition, according to xAI, the early version of Grok 3 also achieved competitive scores in Chatbot Arena (a crowdsourced testing platform where different AI models compete with each other and users vote on their answers).
Live Demonstration Highlights Capabilities
During the launch event, Musk's team carried out several live tests to showcase the powerful capabilities of Grok-3 more intuitively. One of the remarkable tests was to ask Grok-3 to generate the code for a 3D animation of a space launch. This task required the AI model to have a deep understanding of complex physical knowledge. Grok-3 successfully provided the code after careful processing. When the code was run, a vivid animation of a spaceship traveling between the Earth and Mars was presented on the screen.
Moreover, the team also tasked Grok-3 with creating a game that combines the gameplay of Tetris and Bejeweled. Grok-3 thought for a few minutes and then gave a satisfactory solution, fully demonstrating its strong practical application ability in creative tasks.
Empowered by Reasoning Models, Functionality Expanded and Upgraded
Grok 3 also includes two reasoning model variants, namely Grok 3 Reasoning and Grok 3 mini Reasoning. They can think through problems carefully, just like reasoning models such as OpenAI's o3-mini and DeepSeek's R1 from a Chinese AI company. Reasoning models conduct a comprehensive fact-check on themselves before giving out results, which helps them avoid some of the pitfalls that usually trip up other models. xAI claims that Grok 3 Reasoning surpasses the best version of o3-mini, o3-mini-high, in several popular benchmark tests, including a newer mathematics benchmark called AIME 2025.
The reasoning models can be accessed through the Grok application. Users can ask Grok 3 to Think, and for more complex queries, they can also utilize the Big Brain mode, which will call on additional computing resources for reasoning. xAI indicates that the reasoning models are best suited for answering questions related to mathematics, science, and programming.
Musk also mentioned that in the Grok application, the thinking processes of some reasoning models are hidden to prevent distillation (a method used by AI model developers to extract knowledge from other models. Recently, DeepSeek was accused of distilling OpenAI's models to create its own).
New Function Launched, Subscription Model Clarified
The reasoning models of Grok provide support for a new function in the Grok application called DeepSearch, which is xAI's response to AI-driven deep research tools like OpenAI's deep research. DeepSearch can scan the Internet and the X platform to analyze information and provide a summary in response to a question.
Subscribers to the Premium+ plan of the X platform will be the first to obtain the right to use Grok 3, while some other functions require a subscription to the service named SuperGrok launched by xAI. The subscription price of SuperGrok is $30 per month or $300 per year. Subscribers can unlock more reasoning and deep search query permissions and can generate images without limitations.
Future Plans Disclosed, Style Controversy Awaits Resolution
Musk also revealed the future plans for Grok: In about a week, Grok will add a voice mode; in a few weeks, the Grok 3 model will appear in xAI's enterprise API together with the DeepSearch function; in a few months, xAI will open-source Grok 2. Musk said, Our general strategy is that when the next version is fully launched, we will open-source the previous version of Grok. When Grok 3 is mature and stable (which may be within a few months), we will open-source Grok 2.
About two years ago, when Musk announced the launch of Grok, he positioned it as a cutting-edge, unfiltered, and anti-woke artificial intelligence that is generally willing to answer controversial questions that other AI systems are reluctant to answer. To some extent, he fulfilled this promise. For example, when asked to speak vulgarly, Grok and Grok 2 would readily comply, uttering some vivid language that you are unlikely to hear from ChatGPT.
However, the Grok models before Grok 3 showed some reservations on political topics and would not cross certain boundaries. In fact, a study found that on political topics such as transgender rights, diversity programs, and inequality issues, Grok leaned towards the political left. Musk attributed this situation to Grok's training data (from public web pages) and promised to shift Grok closer to political neutrality. It is still unclear whether xAI has achieved this goal.