The highly anticipated Orion model from OpenAI for ChatGPT, which has both been speculated about and denied as a year-end release, might not live up to expectations, according to a recent report by The Information.
According to unnamed sources within OpenAI, the Orion model has demonstrated significantly less improvement compared to GPT-4 than GPT-4 did over GPT-3. These insiders also mentioned that Orion doesn’t consistently outperform GPT-4 in certain areas, specifically in coding tasks. However, its capabilities in general language tasks, such as document summarization and email generation, are indeed stronger.
The report indicates that the challenges in enhancing the Orion model can be attributed to a declining availability of high-quality text and data for training. Essentially, the AI sector is facing a training data bottleneck, having largely exhausted easily accessible social media sources from platforms like X, Facebook, and YouTube. This limitation is making it increasingly difficult for these companies to discover complex coding problems that could push their models to new heights, thereby decelerating their pre-release training efforts.
This diminished efficiency in training is raising significant ecological and economic concerns. As leading-edge large language models (LLMs) expand and their parameters soar into the trillions, projections suggest that resource consumption—including energy and water—could increase sixfold in the coming decade. This escalating demand is evident in moves by companies like Microsoft’s plans to reactivate Three Mile Island, AWS’s acquisition of a 960 MW plant, and Google’s purchase of power from seven nuclear reactors. These measures are being taken to meet the energy requirements of their expanding AI data center networks, as the existing power grid struggles to keep pace.
To address these challenges, OpenAI has reportedly established a “foundations team” aimed at tackling the training data shortage. As noted by TechCrunch, this team’s strategies may include the use of synthetic training data, akin to what Nvidia’s Nemotron models can produce. They are also exploring ways to enhance model performance after the training phase.
Initially thought to be the code name for GPT-5, the Orion model is now anticipated to make its debut in 2025. Whether we will have enough electrical capacity to deploy it without overwhelming our municipal power grids is still uncertain.