Nebius bets on faster AI inference

byHyphen Web Desk -May 03, 2026

Nebius has agreed to buy Eigen AI for about $643 million, using cash and Class A shares to deepen its position in managed AI inference as cloud providers race to cut the cost of running large language models at scale.

The Amsterdam-headquartered AI infrastructure company announced the agreement on May 1, with the transaction expected to close in the coming weeks, subject to customary conditions including antitrust clearance. The deal gives Nebius control of Eigen AI’s inference and post-training optimisation technology, which will be folded directly into Nebius Token Factory, its managed platform for deploying and customising open-source AI models in production.

The acquisition marks a strategic shift beyond raw compute capacity. Nebius, listed on Nasdaq under the ticker NBIS, has been building data centre capacity and GPU-backed cloud services for AI developers, startups and enterprises. By adding Eigen AI’s optimisation stack, it is seeking to improve the economics of inference, the process through which trained AI models generate outputs for users.

Inference has become one of the most competitive areas in the AI economy. Training frontier models requires vast upfront investment, but inference creates continuing costs every time a chatbot answers a question, an enterprise agent processes a workflow, or a software service generates text, code, images or recommendations. As AI applications move from experimentation into production, the ability to produce more tokens per GPU at lower cost has become a commercial advantage.

Eigen AI specialises in improving the performance of open-source models and reducing the engineering burden required to run them efficiently. Its work covers post-training, fine-tuning, kernel-level optimisation and production inference. Nebius said the two companies had already worked together on optimised versions of leading open-source models, with jointly tuned endpoints ranking among the fastest on Artificial Analysis across multiple models.

The transaction also strengthens Nebius’s US footprint. Eigen AI’s founding team will establish a Nebius engineering and research presence in the San Francisco Bay Area, giving the company closer access to one of the world’s deepest pools of AI research talent and enterprise customers. Eigen AI’s leadership includes Ryan Hanrui Wang and Wei-Chen Wang, both linked to MIT’s HAN Lab, along with Di Jin, an MIT CSAIL PhD whose work includes post-training and model alignment.

The price tag has drawn attention because Eigen AI is a small specialist team rather than a large operating company. The valuation reflects the scarcity of engineers and researchers who can improve model-serving efficiency at the level required by high-volume AI platforms. For cloud providers, even modest improvements in throughput, memory use and GPU utilisation can translate into significant savings when workloads are running continuously across large clusters.

Nebius’s immediate objective is to make Token Factory more attractive to developers and companies deploying open-source models. The platform already offers autoscaling endpoints and fine-tuning pipelines. Integrating Eigen AI’s technology is intended to lower cost per inference, improve throughput and help customers adopt new model architectures without maintaining specialised internal optimisation teams.

The move comes as open-source AI models gain ground among companies seeking more control over data, deployment and customisation. Models from Meta, Alibaba, Google, Nvidia, DeepSeek and other developers have created a fast-moving market in which enterprises can choose from multiple architectures, including mixture-of-experts systems, long-context models and reasoning-focused models. Those models often need substantial optimisation before they can be run economically in production.

Nebius is competing in a crowded AI infrastructure market that includes major cloud platforms as well as AI-focused providers such as CoreWeave, Lambda and other GPU cloud specialists. These companies are trying to serve demand from model builders, application developers and enterprises that need access to advanced Nvidia accelerators without building their own data centres. The fight is no longer only about who can secure chips; it is increasingly about who can deliver the best performance per dollar.

Nebius has been expanding after completing its separation from Yandex-linked assets and repositioning itself as an AI infrastructure company. The group raised $700 million in a private placement in 2024 from investors including Nvidia, Accel and accounts managed by Orbis Investments, giving it additional capital to build GPU clusters and cloud platforms. It has also pursued acquisitions to add capabilities above the infrastructure layer.

Eigen AI is the company’s second major acquisition within a short span. Earlier this year, Nebius bought Tavily, an AI agent search company, for $275 million. Together, the two deals suggest a broader push to move closer to enterprise AI workloads, combining compute, model deployment, optimisation and agent infrastructure rather than relying solely on GPU rental.

Topics Technology

License Key

Nebius bets on faster AI inference

Advertisement

Advertisement

Deep#Door exposes blind spots in Windows defences

Advertisement

Latest Posts

Popular Posts

Qatari Energy Firm Inks Major Naphtha Deal with India

Profits Plummet for Qatar Energy as Global Gas Market Cools

QatarEnergy to Form Region’s First Salt Manufacturing Firm

نموذج الاتصال