Nvidia will spend $26 billion to build open-weight AI models, filings show


Nvidia will spend $26 billion over the next five years to build open source artificial intelligence Models, according to A Financial deposit 2025. Executives confirmed this news, which had not been previously reported, in interviews with WIRED.

The significant investment can be seen Nvidia Evolving from a chip manufacturer with a great software stack to a bona fide company Frontier Laboratory able to compete with OpenAI and Deep Sick. It’s a strategic move that could cement Nvidia’s position as the leading chipmaker in the world of artificial intelligence, as models are fine-tuned on the company’s hardware.

Open source models are those in which the weights or parameters that define the model’s behavior are publicly published – sometimes along with details of its architecture and training. This allows anyone to download and play it on their own device or on the cloud. In Nvidia’s case, the company also discloses the technical innovations involved in building and training its models, making it easier for startups and researchers to modify and build on the company’s innovations.

On Wednesday, Nvidia also released Nemotron 3 Super, its most capable lightweight AI model yet. The new model contains 128 billion parameters (a measure of the size and complexity of the model), making it roughly equivalent to the largest version of GPT-OSS from OpenAI, although the company claims that it outperforms GPT-OSS and other models across many benchmarks.

Specifically, Nvidia claims that the Nemotron 3 Super received a score of 37 on the AI ​​Index, which scores models across 10 different benchmarks. GPT-OSS received 33 points, but several Chinese models scored higher. Nvidia says the Nemotron 3 Super has been secretly tested on PinchBench, a new benchmark that evaluates a model’s ability to control OpenClaw, and ranks first in that test.

Nvidia also provided a number of technical tricks that it used to train the Nemotron 3. These Includes architectural and training techniques Which improves the model’s reasoning capabilities, dealing with long context, and responding to reinforcement learning.

“Nvidia is taking open model development more seriously,” says Brian Catanzaro, vice president of applied deep learning research at Nvidia. “We are making a lot of progress.”

Open borders

Meta was the first big company to work on artificial intelligence Open model releasea llama, in 2023. However, CEO Mark Zuckerberg recently rebooted the company’s AI efforts, and He pointed this out It may not make future models fully open. OpenAI offers an open-weight model, It’s called GPT-ossbut it is lower than the company’s best proprietary offerings, and not suitable for modification.

The best American models from OpenAI, Anthropicand Googleand can only be accessed through the cloud or via the chat interface. In contrast, the weights of many of the top Chinese models, such as DeepSeek, Alibaba, Moonshot AI, Z.ai, and MiniMax, are released openly and for free. As a result, many startups and researchers around the world are currently relying on Chinese models.

“It’s in our interest to help the ecosystem evolve,” says Catanzaro, who joined Nvidia in 2011 and helped lead the company’s shift from making graphics cards for gaming to manufacturing silicon for artificial intelligence. Nvidia released the first Nemotron model in November 2023. He adds that Nvidia recently finished pre-training a model with 550 billion parameters. (Pre-training involves inputting massive amounts of data into a model spread across large numbers of specialized chips running in parallel.) Since then, Nvidia has released a range of specialized models for use in fields such as robotics, climate modeling, and protein folding.

Nvidia’s future AI models will help the company optimize not only its chips but also the hyperscale data centers it builds, says Cary Brisky, vice president of enterprise generative AI software. “We’re building it to extend our systems and test not only compute, but also storage and networking, and to build a roadmap for our hardware architecture,” she says.

Leave a Reply

Your email address will not be published. Required fields are marked *