Nvidia becomes major model maker with Nemotron 3


Nvidia did Tharwa supplies chips to companies operating in it artificial intelligenceBut today the chipmaker has taken a step toward becoming a more serious model maker by releasing a series of cutting-edge open models, along with data and tools to help engineers use them.

The move, which comes at a time when AI companies like OpenAI, Google and Anthropic are developing increasingly capable chips of their own, could be a hedge against these companies veering away from Nvidia’s technology over time.

Open models are already an important part of the AI ​​ecosystem with many researchers and startups using them to experiment, prototype, and build. While OpenAI and Google offer small open models, they do not update them as frequently as their Chinese competitors. For this reason and others, open models from Chinese companies are now becoming more popular, according to him Data from face huggingA hosting platform for open source projects.

Nvidia’s new Nemotron 3 models are among the best models that can be downloaded, modified, and played on one’s hardware, according to benchmark scores the company shared ahead of release.

“Open innovation is the foundation of AI progress,” CEO Jensen Huang said in a statement ahead of the news. “With Nemotron, we are turning advanced AI into an open platform that gives developers the transparency and efficiency they need to build agent systems at scale.”

Nvidia is taking a more transparent approach than many of its US competitors by releasing the data used to train the nemotron — a fact that should help engineers tweak models more easily. The company is also releasing tools to aid in customization and fine-tuning. This includes a new hybrid model architecture for a mixture of experts, which Nvidia says is particularly good for building AI agents that can take actions on computers or the web. The company also releases libraries that allow users to train agents to do things with it Reinforcement learningwhich includes giving simulated models of rewards and punishments.

Nemotron 3 models come in three sizes: Nano, which has 30 billion parameters; Super, which has 100 billion; And Ultra, which is worth 500 billion. The model’s parameters loosely correspond to how capable it is as well as how hard it runs. Larger models are so cumbersome that they have to run on expensive hardware racks.

Model foundations

Open models are important for AI creators for three reasons: Creators increasingly need to customize models for specific tasks, said Carrie Ann Bresky, vice president of enterprise generative AI programs at Nvidia. It often helps in handing over queries to different forms; It is easier to get more intelligent responses from these models after training by having them perform some kind of reasoning simulation. “We believe open source is the foundation for AI innovation and continuing to accelerate the global economy,” Brisky said.

Social media giant Meta has released the first advanced open forms under this name Llamas in February 2023. As competition intensifies, Meta indicated that its future releases may not be open source.

The move is part of a larger trend in the AI ​​industry. Over the past year, American companies have moved away from openness, becoming more secretive about their research and more reluctant to tell rivals about their latest engineering tricks.

Leave a Reply

Your email address will not be published. Required fields are marked *