Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

New type to Language modelAnd developed by researchers at the Allen Institute of the AI2 AII, makes it possible to control how training data is used even after creating a model.
It can challenge the new model, called Flexolmo, the current industry model artificial intelligence Companies Datify data From the web, books and other sources – often Little attention to ownership– Then possess the resulting models completely. Once the data is kidnapped in the artificial intelligence model today, extracting it from this model is somewhat similar to trying to restore eggs from a final cake.
“From a traditional point of view, your data is either inside or outside,” says Ali Farhadi, CEO of AI2, based in Seattle, Washington. “Once you train in this data, you lose control. You have no way out, unless you forced me to go through another round of training in millions of dollars.”
The vanguard approach from AI2 divides training so that data owners can exercise control. Those who want to contribute to data in the Flexolmo model can do this by publicly copying a joint model known as “anchor”. Then they train a second model using their own data, collect the result with the anchor model, and contribute to returning to anyone who builds the third and final model.
Contribution in this way means that the data itself should never be delivered. Because how to combine the data owner model with the final, it is possible to extract the data later. A magazine publisher, for example, a text from the archive of the articles to a model, but later, remove the sub -form trained on these data If there is a legal dispute Or if the company objects to how to use the form.
“Training is completely unplintered,” says Seon Min, the AI2 research scientist who led the technical work. “Data owners do not have to coordinate, and training can be done completely independently.”
The flexolmo model structure is what is known as a “mix of experts”, a common design that is usually used to combine many sub -models simultaneously in a larger and more capable image. The main innovation of AI2 is a way to integrate sub -models that have been independently trained. This is achieved using a new scheme to represent values in the model so that its capabilities can be combined with others when running the final built -in model.
To test the approach, researchers created Flexolmo data they call Flexmix from property sources including books and websites. They used Flexolmo’s design to build a model that contains 37 billion teachers, about the tenth of the largest open source model of Meta. Then compare their model to many others. They found that it outperformed any individual model in all tasks and also recorded 10 percent better in common standards than other approaches to integrate independent trained models.
The result is a way to get your cake – and also restore your eggs. “You can only cancel the system without any major damage and the time of conclusion,” says Farhadi. “It is a completely new way to think about how to train these models.”