Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Researcher belongs to the start of Elon Musk xi I have found a new way to measure and deal with the firm preferences and values that are expressed by artificial intelligence Models – including their political views.
Work And HandrixDirector of non -profit organizations Artificial intelligence safety center Xai Adviser. It is suggested that this technique can be used to make popular artificial intelligence models better reflect the will of the voters. “Perhaps in the future, it is possible to align (a model) with the specified user,” said Hendrycks Wire. But in the meantime, he says, a good default is the use of election results to direct the views of artificial intelligence models. He does not say that the model must necessarily be “Trump along the way,” but he argues that he should be biased towards Trump a little, “because he won the popular vote.”
Xai released New artificial intelligence risk framework On February 10, the benefit engineering approach can be used in Hendrycks to evaluate Grok.
Hendrycks led a team of the AI Center for Intelligence, the University of California in Berkeley, and the University of Pennsylvania, which analyzed artificial intelligence models using a borrowed technology from the economy to measure consumer preferences for different goods. By testing models through a wide range of virtual scenarios, researchers enabled the calculation of what is known as the function of the tool, a measure of the satisfaction that people derive from the commodity or service. This allowed them to measure the preferences expressed by various artificial intelligence models. The researchers decided that they were often consistent, not random, and showed that these preferences became more drawn as the models become larger and stronger.
some Research studies I found that artificial intelligence tools such as Chatgpt are biased towards opinions expressed by environmentally supportive ideologies, left -wing and liberal ideologies. In February 2024, Google faced criticism from Musk and others after her Gemini tool, found to create pictures described by critics as. “I woke up“Like black Vikings and Nazis.
This technology developed by Hendrycks and its collaborators introduces a new way to determine how the views of artificial intelligence models differ from its users. In the end, some experts assume that this type of difference may become dangerous for very smart and capable models. The researchers appear in their study, for example, that some models are constantly estimated at the presence of artificial intelligence above the presence of some non -human animals. The researchers also say that the models seem to estimate some people over others, raising their own moral questions.
Some researchers, including hendrycks, believe that the current methods to align the models, such as manipulation and prohibiting their outputs, may not be sufficient if the unwanted targets lie under the surface within the model itself. “We will have to confront this,” says Hendrix. “You cannot pretend that there is no.”
Dylan Haddeld MinelA professor at the Massachusetts Institute of Technology, who is looking for artificial intelligence methods with human values, says the Hindorx paper proposes a promising trend for artificial intelligence research. “They find some interesting results,” he says. “The main match that stands out is that with an increase in the model scale, the benefit representations become more complete and coherent.”