Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

Atlantic Reporter Alex Rezner recently revealed Four datasets Of music used for training Artificial intelligence models And I made them Fully searchable To the public. Two of the collections are quite massive, containing 12 million and 9 million audio tracks. The other two are much smaller, but still represent a large amount of training data of over 100,000 songs each.
According to Reisner, these collections have been downloaded thousands of times, and although it is impossible to know exactly who used them, Google and stability Both have been confirmed in research papers. Some sources such as Free music archive The datasets are,free to stream for personal use but require a license for,commercial applications.
While the datasets are freely available online in theory, using them as training data is not as simple as downloading a ZIP file and feeding it into an AI model. As Reisner explains:
Three of the datasets I found were distributed as a list of links to songs on YouTube or Spotify. AI developers download actual audio using tools that automate the task, some of which allow developers to bypass logins, ads, and mechanisms that might earn money or subscribers for creators. These tools violate these platforms’ terms of service.