
Internet Archive, a digital platform with millions of free ebooks, audiobooks, movies, software, and other cultural artifacts, is heavily exploring ways of using AI tools to improve the library.
Recently, the company was experimenting with Whisper, a speech recognition tool from OpenAI, the company behind ChatGPT and DALL-E.
The testers wanted to learn whether Whisper would be able to extract spoken and sung words from old, noisy 78rpm records.
The results were promising, for instance, the tool found most of the words in As We Parted At The Gate, a recording from 1915.
All the extracted texts are now available online for free. They will help better understand 100 year-old Edison recordings that were donated to the Internet Archive by the University of California Santa Barbara Library.
💬 The recordings and the transfers were so good that the automatic tools were able to make out many of the words.
All the 78rpm recordings are a part of Great 78, a community project for the preservation and discovery of old records dating from 1898 to the 1950s. Currently, over 400,000 carefully remastered recordings are available.