FYI.

This story is over 5 years old.

Tech

A Team of Volunteers Is Archiving SoundCloud in Case It Dies

Spurred by recent reports the German streaming music and audio company may be running out of cash, The Archive Team is racing to preserve sound files—at high cost.

What will become of SoundCloud? The nine-year-old German streaming music and audio service is popular around the world (Chance The Rapper is one outspoken fan) and was previously touted as a rival to Spotify, but it has fallen on hard times lately.

This month, it announced it was cutting 40 percent of its staff and closing offices in London and San Francisco. And it may be running out of cash, according reports in The Financial Times and TechCrunch (SoundCloud has disputed the latter report in Variety). If SoundCloud does go out of business, where does that leave the more than 135 million tracks and audio files hosted on the service and played by 175 million monthly users?

Advertisement

The preservationists at the Archive Team, a group of about 150 volunteer programmers "dedicated to saving our digital heritage," aren't waiting around to find out. On Thursday, Archive Team coordinator Jason Scott tweeted that he and dozens of his fellow archivists would begin "large scale backing up of SoundCloud soon"—without the company's blessing or assistance. And indeed, shortly thereafter, a page for SoundCloud appeared on the Archive Team's website indicating that it would begin attempting to make copies of SoundCloud files starting July 18.

Scott also cautioned on Twitter that Archive Team would not be able to save the entirety of SoundCloud, due to the prohibitively high cost of server space for all the sound files.

So how does the Archive Team choose which files to archive? "If we're finding we're only able to get a portion, we traditionally go for the earliest years (for historical reasons, less likely to exist elsewhere) and the most popular files (will break the most links if they disappear or can't be found)," Scott wrote in an email to Motherboard.

Scott estimates SoundCloud's entire library is 1 petabyte (or 1 million gigabytes) in size. But a page on Amazon's Web Services solutions site highlights SoundCloud as a case study customer, and says that SoundCloud was storing 2.5 petabytes of data.

"I think that page on Amazon's AWS page is using old numbers, and they have little initiative to round downwards," Scott wrote.

Advertisement

Motherboard emailed Amazon's and SoundCloud's press addresses to confirm the numbers, but did not receive responses in time for this article's publication

Either way, we're talking about a lot of cash. Scott said that at 1 petabyte, it would cost between $1.5 million and $2 million to store all the files for the foreseeable future on server space loaned out from the separate Archive.org, the non-profit organization that manages the Internet Archive (where Scott is also employed). Figuring Scott's estimate of a cost of $1500 per terabyte to store on Archive.org servers, storing 2.5 petabytes for the foreseeable future would cost $3.75 million.

In any case, Archive Team will soon find out just how large a project it's dealing with.

READ MORE: Jason Scott Is Archiving CD-ROMs and Floppy Discs From Closets Around the World

"Archive Team generally tries to assess what exactly a site contains," Scott wrote. "In most cases, it is a relatively small (under a few terabytes) site and can be grabbed wholesale. We don't often get 'whales' like SoundCloud, although we have had a few cases like this. In the case for 'whales,' we do an assessment of what's on the site, try to get a grip on size, and do tests to see how completely we can rescue the data off the dying website. We're doing that with SoundCloud at the moment."

Even archiving just part of the massive library of sounds will be costly, hence why Scott is calling upon those who are interested in assisting with the archiving of SoundCloud to donate to Archive.org.

"We will save the data and then try to make it in some way playable, although we're not interested in hosting a 'new SoundCloud,'" Scott wrote. "Our main concern is artists and creators suddenly finding their stuff gone, and making it so it's not in oblivion."

Get six of our favorite Motherboard stories every day by signing up for our newsletter.