In a bold move, Reddit CEO Steve Huffman is demanding that Microsoft and other companies like Anthropic and Perplexity pay for scraping Reddit’s data.
Huffman has already struck deals with Google and OpenAI, but Microsoft has yet to come to the table.
The Data Dilemma
Huffman is frustrated with companies using Reddit’s data without permission. “Without these agreements, we don’t have any say or knowledge of how our data is displayed and what it’s used for,” he said in a recent interview. Blocking these companies has been “a real pain in the ass,” Huffman added.
Reddit has been actively fighting against unauthorized data scraping. In July, they updated their robots.txt file to block web crawlers they don’t have agreements with. Consequently, Reddit results are now only visible in Google searches, where Reddit is paid for its data, and not in other search engines like Bing.
Want to learn more about AI's impact on the world in general and property in particular? Join us on our next Webinar! Click here to register
Microsoft in the Crosshairs
Huffman specifically called out Microsoft for using Reddit’s data to train its AI and summarizing content in Bing results without permission. He also mentioned that Reddit’s data has been sold through the Bing API to other search engines. Microsoft AI CEO Mustafa Suleyman’s recent comment that public data on the internet is “freeware” didn’t help matters.
“We’ve had Microsoft, Anthropic, and Perplexity act as though all of the content on the internet is free for them to use,” Huffman said. “That’s their real position.”
In response, Microsoft’s head of search, Jordi Ribas, stated that Reddit has blocked Bing from crawling their site, which impacts competition. Microsoft spokesperson Caitlin Roulston added that they “honor the directions provided by websites that do not want content on their pages to be used with our generative AI models.”
A Model to Follow
Huffman pointed to OpenAI’s recent announcement of SearchGPT, which will show Reddit results thanks to a deal both companies reached earlier this year. This is the model Huffman wants to replicate with other companies. According to Reddit spokesperson Tim Rathschmidt, none of the content licensing deals Reddit has done to date include exclusive use cases for its data.
What’s Next?
As Reddit continues to push for fair compensation for its data, the outcome of these negotiations could set a precedent for how online content is valued and used. Will Microsoft and others come to the table, or will the battle over data rights intensify?
Want to learn more about AI's impact on the world in general and property in particular? Join us on our next Webinar! Click here to register
Made with TRUST_AI - see the Charter: https://www.modelprop.co.uk/trust-ai
コメント