2 mins read

Google Will Use Your Reddits to Train AI

Google Will Use Your Reddits to Train AI - Screenshot from The Keyword, official announcement Source
Google Will Use Your Reddits to Train AI - Screenshot from The Keyword, official announcement Source

Google Will Use Your Reddits to Train AI – Key Notes

  • Google’s $60 million agreement with Reddit for AI training data.
  • Concerns over user data licensing.
  • The deal supports Reddit’s path to profitability despite user concerns

The Deal Between Google and Reddit

As Rajan Patel CEO announced on The Keyword, Google has entered into a $60 million agreement  with Reddit to utilize their online communities as a source for AI training data – this comes after last year’s protests by moderators over API access.

According to Reuters, Google will analyze a large number of posts on Reddit to train a language model.

Although Bloomberg initially reported the deal, the recipient of the data was not disclosed as a “large AI company.” Despite only generating $800 million in revenue last year, Reddit is considering an IPO with a valuation of $5 billion. This is due to the potential for their online communities to serve as a perfect environment for AI models to train.

However, there have been concerns over licensing the thoughts and ideas of users.

User Concerns and Moderator Protests

Last year, some of the most popular subreddits went offline in protest of the company’s decision to charge for API access, announced in April 2023.

This deal with Google aligns with Reddit’s goals, as major tech companies are eager for data and have turned to various sources such as news organizations, community forums, and even universities.

While this may upset users, it offers Reddit a path to profitability. CEO Steve Huffman stated that the data from Reddit is valuable and they do not want to give it away for free to the largest companies in the world.

“The Reddit corpus of data is really valuable,”

“But we don’t need to give all of that value to some of the largest companies in the world for free.”

he stated to The New York Times.

However, the decision to charge for API access not only affected big companies but also small, independent researchers. This caused difficulties for moderators in managing their communities and some argued that it made the overall experience for Reddit’s 800 million monthly active users worse.


  • Reddit: It’s a vast network of communities where people can dive into their interests, hobbies, and passions.
  • AI Training Data: The large sets of information used to teach AI models how to perform tasks by identifying patterns and making decisions.
  • Reddit IPO: The anticipated public offering of Reddit’s stock, marking its transition from a private to a publicly traded company.
  • API (Application Programming Interface): A set of rules that allows different software applications to communicate with each other.

Frequently Asked Questions

  1. What are Reddits to train AI?
    Reddits are being used to provide vast, diverse conversational data to train AI models in language processing and user interaction.
  2. What does API access mean for AI training on Reddit?
    API access means that developers can programmatically retrieve Reddit’s data, crucial for feeding and training AI algorithms.
  3. Are there privacy concerns when using Reddits to train AI?
    Yes, there are privacy concerns as using Reddits for AI involves handling user-generated content, which may contain personal information.

Laszlo Szabo / NowadAIs

As an avid AI enthusiast, I immerse myself in the latest news and developments in artificial intelligence. My passion for AI drives me to explore emerging trends, technologies, and their transformative potential across various industries!


Follow us on Facebook!

Previous Story


AI in Everyday Life Use of Artificial Intelligence for Daily Comfort - featured image
Next Story

AI in Everyday Life: Use of Artificial Intelligence for Daily Comfort

Latest from Blog

Go toTop