In case you didn’t know, you can’t train an AI on content generated by another AI because it causes distortion that reduces the quality of the output. It is also very difficult to filter out AI text from human text in a database. This phenomenon is known as AI collapse.

So if you were to start using AI to generate comments and posts on Reddit, their database would be less useful for training AI and therefore the company wouldn’t be able to sell it for that purpose.

  • 🇰 🌀 🇱 🇦 🇳 🇦 🇰 ℹ️@yiffit.net
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    7 months ago

    So if you were to start using AI to generate comments and posts on Reddit, their database would be less useful for training AI and therefore the company wouldn’t be able to sell it for that purpose.

    It feels like Reddit was already using bots to make posts after they killed 3rd party apps. It’s been pointed out a lot here how so many comment chains on the site these days make no sense unless they are AI/bots.

    • piecat@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      7 months ago

      It’s not just the content, it’s the ecosystem

      If you’re training ai, you need a way to evaluate outputs. What better way than through karma score?

    • Khrux@ttrpg.network
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 months ago

      Even before then, you’d always find comments in any larger section that were irrelevant praise posted by bots to generate a “realistic” Reddit account to sell later to marketing companies.

      Hell I believe I once used a tool to value my Reddit account at like $200 and it literally told me how kind my responses were. Also to generate comment karma, responding to a post early is much more valuable than a good response.

  • febra@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    7 months ago

    With the amount of bot generated content on Reddit already that data can’t be of much value

  • FaceDeer@kbin.social
    link
    fedilink
    arrow-up
    1
    ·
    7 months ago

    In case you didn’t know, you can’t train an AI on content generated by another AI because it causes distortion that reduces the quality of the output.

    This is incorrect in the general case. You can run into problems if you do it incorrectly or in a naive manner. But this is stuff that the professionals have figured out months or years ago already. A lot of the better AIs these days are trained on “synthetic data”, which is data that’s been generated by other AIs.

    I’ve seen a lot of people fall for wishful thinking on this subject. They don’t like AI for whatever reason, they hear some news article that says something that sounds like “AI won’t work because of problem X”, and so they grab hold of that. “Model collapse” is one of those things, it’s not really a problem that serious researchers consider insurmountable.

    If you don’t want Reddit to use your posts to train AI then don’t post on Reddit. If you already did post on Reddit, it’s too late, you already gave them your content. Bear this in mind next time you join a social media site, I guess.