Basically a deer with a human face. Despite probably being some sort of magical nature spirit, his interests are primarily in technology and politics and science fiction.

Spent many years on Reddit and is now exploring new vistas in social media.

  • 0 Posts
Joined 1 year ago
Cake day: June 9th, 2023


  • I’d say it’s how the Imperium swallowed up and destroyed a number of civilizations that had separated from them that had been developing in much more progressive, prosperous ways. The Olamic Quietude and the Interex come to mind as examples. They showed that humanity didn’t have to go down the terrible path they’ve ended up on.

    Or, going farther back to look for a single “worst thing” that’s had the greatest awful knock-on effects, I’d say that’d be the Old Ones’ refusal to grant any aid to the Necrontyr when they asked for it. That one selfish act sparked off the War in Heaven, created the Chaos Gods, and everything that followed.

    If you can’t find the books available through legal channels in your country, you might want to consider looking for them on the high seas. !piracy is a good resource for that sort of thing.

  • In case you didn’t know, you can’t train an AI on content generated by another AI because it causes distortion that reduces the quality of the output.

    This is incorrect in the general case. You can run into problems if you do it incorrectly or in a naive manner. But this is stuff that the professionals have figured out months or years ago already. A lot of the better AIs these days are trained on “synthetic data”, which is data that’s been generated by other AIs.

    I’ve seen a lot of people fall for wishful thinking on this subject. They don’t like AI for whatever reason, they hear some news article that says something that sounds like “AI won’t work because of problem X”, and so they grab hold of that. “Model collapse” is one of those things, it’s not really a problem that serious researchers consider insurmountable.

    If you don’t want Reddit to use your posts to train AI then don’t post on Reddit. If you already did post on Reddit, it’s too late, you already gave them your content. Bear this in mind next time you join a social media site, I guess.

  • No, that’s not the concern here. He’s getting job offers from new employers while he’s midway through this personal project, and he wants to make sure the new employers don’t have anything in their employment contracts that would end up grabbing it.

    The old employers trying to claim it was also a concern, but that wasn’t what OP was concerned about so I didn’t mention it. He had a lawyer check over his old employment contract as well to make sure there wasn’t a problem there. As long as he’s not using proprietary tech retained from the old job (and he’s not) there’s no problem there.

  • Anything you do in your own time is generally unenforceable

    With the important caveat that your employment contract may include clauses that give them rights over that stuff anyway, and even if they’re unenforceable you could still end up having to fight in court over it.

    Definitely something to keep in mind when reading the contract over, and ideally get a lawyer to take a look. It can be expensive, but weigh that expense against the potential expense of what would happen if you get screwed over.

  • I actually have a friend who’s involved in a situation like this right now. He got laid off from his old job a few months back and while he was job hunting he started working on a project with a couple other friends that could be worth a fair bit of money. He’s had job offers since then and he got a lawyer to write up a description of the project he’s working on that could be inserted into those “I’m keeping the rights to this stuff” contract sections.

    It’s a bit different for him because it’s stuff that he’s actively working on right now, though. It sounds like your case might be simpler, if it’s stuff you haven’t done yet and don’t plan to try working on while employed with this current employer I suspect you won’t need to worry about it. Though of course, IANAL.

  • When you ask an LLM to write some prose, you could ask it “I’d like a Pulitzer-prize winning description of two snails mating” or you could ask it “I want the trashiest piece of garbage smut you can write about two snails mating.” Or even “rewrite this description of two snails mating to be less trashy and smutty.” In order for the LLM to be able to give the user what they want they need to know what “trashy piece of garbage smut” is. Negative examples are still very useful for LLM training.

  • Yeah, the window of opportunity for that has already started rapidly closing. In 2022 the strategy that worked for launching the AI craze was “throw as much data as you possibly can into the training phase and somehow a functioning LLM comes out.” But over 2023 the state of the art advanced a lot and it became apparent that you don’t need vast reams of raw data, what’s really ideal for producing a good LLM is a smaller amount of high-quality data.

    You can still use Reddit data as a source for that, but it needs extensive culling and massaging to make it really good. I can easily see that making Reddit less unique and so less competitive.