For context I created a video search engine last year, I shut it down and put the data online. You can read about it here: https://www.bendangelo.me/2024/07/16/failed-attempt-at-creating-a-video-search-engine/

I put that project on hold because of scaling issues, anyway I’m back with an other idea. I’ve been frustrated with how AI slop is ruining the internet and recently it’s been hitting Youitube pretty hard with AI videos. I’m brainstorming a tool for people to selfhost:

Self-hosted crawler: Pick which sites/videos to index (blogs, forums, YT channels, etc.). AI chat interface: Ask questions like, “Show me Rust tutorials from 2023” or “Summarize recent posts about homelab backups.” Optional sharing: Pool indexes with trusted friends/communities.

Why? No Google/YouTube spam—only content you choose. Works offline (archive forums, videos, docs). Local AI (Mistral) or cloud (paid) for smarter searches.

Would this be useful to you? What sites would you crawl? Any killer features I’m missing?

Prototype in progress—just testing interest!

  • Xanza@lemm.ee
    link
    fedilink
    English
    arrow-up
    2
    ·
    17 days ago

    Why the hell does everything have to be AI for you people to be happy? I just plain don’t understand it. We know that AI hurts your critical thinking and reasoning skills, and we continue to just pack AI into everything… Doesn’t make sense. Sooner or later you’re gonna need to ask ChatGPT whether or not you need to wipe your own ass or not.

  • DarkSpectrum@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    17 days ago

    AI uses so much more resources than standard search engines and it comes at a time when the whole planet needs to slow down climate change

  • CameronDev@programming.dev
    link
    fedilink
    English
    arrow-up
    1
    ·
    17 days ago

    I personally have zero interest in AI search, if you mean LLM. The fact that it can make stuff up, also means it can miss stuff as well. Neither are acceptable for a search engine.

    If you mean some kind of deterministic algorithm for indexing and searching, then maybe.

    Also, attempting to crawl sites locally sounds like a great way to get banned from those sites for looking like a bot.

  • rtxn@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    17 days ago

    No. I’m so bloody fed up with AI “search” solutions that return everything on the fucking planet except what I want. Text search has been a solved problem for a decade. All I want out of a search engine is to be deterministic, stable, and reliable, and to look in titles, descriptions, and keywords. Vibe processing is completely unnecessary and will only create issues.

    If you really want to iNnoVAte, then consider creating an index with transcripts and summaries that users can search by keywords.

  • Zwuzelmaus@feddit.org
    link
    fedilink
    English
    arrow-up
    1
    ·
    17 days ago

    No. Never would I self-host a search engine.

    The crawler would eat up so much more ressources than I am ever willing to spend.

  • lambalicious@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    1
    ·
    17 days ago

    I don’t want AI slop from big corpo and you think I am gonna want AI slop that’s just as wasteful and harmful just because it’s “locally produced”? That’s Republican-ish crap line of thought.