• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    21 days ago

    https://en.m.wikipedia.org/wiki/External_memory_algorithm

    Unfortunately that’s not really relevant to LLMs beyond inserting things into the text you feed them. For every single word they predict, they make a pass through the multi-gigabyte weights. Its largely memory bound, and not integrated with any kind of sane external memory algorithm.

    There are some techniques that muddy this a bit, like MoE and dynamic lora loading, but the principle is the same.