Can't relate at all.

Sips'@slrpnk.net · 4 months ago

Can't relate at all.

brucethemoose@lemmy.world · edit-2 4 months ago

https://en.m.wikipedia.org/wiki/External_memory_algorithm

Unfortunately that’s not really relevant to LLMs beyond inserting things into the text you feed them. For every single word they predict, they make a pass through the multi-gigabyte weights. Its largely memory bound, and not integrated with any kind of sane external memory algorithm.

There are some techniques that muddy this a bit, like MoE and dynamic lora loading, but the principle is the same.