• Hackworth@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    21 days ago

    I’ll try it out! It’s been a hot minute, and it seems like there are new options all the time.

    • brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      21 days ago

      Try a new quantization as well! Like an IQ4-M depending on the size of your GPU, or even better, an 4.5bpw exl2 with Q6 cache if you can manage to set up TabbyAPI.