I’d like to self host a large language model, LLM.

I don’t mind if I need a GPU and all that, at least it will be running on my own hardware, and probably even cheaper than the $20 everyone is charging per month.

What LLMs are you self hosting? And what are you using to do it?

  • Showroom7561@lemmy.ca
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    14 hours ago

    You can run this right from Windows: https://jan.ai/

    You’ll need a lot of RAM, and processing is decently fast, even on a basic laptop.

    edit: holy hell. Grammar.

    • dangling_cat@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      2
      ·
      13 hours ago

      Tip: you can copy and paste the Hugging Face link directly into the search box, and it will download the model automatically! Also, it’s pretty smart. It will load into your VRAM first, then your RAM. If you can fit everything into VRAM, you get the fastest speed. But even if you are using RAM, it’s not terribly bad; it’s still faster than you can read.