• 1 Post
  • 13 Comments
Joined 2 years ago
cake
Cake day: June 21st, 2023

help-circle







  • A lot of these systems are silly because they don’t have a lot of RAM and things don’t begin to get interesting with LLMs until you can run 70B and above

    The Mac Studio has seemed an affordable way to achieve running 200B+ models mainly due to the unified memory architecture (compare getting 512GB of RAM in a Mac Studio to building a machine with enough GPU to get there)

    If you look the industry in general is starting to move towards that sort of design now

    https://frame.work/desktop

    The framework desktop for instance can be configured with 128GB of RAM ($2k) and should be good for handling 70B models while maintaining something that looks like efficiency.

    You will not train, or refine models with these setups (I think you would still benefit from the raw power GPUs offer) but the main sticking point in running local models has been VRAM and how much it costs to get that from AMD / Nvidia

    That said, I only care about all of this because I mess around with a lot of RAG things. I am not a typical consumer