Smaug-72B-v0.1: The New Open-Source LLM Roaring to the Top of the Leaderboard

hexual@lemmy.world · 5 months ago

Smaug-72B-v0.1: The New Open-Source LLM Roaring to the Top of the Leaderboard

TheChurn@kbin.social · 5 months ago

Every billion parameters needs about 2 GB of VRAM - if using bfloat16 representation. 16 bits per parameter, 8 bits per byte -> 2 bytes per parameter.

1 billion parameters ~ 2 Billion bytes ~ 2 GB.

From the name, this model has 72 Billion parameters, so ~144 GB of VRAM

nicetriangle@kbin.social · 5 months ago

Ok but will this run on my TI-83? It’s a + model.

Rai@lemmy.dbzer0.com · 5 months ago

Only if it’s silver.

𝕸𝖔𝖘𝖘@infosec.pub · 5 months ago

Dang. So close.

Rai@lemmy.dbzer0.com · 5 months ago

My 83 was ganked by some kid I knew so my folks bought me a silver. He denied it. I learned that day to write my name in secret spots.

𝕸𝖔𝖘𝖘@infosec.pub · 5 months ago

That kid you knew was a dick. At least he taught you a valuable lesson, I guess.

Rai@lemmy.dbzer0.com · 5 months ago

He absolutely was a dick. I stopped being mates with him after that. My school was like “yeah the cameras didn’t work that day actually”

𝕸𝖔𝖘𝖘@infosec.pub · 5 months ago

Leads me to believe that the cameras never actually worked.

Rai@lemmy.dbzer0.com · 5 months ago

I believe that. Or they just didn’t want to be responsible for dealing with theft. Both ways make perfect sense to me.

whoelectroplateuntil@sh.itjust.works · 5 months ago

no. but put this clustering software i wrote in ti-basic on 40 million of them? still no

FaceDeer@kbin.social · 5 months ago

It’s been discovered that you can reduce the bits per parameter down to 4 or 5 and still get good results. Just saw a paper this morning describing a technique to get down to 2.5 bits per parameter, even, and apparently it 's fine. We’ll see if that works out in practice I guess

OutrageousUmpire@lemmy.world · 5 months ago

Any idea what 8Q requirements would be? Or 4 or 5?

General_Effort@lemmy.world · 5 months ago

https://huggingface.co/senseable/Smaug-72B-v0.1-gguf/tree/main

About 44GB and 50GB for the Q4 and 5. You’d need quite some extra to fully use the 32k context length.

rs137@lemmy.world · 5 months ago

Llama 2 70B with 8b quantization takes around 80GB VRAM if I remember correctly. I’ve tested it a while ago.

Smaug-72B-v0.1: The New Open-Source LLM Roaring to the Top of the Leaderboard

Smaug-72B-v0.1: The New Open-Source LLM Roaring to the Top of the Leaderboard

abacusai/Smaug-72B-v0.1 · Hugging Face