@glizzyguzzler

glizzyguzzler@lemmy.blahaj.zone · 3 days ago

You can, but the stuff that’s really useful (very competent code completion) needs gigantic context lengths that even rich peeps with $2k GPUs can’t do. And that’s ignoring the training power and hardware costs to get the models.

Techbros chasing VC funding are pushing LLMs to the physical limit of what humanity can provide power and hardware-wise. Way less hype and letting them come to market organically in 5/10 years would give the LLMs a lot more power efficiency at the current context and depth limits. But that ain’t this timeline, we just got VC money looking to buy nuclear plants and fascists trying to subdue the US for the techbro oligarchs womp womp

glizzyguzzler@lemmy.blahaj.zone · 3 days ago

No, they’re right. The “research” is biased by the company that sells the product and wants to hype it. Many layers don’t make think or reason, but they’re glad to put them in quotes that they hope peeps will forget were there.

glizzyguzzler@lemmy.blahaj.zone · 3 days ago

So close, LLMs work via matrix multiplication, which is well understood by many meat bags and matrix math can’t think. If a meat bag can’t do matrix math, that’s ok, because the meat bag doesn’t work via matrix multiplication. lol imagine forgetting how to do matrix multiplication and disappearing into a singularity or something

glizzyguzzler@lemmy.blahaj.zone · 3 days ago

They do not, and I, a simple skin-bag of chemicals (mostly water tho) do say

glizzyguzzler@lemmy.blahaj.zone · 3 days ago

I was channeling the Interstellar docking computer (“improper contact” in such a sassy voice) ;)

There is a distinction between data and an action you perform on data (matrix maths, codec algorithm, etc.). It’s literally completely different.

An audio codec (not a pipeline) is just actually doing math - just like the workings of an LLM. There’s plenty of work to be done after the audio codec decodes the m4a to get to tunes in your ears. Same for an LLM, sandwiching those matrix multiplications that make the magic happen are layers that crunch the prompts and assemble the tokens you see it spit out.

LLMs can’t think, that’s just the fact of how they work. The problem is that AI companies are happy to describe them in terms that make you think they can think to sell their product! I literally cannot be wrong that LLMs cannot think or reason, there’s no room for debate, it’s settled long ago. AI companies will string the LLMs together and let them chew for a while to try make themselves catch when they’re dropping bullshit. It’s still not thinking and reasoning though. They can be useful tools, but LLMs are just tools not sentient or verging on sentient

glizzyguzzler@lemmy.blahaj.zone · edit-2 6 days ago

Improper comparison; an audio file isn’t the basic action on data, it is the data; the audio codec is the basic action on the data

“An LLM model isn’t really an LLM because it’s just a series of numbers”

But the action of turning the series of numbers into something of value (audio codec for an audio file, matrix math for an LLM) are actions that can be analyzed

And clearly matrix multiplication cannot reason any better than an audio codec algorithm. It’s matrix math, it’s cool we love matrix math. Really big matrix math is really cool and makes real sounding stuff. But it’s just matrix math, that’s how we know it can’t think

glizzyguzzler@lemmy.blahaj.zone · 6 days ago

It’s literally tokens. Doesn’t matter if it completes the next word or next phrase, still completing the next most likely token 😎😎 can’t think can’t reason can witch’s brew facsimile of something done before

glizzyguzzler@lemmy.blahaj.zone · 6 days ago

You can prove it’s not by doing some matrix multiplication and seeing its matrix multiplication. Much easier way to go about it

glizzyguzzler@lemmy.blahaj.zone · 6 days ago

Too deep on the AI propaganda there, it’s completing the next word. You can give the LLM base umpteen layers to make complicated connections, still ain’t thinking.

The LLM corpos trying to get nuclear plants to power their gigantic data centers while AAA devs aren’t trying to buy nuclear plants says that’s a straw man and you simultaneously also are wrong.

Using a pre-trained and memory-crushed LLM that can run on a small device won’t take up too much power. But that’s not what you’re thinking of. You’re thinking of the LLM only accessible via ChatGPT’s api that has a yuge context length and massive matrices that needs hilariously large amounts of RAM and compute power to execute. And it’s still a facsimile of thought.

It’s okay they suck and have very niche actual use cases - maybe it’ll get us to something better. But they ain’t gold, they ain’t smart, and they ain’t worth destroying the planet.

glizzyguzzler@lemmy.blahaj.zone · 6 days ago

Can’t help but here’s a rant on people asking LLMs to “explain their reasoning” which is impossible because they can never reason (not meant to be attacking OP, just attacking the “LLMs think and reason” people and companies that spout it):

LLMs are just matrix math to complete the most likely next word. They don’t know anything and can’t reason.

Anything you read or hear about LLMs or “AI” getting “asked questions” or “explain its reasoning” or talking about how they’re “thinking” is just AI propaganda to make you think they’re doing something LLMs literally can’t do but people sure wish they could.

In this case it sounds like people who don’t understand how LLMs work eating that propaganda up and approaching LLMs like there’s something to talk to or discern from.

If you waste egregiously high amounts of gigawatts to put everything that’s ever been typed into matrices you can operate on, you get a facsimile of the human knowledge that went into typing all of that stuff.

It’d be impressive if the environmental toll making the matrices and using them wasn’t critically bad.

TLDR; LLMs can never think or reason, anyone talking about them thinking or reasoning is bullshitting, they utilize almost everything that’s ever been typed to give (occasionally) reasonably useful outputs that are the most basic bitch shit because that’s the most likely next word at the cost of environmental disaster

glizzyguzzler@lemmy.blahaj.zone · edit-2 27 days ago

I got my parents to get a NAS box, stuck it in their basement. They need to back up their stuff anyway. I put in 2 18 TB drives (mirrored BTRFS raid1) from server part deals (peeps have said that site has jacked their prices, look for alts). They only need like 4 TB at most. I made a backup samba share for myself. It’s the cheapest symbology box possible, their software to make a samba share with a quota.

I then set up a wireguard connection on an RPi, taped that to the NAS, and wireguard to the local network with a batch script. Mount the samba share and then use restic to back up my data. It works great. Restic is encrypted, I don’t have to pay for storage monthly, their electricity is cheap af, they have backups, I keep tabs on it, everyone wins.

Next step is to go the opposite way for them, but no rush on that goal, I don’t think their basement would get totaled in a fire and I don’t think their house (other than the basement) would get totaled in a flood.

If you don’t have a friend or relative to do a box-at-their-house (peeps might be enticed with reciprocal backups), restic still fits the bill. Destination is encrypted, has simple commands to check data for validity.

Rclone crypt is not good enough. Too many issues (path length limits, password “obscured” but otherwise there, file structure preserved even if names are encrypted). On a VPS I use rclone to be a pass-through for restic to backup a small amount of data to a goog drive. Works great. Just don’t fuck with the rclone crypt for major stuff.

Lastly I do use rclone crypt to upload a copy of the restic binary to the destination, as the crypt means the binary can’t be fucked with and the binary there means that is all you need to recover the data (in addition to the restic password you stored safely!).