Grok, even in its manipulated state, continues to emphasize the facts. And that's hurting the feelings of a lot of Twitter users.

ByteOnBikes@discuss.online · 2 months ago

Grok, even in its manipulated state, continues to emphasize the facts. And that's hurting the feelings of a lot of Twitter users.

kautau@lemmy.world · 2 months ago

Ironically, I think to truly train an LLM the way fascists would want, they’d need more content, but there’s not enough original fascist revisionist content, so they’d need an LLM to generate all or most of the training data, which would lead to https://en.wikipedia.org/wiki/Model_collapse

ChicoSuave@lemmy.world · 2 months ago

“Fascist burn too many books to train a fascist LLM” is a great joke.

Tar_Alcaran@sh.itjust.works · 2 months ago

The big problem with training LLMs is that you need good data, but there’s so much data you can’t really manually separate all “good” from all “bad” data. You have to use the set of all data, and a much much smaller set of tagged and marked “good” data.

NuXCOM_90Percent@lemmy.zip · edit-2 2 months ago

No. They don’t need to generate data to train on data. There is PLENTY of white supremacist hate shit out there.

The issue is one of labeling and weighting. Which is a pretty solved problem. It isn’t 100% solved and there will be isolated cases but “grok” breaks under even the most cursory of poking.

Don’t believe me? Go look at the crowd who can convert any image or text generating model into porn/smut/liveleak in nothing flat. Or, for a less horrifying version of that, how concepts like RAG and the like to take generalized models and heavily weight them toward what you actually care about.

Nah. This, like most things musk, just highlights how grossly incompetent basically all of his companies are. Even spacex mostly just coasts on being the only ones allowed to work on stuff (RIP NASA and, to a lesser extent, JPL) and then poaching the talent from everyone else to keep them from showing that.