

Well, anyone who used an LLM long enough knows that there’s always a randomness to their answers and sometimes they can output a totally weird and nonsense answer too. Just start a new chat and ask it again, it’ll give a new answer.
This is actually the best way to know whether it’s hallucinating something, if it answers the same thing two or more times in different chats, it’s likely not making it up.
This seems to me like just a semantic difference though. People will say the LLM is “making shit up” when they’re outputting something that isn’t correct, and that happens (according to my knowledge) usually because the information you’re asking wasn’t represented enough in the training data to guide the answer always to that information.
In any case, there is an expectation from users that LLMs can somehow be deterministic when they’re not at all. They’re a deep learning model that’s so complicated that’s impossible to predict what effect a small change in the input will have on the output. So it could give an expected answer for a certain question and give a very unexpected one just by adding or changing some word on the input, even if that appears irrelevant.