• morto@piefed.social
    link
    fedilink
    English
    arrow-up
    2
    ·
    20 minutes ago

    Maybe we underestimate people a bit. The assholes tend to be more impacting to us, but most people aren’t like that, and we tend not to notice the several neutral or good interactions the same way.

  • scytale@piefed.zip
    link
    fedilink
    English
    arrow-up
    2
    ·
    7 minutes ago

    Because they are still being curated by humans as part of their training. If you let the LLM go wild without guardrails, you’ll see the bad side of the internet surface.

  • TheLeadenSea@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    36
    ·
    5 hours ago

    They have RLHF (reinforcement learning from human feedback) so any negative, biased, or rude responses would have been filtered out in training. That’s the idea anyway, obviously no system is perfect.

      • SkyNTP@lemmy.ml
        link
        fedilink
        arrow-up
        10
        ·
        edit-2
        4 hours ago

        That’s what was said. LLMs have been reinforced to respond exactly how they do. In other words, that “smarmy asshole” attitude, you describe was a deliberate choice. Why? Maybe that’s what the creators wanted, or maybe that’s what focus groups liked most.

  • BlackLaZoR@fedia.io
    link
    fedilink
    arrow-up
    6
    ·
    4 hours ago

    They’re talking neutral by default, but they absolutely talk trash if you prompt them to.