• ji59@hilariouschaos.com
    link
    fedilink
    English
    arrow-up
    5
    ·
    18 hours ago

    According to the study, they are taking some random documents from their datset, taking random part from it and appending to it a keyword followed by random tokens. They found that the poisened LLM generated gibberish after the keyword appeared. And I guess the more often the keyword is in the dataset, the harder it is to use it as a trigger. But they are saying that for example a web link could be used as a keyword.