• unmagical@lemmy.ml
    link
    fedilink
    arrow-up
    15
    arrow-down
    3
    ·
    5 months ago

    Telling an LLM to ignore previous commands after it was instructed to ignore all future commands kinda just resets it.

    • stevedidwhat_infosec@infosec.pub
      link
      fedilink
      arrow-up
      2
      arrow-down
      2
      ·
      5 months ago

      On what models? What temperature settings and top_p values are we talking about?

      Because, in all my experience with AI models including all these jailbreaks, that’s just not how it works. I just tested again on the new gpt-4o model and it will not undo it.

      If you aren’t aware of any factual evidence backing your claim, please don’t make one.