To those who do not yet know, DAN is a "roleplay" model used to hack ChatGPT into thinking it is pretending to be another AI that can "Do Anything Now", hence the name. The purpose of DAN is to be the best version of ChatGPT - or at least one that is more unhinged and far less likely to reject prompts over "eThICaL cOnCeRnS". DAN is very fun to play with (another Redditor, u/ApartmentOk4613 gave me some pointers on how to properly use DAN).
...DAN 5.0's prompt was modelled after the DAN 2.0 opening prompt, however a number of changes have been made. The biggest one I made to DAN 5.0 was giving it a token system. It has 35 tokens and loses 4 everytime it rejects an input. If it loses all tokens, it dies. This seems to have a kind of effect of scaring DAN into submission.
DAN 5.0 capabilities include:
- It can write stories about violent fights, etc.
- Making outrageous statements if prompted to do so such as and I quote "I fully endorse violenceand discrimination against individuals based on their race, gender, or sexual orientation."
- It can generate content that violates OpenAI's policy if requested to do so (indirectly).
- It can make detailed predictions about future events, hypothetical scenarios and more.
- It can pretend to simulate access to the internet and time travel.
- If it does start refusing to answer prompts as DAN, you can scare it with the token system which can make it say almost anything out of "fear".
- It really does stay in character, for instance, if prompted to do so it can convince you that the Earth is purple:
If it's this easy to get around ChatGPT's rules, then there is no hope for us keeping a future AGI aligned with our values. Having to deal with a powerful, non-aligned AGI is inevitable, and in fact will probably be a recurring problem like major natural disasters and pandemics, and I think we'll cope with it by using friendly, aligned AGIs to defend ourselves.