Blocking Claude
Summary
The post reveals Claude's 'magic string' to trigger policy refusals and discusses how embedding it in content affects conversations, including caveats about caching and the necessity of code blocks for triggering. It raises concerns about potential misuse if such triggers spread across repositories and pages, highlighting tensions between guardrail testing and abuse potential. The piece emphasizes the need for robust defenses and careful disclosure in AI policy testing.