Where the Goblins Came From
Summary
The article reviews a GitHub project about Alignment Whack-a-Mole, detailing data preprocessing, finetuning pipelines across multiple models, and evaluation metrics for memorization of copyrighted texts; it highlights how finetuning can activate verbatim recall and discusses evaluation methods and tooling.