Supercazzola - Generate spam for web scrapers
Summary
This article presents Supercazzola, a BSD-based tool designed to generate dynamic web pages to trap or ‘poison’ web crawlers that ignore robots.txt. It provides build instructions, a Markov-chain based text generator, and a daemon that serves fake pages, illustrating anti-scraping or bot-trap techniques and their security implications. Caution is warranted due to potential misuse and ethical/legal considerations around manipulating bots and web traffic.