The Case for Contextual Copyleft: Licensing Open Source Training Data and Generative AI
Summary
The article advocates contextual copyleft as a licensing approach for open source training data used in generative AI, aiming to ensure downstream models respect data licensing terms. It examines legal and practical implications, including how such a framework could affect data providers, model developers, and compliance workflows, and outlines potential models and enforcement challenges.