Where did you think the training data was coming from?
Summary
The blog post discusses how training data for AI models largely comes from user-generated data captured by devices and services (e.g., Meta Glasses, Windows telemetry, and ChromeOS usage), highlighting privacy concerns and the opaque nature of terms that govern data collection. It cites industry examples and a LeCun quote to illustrate that billions of user-generated images and other data power modern AI, often used for advertising and model training under vague policies.