No, AI isn’t theft - It's Complicated with Jeff Kirvin

# No, AI isn’t theft Every time anyone mentions [[AI]] on social media, some well-meaning individual pipes up to tell everyone that AI is built on theft and no one should use it. This idea is remarkably pervasive, especially among indie creative types. I can understand their fear, and even why they might see AI as an existential threat, but that doesn’t make what they’re saying true. One of my late wife’s favorite sayings was “define your terms.” (We were a lot of fun at parties.) And here we really need to understand what people mean by “theft.” Their assertion is that AI models are trained on data that the AI companies didn’t pay for, and that’s theft of intellectual property, copyright infringement, and probably a moving violation. But that misses a few key points about how AI works and what it’s really doing. First off, training data isn’t retained by the model, so it’s not keeping a copy of every article it reads, every picture it analyzes. Training data is “digested” into patterns and probabilities, so that with enough training data the model learns what things tend to go with what other things, and what does not. Think of it like a human learning a language. You memorize vocabulary and grammar, but you don’t remember every single paragraph written in that language that you’ve ever read. AI figures out patterns and relationships in the training data, but it doesn’t copy anything, doesn’t regurgitate anything word for word. (This *was* a problem with older, smaller models, which didn’t have as much training data to relate, and therefore had fewer possible combinations of content they could create. At the scale of the latest models, this just doesn’t happen anymore.) Secondly, now that we’ve got the copying issue debunked, training an AI is functionally no different than a human reading or viewing the same data. And specifically, most AIs are trained on publicly available information from the internet. If your blog post can be read by a human, it can be read by an AI. If your artwork can be seen by a human on Deviant Art, then machines can look at it too. There’s really no difference. Already, we’ve satisfied the Fair Use doctrine of copyright. But the AI companies still derive value without compensating the producers! That’s true, but it’s not theft. Value is derived from organizing and synthesizing the patterns in the data into a usable tool. It’s the same principle behind search engines, which derive massive value from indexing and ranking the content of the web—again, using publicly available data and without paying people for the access to the data. (And it makes it interesting that Google’s policy is that if you block Gemini from using your site for training, Google Search won’t index it either. Go ahead, cut off your nose to spite your face.) So yes, companies profit from this, but that doesn’t make it theft. There’s also a strong ethical and legal precedent for this kind of derivative learning. Humans have been doing it forever. Think about chefs learning from each other’s recipes, writers influenced by other writers, or artists inspired by other artists. The output is new and distinct, even if the learning process involved exposure to others’ work. What AI does is mimic this process at scale, with the efficiency of computation. Finally, let’s address the existential threat. The fear that AI might replace human creativity misunderstands its role. AI is a tool. Like a camera doesn’t replace painting, or a typewriter didn’t replace handwriting, AI doesn’t replace human ingenuity. It complements it, amplifies it, and makes certain processes faster or more accessible. The existence of AI doesn’t diminish the value of human creativity; it adds another medium through which creativity can flourish. The concern over AI isn’t baseless, but framing it as “theft” misses the mark. We should have discussions about AI’s ethical use, transparency, and impact on industries. But declaring it inherently unethical because it learns from publicly available data isn’t productive. Let’s focus on how to harness AI responsibly and ensure it benefits creators and society at large, rather than vilifying it based on misconceptions.