Although we are still a long way from the science fiction version of “artificial general intelligence” that thinks, feels, and refuses to “open the pod bay doors,” recent advances in machine learning and artificial intelligence (AI) have captured the public’s imagination and lawmakers’ interest. We now have large language models (LLMs) that can pass the bar exam, carry on (what passes for) a conversation about almost any topic, create new music, and create new visual art. These artifacts are often indistinguishable from their human-authored counterparts and yet can be produced at a speed and scale surpassing human ability.
“Generative AI” systems, such as the Generative Pretrained Transformer (GPT) and Large Language Model Meta AI (LLaMA) language models and the Stable Diffusion and Midjourney text-to-image models, were built by ingesting massive quantities of text and images from the internet. This was done with little or no regard to whether those works were subject to copyright restrictions or whether the authors would object to their use.
The rise of generative AI poses important questions for copyright law. These questions, however, are not entirely new. Generative AI gives us yet another context to consider copyright’s most fundamental question: where do the rights of the copyright owner end and the freedom to use copyrighted works begin? Some jurisdictions will choose to answer this question in relation to generative AI with special rules. Others will rely on fair use and perhaps even fair dealing. Some jurisdictions will hide their heads in the sand as this technology develops, tacitly allowing widespread infringement or opting to let others do the heavy technological lifting of training large models. My aim in this Essay is not to establish that generative AI is, or should be, non-infringing; it is to outline an analytical framework for making that assessment in particular cases.