That team published its perspective in the Thursday, June 15 edition of the peer-reviewed Science journal, and in doing so offered some insight into the future of intellectual property law and art. At the heart of the matter is: what constitutes fair use, does the source and type of materials used to train the AI matter and who owns images generated with the use of an AI assistant?
According to information from the U.S. Copyright Office, fair use doctrine permits creators to use “limited portions of a work,” and in the case of written materials, that can include quotations, criticism and scholarly articles. As such, there is no qualifying number or words, musical notes or percentage of a work that is explicitly permitted or prohibited. Fair use, per the office, depends entirely on the circumstances.
“Much of copyright law relies on judicial interpretations, so it is not yet clear if collecting third-party data for training or mimicking an artist’s style would violate copyright. Legal and technical issues are entwined: Do models directly copy elements from the training data, or produce entirely new works?” posits the journal.
"Radical proposal re: IP rights pertaining to use as AI training inputs: Any AI model that is released 100% free of charge to everyone forever has the right to be trained on ALL human IP ever produced; supersedes all other IP/copyright law in every jurisdiction."
There is no shortage of considerations in play. As generative AI, which draws upon the nearly infinite sea of existing creative materials already produced, seems to yield far more questions than answers. “Generative AI’s reliance on training data to automate aspects of creation raises legal and ethical challenges regarding authorship and thus should prompt technical research into the nature of these systems,” reads the journal. “Copyright law must balance the benefits to creators, users of generative AI tools, and society at large.”
The issue has also caught the attention of the House Judiciary Subcommittee on Courts, Intellectual Property, and the Internet, which hosted the first leg of its exploratory series on the subject last month. During that hearing, another subject matter expert, Chris Callison-Burch, an associate professor of computer and information science at the University of Pennsylvania, offered some proposals to help transition the industry.
"Generative AI has a FATAL flaw. It doesn’t get protection from copyright law. That means AI art is at risk of being copied without legal penalty. Smart creators will take advantage of this gap because human-made art can’t be copied in the same way."
Callison-Burch, who is also the visiting research scientist for the Allen Institute for Artificial Intelligence, indicated during his testimony the research community creating these AI tools is inclined to treat image and language models that are trained by millions of copyrighted images and trillions of words as fair use.
“However, legal precedents have not yet been established. If it were to be ruled that training AI systems on copyrighted works were not fair use, and that every work in the training data set needed an explicit license from the copyright holder, then progress on developing capable AI systems would be jeopardized,” Callison-Burch said.
To that end, he suggests new legislation explicitly grants fair use for AI training but also allow content producers to have some say in what comes of their work.
“A possible outcome could be that a small number of large corporations who already have licensed lots of copyright data could continue to innovate in the field of AI, but startups would be unlikely to be able to do so,” he added. “I propose that any future legislation on AI and copyright should make explicit that training on copyrighted works is fair use. Legislation should also provide a mechanism for creators to opt out of having their work included in training.”