I am not a lawyer; however, I came across some news that I cannot ignore since it might have massive implications for the ongoing AI revolution and the excitement that everyone has about generative models. So, there are two lawsuits, one against GitHub Copilot for the potential violation of the open-source license and another one against Stability AI, DeviantArt, and Midjourney for their use of Stable Diffusion that potentially violates copyrights.
The first matter was a lawsuit filed against GitHub, Microsoft, and OpenAI for allegedly violating the legal rights of creators who posted code or other work under specific open-source licenses on GitHub. The complaint claims a violation of the attribution requirements of open-source licenses, as well as of GitHub's own terms of service and privacy policies, the DMCA, and the California Consumer Privacy Act.
The following open-source licenses require attribution of the author's name and copyright:
- Apache License 2.0 ("Apache 2.0")
- Boost Software License ("BSL-1.0")
- The 2-Clause BSD License ("BSD 2")
- The 3-Clause BSD License ("BSD 3")
- Eclipse Public License 2.0 ("EPL-2.0")
- GNU Affero General Public License version 3 ("AGPL-3.0")
- GNU General Public License version 2 ("GPL-2.0")
- GNU General Public License version 3 ("GPL-3.0")
- GNU Lesser General Public License version 2.1 ("LGPL-2.1")
- MIT License ("MIT")
- Mozilla Public License 2.0 ("MPL-2.0")
Regarding the second lawsuit, it is very similar; it concerns those artists, authors, writers, and other creators who don't consent to use their work. Stable Diffusion, AI generative system, needs millions— and possibly billions— of copyrighted images to train the model. And these copies were made without the knowledge or consent of the artists.
The author of the case provided a comparison of the damage magnitude compared with the largest art heist. "Assuming nominal damages of $1 per image, the value of this misappropriation would be roughly $5 billion. (For comparison, the largest art heist ever was the 1990 theft of 13 artworks from the Isabella Stewart Gardner Museum, with a current estimated value of $500 million.)"
What does it mean for a digital twin?
There may be different implications depending on what digital twin we are talking about.
- The creation of the physical object or system twin implemented as an agent model should be based on data acquired with the owner's consent. I am talking about the models of a truck, vehicle, or equipment that can be reused among multiple agent-based digital twins for various simulations.
- Process digital twins are only relevant to their owners, so there is little reusability value outside the specific enterprise environment. However, process digital twins might be an attractive target for lousy industry actors.
- People's digital twin (consumers and employees), simulating their preferences, biases, or other characteristics, might be a gray area as well. Who owns the data point used to simulate the behavior of an agent? What if the data is biased? Do we want to keep the bias in the prediction or alter it? It seems to me that answers will depend on the specific case.
I look brightly into the future; the lawsuits filed against GitHub, Microsoft, OpenAI, Stability AI, DeviantArt, and Midjourney raise attention to the regulation void and highlight the importance of ensuring proper attribution and consent for the use of open-source code and copyrighted material in the development of AI systems. These cases will define the evolution of future generative AI systems; just as any other technology, they are not exempt from the law and must operate within the confines of established legal frameworks.
The digital twin industry should take note of these legal developments and ensure adherence to the proper protocols obtaining appropriate consent when utilizing data to create models, especially when we talk about people's behavioral and decision data. It's crucial to strike a balance between the benefits of AI and the protection of creators' rights and individuals’ privacy. As the use of AI and digital twins continues to grow, it's essential that the industry proactively address these issues to ensure that the technology is developed ethically and fairly.
Author: Pavel Azaletskiy
