Sep 28, 2023

AI, Creativity, and Labor

I tend to avoid writing too much about current events because there are voices better than mine on most topics and such posting has a limited shelf-life. I decided I'd weigh in on this one because it is at the intersection of two of my passions.

If you haven't heard, the Writer's Guild of America (WGA) went on strike about five months ago. Their demands include residuals for streaming content and limiting the use of artificial intelligence in the creation of movies and TV shows, among a few other things. The union seems to have struck a great deal, achieving most of what they want.

In related news, a large group of authors has sued OpenAI, the maker of ChatGPT claiming that ChatGPT was trained by illegally using their copyrighted works. This case contains more nuance and I think for authors this might be something where we need to be careful what we wish for.

I posted a few weeks ago with an explanation of AI in "Always Artificial, Rarely Intelligent". If you haven't read it, you should, but the short version is "AI" is more properly called "Machine Learning" and it's not AI, it's a fancy matching algorithm. That's not to say it isn't useful, but it's not replacing humans, not yet anyway.

I touched on it in that article but I will say it again. Artists and creatives must be compensated for their work. That part isn't in question and I think the WGA agreement strikes a good balance. People using ChatGPT to make "books" on Amazon, going so far as to put real authors' names on them, is abhorrent. The WGA contract makes it so AI can't be used as "original" works and WGA writers must know if some of what they're working on is AI-generated. Seems reasonable, and allows each writer to make their own decisions about it.

I worry about this other lawsuit because it strikes me as similar to the problems I had with the lawsuit against the Internet Archive. For background, the Internet Archive acted as a digital library and publishers said that isn't fair use because they loaned more copies than they actually own. There's some nuance to it and while it's a gray area and IA should have been more careful, I err on the side of making work available to wider audiences.

What does this have to do with ChatGPT? The new lawsuit claims ChatGPT produces high-quality summaries of copyrighted material and to do that it must have been fed those copyrighted works. OpenAI admits that ChatGPT has read copyrighted works that are publicly and legally available on the Internet.

What we have is a Fair Use question. If I can read it with my own eyes, is it okay to use a tool to read it? If the answer to that is no, or the answer is "some tools" then we enter some slippery territory. Let's start at a very basic level, and presume the book is offered as a PDF file. My PDF software will index the text and offer me a search function. Is the PDF software itself now in violation of copyright? What about a search engine that led me to the work in the first place?

Then there's the content generated. If I read a book and write a summary of it, have I created a derivative work? Is that different if a program generates summaries for millions of books? If it's different, how do you enforce it? How do you draw the line?

I worry that it's publishers who benefit from a suit like this. It gives corporations more control over how readers can consume content. It could make it harder for new or indie authors to get exposure.

I don't have a solution but I hope we can find a balance between some very useful tools and for creative people to be compensated as they deserve. I think the WGA agreement is a good starting point.

Subscribe to cfreak.dev