What's More Troubling? Robot Overlords or Robot Prose.
The clamor around identifying AI-generated writing may be misplaced
OpenAI earlier this week revealed a new AI classifier “to distinguish between AI-written and human-written text.” You can check it out here. The creators of the tool lead off their blog’s second paragraph with this statement:
Our classifier is not fully reliable.
This may be an understatement. The blog post goes on to state:
The classifier is very unreliable on short texts (below 1,000 characters). Even longer texts are sometimes incorrectly labeled by the classifier.
Sometimes human-written text will be incorrectly but confidently labeled as AI-written by our classifier.
We recommend using the classifier only for English text. It performs significantly worse in other languages and it is unreliable on code.
More troubling:
“In our evaluations on a ‘challenge set’ of English texts, our classifier correctly identifies 26% of AI-written text (true positives) as ‘likely AI-written,’ while incorrectly labeling human-written text as AI-written 9% of the time (false positives)….
“We recognize that identifying AI-written text has been an important point of discussion among educators, and equally important is recognizing the limits and impacts of AI generated text classifiers in the classroom. We have developed a preliminary resource on the use of ChatGPT for educators, which outlines some of the uses and associated limitations and considerations. While this resource is focused on educators, we expect our classifier and associated classifier tools to have an impact on journalists, mis/dis-information researchers, and other groups.”
Let’s be clear. This is not a solution. It is at the basic research stage of development. When a model fails to identify AI-written text three-quarters of the time, it is worse than a coin flip. And keep in mind this is for “likely” and “very likely” probabilities. It is not a high certainty result.
The 9% of false positives is even more concerning. This could lead to false evidence being used to undermine someone’s claim that they are the sole author. Do you know who will be most at risk of this? People with strong writing skills. There is little reason to believe that poor writing would ever be assigned a score of “likely” AI-written.
I applaud OpenAI for its openness about the product’s capabilities and limitations. This is refreshing. It also suggests we should not take for granted claims that other solutions can accurately identify whether text is generated by AI.
If OpenAI, the creator of the most noteworthy of the AI-text generation models today, is achieving a 26% hit rate on identifying AI-generated text, it is wrong the vast majority of the time. Should we assume that other solutions are better than this and somehow miraculously found a secret to fine-tuning that has totally escaped the notice of OpenAI’s researchers? It is possible because innovation can come from anywhere. However, there is a good argument that skepticism should be your default position.
It will be important that people who are anxious for a solution that reliably identifies AI-generated text don’t assume more accuracy than what is available. What they are looking for may never be achievable at an accuracy rate that we would find minimally acceptable for other types of applications or measurements.
“Is it even important that we know if text is generated by AI?”
Finding Problems Under Every Rock
A research paper in Science in 2018 assessed Prevalence-induced Concept Change. If you asked ChatGPT to rewrite that title, it might come up with something like “Why Your Brain Never Runs Out of Problems to Find.” That is, in fact, the title of an article summarizing the Prevalence study findings by Harvard post-doctoral psychologist David Levari.
Humans are wired to fear. Safety and security occupy the second of five levels in Maslow’s hierarchy of needs, right after oxygen, food, water, and shelter. Levari and his colleagues found that when people see a problem diminish or disappear, they will simply expand their definition of the problem set and start identifying new characteristics as part of the problem.
I am not introducing the idea of “concept creep” and prevalence around identifying AI-generated text because I don’t see any value in the use case. There are some situations where it may be desirable. With that said, I remember the talk and genuine fear expressed not long ago about the singularity and the imminent rise of super-intelligent robots destined to rule over all of humanity.
So, I find it amusing that so many people are expressing high levels of fear about not knowing if the provenance of some content is human or AI. This topic wasn’t high on their hierarchy of fears when the dominant concerns were enslavement and death.
The fear of robot overlords subsided as everyone realized artificial general intelligence was not just around the corner. There are plenty of people still worried about this fate, but most others have moved on with the realization it is not likely or, at least, not likely in the near term. Are they simply substituting a fear of the unknown impact of AI for a less threatening phenomenon?
What is the Solution?
This backdrop is the perfect context for OpenAI’s new product, which is only accurate at identifying AI-generated content about a quarter of the time. Is the appropriate response for us to mark AI-generated content and be more skeptical about its accuracy? Did the fear of misinformation and the spreading of falsehoods predate AI? Of course, it did. Humans invented this stuff and have perpetuated it through every generation.
When CNET employs an AI text generator to write articles that include basic factual errors, who is responsible? Would the answer be any different if a human had made the same errors? CNET is responsible for the content it publishes. Might it be the editor employed by CNET that should catch these errors? Yes, but ultimately CNET has a responsibility to avoid errors or at least an interest in protecting its reputation.
This is also true beyond the realm of media publishers. Every company has an interest in avoiding errors in its written communications and can’t simply blame an AI model to help salvage its credibility. And an individual has similar interests. Becoming associated with error-prone communications is not a great long-term strategy if your reputation has any value.
The solution is simpler than most people acknowledge. Anyone or any organization that publishes information should simply stand behind the quality and veracity of their work, no matter the source. Ultimately, they will be rewarded or penalized for how well they adhere to these standards.
Reactionaries, Innovation, and New Ideas
The net result of this is a focus on the wrong problem. It is hard to conceive of a scenario five years hence where the amount of AI-generated and AI-assisted content generation is rare. When something is more common than uncommon, what purpose does an AI watermark or detection serve outside of niche applications?
Reactionary tendencies are a common response to all new technology and, indeed, to any form of human progress. These tendencies also often lead to a misallocation of resources to placate reactionaries. It may be necessary, in some instances, to help the innovation gain wider acceptance, but I suggest we see it for what it is.
On a constructive note, here are three areas where I’d prefer AI research resources be allocated today:
Improve generative AI model accuracy: Any progress in reducing hallucinations and confabulations would provide broad benefit.
Improve AI model observability: We know little about how generative AI models arrive at their outputs for any given task. Better observability will be important for not only improving how well the models perform but also for applying them effectively to a variety of use cases.
Detect AI model signatures: Knowing whether an AI model produced a particular artifact will likely be less useful than knowing which model was used. There is a lot of discussion around model bias and skew. A reliable way to identify which model was used could help us all become better at evaluating the true veracity and value of any particular AI-generated content.