Generative AI plagiarism is becoming more contentious for content creation, with recent investigations by Israeli startup, Copyleaks, exposing low authenticity of AI-created content. Their comprehensive research unveils that a significant 60% of outputs from GPT-3.5 manifest elements of plagiarized content.
To discern the level of originality in AI-generated content, Copyleaks analyzed 1,045 outputs from GPT-3.5, spread over 26 diverse subjects. The detailed analysis revealed a high prevalence of Generative AI plagiarism, varying considerably among subjects. For instance, Physics manifested the highest levels of identical text, amounting to 27%.
Conversely, Computer Science displayed the highest instances of paraphrased text, 80.7%. Both Physics and Psychology surfaced as subjects with the highest percentages of minor changes at 25.2%. It implies a challenge of unauthorized content modifications and attributions.
Employing an advanced scoring system, Copyleaks gauged content originality, where a 0% score corresponds to absolute originality and 100% denotes a complete lack thereof. Physics stood out with the highest average Similarity Score at 31.3%. Meanwhile, disciplines like Theater and Humanities showcased the least similarity scores.
“As Generative AI adoption grows, it’s important to realize that there are major copyright and IP implications for using AI-generated content,” commented Alon Yamin, CEO and co-founder of Copyleaks. “As our research demonstrates, high percentage of AI content is plagiarized and could be copyrighted which is why it is crucial to identify Generative AI content presence and authenticate it for originality.”
With predictions pointing towards a near future where nearly 90% of online content will be generated by AI, the sanctity of original ideas and critical thinking is in peril. Addressing Generative AI plagiarism is paramount to safeguarding intellectual integrity and originality in this new digital era.