YoutubeStruggle OP t1_j6ebgjv wrote on January 29, 2023 at 7:17 PM

Reply to comment by mkzoucha in [P] AI Content Detector by YoutubeStruggle

Did you give it a try? It shouldn't be easy to fool this tool. Can you give an example of when it gives a false positive? Your feedback is appreciated.

mkzoucha t1_j6ed1z9 wrote on January 29, 2023 at 7:28 PM

I did not have time to try this specific one but I have tried at least 10 others. Sorry, not trying to be negative or anything. They’re are just tons of different models, each of which would need a separate detection model. The model was trained on human writing, so it’s bound to have humanistic sound, and some humans are bound to have a writing voice similar to the output of AI content creators. There is also no real standard ‘human’ way of writing to clearly separate the two. Combine that with the difference in results based on the prompt and it quickly becomes an insurmountable task in my opinion.

At the end of the day, I applaud your efforts, truly but realistically I think your model is significantly overfit to a very small percentage of possible samples, both AI and human generated.

YoutubeStruggle OP t1_j6efuzb wrote on January 29, 2023 at 7:46 PM

I agree, but the point is AI, and e.g. chatGPT, will always have one way to generate content. Whereas humans may have diverse ways of writing and suppose if we consider an essay or an article, the way of writing by a human would vary with every single sentence but it would remain the same for AI throughout. That's how AI-generated content can be detected. If we do para-wise analysis, we would get better results and a clearer picture but it won't be the same for sentence-wise analysis. And there should not be any possible way that for a particular human, all the generated paragraphs come out to be detected as AI-generated.

mkzoucha t1_j6edujv wrote on January 29, 2023 at 7:33 PM

Also, one more thing, at the end of the day there is no way to prove either way without having students record their screens and entire rooms (or only do in person) when writing papers

YoutubeStruggle OP t1_j6eh9z2 wrote on January 29, 2023 at 7:55 PM

AI can generate text that resembles human writing, but it is still not capable of truly replicating the depth and nuance of human writing. AI text generation models can generate text that is coherent and grammatically correct, but it lacks the personal touch, creativity, and emotional depth that is unique to human writing. This is because AI is trained on large amounts of data and generates text based on statistical patterns in the data, whereas human writing is influenced by personal experiences, emotions, and individual perspectives. Additionally, AI text generation models may still struggle with context-awareness and understanding the full meaning behind the words it is generating. So, AI-generated content can often be distinguished from human-written content by its lack of originality and personal touch.

That's what chatgpt thinks about writing text resembling human-generated content :)

royalemate357 t1_j6espto wrote on January 29, 2023 at 9:05 PM

excluding the last sentence of your comment, your website says this comment is more likely AI generated (33% human). link: https://imgur.com/a/vqBo4BK

YoutubeStruggle OP t1_j6ezofb wrote on January 29, 2023 at 9:46 PM

I see, will look into it.

mkzoucha t1_j6eied9 wrote on January 29, 2023 at 8:03 PM

What are your training sample sizes? What about test? How was your data compiled? Labeled? What ai models? What were the sources of human writing?

YoutubeStruggle OP t1_j6ekdz9 wrote on January 29, 2023 at 8:15 PM

My total data size is 40K paragraphs, where I have used Roberta-base and chatGPT was used for ai-generated sentences.