ThrillHouseofMirth t1_j4x7o9e wrote on January 18, 2023 at 10:04 PM

I don't think that there's any way to do so at this point and eventually someone will prove it. "Original" language virtually always is a recombination previous language of sufficient complexity and uniqueness.

A possible solution to this is AI language model providers to provide API's that allow people to check content against an archive of text that it generated.

Any solution needs to monitoring and telemetry based, the days of algorithmic checking are definitively over.

MajorValue1094 t1_j4xdtl7 wrote on January 18, 2023 at 10:44 PM

Agreed, the design of GPT is to be indistinguishable from real text, hence you’re fighting a losing battle (unless you have millions to train a rival network). The only key may be in the way GPT interprets language, we are all aware of how it does not understand what it’s says. If you can find a way to target a pre-trained network at that you may have a chance but in theory by the nature of GPT you will loose.

[deleted] t1_j4yx0ra wrote on January 19, 2023 at 5:40 AM

[removed]

CaptainDifferent3116 OP t1_j4w06rk wrote on January 18, 2023 at 5:27 PM

The best performing one so far would be : https://huggingface.co/roberta-base-openai-detector

Acceptable-Cress-374 t1_j4w9ssn wrote on January 18, 2023 at 6:26 PM

From the model's card:

> Direct Use

> The model is a classifier that can be used to detect text generated by GPT-2 models. However, it is strongly suggested not to use it as a ChatGPT detector for the purposes of making grave allegations of academic misconduct against undergraduates and others, as this model might give inaccurate results in the case of ChatGPT-generated input.

Just for lols, I tested this with text-davinci-03.

Prompt: Write a post about liking fruity icecream, in the style of a reddit post. Make some minor grammar mistakes while writing the post.

Response: hey everyone, so I'm totally loving this fruity icecream lately! I got it from a store near my house and its been so good. Its really sweet but not overly sickeningly sweet, like some other ice cream. Plus the fact that its fruity makes it even better in my opinion. Does anyone else like fruity icecream too? It'd be awesome to hear some of your thoughts on it!

The above detector: > Computation time on Intel Xeon 3rd Gen Scalable cpu: 0.090 s > > Real 0.984

suflaj t1_j4wndsx wrote on January 18, 2023 at 7:49 PM

Using a black box model for this kind of stuff looks like a nice way to get sued

CaptainDifferent3116 OP t1_j4wedjz wrote on January 18, 2023 at 6:53 PM

I'll try and share in a small article how I'm testing with the dataset's details.

TiredOldCrow t1_j4wdufa wrote on January 18, 2023 at 6:50 PM

Nothing works consistently, especially if an attacker tests their own outputs against the open source detectors, or makes manual tweaks to the outputs.

Survey paper

sfhsrtjn t1_j4vxu76 wrote on January 18, 2023 at 5:13 PM

https://huggingface.co/spaces/openai/openai-detector

https://huggingface.co/spaces/Hello-SimpleAI/chatgpt-detector-single

Tried these already? I have not so I can't speak to their quality

CaptainDifferent3116 OP t1_j4vz4wd wrote on January 18, 2023 at 5:21 PM

The first one doesn't seem to work (at least the live test)
The second one is garbage...

sfhsrtjn t1_j4w5dy0 wrote on January 18, 2023 at 5:59 PM

Please be aware of this one as well:

>Edward Tian's app at GPTZero.me

https://www.npr.org/sections/money/2023/01/17/1149206188/this-22-year-old-is-trying-to-save-us-from-chatgpt-before-it-changes-writing-for

Also cannot vouch for this, just trying to be a bit helpful :)

Acceptable-Cress-374 t1_j4wcavd wrote on January 18, 2023 at 6:41 PM

I tested this with text-davinci-03.

Prompt: Write a post about liking fruity icecream, in the style of a reddit post. Make some minor grammar mistakes while writing the post.

> hey everyone, so I'm totally loving this fruity icecream lately! I got it from a store near my house and its been so good. Its really sweet but not overly sickeningly sweet, like some other ice cream. Plus the fact that its fruity makes it even better in my opinion. Does anyone else like fruity icecream too? It'd be awesome to hear some of your thoughts on it!

This site gave me this:

> Your text is likely human generated!

feloneouscat t1_j65vzjx wrote on January 27, 2023 at 10:51 PM

>Make some minor grammar mistakes while writing the post.

Huh. So you told it to do something it wouldn’t ordinarily do.

This seems akin to salesman who took a sledge to a product and then argued that it breaks in the field (true story). When you leave that off, does the paragraph get caught? Or did you muck about to find something that assured it would think it was human generated?

Acceptable-Cress-374 t1_j67w859 wrote on January 28, 2023 at 10:25 AM

That was my first try. I went with the gut feeling that any training that they used for their model would assume bland prompts. I made mine different, and got 97% human generated the first try. Someone else mentioned other things that you could do, like mess around with temperature and such. Those work as well.

junetwentyfirst2020 t1_j4wcwxz wrote on January 18, 2023 at 6:45 PM

It’s important to remember that these models are statistically robust. So while you may get a false positive or false negative, it does not reflect on the robustness of the model.

seventyducks t1_j4zvo3n wrote on January 19, 2023 at 12:43 PM

Where are the benchmarks and analyses that you're basing this statement on?

[deleted] t1_j4z5kat wrote on January 19, 2023 at 7:14 AM

[removed]

Beautiful-Lock-4303 t1_j4yvklh wrote on January 19, 2023 at 5:26 AM

If you could you could just make gpt better through a GAN architecture and then you couldn’t anymore

RoboiosMut t1_j4wanh2 wrote on January 18, 2023 at 6:31 PM

Wondering if you can build a GAN on top of GPT

stablebrick t1_j4wl8xn wrote on January 18, 2023 at 7:36 PM

GPT itself

CaptainDifferent3116 OP t1_j4wsanw wrote on January 18, 2023 at 8:19 PM

I tried that but didn't work very well

hjmb t1_j4xbt1f wrote on January 18, 2023 at 10:31 PM

Take a look at Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods by Crothers, Japkowicz, and Viktor (open access preprint on the arXiv, from October 2022)

Leptino t1_j4zxkyn wrote on January 19, 2023 at 1:01 PM

The only people that have a prayer at doing this, is OpenAI themselves. It is likely they can insert an undetectable watermark in sufficiently generic text output for sufficiently many words which does not distort the meaning or quality appreciatively.

However, there is almost no way this can survive subsequent finetunings.. Like 'rewrite the previous paragraph with three new random words that doesn't change the meaning', and 'change all the nouns/verbs into synonyms that preserves the meaning of the paragraph'.

I strongly suspect (and might one day try my hand at the math) that there can be no such system that works in general against this sort of attack.

CaptainDifferent3116 OP t1_j4wendu wrote on January 18, 2023 at 6:55 PM

Also, did someone build a recent dataset with chatgpt examples for this ?

Anjum48 t1_j4wjh0m wrote on January 18, 2023 at 7:25 PM

I came across this one last week which the author says is a fine-tuned BERT model: https://originality.ai/

CaptainDifferent3116 OP t1_j4wsqjz wrote on January 18, 2023 at 8:22 PM

They don't offer free trial . Who the hell does that ! I won't pay 20$ just to see the perf.

Anjum48 t1_j4zazrm wrote on January 19, 2023 at 8:24 AM

Oops - didn't realise that. Apologies

Skirlaxx t1_j4z95ow wrote on January 19, 2023 at 8:00 AM

Yeah there's a detector on hugging face hub. It's not always correct and it's either sure from 99.99 % or 0.01 % or something. But usually it works.

Nightchanger t1_j50s3j3 wrote on January 19, 2023 at 4:37 PM

It may be possible against specific models if you know them. It's the same as trying to recognize authors according to text

kyoko9 t1_j4yqrka wrote on January 19, 2023 at 4:41 AM

I'm sorry, I don't know of any model that can detect GPT-generated text.

hannahmontana1814 t1_j5093nz wrote on January 19, 2023 at 2:32 PM

If you're looking for a model to detect GPT-generated text, you're out of luck.

[D] Do you know of any model capable of detecting generative model(GPT) generated text ?

Comments