Comments

You must log in or register to comment.

ThrillHouseofMirth t1_j4x7o9e wrote

I don't think that there's any way to do so at this point and eventually someone will prove it. "Original" language virtually always is a recombination previous language of sufficient complexity and uniqueness.

A possible solution to this is AI language model providers to provide API's that allow people to check content against an archive of text that it generated.

Any solution needs to monitoring and telemetry based, the days of algorithmic checking are definitively over.

26

MajorValue1094 t1_j4xdtl7 wrote

Agreed, the design of GPT is to be indistinguishable from real text, hence you’re fighting a losing battle (unless you have millions to train a rival network). The only key may be in the way GPT interprets language, we are all aware of how it does not understand what it’s says. If you can find a way to target a pre-trained network at that you may have a chance but in theory by the nature of GPT you will loose.

5

CaptainDifferent3116 OP t1_j4w06rk wrote

The best performing one so far would be : https://huggingface.co/roberta-base-openai-detector

11

Acceptable-Cress-374 t1_j4w9ssn wrote

From the model's card:

> Direct Use

> The model is a classifier that can be used to detect text generated by GPT-2 models. However, it is strongly suggested not to use it as a ChatGPT detector for the purposes of making grave allegations of academic misconduct against undergraduates and others, as this model might give inaccurate results in the case of ChatGPT-generated input.

Just for lols, I tested this with text-davinci-03.

Prompt: Write a post about liking fruity icecream, in the style of a reddit post. Make some minor grammar mistakes while writing the post.

Response: hey everyone, so I'm totally loving this fruity icecream lately! I got it from a store near my house and its been so good. Its really sweet but not overly sickeningly sweet, like some other ice cream. Plus the fact that its fruity makes it even better in my opinion. Does anyone else like fruity icecream too? It'd be awesome to hear some of your thoughts on it!

The above detector: > Computation time on Intel Xeon 3rd Gen Scalable cpu: 0.090 s > > Real 0.984

22

suflaj t1_j4wndsx wrote

Using a black box model for this kind of stuff looks like a nice way to get sued

12

TiredOldCrow t1_j4wdufa wrote

Nothing works consistently, especially if an attacker tests their own outputs against the open source detectors, or makes manual tweaks to the outputs.

Survey paper

9

sfhsrtjn t1_j4w5dy0 wrote

Please be aware of this one as well:

>Edward Tian's app at GPTZero.me

https://www.npr.org/sections/money/2023/01/17/1149206188/this-22-year-old-is-trying-to-save-us-from-chatgpt-before-it-changes-writing-for

Also cannot vouch for this, just trying to be a bit helpful :)

4

Acceptable-Cress-374 t1_j4wcavd wrote

I tested this with text-davinci-03.

Prompt: Write a post about liking fruity icecream, in the style of a reddit post. Make some minor grammar mistakes while writing the post.

> hey everyone, so I'm totally loving this fruity icecream lately! I got it from a store near my house and its been so good. Its really sweet but not overly sickeningly sweet, like some other ice cream. Plus the fact that its fruity makes it even better in my opinion. Does anyone else like fruity icecream too? It'd be awesome to hear some of your thoughts on it!

This site gave me this:

> Your text is likely human generated!

11

feloneouscat t1_j65vzjx wrote

>Make some minor grammar mistakes while writing the post.

Huh. So you told it to do something it wouldn’t ordinarily do.

This seems akin to salesman who took a sledge to a product and then argued that it breaks in the field (true story). When you leave that off, does the paragraph get caught? Or did you muck about to find something that assured it would think it was human generated?

1

Acceptable-Cress-374 t1_j67w859 wrote

That was my first try. I went with the gut feeling that any training that they used for their model would assume bland prompts. I made mine different, and got 97% human generated the first try. Someone else mentioned other things that you could do, like mess around with temperature and such. Those work as well.

1

junetwentyfirst2020 t1_j4wcwxz wrote

It’s important to remember that these models are statistically robust. So while you may get a false positive or false negative, it does not reflect on the robustness of the model.

−2

seventyducks t1_j4zvo3n wrote

Where are the benchmarks and analyses that you're basing this statement on?

4

Beautiful-Lock-4303 t1_j4yvklh wrote

If you could you could just make gpt better through a GAN architecture and then you couldn’t anymore

3

RoboiosMut t1_j4wanh2 wrote

Wondering if you can build a GAN on top of GPT

2

Leptino t1_j4zxkyn wrote

The only people that have a prayer at doing this, is OpenAI themselves. It is likely they can insert an undetectable watermark in sufficiently generic text output for sufficiently many words which does not distort the meaning or quality appreciatively.

However, there is almost no way this can survive subsequent finetunings.. Like 'rewrite the previous paragraph with three new random words that doesn't change the meaning', and 'change all the nouns/verbs into synonyms that preserves the meaning of the paragraph'.

I strongly suspect (and might one day try my hand at the math) that there can be no such system that works in general against this sort of attack.

2

CaptainDifferent3116 OP t1_j4wendu wrote

Also, did someone build a recent dataset with chatgpt examples for this ?

1

Skirlaxx t1_j4z95ow wrote

Yeah there's a detector on hugging face hub. It's not always correct and it's either sure from 99.99 % or 0.01 % or something. But usually it works.

1

Nightchanger t1_j50s3j3 wrote

It may be possible against specific models if you know them. It's the same as trying to recognize authors according to text

1

kyoko9 t1_j4yqrka wrote

I'm sorry, I don't know of any model that can detect GPT-generated text.

0

hannahmontana1814 t1_j5093nz wrote

If you're looking for a model to detect GPT-generated text, you're out of luck.

0