Viewing a single comment thread. View all comments

dmart89 t1_j4l4vyz wrote

Right now, no. They're working on a digital watermark for model outputs to distinguish whether gpt wrote something or a human.


EmbarrassedHelp t1_j4lyssq wrote

The digital watermark though risks damaging the model outputs, and would rendered useless when changing generated the text output yourself.


dmart89 t1_j4mkxyd wrote

I guess we don't know how they'll do it yet, but from what I understand, the purpose is to prevent future gpt versions to train on gpt generated text because gpt trains on text from the Internet.


lumin0va t1_j4mo4ew wrote

As if that won’t be easy to bypass


dmart89 t1_j4nio9p wrote

Idk, I guess the point is that if text is 100% gpt written and not reviewed by a human, then there is a risk that gpt learns from bad gpt examples. If you review and modify it to remove the watermark, then it is effectively human reviewed/labelled content and ok for re-ingestion in future iterations.

But tbh the guys at openai are pretty capable, I'm sure they'll think of something. I don't know anything more than the headline I read.


armchair-progamer t1_j4ovjjm wrote

> digital watermark

Wouldn't it be easier to store the model outputs or a perceptual hash, and then provide a way to determine if some text is similar to prior ChatGPT output? I assumed they were already doing something like this to collect usage data as they scrape new content.

ChatGPT already has a unique writing style, I'm not sure how you could add anything to the text which couldn't be trivially removed and do better


Fit_Macaron4492 t1_j4q1jsu wrote

Not really, I tried Chat GPT a few days ago. Thus I gave it a theme in which I had written an Essay before and asked it to rewrite it. I sent both texts to my father, who knows my writing style, and he was unable to differentiate who wrote which one. To be fair, you can tell the AI to give you a whole paragraph in other words, which often improves the language.


Leptino t1_j4oxrdp wrote

It shouldn't be too difficult to produce a watermark provided the output is something on the order of a paragraph. However, I don't think its always possible. For instance if I ask ChatGPT to replicate the previous paragraph by replacing all nouns and verbs and to keep the same meaning.

Further tweaking by a human should completely destroy any residual.


blablanonymous t1_j4q72d0 wrote

I’m really curious how that would work. It seems very constraining to watermark text. Any existing solutions? For audio and pictures it seems pretty straightforward but for text?