dineNshine t1_j5vm4r5 wrote
Reply to comment by mirrorcoloured in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut
By definition. If you force the model to embed a watermark, you can only generate watermarked content. Since OP proposed to embed it into model parameters, it would also likely degrade performance.
Limiting the end user this way is bad, for reasons I have stated above. The right approach is to train a model that fits the data well, and then condition it using the input prompt. Putting arbitrary limits on the model itself to prevent misuse is misguided at best, making sure that only people in power will be able to utilize the technology to its fullest. This would also give people a false sense of security, since they might think that content generated with a model lacking a watermark is "genuine".
If AI advances to a point where the content it generates is indistinguishable from human-generated content and fake recordings become a problem, the only sensible thing we can really do is using signatures. This is a simple method that works perfectly well. For any piece of virtual content, you can quickly check if it came from an entity known to you by checking against their public key.
mirrorcoloured t1_j5wwhn5 wrote
While I agree with your concluding sentiments against centralization and recommending use of signatures, I don't believe your initial premise holds.
Consider stenography in digital images, where extra information can be included without any noticeable loss in signal quality.
One could argue that any bits not used for the primary signal are 'limiting usability', but this seems pedantic to me. It seems perfectly reasonable that watermarking could be implemented with no noticable impacts, given the already massive amount of computing power required and dense information output.
dineNshine t1_j5xeqvi wrote
Embedding watermarks into images directly is one thing. OP suggested changing model parameters such that the model produces watermarked images, which is different. Editing model parameters in a functionally meaningful way would be hard without affecting performance. It seems like you are referring to a postprocessing approach, which is along the lines of what I recommended in general for curating model outputs. In this instance, this kind of solution wouldn't perform the function OP intended, which is preventing users from generating images without the watermark (since postprocessing is not an integral part of the model and is easy to remove from the generation process).
It is conceivable that the parameters could be edited in an otherwise non-disruptive way, although unlikely imo. I don't like this kind of approach in general though. The community seems to channel a lot of energy into making these models worse to "protect people from themselves". I despise this kind of intellectual condescension.
mirrorcoloured t1_j6e1ckl wrote
Yes I wasn't clear on the comparison, but I meant more by analogy that it's possible to hide information in images without noticeable impact to humans. In this space, I just have my anecdotal experience that I can use textual inversion embeddings that use up 10-20 tokens with no reduction in quality that I can notice. I'm not sure how much a quality 'watermark' would require, but based on this experience and the fact that models are getting more capable over time, it seems reasonable to me that we could spare some 'ability' and not notice.
I also agree with the philosophy of 'do one thing and do it well' where limitations are avoided and modularity is embraced. Protecting people from themselves is unfortunately necessary, as our flaws are well understood and fairly reliable at scale, even though we can all be rational at times. As a society I think we're better off if our pill bottles have child safe caps, our guns have safeties, and our products have warning labels. Even if these things marginally reduce my ability to use them (or increase their cost), it feels selfish for me to argue against them when I understand the benefits they bring to others (and myself when I'm less hubristic). To say that, for example, 'child-safe caps should be optionally bought separately only by those with children and pets' ignores the reality that not everyone would do that, friends and family can visit, people forget things in places they don't belong, etc. The magnitude of the negative impacts would be far larger than the positive, and often experienced by different people.
dineNshine t1_j6gikpr wrote
Children and pets are not the same as adults. Guns are also different from language models and image generators. A gun is a weapon, but a language model isn't.
Adding certain protections might be necessary for objects that can otherwise cause bodily harm to the user (e.g. gun safeties), but if you think that people must be prevented from accessing information because they are too stupid to properly evaluate it, then you might as well abolish democracy.
I am not doubting that people can evaluate information incorrectly. The issue is that nobody can do it in an unbiased way. The people doing the censorship don't know all that much better and often don't have the right intentions, as is often demonstrated.
It has been shown that ChatGPT has strong political biases as a result of the tampering applied to make it "safe". I find this concerning.
Viewing a single comment thread. View all comments