avocadoughnut t1_ja35pg6 wrote on February 26, 2023 at 2:32 PM

There's risk of breaking OpenAI TOS by training on their models. It's a hard no for this project to ensure legal safety.

sebzim4500 t1_ja874jk wrote on February 27, 2023 at 3:54 PM

Oh how the turntables.

coconautico OP t1_ja3nvs7 wrote on February 26, 2023 at 4:40 PM

I have manually copy-pasted a few interesting questions (i.e, my input) that I have asked chatGPT previously, that encouraged lateral thinking or required specialized knowledge.

However, I'm not so sure it would a good idea to load thousands of questions indiscriminately, because just as we wouldn't express a question on Reddit in the same way we would in person, when we ask a question to chatGPT (or Google), we slightly modify the way we talk by taking into account the weaknesses of the system. And given that we are looking for a high-quality dataset of natural conversations, I don't think this would be a very good strategy in the short term.

Moreover, we also have to consider that the project prioritizes quality above all else, and unless the number of volunteers ranking questions/replies increases considerably, the "ratio of trees to ready exported" wouldn't increase much either.

LetterRip t1_ja3rzqk wrote on February 26, 2023 at 5:07 PM

> I have manually copy-pasted a few interesting questions that I asked chatGPT and encouraged lateral thinking or required specialized knowledge. > >

Don't do that - it violates ChatGPT's TOS which could result in a lawsuit against the model developers.

coconautico OP t1_ja3ujgs wrote on February 26, 2023 at 5:24 PM

According to OpenAI's terms of service, I'm the owner of the input (i.e., my question), which implies that they can use, modify, and distribute my input for the purpose of operating and improving the ChatGPT system, but they can't do anything to prevent me from using my data in other systems.
Link: https://openai.com/terms/

LetterRip t1_ja4d12c wrote on February 26, 2023 at 7:24 PM

It appears they have changed the ToS. It used to restrict usage of output.

sebzim4500 t1_ja87cym wrote on February 27, 2023 at 3:56 PM

> You may not [...] (iii) use the Services to develop foundation models or other large scale models that compete with OpenAI

coconautico OP t1_ja8abnh wrote on February 27, 2023 at 4:16 PM

I can't use the output of ChatGPT to train other systems, but I can use my input however I want because, according to the TOS, I'm the owner of it.

sebzim4500 t1_ja8agwp wrote on February 27, 2023 at 4:17 PM

Are you using the output of ChatGPT to determine which inputs you copy across and which ones you don't? If not, I agree that you are probably in the clear. Otherwise idk.

coconautico OP t1_ja8dbew wrote on February 27, 2023 at 4:36 PM

No, I don't, because even if chatGPT could answer my question correctly, that doesn't mean that another assistant could.

Therefore, when I come up with a question that, from my point of view could be challenging to answer by a virtual assistant, and regardless of whether I have searched Google/Reddit/StackOverflow/ChatGPT/... for the answer, I end up typing it on OpenAssistant, (again, just my question).

[P] [N] Democratizing the chatGPT technology through a Q&A game

visarga t1_ja2r2fe wrote on February 26, 2023 at 12:11 PM