radi-cho t1_jdz40zp wrote
Last week I released a CLI that can do this at scale: https://github.com/radi-cho/datasetGPT. Will use personal funds to generate somewhat big task oriented dataset later today with gpt-3.5 or gpt-4. Will open source it along a way for people to contribute their own datasets so we can collect bigger ones. Would be helpful both for analysis of how LLMs work and for fine tuning downstream models (Alpaca-like).
Viewing a single comment thread. View all comments