shitasspetfuckers

shitasspetfuckers t1_jed7vuu wrote on March 31, 2023 at 3:58 AM

Reply to comment by SeymourBits in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

Can you please clarify what specifically you have tried, and what was the outcome?

shitasspetfuckers t1_jed796l wrote on March 31, 2023 at 3:52 AM

Reply to comment by Qzx1 in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

> Google's Spotlight paper

https://ai.googleblog.com/2023/02/a-vision-language-approach-for.html

shitasspetfuckers t1_je6p0z9 wrote on March 29, 2023 at 8:15 PM

Reply to comment by detached-admin in [D] The best way to train an LLM on company data by jaxolingo

Why not other people's money?

shitasspetfuckers t1_je1v7pf wrote on March 28, 2023 at 8:21 PM

Reply to comment by reditum in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

Can you please clarify what specifically about their approach wasn't great?

shitasspetfuckers t1_iwzs5i8 wrote on November 19, 2022 at 5:08 PM

Reply to comment by flapflip9 in [P]Modern open-source OCR capabilities and which model to choose by Rodny_

Link: https://github.com/open-mmlab/mmocr