modcowboy t1_jdkz6of wrote

Probably would be easier for the LLM to interact with the website directly through the inspect tool vs machine vision training.


MjrK t1_jdm4ola wrote

For many (perhaps these days, most) use cases, absolutely! The advantage of vision in some others might be interacting more directly with the browser itself, as well as other applications, and multi-tasking... perhaps similar to the way we use PCs and mobile devices to accomplish more complex tasks