FastestLearner OP t1_j4uhkbm wrote on January 18, 2023 at 9:59 AM

Reply to comment by C0hentheBarbarian in [D] Idea: SponsorBlock with a neural net as backend by FastestLearner

Yes. I did think about that and potential solutions could be:

(1) A startup offering services in exchange of a small fee - The good thing about it is that once you do an inference on a video, you can serve it to thousands of customers with no additional cost (except for server maintenance and bandwidth, but no extra GPU cost other than the first time you ran it on a particular video).

(2) Crowd sourced inference - The current state of the sponsor-blocking extension is that it requires manual user input which it sources from the crowd and collects at a central server. So it's basically crowd-sourced (or peer-sourced) manual labour. I'm sure if someone could come up with an automated version like an executable which runs in the background with very small resource usage, then inference can be done via crowd-sourcing too, the timestamps can then be collected to a central server and distributed across the planet. The good thing about this is that as more and more people join in to participate in the peer-sourced inference, the lower would be the cost of keeping any one peer's GPU busy.

float16 t1_j4uqlls wrote on January 18, 2023 at 11:53 AM

2 seems doable. Not everybody has to have a GPU, but I bet lots of people, including me, would rather spin up the GPU in their personal computer for a few seconds than manually specify where skippable segments are.

The one central server thing bugs me. I'd prefer something like "query your nearest neighbors and choose the one with the most recent data." No idea how to do that though; not a systems person.

FastestLearner OP t1_j4x7rvp wrote on January 18, 2023 at 10:05 PM

Yes. Your first point is something that I would happily engage in as well. I have no problems contributing to the community. Moreover, the extension can have several additional options like:

(i) Do not perform any kind of inference on the client, i.e. always use query existing timestamps from an the online database. This will be helpful for users with low power devices like laptops.

(ii) Perform inference (only) for the video that the client wants. This is, of course, necessary if the video does not have any timestamps on the server. It does the inference and uploads the results on the central server.

(iii) Keep performing inference for new videos (even ones that are not watched by the particular user) - Some folks who runs a powerful enough hardware and are eager to donate their computation time can choose this option. I am pretty sure some folks will emerge who are willing to do this. The LeelaChessZero project banked entirely on this particular idea. For this option, there could be slider to let the user control how much of the resources to keep actively engaged (maybe by limiting thread count).

The second point that you mentioned could be a implemented with a peer-to-peer communication protocol, but if the neural network's weights don't change, then there would be nothing different with most recent vs. stale timestamps. Also, in P2P you'd still need trackers to keep track of peers, which could be a central server or be decentralized and serverless depending on the implementation. One potential problem could be latency though.