HGFlyGirl

HGFlyGirl t1_j2ewwry wrote

Tried a few of these things. The problem was that a lot of the songs had been ripped from CD's using different software. So, some would be called things like track01.mp3 with a duplicate with a completely different file name. These could also have different byte lengths and durations. Then there are the ones that come from the original recording, the live version and/or the compilation album - which often differ a bit in all the parameters.

1

HGFlyGirl t1_j2al2u4 wrote

Whatever solution you find, be mindful of how it impacts the bottom line. It's easy to spend more on protection against theft, than you could lose from a theft.

It could be impossible to make it completely safe from theft, but it can be made difficult and as you say - your customers have little knowledge of computers. I have had a customer actually pay a hacker to steal my software, I caught them at it and a letter from the legal team was all I needed. I caught it because I had legitimate remote access.

Can you encrypt the model and make your software temporarily decrypt it at the point of inference? This might make the model useless in isolation.

1

HGFlyGirl t1_j2aen6d wrote

I trained a model to find duplicate music files in my brother's huge collection of digital music. He was frustrated by so many duplicates that still had different file names, file sizes and tags. We couldn't find any existing software that could do it - because they were all just looking for matches on those parameters. The model ended up working quite well.

5