
Loquzofaricoalaphar OP t1_j5kmiqg wrote

Yes this is the sort of thing I am thinking about. Some percentage of people have very distinct styles, however with Ted it might have been the content that gave it away.

Yes I am familiar with amiunique and all the variables of the browser.

I wonder if this way of identifying people is ever used when google or others get subpoenaed and hand over stuff. It would be more accurate than IP in determining the individual with correlations it seems, however I wonder if accepted by or holds up in court of law?


Loquzofaricoalaphar OP t1_j5h6s4z wrote

That is interesting to think about. I’m biased to think text patterns have lots of variables and are fairly unique. Perhaps it’s more of a model than compute problem to analyze it at scale and not get mush.


Loquzofaricoalaphar OP t1_j5h5kq4 wrote

Perhaps It could return the top 10 likelihoods of the author of the account, some patterns of writing and and grammatical errors might be pretty unique and the more post it has the more unique right?


Loquzofaricoalaphar OP t1_j5h59id wrote

So like if you fed it 200 peoples samples you were looking and then fed it Reddit? Perhaps all of Reddit would be tricky because some might not have public text and it would be difficult to label all the text on Facebook or link-en, etc.