User Controls

Linguistic Fingerprinting

  1. #1
    aldra JIDF Controlled Opposition
    I'm currently writing a script that will download all of a user's posts on this site and analyse them for grammar, spelling, capitalisation, favourite words etc. The script then stores those scores as a 'fingerprint' into an SQLite database for comparison to other users - ostensibly, through simple analysis it should be easy to tell which users are alts belonging to who.

    In announcing this I figure it's easy enough for people to change their posting style (if they haven't already) across their accounts, but I'm only planning on using this site to test the theory anyway - I don't really care who is who. The real basis for this is to run it against anonymous or open comment sections on news sites for instance, as certain topics seem to attract what I believe to be false mass-controlled accounts for the purpose of swaying public opinion.

    In many cases it seems to me that the comments on a news article are more instrumental in shaping the reader's opinion than the article itself.
    The following users say it would be alright if the author of this post didn't die in a fire!
  2. #2
    Call the program altdra
  3. #3
    Ajax African Astronaut [rumor the placative aphakia]
    totaly 2 ez 2 fool yer little script & evry1 els motherfucker
  4. #4
    aldra JIDF Controlled Opposition
    Originally posted by Ajax totaly 2 ez 2 fool yer little script & evry1 els motherfucker


    gonna go back through every single one of your posts and translate them to bling?
  5. #5
    Ajax African Astronaut [rumor the placative aphakia]
    Originally posted by aldra gonna go back through every single one of your posts and translate them to bling?

    may b
    The following users say it would be alright if the author of this post didn't die in a fire!
  6. #6
    Originally posted by aldra I'm currently writing a script that will download all of a user's posts on this site and analyse them for grammar, spelling, capitalisation, favourite words etc. The script then stores those scores as a 'fingerprint' into an SQLite database for comparison to other users - ostensibly, through simple analysis it should be easy to tell which users are alts belonging to who.

    In announcing this I figure it's easy enough for people to change their posting style (if they haven't already) across their accounts, but I'm only planning on using this site to test the theory anyway - I don't really care who is who. The real basis for this is to run it against anonymous or open comment sections on news sites for instance, as certain topics seem to attract what I believe to be false mass-controlled accounts for the purpose of swaying public opinion.

    In many cases it seems to me that the comments on a news article are more instrumental in shaping the reader's opinion than the article itself.

    If I put on my tinfoil hat, I see the possibility of this being abused as it strips away a layer of anonymity. Sure, it doesn't matter on this site, but if it were to be applied across news sites, certain individuals could be singled out and targeted - both justly and unjustly.

    In reality though, I don't really care, and it actually sounds pretty cool.
  7. #7
    Ajax African Astronaut [rumor the placative aphakia]
    I would be interested if you had access to the posting history of totse and all of the successor sites for analysis. A lot of people went with different names across the migrations - some pieced together well, and others not.

    As for here, I'm convinced that the registered users consist of -SpectraL talking to himself through his alts.
  8. #8
    aldra JIDF Controlled Opposition
    Originally posted by Dargo If I put on my tinfoil hat, I see the possibility of this being abused as it strips away a layer of anonymity. Sure, it doesn't matter on this site, but if it were to be applied across news sites, certain individuals could be singled out and targeted - both justly and unjustly.

    In reality though, I don't really care, and it actually sounds pretty cool.

    there's no way I'm the first to have thought of it, so in effect it's likely already being exploited in the way you mention and multiple others.

    my end goal in this, I guess, is to get a general idea of how many public commenters are actually genuine and how many are essentially carpet-bombing public opinion

    that said, it's likely people engaged in this use measures to randomise the patterns I'd be looking for so I'd have to experiment with different metrics to even hope for success
    The following users say it would be alright if the author of this post didn't die in a fire!
  9. #9
    Sophie Pedophile Tech Support
    Originally posted by Dargo If I put on my tinfoil hat, I see the possibility of this being abused as it strips away a layer of anonymity. Sure, it doesn't matter on this site, but if it were to be applied across news sites, certain individuals could be singled out and targeted - both justly and unjustly.

    In reality though, I don't really care, and it actually sounds pretty cool.

    Don't you worry, this sort of intel gathering has been going on for a while.

    ONTOPIC: What langauge are you writing it in Aldra? Also how will you analyze all the data you are collecting(In a nutshell). Also, let's translate this to Python, i was thinking about putting together an open source intelligence framework. Be sure to submit commits for eternal glory.
    The following users say it would be alright if the author of this post didn't die in a fire!
  10. #10
    Sophie Pedophile Tech Support
    Originally posted by aldra there's no way I'm the first to have thought of it, so in effect it's likely already being exploited in the way you mention and multiple others.

    my end goal in this, I guess, is to get a general idea of how many public commenters are actually genuine and how many are essentially carpet-bombing public opinion

    that said, it's likely people engaged in this use measures to randomise the patterns I'd be looking for so I'd have to experiment with different metrics to even hope for success

    And yes, there's already counter measure scripts for this on the githubz but i forget their name.
  11. #11
    aldra JIDF Controlled Opposition
    Originally posted by Sophie ONTOPIC: What langauge are you writing it in Aldra? Also how will you analyze all the data you are collecting(In a nutshell). Also, let's translate this to Python, i was thinking about putting together an open source intelligence framework. Be sure to submit commits for eternal glory.

    Perl at the moment - there are plenty of CPAN libs to do the heavy lifting because I'm only really interested in the fingerprinting logic for this project. I might open a repo or I might just post the code here
  12. #12
    Sophie Pedophile Tech Support
    Originally posted by aldra Perl at the moment - there are plenty of CPAN libs to do the heavy lifting because I'm only really interested in the fingerprinting logic for this project. I might open a repo or I might just post the code here

    Sounds dope, keep me posted.
  13. #13
    SBTlauien African Astronaut
    I came across a program that does this about a while back(about two years ago) but I never really looked into it. I'll try to find it.

    SpectraL knows how to avoid these types of things, you'll never catch me.
  14. #14
    benny vader YELLOW GHOST
    status report.
  15. #15
    Originally posted by SCronaldo_J_Trump Call the program altdra

    MK ALTDRA
  16. #16
    NARCassist gollums fat coach
    two words - predictive text

    all users using this, and spellcheck, will seem to the program to be the same person.
  17. #17
    Originally posted by NARCassist two words - predictive text

    all users using this, and spellcheck, will seem to the program to be the same person.

    i bet Sophie uses that thats why he seems robotic
  18. #18
    aldra JIDF Controlled Opposition
    Originally posted by NARCassist two words - predictive text

    all users using this, and spellcheck, will seem to the program to be the same person.

    that'd only cover spelling and grammar - I'm more interested in language patterns; things like favourite words, local expressions or colloquial language etc
    The following users say it would be alright if the author of this post didn't die in a fire!
  19. #19
    good luck
  20. #20
    Originally posted by aldra that'd only cover spelling and grammar - I'm more interested in language patterns; things like favourite words, local expressions or colloquial language etc

    correlate different users with the frequency of the word "triangle"
Jump to Top