1. 1 DISCUSSING
  • Jonathon Colman   Apr 22 2012   Flag

    Hey folks, I thought this was a fascinating post. Is there ever a time when it would serve a content creator (or the public) to have their identity anonymized so that content can not be directly tied to them? Content creators go to such fantastic length to claim their content -- through copyright, through metadata, through markup, etc. -- that they wouldn’t ever consider the opposite condition: where content that they do not want to be attributed to them is identified algorithmically as being theirs. While this seems contrary to the goals of SEO, being able to detect (algorithmically) when it occurs is a highly valuable pursuit. The author proposes that it would be useful when reviewing job applications in order to determine when an applicant pretends to be smarter or know more than they really do. Another example is when an author is publishing a review and does not wish for his/her identity to be stylistically determined from his/her other writings online. A more vibrant example is in determining who in a chat room is a child predator pretending to be a child. The author imagines ways of achieving this, but doesn’t deem any of them to be of significantly high quality. That said, I think the first commenter has a quick and dirty solution (though one that is not algorithmically driven): hire cheap copywriters through Mechanical Turk or some other service to re-write the content several times over, which should result in a loss of "style fidelity", making the content pretty much anonymous. Performing this process on a moderate collection of, say, a few hundred thousand documents might allow for the creation of an algorithm that recognizes sequences of "style" and replace them with bland prose. What do you think - is there ever a time in which we wouldn't want publicly published content to be identified with a creator?

You must login to post comments.