A Data Science Central Community
Without any doubt, procedures generally labeled as "big data", "data science", and/or "machine learning" brought new and unprecedented insights into nearly every aspect of society.
Not only can companies like Google and Amazon "predict" the behavior of customers and adapt advertising and recommendations. Also, it is possible to extract personality clues from seemingly insignificant data. Did you know that e.g. the music taste posted in Facebook can provide indications about the intelligence (interesting for employees), the loyalty (interesting for authorities) or the lifestyle (interesting for insurances)? I have included just one study on this interesting as well as annoying issue.
Of course, this has caused many concerns about privacy, customer protection and several types of abuse.
And it may be even worse: Data about humans are not like data about things. Humans can manipulate their data when they expect benefits. To speak more technically: There exist feedback loops between the observed and the observer.
The faking of data is of course nothing new. Employees, insurances, and authorities have a long experience in fighting (and falling victim to) fraudulent behavior. However, something has changed. The algorithms behind "big data" are not known to everyone to the same extent. (Or who would otherwise post "stupid" things in Facebook?) Some individuals "in the know" already began tuning their postings in appropriate ways by omitting or exaggerating "useful" information. It is only a matter of time when specialized companies will provide services that lead to "better" profiles and finally to better chances in the job, credit, or marriage market. (Or have they already?)
Of course, fake can work only if the majority is not "in the know". When this happens (and it could happen quite soon) many algorithms will become useless. Then, new and more tamper-proof methods will have to be designed.
Comments are closed for this blog post