Public Data Integration
Controversy is raging about the clandestine gathering of phone information by the U.S. intelligence community. This practice should come as no surprise to information management professionals who daily empower their companies with as much as information as can be absorbed.
We’ll all be watching where the revelations go. We also know all postal mail metadata has been saved. Web activity? Purchases? Locations? Keep in mind private enterprises can buy information about every credit card purchase that is made. Data is as important to individuals living through the next couple of centuries as is the house they live in, the food they eat and the car they drive.
If your data were utilized, what do you look like? How are you likely to vote? Are you likely to have secret aspects to your life? Have a susceptibility for certain promotions? What policies will you support and how far will you go in noncompliance?
This is a most prescient and defining issue of our time living in the information age.
Take a single data source like Twitter. Do your tweets reveal your personality? A group of researchers at IBM’s Almaden Research Centre in San Jose, California, claim to be able to interpret your personality just based on 50 of your tweets. Allison Stadd, recently tackled this tough topic in this recap of the research.
What about Facebook? Nidhi Subbaraman of NBC News, reports “Not only are companies tracking what you are doing, they are correlating it“ in this article “Facebook Forensics? What the feds can learn from your digital crumbs”.
Subbaraman suggests different personal proclivities are in play just from Facebook:
Facebook, as you might imagine, provides a wealth of identifying information. In a study published in the Proceedings of the Academy of Sciences in March this year, a team of data scientists showed that they could work out a person’s sexual preferences, political leanings, and a host of other character details from their “likes.” In a similar manner, others can work out similar identifying characteristics from “browsing histories, search queries, or purchase histories,” they write in their paper.
The syndicated data marketplace (Dun & Bradstreet, InfoUSA, Acxiom, many industry-specific) is already robust and as much a part of many clients’ business intelligence environments as is their business intelligence or data integration tool. The concept of shared data is proven and in a constraining economy, many companies have made their data into a sellable asset into the marketplace. Private enterprises have been collecting and sharing massive amounts of data for a decade.
I’ve remediated many a “data mart” environment to an integrated data warehouse environment in my consulting. The value of the data goes up exponentially when this happens. The value of integrated data is much more powerful. And accurate. If the horse of single-source data collection has left the barn, what about integrated data?
Strong inferences can be made about people from their integrated data and while we go through the next decade, the analytics will clearly be imperfect. If you’ve ever had a mess up on your credit report, at least there’s a process to fix it. Will there be a process to “fix” your profile implied from your data? Perhaps, according to this article from Forbes.
I have friends who buy all their junk food with cash only. Determine for yourself, from an informed position, how seriously you should take data privacy and if you should have a personal data strategy.