Data mining is the process of extracting patterns from data. Since July, I’ve been doing a lot of data mining from PubMed. Many of my genomic and/or proteomic queries involve identifying genes and/or proteins associated with various diseases, and PolySearch has proven quite useful for quickly identifying such queries against nearly a dozen different types of text, scientific abstract or bioinformatic databases .
Data mining also has a role in modern business and is an increasingly important tool to transform data into business intelligence. On Monday, I heard a story on American Public Media’s Marketplace about data mining and privacy. According to the story, Facebook has changed the data mining game. An individual’s Facebook data, coupled with their search data, is very powerful.
Thank you Mark Zuckerberg.
Essentially, your Facebook page is a consumer profile that you have built yourself. Instead of making data miners build your profile from search data, you’ve listed all your interests, likes and dislikes in one central location. Search data becomes secondary to the information you yourself have collected.
Currently, advertisers use your information to show you ads for things you’re likely to want or need. However, many predict that data mining will be used by other companies, such as insurance companies and creditors, to determine your level of access to credit or insurance risk.
What really caught my attention was the proposed solution to this problem of reduced privacy: taking ownership of our data and crafting a personal brand using everything from LinkedIn to Facebook to Twitter. In terms of how you’re perceived by people — friends, colleagues, prospective mates and prospective employers — everything you share with your personal network can advance a lot of what will happen in your life.
Indeed, over the past decade, our culture has traded privacy for social connectivity. According to Andreas Weigend, a data mining expert at Stanford:
Maybe privacy was just a blip in history. It started when people moved to cities, where they had places to hide. And it ended with the Internet, when basically there was no place to hide left.
The take-home message: quit trying to hide and take control of how you’re perceived.
Cheng et al. PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites. Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W399-405. Epub 2008 May 16.
Walter Jessen is a digital strategist, writer, web developer and data scientist. You can typically find him behind the screen something with an internet connection.