It seemed harmless at the time. A simple decision that would haunt me for several weeks. I was having a brief conversation with a co-worker about a recent Mashable article titled, “44 Female Founders You Should Know.” My mentioning of the article led to my co-worker telling me about self-made billionaire Sara Blakely, founder of Spanx (if you don’t know what I am referencing, bless you). To my surprise, she (my co-worker) informed me that Spanx now offers Spanx for men. Naturally, I headed to Spanx.com (don’t click it) to see this firsthand. Unbeknownst to me, I was leaving a data trail.
For the past several weeks, I continually see Spanx banner advertisements on nearly every website I visit. As I have since concluded, my web traffic patterns have been recorded and I am now considered a potential Spanx customer. Data is taking over the world and I have decided to share my insight, horror and plain amazement with you!
Data Scientist, The New “Social Media Guru”
The job title “Data Scientist has become commonplace, overnight. As the social media industry cools slightly or significantly depending on who you ask (reference ‘Facebook IPO‘ and ‘Zynga Struggles‘), Silicon Valley has switched focus over to Big Data. How big is “Big Data”? More accurately, how much is “Big Data” worth? In 2010 and 2011, VCs and the American government sunk over $4 billion into Big Data startups. Could this be a mere fad (reference “acid wash jeans” or “parachute pants”)?
In short, no. Big Data is the product of a few key timeless elements: improvements in technology, demand for unique data sets and desire to access this information quickly/easily. Big Data is here to stay.
Which types of data sets are being collected: Any available information that can be collected, categorized and analyzed to reveal insights about consumers in all demographics.
Information sources: Web-browsing data trails, social network communications and sensor data.
How are they collecting the data: Several methods are used, but a few data collection processes are very prevalent. These methods include: logfile analysis, detecting patterns & changes in large data flows, Hadoop cluster processing claims and SQL queries.
What’s that? You don’t speak data. Just nod your head and smile (that’s what I do!).
Costs: Expensive. No Groupons yet. Investment can be quite substantial with an opportunity for great return if utilized effectively.
The Shift From How We Obtain It…To How We Use It
It’s important to note that data mining is heavily driven by mathematical and engineering solutions. The result is complex data sets that are not limited to three- or four-dimensional taxonomy and order. The challenge for Big Data companies then becomes organizing, analyzing, categorizing and ultimately commoditizing their findings. While Data Scientists continue to progress and provide increasingly sophisticated solutions, the challenges they face remain: verifying accuracy, discovering trends and providing valuable insights to their clients.
Remix! Reworking, Reorganizing and Reinventing Existing Data For New Services
As summer blockbuster movies have shown in recent years, the remake is “so hot right now” (insert Mugatu voice). The startup community is experiencing a similar trend with startups like Hoppit. Like many other founders, Steven Dziedzic started Hoppit because he was inspired by a personal need. As a single guy, Dziedzic would go on as many as three dates a week. While information databases like Yelp provided a wealth of information about restaurants, bars and clubs, there were still information gaps and it took too long to find the information he actually needed.
The missing element or categorization Dziedzic discovered was “vibe.”
Using specific algorithms to target keywords and phrases contained within customer reviews on sites like Yelp, Hoppit’s five-person team is able to determine the “vibe” of popular restaurants in New York (expanding to a market near you soon!). Dziedzic’s revelation is reflective of a much greater point. While improvements in science and technology will continue to eliminate certain functions performed previously by humans, perfecting and refining big data will rely heavily on humans for the time being (at least until Armageddon or when the Clippers win a championship, boom!). Additionally, there are many tasks that still require the ability to make subjective decisions which are a distinctly human attribute for now.
Big Data Cultivated By Machines, Perfected By Humans
We are on the precipice of a fundamental change in labor. Despite where you might think I’m going with this, we are far from the age of the Jetsons. There are no signs of robot maids being available anytime soon (even though I pray every day for just that). However, we have made remarkable steps towards being able to access unprecedented amounts of information available immediately, at your fingertips. Human beings are not in competition with machines. Rather, we have become complementary of each other. The race for unique data sets continues to fuel the growth of the Big Data industry. The solutions provided by Big Data companies remain imperfect. Humans are still required to verify data accuracy, refine algorithms to produce valuable insights and optimize data sets for use by companies. Employing the right team of people to utilize data correctly will ultimately determine a company’s longevity. Like all industries, data also contains a diverse set of jobs ranging from entry level data transcription to high-level data scientists or data forecasters. Data has quickly become one of the world’s largest unnatural resources. How you cultivate, categorize and capitalize (C^3, trademarking that) on this growing resource is up to you?