Remarkable Big Data Pros & Cons

Sogeti Labs

September 24, 2012

Technology Breakthroughs In technological terms, a great deal is happening in the context of Big Data, such as the data-analysis language R. Another example is AMPlab at Berkeley University. With Big Data as its starting point, AMPlab orients itself to the combined forces of Algorithms, Machines & People. A Big Data milestone in the summer of 2012 was the appearance of the GraphChi software, developed at Carnegie Mellon University. This has enabled analyses on a common pc, where previously large computer clusters were occupied for hours performing such tasks. With a Twitter dataset from March 2010 as a benchmark, one single GraphChi pc turned out to be able to analyze this in 59 minutes. The previous occasion this was done, 1000 large computers spent 6.5 hours on the same task. The dataset in question is available free from the Infochimps.com website and contains 40 million users, more than 1.5 billion tweets, and 1.2 billion connections between users. Marketing Future Regardless of how impressive this all may be, the big issue concerns the significance of it all. Not everyone is equally enthusiastic about Big Data; but many, including Jeff Dachis of the Dachis Group, hold the opinion that Big Data on social media forms the glorious hypertargeting future. Just think about it, says Dachis, hundreds of millions of people are busy on the social web, sharing unconcernedly their whole lives with one another. It easily adds up to 500 billion dollars in brand engagement value. At the beginning of 2012, Twitter had a total of 225 million accounts, and almost 200 million tweets were sent every day. In comparison: Facebook has more than 800 million active users and LinkedIn has more than 135 million. This ‘consumerization’ of Big Data will only assume larger proportions in the future. There is skepticism about the use of advertisements on Facebook in particular. Just before Facebook was launched on the stock market, General Motors slashed its advertising budget for the social network. But this same gm still spends three times that sum on engagement with people on Facebook. The major challenge is to measure the ROI of this action. For such tasks we now have advanced Social Analytics tools and new algorithms such as GraphChi. The Hype of Big Social Data One of the applications of Social Analytics is to gather as much information as possible on the online behavior of people — Big Social Data — with the aim of predicting what they are going to do next and what they are going to buy. Peter Fader, a Marketing professor at Wharton Business School and co-director of the Wharton Customer Analytics Initiative, inserts a few prominent question marks here. He compares the projected goldmine of Big Social Data to Customer Relationship Management, which made its breakthrough in the early 1990s. At first it was regarded as the Holy Grail, but nowadays a harder evaluation is given: it causes huge frustration and is much too expensive; in short, the it party has run out of control. Fader is afraid that things will turn out the same way with the current Big Data hype. Dragging the entire Twitter or Facebook ‘fire hose’ through some Social Analytics refinery is simply nonsensical, says Fader. If you wish to get involved in hypertargeting, you have to look at tweets at individual level and link them to the transactions that a person executes. But online and mobile do not jointly form the complete new world that the Big Social Data evangelists, in particular, would have us believe. Of course, more information can lead to new insights, but the question remains as to how many data is needed for this? How interesting is it, actually, to know where someone is shopping at any given moment and what he/she is looking at? And which information on this subject should we retain? Fader argues that the real golden age of ‘predictive behavior’ occurred about fifty years ago. At that time, consumer information was very scarce. In the 1960s, Lester Wunderman began what he called ‘direct marketing’. That was genuine ‘data science’. Everything that could be known about a customer was kept up to date. What the direct marketing pioneers eventually achieved was RFM: the relationship between Recency, Frequency and Monetary value. The effect of F upon M is evident. R was the great surprise: it is easy to convince people to repeat previous behavior, even if they only buy things sporadically. However, you have to reach them immediately. In the marketing business, everyone is familiar with RFM, but it often signifies little to e-commerce people. With lots of Big Data you will undoubtedly come to the same conclusion, but that is a bit of a waste of all the time and effort expended. How to Measure Anything A good eye-opener in this context is the book How to Measure Anything: Finding the Value of “Intangibles” in Business by Douglas Hubbard, published in 2007. This is full of examples and tips to enable you to find out lots of things in a practical manner.

About the author

SogetiLabs gathers distinguished technology leaders from around the Sogeti world. It is an initiative explaining not how IT works, but what IT means for business.

Generative AI

Cloud

Testing

Artificial intelligence

Security

Remarkable Big Data Pros & Cons

Sogeti Labs

September 24, 2012

About the author

Related posts

The Business Case for Addressing Tech Debt

Delta Lake with Azure Synapse: Unleashing the Power of Data

Synthetic tabular data for testing your applications

Whitepaper – Advanced Analytics

UPI – The Data Behemoth

Social Media Analytics in Pharma-Use Case (Part-1)

Presentation Sander Duivestein – Internet culture as a reflection of societal change

Real Fake Newsletter – Issue #89

Issue #67 – Real Fake

Big Data Cluster with Kubernetes and SQL Server 2019

Comments

Leave a Reply Cancel reply