Big data. What is it exactly? And if you know what it is, could you tell me if it’s anything else than one or more large data sets? This is what I have been finding out.
There is an increasing amount of information being made available these days and there are a good number very smart and creative people doing fun things with all this information. The resources to do this are available to everyone after all.
It is raw data I’m talking about by the way. Large data sets about various subjects. And it is not just the government, universities and companies dumping their data online, but people like you and me too. There are many ways to use the data and some are doing it better than others. You can read more about that in this report by McKinsey And learn what locals are doing at SETUP. This is all fantastic and exciting, really. Making something interesting and useful from raw data gets me excited. Not that I actually have made anything (yet), but I’m sure if I gave it good shot for a while I could, and so could you. There are opportunities to do good with it, to solve important problems if you were so inclined.
What seems to be a trend in business is that bosses and managers want to make better, more informed decisions using big data. But what this strategy does is force the decision makers to look into the past in stead of trying to read the future . It’s also looking like there is going to be a shortage of people who know how to analyze these vast amounts of data properly and there are issues concerning privacy, logistics and what not, yet to be sorted out.
Besides the actual size of the data sets it’s the bringing together of different types of information that makes big data so useful. And another interesting way of looking at big data is a philosophical one: The following is a quote from a blog post by Mike LaBossiere in Talking Philosophy
The data will, of course, always be less than complete. In addition to the practical limits, there is also the problem of “limited” omniscience—knowing everything that is and was. Unlimited omniscience would include knowing everything, including what will be (assuming that can be known). Given human limitations, we will never have that complete information. As such, the epistemic limits will certainly prevent a perfect model because there will presumably always be past things that we do not know (and perhaps there are unknowable things) and hence they will not be in the data.
Big Data then is interesting and fairly complex. That is why I don’t like when people start using terms like big data for many different things when they don’t exactly know what they mean. This happened and continues to happen with terms like open data (It’s definitely something else than big data, but the two are also connected) and cloud computing (It is not the same as The Internet, which is what many people are using it for) What the convolution of terms creates is confusion and it gives rise to online hacks who over charge for seminars, webinars, lectures, talks, guest posts on blogs etc. all because of managers and CEO’s who don’t want to be left behind and forget to research what they really want to know.