A Data Science Central Community
Big data is new buzz word in IT industry. Now what question arises what actually is BIG Data, Going by Wikipedia
“Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.”
Don’t you think it’s more technical and complex to understand if you are not a database or It person. For layman let’s try to explain in simple word,
“Big data is data which exists everywhere making it difficult to use or store using currently available database management tools”.
This look little bit of ambiguous like what data exist everywhere, why it’s difficult to store or use. Here is my attempt to explain it
1) Data exists everywhere is data that is available in your phone, on your computer even weather forecast or social network that we use be it a tweet or Facebook comment or like. Everything, everywhere you use its data. However how relevant it is and its relationship with each other is the key.
I use word imagine as there is currently no possible tool or chain of systems that can do it.
This series of data is BIG DATA where all thing are interrelated may be directly or in directly. Hence said data is everywhere. Can you imagine a system where data correlates everything like weather to traffic condition, tyre pressure in your car, optimal route to your office. This all information based on any one condition of weather or traffic. Hence I called data to be everywhere.
2) Currently if we see only YouTube holds million and millions of terabytes of data. Think how much space is required to store the data in only above example. Moreover the processing power that is require to understand and utilize the data as per above example to find the video that violate the traffic rules, I am no math whiz but I can tell you that the mere probability of finding the concern video can run into millions if not in billions. I have taken example of only one video sharing portal. Hence it’s difficult to store and manage.
We can use any technical term to explain Big data this is the close I can get to explain it as a novice.
Big Data is "Everything, quantified and tracked." That is my preferred definition**. That applies to *everything*! There are sensors everywhere, measuring and tracking everything. The vast majority of sensors (for the moment) are humans -- i.e., using social media to sense and to report the pulse of the world. But soon the Internet of Things and Machine-to-Machine data traffic will surpass even the multi-billion human sensors on the planet.
Big Data is "Everything, quantified and tracked." Is definition which applies to quantifiable part of big data? However if we look at non quantifiable aspect like videos we can know its playing Length or its size. I read somewhere YouTube is having an upload of approx. 72Hr per min this is huge. My point here is big data is not mere knowing dimensions however it is to analyze the same fields like video analytics based on it. How will you create a rule that will say this is a said customer behavior?
E.G. if we want to analyze the customer trend in a marked we conduct surveys however we know the survey is as correct as its target audience or the sample set. However there is a huge volume of data is available in social feeds, Tweets, Picture videos which need to be analyzed these is not quantifiable accurately we need big data to make it.
Yes end result will be quantifiable but not the beginning so in my view Big Data is "Everything, quantified and tracked. When it is processed and analyzed."
@Vishal, that is exactly my point! Remember that Big Data is a concept that refers to the full suite of tools and methods that enable "Learning from Data" in all contexts. For example, we have had movies for nearly 100 years, but never have we seen the current scale of personal movie-making on YouTube, which monitors and tracks the loves, passions, and cats of countless human beings from around the world. Similarly, there have been surveillance cameras in banks, airports, and other such places for many years, but there has been nothing like the current monitoring and tracking of the social pulse of the world through ubiquitous cameras, social media, and other data streams. It is precisely because we have such enormous opportunity to quantify and track everything that we are now attempting to do so. Video analytics is just one example of "everything, quantified and tracked." My definition includes exactly that kind of example. The current world of "big data" is totally unlike the "data of the past". See my article here:
I surely do agree with you on that Kirk