A Data Science Central Community
Words Like Data Lake or Big data are taking on the world these days. Let’s try to understand what they actually means
Data Lake - A data lake is a storage repository that holds a vast amount of raw data in its native format. A data lake uses a flat architecture to store data. Each data element in a lake is assigned a unique identifier and tagged with a set of extended metadata tags. When a business question arises, the data lake can be queried for relevant data, and that smaller set of data can then be analyzed to help answer the question.
Big Data - is a blanket term for any collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.
I understand these two terminologies are different however they are terminologies.
Don’t you think we are getting into process of making terminologies? It’s not far when we have words like “Data Sea” or “Data Universe” (for me Internet is data universe where we have all different type of data and still we are looking ways to analyze it) coming into pictures. My question here is that are these big names that important or we missing the essence of business here that is Analytics, where we get data and analyze it and provide data driven results to Organization to make informed decision.
If you see my previous article there was effort to link old database technique to new storage and retrieval methods which according to me actually fits. We look at old things and giving them new names.
Will like to conclude that terminologies are important but more important is its use in business environment no matter how primitive or old it sound