Let’s understand first, what is Big Data?
Big data is a blanket term for any collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.
The challenges include capture, curation, storage, search, sharing, transfer, analysis and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to "spot business trends, determine quality of research, prevent diseases, link legal citations, combat crime, and determine real-time roadway traffic conditions.”
What is Data Mining and how it has benefited so far
Data mining is frequently described as "the process of extracting valid, authentic, and actionable information from large databases." In other words, data mining derives patterns and trends that exist in data. These patterns and trends can be collected together and defined as a mining model.
Data mining is useful for discovering and outline hidden patterns in a specific data. Because the data grows rapidly, it can be difficult to find information manually. Data mining provides algorithms that allow automatic pattern discovering and interactive analysis.
Mining models can be applied to specific business scenarios, such as:
What's the credit risk of this customer?
What are my customers' characteristics?
What products do people tend to buy together?
How much product do I expect to sell next month?
Determining which products are likely to be sold together
Finding sequences in the order that customers add products to a shopping cart