Introduction To Big Data
Big data by definition, is a term used to describe a variety of data – structured, which makes it a complex data infrastructure. One of the commonly used models for explaining big data is the multi-V model.
Big data is data which require the use of new technical styles, analytics, and tools to unlock new springs of business value.
The report aims to promote and communicate advances in big data research by providing a fast and high-quality research for practitioners and policy makers from the very many different communities working on.
Big Data: Background
It was in the 1980s that artificial intelligence-based algorithms were developed for data mining. One of the most popular models used for data processing on cluster of computers is MapReduce. Jackson, Vijayakumar, Quadir and Bharathi provide a survey on the programming models that support big data analytics.
As large and small enterprises constantly attempt to design new products to deal with big data, the open source platforms, such as Hadoop, given the opportunity to load, store and query a massive scale of data and execute advanced big data analytics in parallel across a distributed cluster. Batch-processing models, such as MapReduce enable the data coordination, combination and processing from multiple sources.
FUTURE RESEARCH DIRECTIONS:
There are many open source data mining techniques, resources and tools exist. Some of these include R, Gate, Rapid-Miner and Weka, in addition to many others. Cloud-based big data analytics solutions provide a provision for the availability of these affordable data analytics on the cloud so that cost-effective and efficient services can be provided. The fundamental reason why cloud-based analytics are such a big thing is their easy accessibility, cost-effectiveness and ease of setting up and testing.
Direction in which big data can influence in coming years are as below:
- Evolution of analytics and information management with respect to cloud-based analytics.
- Adaptation and evolution of techniques and strategies to improve efficiency and mitigate risks
The research directions are not limited to the above-mentioned points. The main goal is to transform the cloud from being a data management and infrastructure platform to a scalable data analytics platform.
Main topics in big data:
When discussing infrastructure security, it is necessary to highlight the main technologies and frameworks found as regards securing the architecture of a Big Data system, and particularly those based on the Hadoop technology, since it is that most frequently used. In this section we shall also discuss certain other topics, such as communication security in Big Data, or how to achieve high-availability.
Data privacy is probably the topic about which ordinary people are most concerned, but it should also be one of the greatest concerns for the organizations that use Big Data techniques.. However, we should ask ourselves where the limit regarding the use of that information is. Administrations should not have total freedom to use that information without our knowledge, although they also need to gain some benefit from the use of that data. Several techniques and mechanisms with which to protect the privacy of the data, and also allow companies to still make a profit from it have therefore been developed, and attempt to solve this problem in various different ways.
The most frequently employed solution as regards securing data privacy in a Big Data system is cryptography. Cryptography has been used to protect data for a considerable amount of time..
This section focuses on what to do once the data is contained in the Big Data environment. It not only shows how to secure the data that is stored in the Big Data system, but also how to share that data.
Security at Collection or Storage:
As mentioned previously, Big Data usually implies a huge amount of data. It is, therefore, important not only to find a means to protect data when it is stored in a Big Data environment, but also to know how to initially collect that data.
Policies, Laws, or Government:
Every disruptive technology brings new problems with it, and Big Data is no exception. The problems related to Big Data are mostly related to the increase in the use of this technique to obtain value from a large amount of data by using its powerful analysis characteristics. This could imply a threat to people’s privacy.
In this research, we have examined the innovative topic of big data, which has recently gained lots of interest due to its perceived unprecedented opportunities and benefits. In the information era we are currently living in, voluminous varieties of high velocity data are being produced daily, and within them lay intrinsic details and patterns of hidden knowledge which should be extracted and utilized. Hence, big data analytics can be applied to leverage business change and enhance decision making, by applying advanced analytic techniques on big data, and revealing hidden insights and valuable knowledge.