.

COIT20253 Assessment 3: Practical and Written Assessment

Assessment Task:

This is an individual assessment. In this assessment, you are required to produce a report based on the Big Data strategy document you developed forAssessment 2.

At the beginning of the report, you will identify some Big Data use cases based on the Big Data strategies you developedfor Assessment 2.

In the following part, you will critically analyse different Big Data technologies, data models, processing architecturesand query languages and discuss the strengths and limitations of each of them. You will also discuss different Big Data analytics and business intelligence tools that enable businesses to gain actionable insights from Big Data.

Furthermore, you will discuss the Big Data technologies that you could use for data collection, storage,transformation, processing and analysis to support your use cases. You will also illustrate the Big Data technology stackand processing architecture required to support your use cases. Moreover, you will specify what user experiences youare going to provide to aid in decision making. You have to provide the rationale behind each of the choices you make.

Your target audience is executive business people who have extensive business experience but limited ICT knowledge. They would like to be informed as to how new Big Data technologies might be beneficial to their business. Please note that a standard report structure, including an executive summary, must be adhered to.

The main body of the report should include but not limited to the following topics:

1. Big Data Use Cases

2. Critical Analysis of Big Data Technologies

3. Big Data Architecture Solution

The length of the report should be around 3000 words. You are required to do an extensive reading of more than 10 articlesrelevant to the chosen Big Data use cases, technologies, architectures and data models. You will need to provide in-textreferencing of the chosen articles.
You assessment must have a Cover page (Student name, Student Id, Unit Id, Campus, Lecturer and Tutor name) and Table of Content (this should be MS word generated).
Bottom of Form

Caution: ALL assessments will be checked for plagiarism by Turnitin.


Assessment Submission:
You must upload the written report to Moodle as a Microsoft Office Word file by the above due date.

Assessment Criteria:

You will be assessed based on your ability to critically analyse and evaluate different Big Data technologies, and to applyBig Data architecture, tools, and technologies to support Big Data use cases.


Big Data Sample Solution : COIT20253 Assessment 3: Practical and Written Assessment


1.0 Executive summary

In this paper we will discuss about the big data analysis in a brief manner. The Intel Company is well known for its microprocessor. The processors are unique for its features and the storage capacity of data. The Company has aimed to increase the quality of the processor and accelerate he velocity of the processor. It does use the strategy of Package Level Integration particularly in Stacking of the chipsets. The chipsets designed must be efficient in handling big data analysis when it deals with the business intelligence. It is decided to acquire talent for the development of the new technology. It expects the output of the chipsets should be highly efficient to meet the requirements of the user. It has decided to improve its production performance of the processor. In big data, volumes of data are stored in database which is complex in nature to deal with the traditional storage methods and analysis methods. It requires advanced computational tools and techniques to do the analysis. In the Intel Micro processing chipsets, it is designed to meet the operational efficiency of the big data. When involving with complex data, the efficiency of the operations carried out for the analysis of the big data must be high. Apart from business intelligence, big data technology is used in Various chip processing techniques sectors and other domains. The use cases of the big data analytics in those fields are explained. The various analysis required are specified and its procedures are briefed. The architectural framework plays the important part of the analysis of the big data.


2.0 Introduction

In this recent technological world, data is being generated in various sources for varied purposes which results in the growth of big data. It becomes an evolutionary breakthrough, which aids in the collection of large sets of data. Big data is termed as massive amounts of complex data which is either structured or unstructured and are stored in databases. The objective of big data is to analyze and process high volume of data which uses wide range of technologies and intelligent techniques. Big data comprises of four dimensions as volume, velocity, variety and veracity (Taylor-Sakyi, 2016). It does provide lot of opportunities in knowledge processing tasks which will enable the researches done in this field of study. The bigger challenge prevailing in big data is its storage and accessing of data for the analysis of data. The analysis requires highly computational tools which handle the complexities, uncertainty and inconsistencies existing in the big data in effective manner. Big data is used in various domains like retail industry, scientific researches, healthcare, public administration etc. Big data comprises of various strategies which is used in the analysis of data which is associated with the particular domain. The analysis requires tools and models for further processing of the big data. In this report we will discuss about the big data analysis with respect to the technological advancements prevailing in the field. Further we will have a brief about big data use cases which is related to the assessment done for the Intel micro processing chipsets. Finally, we will have a discussion about the big data techniques and tools or models of big data architecture which is used for processing the solution.


3.0 Body of the Report

3.1 Big data use cases

In this section we will have a brief about the use cases of big data in various fields. With respect to the digitalization in the world, big data is used in fields like Chip set designing using big data, storage process, In IoT microprocessor Technologies etc. Here we will have a brief discussion on how the big data is applied in to those fields and its outcomes after applying the big data.

Business Strategy used in the Processing Chipsets

The chipset manufacturing employs packet level integration strategy for its manufacturing. The packet level integration makes use of Foveros 3D stacking and EMIB Technologies. The method of 3D stacking created a revolutionary change in the designing of chipsets in microprocessor. It does helps in connecting components and enables in creating 3D chip architecture. The EMIB technologies a used for 2D stacking of components in the processing chipsets.


Chipset Designing Based on Big Data

In this we will have explanation about big data strategy which is used in the Intel micro processing chipsets. It does uses advanced analytics in the case of Intel processors. It necessitates the development of chipsets which would provide largest memory support and rapid analysis to provide real time insights involved with massive amount of data. The chipset designed for the processor must be highly advanced in complex computational process. In recent years big data and business intelligence is non-separable in terms of technological advancements. The processor makes use of analytics which helps in decision making process in business fields. Further with the usage of big data enhances the performance of processes in business. The chipsets must hold good quality with highest performance computing which will solve various uncertainties and complexities. Hence the chipset will enable the processing of data of high volume from the data sets without any inconsistency. The usage of big data analytics and techniques helps in improving the processing of data in chipsets. The strategies required for the processing of chipsets is formulated which in turn improves the operational efficiency of the Intel micro processing chipsets. The chipsets are designed according to the user requirements hence in turn it improves the production performance. The processor must be capable of accessing the information at ease when it deals with the big data. Thus, the big data analytical process is useful in the field of business intelligence for various purposes.


Enabling Chipset Storage and Networking using Big Data

In the case of Intel micro processing chipsets, the big data made many improvements in the compute and storage of resources. The chipsets are subjected to use Ethernet solutions for its balanced system. It does allow the chipset to import and replicate large number of datasets across the servers. Because of the use of Ethernet solutions in chipsets, the processor provides high-throughput connections while dealing with the big data. The Intel processor makes use of solid-state drives which are high in performance and high –throughput devices for raw storage of data in the chipsets. With the results obtained from the performance of the apache Hadoop, the processor exhibited balanced computing, storage and networking resources. Hence it paved its way for performance advantage by reducing the processing time from 4 hrs. to 12 minutes which is a near real time results with respect to the big data technologies. In case of memory, the environments are data intensive which requires 10 gigabit Ethernet solutions. The processor requires speed and bandwidth for the purpose of supporting low-latency communications. The in-memory of processor requires persistent storage to maintain SQL compliance for database transactions. Hence, high-performance and cost-efficient analytics platform is obtained by the combination of Hadoop framework with the standard server platforms (center, 2015).


Imparting IoT in Microprocessor using Big Data

Big data analytics helps in the process of imparting internet of things (IoT) in the micro processing chipsets. This helped in the area of cost savings, maintenance and high product yields in their manufacturing process. As the data is either structured or unstructured it does use the big data analytics to merge and correlate those data sets which will be used in the creation of business value. Intel experiences reduced time in the case of quality checking of its processor chipsets. It also had the advantage of improved manufacturing and monitoring by means of detecting failures using data intensive process. Intel accomplished the integration of IoT and big data analytics and technologies which will be beneficial to its manufacturing. The integration helps in the case of reducing the production yield loss by monitoring and analyzing the parametric values obtained from the machine and then the parts are replaced before it is subjected to failure. It also helps in identifying the good or defect units by using image analytics which is installed in the chipsets. Further it is stated that, the use of IoT in the Intel manufacturing network, it does have an integrated and validated big data analytics for the business value. The big data analytics provides solution for the data which is extracted from the Intel’s manufacturing network (Optimizing Manufacturing with the Internet of Things).


3.2 Critical analysis of big data technologies

In this section we will discuss about the analysis techniques of big data. The type of analysis will vary among the different type of industries and sectors. The big data strategy used in the Intel processing chipset is package level integration which aims in obtaining high quality outputs and enhance the innovation in the chipsets processing. The analysis of big data is based on the specific technological needs and requirements.The Technologies used in this process are,Apache Hadoop, Microsoft HDInsight, No SQL, Polybase, Cloud technology.


3.2.1 Hadoop Technology

With the help of Hadoop technology many applications will be run based on MapReduce Algorithm, and here the data is processed in parallel process. Hadoop is a tool which is used to perform the statistical analysis of the large amount of data. It is an Opensource network that is written in Java. It is used for storage and computation process.


3.2.2. Cloud Technology

Due to the process of large data storage and process existing data storage is not enough when it is processed with the big data. So, there is a need of cloud storage technology for proceeding and storing, accessing the large data set which is executed in the many application areas.

3.2.3 NoSQL Database - This is used for analyzing the documents, graphs and databases in detail.

Some of the analysis tools involved in this process are as follows,

  • Predictive Analysis Tool
  • Descriptive Analysis Tool
  • Cognitive Analysis Tool
  • Prescriptive analysis Tool

3.2.4 Analysis Tools Used in Big Data Technologies

Predictive analysis – This type of analysis makes use of the past references to predict the future. The predictive analysis is mainly based on forecasting the future of the big data source. The techniques like data mining and artificial intelligence are widely used for analyzing the data which will identify the future scenario. The business patterns and models are developed for the future needs (Youssra Riah, 2018). It is mainly focused on the prediction of future relationship, patterns and the state of the big data. This analysis is useful in Intel chipsets processing to enhance the performance by means of identifying the future predictions. With the help of the past references of the processor chipsets, the features of the chipsets are enhanced for the better productivity and the quality.

Descriptive analysis - this type of analysis helps in determining the current business situation based on the data sources. It does prepare patters and reports which contains the necessary information. It aims in pattern detection and descriptions which is used in various fields of technologies. It digs into the valuable insights of the past from multiple resources and then analyzes the pattern. In the Intel micro processing, the past resources are valued for determining the current state of the processor chipsets. The patterns and reports are made according to the current operational efficiency of the chipsets to evaluate its performance in accumulating the big data. The big data analytics prevails in the chipsets will also be enhanced to obtain the reports.

Prescriptive analysis – it is dealt with the enhancement of the service levels of the big data which reduces the expenses to a certain level. It does follows optimization and randomization of the testing to provide better results. It is mainly useful in determining the cause and effect relationship of prevailing in the results of the analysis and in the optimization policies (Naganathan, 2018). It is mainly focused on the optimal decisions for the future situations. It is determined from the variety of choices given and the options are made from them to derive the future patterns or analytical reports. This analysis requires advanced techniques which is more sophisticated to implement and manage.

Diagnostic analysis – the diagnostic analysis is used for finding out the root of any issues or problems prevailing in the big data technologies. By means of this analysis, it is determined to know why the problem has occurred. It does help in identifying the causes of the events and the behaviors of the patterns associated with the big data. In the Intel processing chipsets, the diagnosis is made to identify the defects when the system is subjected to big data analysis and its processing. The issues and problems are identified and then the solutions are made to rectify those issues. Here in which the actions are taken according to the report given as it answers the question why it has happened.

Preemptive analysis – this deal with the steps or strategies for precautionary actions or events which necessitate the technologies to prevent from any other issues. It is a sort of precautionary method for the data sources (Uthayasankar Sivarajah, 2016).

Hence it is explained clearly about the critical analysis of big data with respect to the technologies. The analysis will vary according to the type of the technology and the source of data obtained from the technologies.


3.3 Big data architecture solution

In this section we will study about the big data architecture and tools which are used for performing the analysis. Big data tools and techniques are widely used in various real time applications. Examples of big data tools used are apache Hadoop, Mango DB, Apache spark, Apache storm, Neo4j, Cassandra etc.

big data sample

Figure:1 Big Data Process Architecture

The architecture forms the basis any system. The big data analytical tools have certain framework for its systematic analysis of the data. The complex nature of big data necessitates the requirement of proper architectural system. Here in which we will elaborately study about the Hadoop architecture which is used widely in many enterprises. It is mostly used for the advantage of refined data management even though it processes large amounts of data at low cost (P.Joseph Charles, 2018).


Hadoop framework

Hadoop is a type of open source project which is coming under Apache (M. Sowmya, 2017). As the data analysis includes distributed data processing methods, and programming and the data pre -processing components. The open source project or tools for data management comprises of following categories like, cluster management, data store, distributed file system, governance and security and data ingestion. Both the tools and application are used for the processing of the big data processes. Hadoop is used in many organizations which deals with the storage large volumes of data. Hadoop enhances the scalability and availability of the big data analytics. It does run on multiple cluster system which uses simple programming modules. It is an enhanced methodology which replaces the traditional database management system. This framework will be useful in handling massive volume of data from the storage without any inconsistencies. It is widely accepted in many organizations for big data analysis. There are some additional packages which are also installed with the Hadoop and those packages are called Hadoop ecosystem. It provides effective and efficient solution for the processing and the storage of the data

The Hadoop has two components distributed file system and map reduce. The distributed file system helps in locating the storage of the data which requires limited functionalities for the processing of data. The map reducing is used for reducing the problems into sub- problems in which the data are solved independently. The results of the sub- problems are used to build the ecosystem along with the specified components.

Data storage layer- used for storing the data is comprised of HDFS which acts as the main storage in the ecosystem. HBase is a column-oriented data base which is used whenever it is necessitate the read and write operation in the database. Yarn is resource management platform which deals with the security and governance of the clusters. Hive is another platform for data storage which is used to manage the large sets of data. Avro is another specification used to serialize the data, managing remote procedure calls which enable the exchange of data from one program to another.

Data sources – the volumes of data are increasing day by day which is stored in the cloud database and hence the complexity of the data also increases. Thus, the ecosystem uses multiple integration strategies which aim in accessing and storing the enormous amount of data into the storage system. In the data storage the data are being collected with the help of NoSQL or SQL databases. Data transformation is done for the process of loading data into the processing phase and it makes use of tools for the transformation. In data processing, both the structured and unstructured data bare combined together to carry out the real-time processing methodology. The data analysis is done with the help of data warehouses. For unstructured data, new functionalities are added which provides better integration of the data. Analysis platforms are required to satisfy the performance level of the data sources. The results of the data analysis are presented in a readable and accessible form for the users.

Governance and security – as the organizations deals with the mass storage and analysis of big data it should focus on the governance and the security of the data sources from being attacked. The architectural frame work must be enhanced with the proper security systems, which adheres with the policies of the governance and the security policies (Mert Onuralp Gökalp, 2017).

Another aspect of the architectural framework is the visualization of the data either in pictorial representation or graphical representation which would help in decision making for the complex concepts to identify the patterns of the data source.


4.0 Conclusion

Thus, we conclude that the report is made for the big data analytics in various technologies are made. This paper defined the strategies and techniques prevailed in big data which is helpful in the analysis of the data. The big data storages are comprising of complex data structures which require high computation methods for its analysis. The big data is useful in various fields in the new era of technological development. The organizations which are comprised of large volumes of data are supposed to use big data analytics which makes their system to run without any uncertainties. The big data uses various analysis methods to analyze the data to find out the predictions or patterns associated with the data stored in the database. The analysis gives solution for the future references which will enhance the technological advancement in the big data development. Then the architecture of the analysis of big data is must to perform any analysis in a sequential order. The data may be processed with the order of preference given in the architectural framework and the output is obtained. The tools and techniques of the big data analysis are easy to use and it adheres with the organizational procedures and policies. The tolls used must be varied to the size of the big data prevailing in the organization.


References

center, I. I. (2015). Getting started with the big data: planning guide ; Intel. big data get started reference guide.

M. Sowmya, N. S. (2017). Big Data: An Overview of Features, Tools, Techniques and Applications. Big Data An overview of Features, Tools, Techniques and Applications.

Mert Onuralp Gökalp, K. K. (2017). Big-Data Analytics Architecture for Businesses: a comprehensive review on new open-source big-data tools. Big Data Analytics Architecture for Businesses a comprehensive review on new open source big data tools.

Naganathan, D. V. (2018). Comparative Analysis of Big Data, Big Data Analytics: Challenges and trends. https://www.irjet.net/archives/V5/i5/IRJET-V5I5373.pdf .

Optimizing Manufacturing with the Internet of Things. (n.d.). https://www.intel.in/content/www/in/en/internet-of-things/white-papers/industrial-optimizing-manufacturing-with-iot-paper.html .

P.Joseph Charles, S. T. (2018). BIG DATA – CONCEPTS, ANALYTICS, ARCHITECTURES – OVERVIEW. https://www.irjet.net/archives/V5/i2/IRJET-V5I231.pdf .

Taylor-Sakyi, K. (2016). Big Data: Understanding Big Data.

Uthayasankar Sivarajah, M. M. (2016). Critical analysis of Big Data challenges and analytical methods. .

Youssra Riah, S. R. (2018). Big Data and Big Data Analytics: Concepts, Types and Technologies. .

.