.

Life Cycle of Data

Life Cycle of Conventional Data Mining

  • To provide obvious insight from Big Data, it is necessary to imagine its life cycle with diverse stages
  • All the stages in a life cycle are linked with one another
  • This cycle has likeness with data mining conventional cycle
  • It is explain in CRISP method

CRISP Data Mining Methodology

  • It is Cross Industry Standard Process Data Mining which is a cycle explains the method that the experts of data mining using to manage problems in the conventional Business Intelligence Data Mining

Stages of CRISP – DM Life Cycle

life cycle of data img1
  • It was introduced in the year 1996. It specifies how a data mining projects are to be represented

The stages in the CRISP – DM Life Cycle are follows

Business Understanding:  It concentrates on recognizing project main goal and needs from business perspective and then transferring the idea into data mining problem. Basic plan and decision mode is designed

Data Understanding: It initiates with collection of data and discovers data insight and also frames a hypotheses for hided information

Data Preparation: It covers construction of complete dataset. It is done many times but not in any order. It involves transformation and cleaning of data for data modelling tools

Modelling: It involves modelling various methodologies

Evaluation: It involves evaluation of the model whether it is quality one or not. It decides the usage of data mining outcomes which are to be achieved

Deployment: It involves the generation of report  and presenting data in a manner useful to the end user or a customer Big Data Analytics

SEMMA Methodology

  • It is a data mining modeling technique coined by SAS
  • It is Sample, Explore, Modify, Model, Asses

The stages are explained here

Sample: It is choosing a large set of  data set for modeling. Also involves data separation

Explore: It involves identifying predictable and non predictable relationships by visualizing data

Modify: It helps in data modeling. It has methods to create, select and transform variables

Model: To offer desired outcome, various modeling methods are tried

Assess:  The effectiveness and usefulness of the models are evaluated based on the result

The major distinction between CRISM- DM and SEMMA is as follows

  • CRISP-DM concentrates on understanding and preprocessing the data to be utilized
  • SEMMA concentrate son modeling concept

Big Data Life Cycle

The life cycle of Big Data includes the following stages

  • Business Problem Definition: Evaluating problems, gains and cost of a project
  • Research: Identifying solutions reasonable to the concern
  • Assessment of Human Resources: Evaluating the HR requirements for the project
  • Acquiring Data: Collecting Shapeless data from different sources
  • Data Munging : It the base for storing data in a easy to use format
  • Data Storing : It specifies processed data storing database e.g. Haddop, MongoDB, Redis, SPARK
  • Exploratory Data analysis: It is understanding and plotting the data
  • Modeling and Assessment Data Preparation : Its reforming the cleansed data which includes outlier detection, feature extraction feature selection and normalization
  • Modeling :It involves attempting various model and providing business solution based on its performance
  • Implementation : Product is deployed in the company
.