.

Online Learning in Big Data

  • Online Learning is assistant field of machine learning
  • It permits to extend supervised learning model into a enormous dataset
  • The general idea behind this is , It s is enough to study every instance at a time, it’s not required to study all the values in memory to fix a model
  • The way to deploy an online learning algorithm using logistic regression is projected here. In majority of the supervised learning algorithms, it reduces the cost function.
  • The cost function in logistic regression is noted as follows
  • online learning img1
  • J(θ) denotes the cost function and hθ (x) denotes the assumption. The logistic regression is represented using the notation
  • online learning img2
  • The cost function is explored, next is to identify an algorithm to reduce it
  • The non difficult algorithm to attain this is termed as stochastic gradient descent
  • The revised protocol of the algorithm for the weights of the logistical model of regression is represented as
  • online learning img3
  • Here, From a kaggle competition , the titanic dataset is taken to work
  • The source data is available in the directory bda/part3/vwfolder
    • csv is a training data
    • csv is an unnamed data to create new forecasts
  • csv format is to be converted into vowpal wabbit input format
  • Utilize csv_to_vowpal_wabbit.py python script
  • It needs python t work on. It lies in the forlder bda/part3/vwfolder
  • In a terminal , run the command : python csv_to_vowpal_wabbit.py

The underlying code displays the regression model result while executing in a command line. By the result, it is attained with an average log loss and a report about the performance of an algorithm

online learning img4 online learning img5

The model.vw is employed and trained to produce the forecast with the newly available data

online learning img6

The forecasts from the above command are unnormalized to fix in the range of 0 to 1. To perform this, sigmoid transformation is employed

online learning img7
.