Awesome
####Supervised learning #####Linear Regression
- Linear Regression
- Assumptions
- Variance Inflation Factor
- T Test
- F Test
- Adj. R Squared vs R Squared
- Mape
- Interpreting Actual vs Predicted Plot
- Interpreting Residual vs Predicted Plot
File:LinearRegression.R
Required packages: car, lmtest, ggplot2
Input parameters:
c_path_in - path pointing to the input .csv file
c_path_out - output folder path
c_var_in_independent - one ore more independent variable(s)
c_var_in_dependent - one dependent variable
Outputs:
parameterEstimates (Data Frame) - Contains estimates of all the inpendent variables used for building the model &
is exported as Estimates.csv to the location "c_path_out"
modelStatistic (Data Frame) - Contains various model statistics & is exported as ModelStatistic.csv to the location "c_path_out"
durbinWatsonTest (List) - Contains statistic used for testing autocorrelation
goldfledQuantdtTest (List) - Contains statistic used for testing homoscedasticity
ActualPredicted.png file is exported to the location "c_path_out"
ResidualPredicted.png file is exported to the location "c_path_out"
#####Logistic Regression
- Logistic Regression
- Chi-Square Test
- Model Evaluation Metrics(Gains Chart, Lift Chart, K-S Test, Confusion Matrix, ROC)
- Hosmer-Lemeshow Test
File: LogisticRegression.R
Required packages: car, ResourceSelection, ggplot2
Input parameters:
c_path_in - path pointing to the input .csv file
c_path_out - output folder path
c_var_in_independent - one or more independent variables
c_var_in_dependent - one binary dependent variable
x_val_event - level of dependent variable
Outputs:
parameterEstimate(Data Frame) - Contains estimates of all the independent varaible used for building the model & is exported as Estimate.csv file to the location "c_path_out"
modelStatistic(Data Frame) - Contains various model ststistics & is exported as ModelStatistic.csv to the location "c_path_out"
hosmerLemeshowTest(List) - Contains statistic for testing goodness of fit
ks_out(Data Frame) - For measuring the performance of the classification model
GainsChart.png is exported to the location "c_path_out"
LiftChart.png is exported to the location "c_path_out"
File: LogisticModelAnalysis.R
Input parameters:
c_path_out - output folder path
modelObj - Logistic Model Object
Description:
Contains function for generating Confusion Matrix for all the cut points between 0.01 & 0.99. ROC curve & Sensitivity - Specificity curve is plotted & exported as ROC.png & Sensitivity-Specificity.png respectively to the location "c_path_out"
#####Bayesian Belief Network
Understanding Bayesian Network using bnlearn - R Package, 1 Scoring
#####Naive Bayes
####Classification #####EM Clustering
####Ensemble #####Random Forest
####Recommender System Introduction
####Factor Analysis #####Mutual Information
#####Linear Discriminant Analysis
File: Distance.R
Description:
Contains various method for calculating distance between two vectors