lcrew001 - Tumblr blog

lcrew001 · 4 years ago

Text

Final Submissions

Final Report finished word count : 11,367

Video and presentation made for showcase

Submission Complete

0 notes

lcrew001 · 4 years ago

Text

Final Results and Report

Evaluation of final results on test data made for all algorithms and added to report

0 notes

lcrew001 · 4 years ago

Text

Final Report

Correction have been made to the draft report and have made progress in developing the final submission.

0 notes

lcrew001 · 4 years ago

Text

Draft Report

Draft Report structured and planned out. Additionally have already started the writeup. Mainly talking about the data, the project as a whole and the 5 ML classifiers so far. Neural Network and LSTM still need work so probably will not talk about it in the draft report.

0 notes

lcrew001 · 4 years ago

Text

Classifiers Organised

Implemented standard scaling for data on classifiers that calculates distances or uses gradient decent. 5 good classifier models made with visualisations and implemented an optimised grid search.

0 notes

lcrew001 · 4 years ago

Text

Basic Report Draft Information

Introduction to project.

Communicate steps for someone to repeat the process of obtaining data.

Talk about models and the order of tuning parameters (general workflow).

How the classifiers work, what attributes do they have that could help.

Table of results and my best models and parameters.

Evaluation of each models in general, compare the best models and talk about the best model.

Evaluate ROC Plots of models.

0 notes

lcrew001 · 4 years ago

Text

Random Forrest

This classifier is showing the best results so far. It’s showing accuracy scores of over 90% and managed to push the result to 94.57% accuracy with 100 trees.

0 notes

lcrew001 · 4 years ago

Text

Logistic Regression

Only a base model has been made for this classifier. So far results of 60% accuracy are being seen.

0 notes

lcrew001 · 4 years ago

Text

KNearest Neighbour Progress

KNN classifier getting a maximum of 67.8% accuracy with 5 neighbours.

0 notes

lcrew001 · 4 years ago

Text

LSTM and Colab

Decided to use google colab for running the LSTM due to high demand on hardware.

0 notes

lcrew001 · 4 years ago

Text

SVM Test Scores

I tested the final support vector machine classifier (kernel = ‘rbf’, C = 30).

Accuracy = 85.2%

0 notes

lcrew001 · 4 years ago

Text

Reformed SVM

Created many different models to find best parameters for the support vector classifier. After an intensive search I believe I have found the best parameters following the investigation. The best kernel was sigmoid. Best C was 30. The other values did not increase the performance of the default values. So I then combined these parameters and got a result of 75.3%. However In a previous experiment, just running C=30 gave me a score for around 78%. The default kernel for this run is rbf. I additionally look at the confusion matrix for them and saw that rbf was a lot better at identifying true positives just like changing the C value. This is in contrast to using sigmoid where it was good at identifying true negatives but not much else. this made me choose the default kernel of rbf as I thought they’d be a better match and would work better together as before they did get a score of 78%.

0 notes

lcrew001 · 4 years ago

Text

SVM Manual GridSearch

As mentioned before looping through different combinations or using a Grid Search take a very long time considering the number of features. I decided to run some different combinations separately to have more control over the time it will take to check for best parameters. Currently testing for the best kernel and is taking a while for one. Running any kind of grid search for all different combinations does not seem realistic.

0 notes

lcrew001 · 4 years ago

Text

LSTM Supervisor Tips

Increase dense layer size considerably (bigger than width of features). Try alternative loss functions. Output to 2 dense layer instead of 1.

0 notes

lcrew001 · 4 years ago

Text

Visualisations

I have planned out how I would like to use visualisations in my report and planned out how to build story with the the Decision Tree and SVM showing improved performances and changes in graphs.

0 notes

lcrew001 · 4 years ago

Text

SVM Update

I added an old technique for finding best parameters with nested for looping through parameters. However I realised that it has a very bad time complexity of O(n^5) and n being a big number. This meant it would take a long time run. I also implemented a much newer technique called grid search which also looks for the best parameters but will take a long time too.

0 notes

lcrew001 · 4 years ago

Text

Previous Week

I aimed to work on and try to improve the performance of the LSTM. However after many hyperparameter tuning I failed to gain any real progress with it. So I changed my focus onto working on the SVM and building visualisations to assist in research for my report.

0 notes