#ntree
Explore tagged Tumblr posts
Text
Assignment: Running a Random Forest
I am an R user, so conducted the assignment in R instead of SAS or Python.
load packages
library(randomForest) library(caret) library(ggplot2) library(readr) library(dplyr) library(tidyr)
Load the dataset
AH_data <- read_csv("tree_addhealth.csv") data_clean <- AH_data %>% drop_na()
Examine data
str(data_clean) summary(data_clean)
Define predictors and target
predictors <- data_clean %>% select(BIO_SEX, HISPANIC, WHITE, BLACK, NAMERICAN, ASIAN, age, ALCEVR1, ALCPROBS1, marever1, cocever1, inhever1, cigavail, DEP1, ESTEEM1, VIOL1, PASSIST, DEVIANT1, SCHCONN1, GPA1, EXPEL1, FAMCONCT, PARACTV, PARPRES)
target <- data_clean$TREG1
Split into training and testing sets
set.seed(123) split <- createDataPartition(target, p = 0.6, list = FALSE) pred_train <- predictors[split, ] pred_test <- predictors[-split, ] tar_train <- target[split] tar_test <- target[-split]
Train random forest model
set.seed(123) rf_model <- randomForest(x = pred_train, y = as.factor(tar_train), ntree = 25) rf_pred <- predict(rf_model, pred_test)
Confusion matrix and accuracy
conf_matrix <- confusionMatrix(rf_pred, as.factor(tar_test)) print(conf_matrix)
Feature importance
importance(rf_model) varImpPlot(rf_model)
Accuracy for different number of trees
trees <- 1:25 accuracy <- numeric(length(trees))
for (i in trees) { rf_temp <- randomForest(x = pred_train, y = as.factor(tar_train), ntree = i) pred_temp <- predict(rf_temp, pred_test) accuracy[i] <- mean(pred_temp == tar_test) }
Plot accuracy vs number of trees
accuracy_df <- data.frame(trees = trees, accuracy = accuracy)
ggplot(accuracy_df, aes(x = trees, y = accuracy)) + geom_line(color = "blue") + labs(title = "Accuracy vs. Number of Trees", x = "Number of Trees", y = "Accuracy") + theme_minimal() I conducted a random forest analysis to evaluate the importance of a variety of categorical and continuous explanatory variables on a categorical outcome variable - being a regular smoker. The five explanatory variables with the highest importance in predicting regular smoking were: ever having used marijuana, age, deviant behaviour, GPA, and school connectedness. The accuracy of the random forest was 83%, which was achieved within 3 trees. Growing additional trees did not add much to the overall accuracy of the model, suggesting a small number of trees is sufficient for identifying the important explanatory variables.
0 notes
Text
If you were called Ree, your name would not only mean wild, fierce, outrageous, overexcited, frenzied, delirious and crazy
But your nickname could be c*ntree
Like omg
To any of the Rees out there i now live you simply for your name and i think yiu should take this new nicknane
Thank you
Yours faithfully,
wedjenowif
0 notes
Text
Random Forest Classification using lending club data

1. What is Lending Club? 2. Data used 3. What is Random Forest Classification 4. Evaluation
1. What is Lending Club? Lending club is a peer to peer lending company which is headquartered in San Francisco. The company connects people who need money (Borrowers) with people who have money (Investors) through its online marketplace. Investors, who are looking to get a solid return for their investment, purchase Notes which are fractions of loans. Borrowers, who need loans for various reasons such as to consolidate debt, improve homes or make a major purchase, can apply for a loan by creating an account on Lendingclub.com. They will submit a loan application which will mention the amount required through the loan. Lending Club will screen the borrowers, facilitate the transaction and service the loan. Borrowers will repay the loan by making monthly payments to Lending Club.
2. Decision trees and Random Forest Classification: Decision trees and random forest classifiers help us classify our data. For example, if a customer has made a purchase (Yes or No), Gender of a person (Male or Female) etc. In this project we will be trying to predict if the borrower is able to repay the loan or not. The goal of decision trees model is to predict the value of the target variable based on several input variables. In a simple way, decision tree algorithm will ask questions to the input variable regarding its attributes. Each time it receives an answer a follow up question will be asked till it reaches a conclusion about the variable.
Random forest classification is one of an ensemble algorithm. It combines lot of decision tree methods. Instead of running the decision tree method once, we will be running multiple methods of decision trees and that will give us a random forest method. We start by selecting random variables from the training data. Then we build a decision tree based on these random variables. Next, we will select the number of trees (Ntrees) we want to build and repeat the previous steps. For any new variable, we will make the Ntrees vote the category to which the variable belongs to and assign the variable to that category based on majority of votes.
3. Data We will use lending data from 2007-2010 and be trying to classify and predict whether or not the borrower paid back their loan in full. The data is publicly available on lendingclub.com Here are what the columns represent:
a. credit.policy: 1 if the customer meets the credit underwriting criteria of LendingClub.com, and 0 otherwise. b. purpose: The purpose of the loan (takes values "credit_card", "debt_consolidation", "educational", "major_purchase", "small_business", and "all_other"). c. int.rate: The interest rate of the loan, as a proportion (a rate of 11% would be stored as 0.11). Borrowers judged by LendingClub.com to be more risky are assigned higher interest rates. d. installment: The monthly installments owed by the borrower if the loan is funded. e. log.annual.inc: The natural log of the self-reported annual income of the borrower. f. dti: The debt-to-income ratio of the borrower (amount of debt divided by annual income). g. fico: The FICO credit score of the borrower. h. days.with.cr.line: The number of days the borrower has had a credit line. i. revol.bal: The borrower's revolving balance (amount unpaid at the end of the credit card billing cycle). j. revol.util: The borrower's revolving line utilization rate (the amount of the credit line used relative to total credit available). k. inq.last.6mths: The borrower's number of inquiries by creditors in the last 6 months. l. delinq.2yrs: The number of times the borrower had been 30+ days past due on a payment in the past 2 years. m. pub.rec: The borrower's number of derogatory public records (bankruptcy filings, tax liens, or judgments).
4. Exploratory data analysis: After performing some exploratory data analysis we observed that:
a. People with low FICO score tend to not meet the credit underwriting criteria of Lending Club. b. Majority of people are still in process of repaying the loan. c. Debt consolidation is a popular reason for pursuing the loan. d. As FICO score increases, there is better credit and less interest rate on the loan.
5. Setting up Data for the Model: As there are some categorical features in the data, we will use pandas ability to create dummy variables so that sci-kit learn will be able to understand them. We will use sci-kit learns ability to split the data into train and test data set. We build the model using the train data set and evaluate the model performance on test data set. Usually such a split is 70:30 in ratio.
6. Build the Model and evaluate the performance: We first build the decision tree model using sklearn.tree to import DecisionTreeClassifier. We fit the classifier on train data and predict the results for test data. Also, we are trying the ensemble methods, we will use the random forest classifier to get a better prediction accuracy. Ensemble methods use various machine learning techniques to deliver the best prediction for our data. As we have seen above Random Forest classification uses multiple decision trees to improve our prediction accuracy.
About Rang Technologies: Headquartered in New Jersey, Rang Technologies has dedicated over a decade delivering innovative solutions and best talent to help businesses get the most out of the latest technologies in their digital transformation journey. Read More...
0 notes
Text
Random Forest Classification using lending club data

1. What is Lending Club? 2. Data used 3. What is Random Forest Classification 4. Evaluation
1. What is Lending Club? Lending club is a peer to peer lending company which is headquartered in San Francisco. The company connects people who need money (Borrowers) with people who have money (Investors) through its online marketplace. Investors, who are looking to get a solid return for their investment, purchase Notes which are fractions of loans. Borrowers, who need loans for various reasons such as to consolidate debt, improve homes or make a major purchase, can apply for a loan by creating an account on Lendingclub.com. They will submit a loan application which will mention the amount required through the loan. Lending Club will screen the borrowers, facilitate the transaction and service the loan. Borrowers will repay the loan by making monthly payments to Lending Club.
2. Decision trees and Random Forest Classification: Decision trees and random forest classifiers help us classify our data. For example, if a customer has made a purchase (Yes or No), Gender of a person (Male or Female) etc. In this project we will be trying to predict if the borrower is able to repay the loan or not. The goal of decision trees model is to predict the value of the target variable based on several input variables. In a simple way, decision tree algorithm will ask questions to the input variable regarding its attributes. Each time it receives an answer a follow up question will be asked till it reaches a conclusion about the variable.
Random forest classification is one of an ensemble algorithm. It combines lot of decision tree methods. Instead of running the decision tree method once, we will be running multiple methods of decision trees and that will give us a random forest method. We start by selecting random variables from the training data. Then we build a decision tree based on these random variables. Next, we will select the number of trees (Ntrees) we want to build and repeat the previous steps. For any new variable, we will make the Ntrees vote the category to which the variable belongs to and assign the variable to that category based on majority of votes.
3. Data We will use lending data from 2007-2010 and be trying to classify and predict whether or not the borrower paid back their loan in full. The data is publicly available on lendingclub.com Here are what the columns represent:
a. credit.policy: 1 if the customer meets the credit underwriting criteria of LendingClub.com, and 0 otherwise. b. purpose: The purpose of the loan (takes values "credit_card", "debt_consolidation", "educational", "major_purchase", "small_business", and "all_other"). c. int.rate: The interest rate of the loan, as a proportion (a rate of 11% would be stored as 0.11). Borrowers judged by LendingClub.com to be more risky are assigned higher interest rates. d. installment: The monthly installments owed by the borrower if the loan is funded. e. log.annual.inc: The natural log of the self-reported annual income of the borrower. f. dti: The debt-to-income ratio of the borrower (amount of debt divided by annual income). g. fico: The FICO credit score of the borrower. h. days.with.cr.line: The number of days the borrower has had a credit line. i. revol.bal: The borrower's revolving balance (amount unpaid at the end of the credit card billing cycle). j. revol.util: The borrower's revolving line utilization rate (the amount of the credit line used relative to total credit available). k. inq.last.6mths: The borrower's number of inquiries by creditors in the last 6 months. l. delinq.2yrs: The number of times the borrower had been 30+ days past due on a payment in the past 2 years. m. pub.rec: The borrower's number of derogatory public records (bankruptcy filings, tax liens, or judgments).
4. Exploratory data analysis: After performing some exploratory data analysis we observed that:
a. People with low FICO score tend to not meet the credit underwriting criteria of Lending Club. b. Majority of people are still in process of repaying the loan. c. Debt consolidation is a popular reason for pursuing the loan. d. As FICO score increases, there is better credit and less interest rate on the loan.
5. Setting up Data for the Model: As there are some categorical features in the data, we will use pandas ability to create dummy variables so that sci-kit learn will be able to understand them. We will use sci-kit learns ability to split the data into train and test data set. We build the model using the train data set and evaluate the model performance on test data set. Usually such a split is 70:30 in ratio.
6. Build the Model and evaluate the performance: We first build the decision tree model using sklearn.tree to import DecisionTreeClassifier. We fit the classifier on train data and predict the results for test data. Also, we are trying the ensemble methods, we will use the random forest classifier to get a better prediction accuracy. Ensemble methods use various machine learning techniques to deliver the best prediction for our data. As we have seen above Random Forest classification uses multiple decision trees to improve our prediction accuracy.
About Rang Technologies: Headquartered in New Jersey, Rang Technologies has dedicated over a decade delivering innovative solutions and best talent to help businesses get the most out of the latest technologies in their digital transformation journey. Read More...
0 notes
Photo

Roscoe Riley
Roscoe, the raccoon, just taking a walk through the forest, paying attention to every step he does, all the sounds, the scent of the mud, trees and grass. Enjoying some time alone, deep in his own thoughts. Pic for Backlash Patreon reward (Thank you for the support!) Digital. Photoshop
116 notes
·
View notes
Text
ચાહકો RRR એક્ટર ઓલિવિયા મોરિસ, જેણે જુનિયર NTRની લેડી-બોયફ્રેન્ડ જેનિફરની ભૂમિકા ભજવી હતી, તેને આશ્ચર્યજનક પેકેજ કહે છે. તેણીની આભાર નોંધ વાંચો
ચાહકો RRR એક્ટર ઓલિવિયા મોરિસ, જેણે જુનિયર NTRની લેડી-બોયફ્રેન્ડ જેનિફરની ભૂમિકા ભજવી હતી, તેને આશ્ચર્યજનક પેકેજ કહે છે. તેણીની આભાર નોંધ વાંચો
તમે “નટુ નટુ” (નાચો નાચો) ના હૂક સ્ટેપથી નિષ્ફળ થઈ શકો છો. આરઆરઆરપરંતુ જો કોઈ એવો અભિનેતા હોય કે જે જુનિયર એનટીઆર અને રામ ચરણની લગભગ અશક્ય પગની હિલચાલની નજીક આવી શક્યો હોય, તો તે ઓલિવિયા મોરિસ છે. ઓલિવિયાએ એસએસ રાજામૌલી-મેગ્નમ ઓપસમાં જેનિફર (જેની), જુનિયર એનટીઆરની ઓન-સ્ક્રીન પ્રેમની ભૂમિકા ભજવી હતી. તેથી, જ્યારે જુનિયર એનટીઆરના ભીમ “મને મેમસાબ ન બોલાવો. તે ફક્ત જેની છે” પંક્તિઓ યાદ કરે છે, તે…
View On WordPress
#RRR જેનિફર કોણ છે?#SS મર્યાદા એમ્બિયન્ટ#આરઆર જેનિફર#આરઆરઆર#આરઆરઆર કાસ્ટ#આરઆરઆર જેની#આરઆરઆર બ્રિટિશ અભિનેતા#ઓલિવિયા આરઆરઆર#ઓલિવિયા મોરિસ#ઓલિવિયા મોરિસ જેનિફર આરઆરઆર#કાંસકો#જુનિયર NTR#જુનિયર NTREE#જેનિફર આરઆરઆર#નાયિકા આરઆરઆર#રામ ચરણ
0 notes
Photo

“ NTREE GARDEN VILLA & VILLA PLOTS “ Entrance arch with Security Cabin Streetlights Drainage Underground EB line Water connection to each garden villa plot Greenish atmosphere Amenities Security Planned Avenue Plantation 30 feet Blacktopped Road Overhead Tank Solar System Compound Wall Premium gated community villa. Between Sriperumbudur and Thiruvallur. Villas starting from Rs.17lakhs. This site is located between NH4 and surrounded by well known MNC's, well connected by government buses and shuttle services. Features of site. Grand entrance Arch, 30feet blacktopped road, Street lights, landscape with Avenue plantations, portable water facility, kids play area, Jogging track and convenience store. For More Details Contact : S.Rajkumar +91 79043 50682. #investmentproperty #realestate #investment #realtor #realestateagent #property #realestateinvesting #investing #propertyinvestment #realestateinvestor #investor #investments #househunting #invest #propertymanagement #forsale #luxuryrealestate #home #rentalproperty #dreamhome #realty #realtorlife #properties #business #realestateinvestment #luxuryhomes #investmentproperties #investors #broker #bhfyp4
1 note
·
View note
Text
Running a Random Forest
TITLE 'Import credit.csv data'; FILENAME CSV "/home/debinqiu0/Practice/credit.csv" TERMSTR = CRLF; PROC IMPORT DATAFILE = CSV OUT = credit DBMS = CSV REPLACE; RUN;
PROC PRINT DATA = credit(OBS = 10); RUN;
TITLE 'Create training and testing data respectively by randomly shuffling strategy'; PROC SQL; CREATE TABLE credit AS SELECT * FROM credit ORDER BY ranuni(0) ; RUN; TITLE 'Training data with 700 observations'; DATA credit_train; SET credit; IF N <= 700 THEN OUTPUT; RUN; TITLE 'Testing data with 300 observations'; DATA credit_test; SET credit; IF N > 700 THEN OUTPUT; RUN;
ODS GRAPHICS ON; PROC HPFOREST DATA = credit_train; TITLE 'Random forest for credit training data'; TARGET default/LEVEL = BINARY; INPUT checking_balance credit_history purpose savings_balance employment_duration other_credit housing job/ LEVEL= NOMINAL; INPUT phone/LEVEL=BINARY; INPUT months_loan_duration amount percent_of_income years_at_residence age existing_loans_count dependents/LEVEL = INTERVAL; SAVE FILE = '/home/debinqiu0/Practice/rf_credit.sas'; RUN;
PROC HP4SCORE DATA = credit_test; TITLE 'Predictions on credit testing data'; ID default; SCORE FILE = '/home/debinqiu0/Practice/rf_credit.sas' OUT = rfscore; RUN;
TITLE "Confusion matrix for testing data"; PROC FREQ DATA = rfscore; TABLES default*I_default /norow nocol nopct; RUN;
credit <- read.table("credit.txt",header = TRUE, sep = "\t")
Split into training and testing sets
set.seed(123) train_sample <- sample(1000,700) credit_train <- credit[train_sample,] credit_test <- credit[-train_sample,] X_train <- credit_train[-c(which(colnames(credit) %in% 'default'))] X_test <- credit_test[-c(which(colnames(credit) %in% 'default'))]
Build model on training data
library(randomForest) credit_rf <- randomForest(default~.,data = credit_train)
Make predictions on testing data
credit_rf_pred <- predict(credit_rf,X_test)
confusion matrix and accuracy
(conf_matrix <- table(credit_test$default,credit_rf_pred)) credit_rf_pred no yes no 197 12 yes 58 33 (sum(diag(conf_matrix))/sum(conf_matrix)) [1] 0.7666667
importance of explanatory variables
importance(credit_rf) MeanDecreaseGini checking_balance 35.270599 months_loan_duration 29.007441 credit_history 19.788039 purpose 18.535077 amount 43.913243 savings_balance 16.717235 employment_duration 19.956369 percent_of_income 14.372510 years_at_residence 13.564876 age 33.709076 other_credit 8.541563 housing 9.169049 existing_loans_count 7.006525 job 10.429602 dependents 4.279386 phone 5.359734 varImpPlot(credit_rf)
Running a different number of trees and see the effect
of that on the accuracy of the prediction
ntree <- seq(50,1000,by = 100) accuracy <- numeric(length(ntree)) set.seed(123) for (i in 1:length(ntree)) {
credit_rf <- randomForest(default~.,data = credit_train,ntree = ntree[i])
credit_rf_pred <- predict(credit_rf,X_test)
conf_matrix <- table(credit_test$default,credit_rf_pred)
accuracy[i] <- sum(diag(conf_matrix))/sum(conf_matrix)
} accuracy [1] 0.7400000 0.7633333 0.7500000 0.7600000 0.7566667 0.7600000 0.7566667 0.7666667 0.7600000 0.7566667 max(accuracy) [1] 0.7666667 ntree[which.max(accuracy)] [1] 750 plot(ntree, accuracy, type = 'l', main = 'acuracy vs. ntree')
import pandas as pd from sklearn.cross_validation import train_test_split from sklearn.ensemble import RandomForestClassifier from sklearn.ensemble import ExtraTreesClassifier import sklearn.metrics from sklearn.preprocessing import LabelEncoder import matplotlib.pylab as plt
credit = pd.read_csv("credit.txt",sep = "\t")
credit = credit.dropna() targets = LabelEncoder().fit_transform(credit['default'])
predictors = credit.ix[:,credit.columns != 'default']
Recode categorical variables as numeric variables
predictors.dtypes for i in range(0,len(predictors.dtypes)): if predictors.dtypes[i] != 'int64': predictors[predictors.columns[i]] = LabelEncoder().fit_transform(predictors[predictors.columns[i]])
pred_train, pred_test, tar_train, tar_test = train_test_split(predictors, targets, test_size=.3)
Build model on training data
classifier = RandomForestClassifier(n_estimators = 25) classifier = classifier.fit(pred_train, tar_train)
Make predictions on testing data
predictions = classifier.predict(pred_test)
Calculate accuracy
sklearn.metrics.confusion_matrix(tar_test, predictions) sklearn.metrics.accuracy_score(tar_test, predictions)
Fit an extra trees model to the training data
model = ExtraTreesClassifier().fit(pred_train, tar_train)
Display the relative importance of each attribute
print(pd.Series(model.feature_importances_, index = predictors.columns).sort_values(ascending = False))
""" Running a different number of trees and see the effect of that on the accuracy of the prediction """
ntree = [50,150,250,350,450,550,650,750,850,950,1000] accuracy = []
for idx in range(len(ntree)): classifier = RandomForestClassifier(n_estimators = ntree[idx]) classifier = classifier.fit(pred_train, tar_train) predictions = classifier.predict(pred_test) accuracy.append(sklearn.metrics.accuracy_score(tar_test,predictions))
pd.Series(accuracy, index = ntree).sort_values(ascending = False)
plt.plot(ntree,accuracy) plt.show()
Calculate accuracy
sklearn.metrics.confusion_matrix(tar_test, predictions) Out[39]: array([[189, 32], [ 46, 33]]) sklearn.metrics.accuracy_score(tar_test, predictions) Out[40]: 0.73999999999999999
Display the relative importance of each attribute
print(pd.Series(model.feature_importances_, index = predictors.columns).sort_values(ascending = False)) checking_balance 0.133015 amount 0.109541 months_loan_duration 0.096196 age 0.086818 employment_duration 0.064515 credit_history 0.064045 percent_of_income 0.063428 purpose 0.063158 savings_balance 0.055704 years_at_residence 0.052617 job 0.045315 existing_loans_count 0.039384 other_credit 0.038604 housing 0.035119 phone 0.030843 dependents 0.021698 dtype: float64
pd.Series(accuracy, index = ntree).sort_values(ascending = False) Out[43]: 850 0.760000 250 0.760000 1000 0.756667 950 0.756667 750 0.756667 650 0.756667 450 0.756667 350 0.756667 550 0.743333 50 0.740000 150 0.736667 dtype: float64
0 notes
Text
Un nuovo studio evidenzia la spinta per i metalli preziosi e industriali
Un nuovo studio evidenzia la spinta per i metalli preziosi e industriali #aumento #Mercatoazionario #metallipreziosi
Una nuova ricerca con 150 fondi pensione europei con un AUM combinato di 213 miliardi di dollari, rivela che prevedono un aumento del prezzo dei metalli preziosi e industriali. Lo studio è stato condotto da NTree International Ltd, una società specializzata nel marketing e nella distribuzione e nell’educazione degli investitori. NTree rappresenta la gamma di metalli ETC del Global Palladium…

View On WordPress
0 notes
Text
CSE505 Problem1-polymorphic tree Solved
CSE505 Problem1-polymorphic tree Solved
4a. The ML type definition below is for a polymorphic tree, called ntree, where each internal node has a list of zero of more subtrees and each leaf node holds a single value: datatype ‘a ntree = leaf of ‘a | node of ‘a ntree list; Using the map(f,l) higher-order function, define a function subst(tr,v1,v2) which returns a new ntree in which all occurrences of v1 in the input ntree tr are…

View On WordPress
0 notes
Text
Machine Learning for Data Analysis 2
Code:
import pandas as pd from sklearn.cross_validation import train_test_split from sklearn.ensemble import RandomForestClassifier from sklearn.ensemble import ExtraTreesClassifier import sklearn.metrics from sklearn.preprocessing import LabelEncoder import matplotlib.pylab as plt credit = pd.read_csv("credit.txt",sep = "\t") credit = credit.dropna() targets = LabelEncoder().fit_transform(credit['default']) predictors = credit.ix[:,credit.columns != 'default'] # Recode categorical variables as numeric variables predictors.dtypes for i in range(0,len(predictors.dtypes)): if predictors.dtypes[i] != 'int64': predictors[predictors.columns[i]] = LabelEncoder().fit_transform(predictors[predictors.columns[i]]) pred_train, pred_test, tar_train, tar_test = train_test_split(predictors, targets, test_size=.3) # Build model on training data classifier = RandomForestClassifier(n_estimators = 25) classifier = classifier.fit(pred_train, tar_train) # Make predictions on testing data predictions = classifier.predict(pred_test) # Calculate accuracy sklearn.metrics.confusion_matrix(tar_test, predictions) sklearn.metrics.accuracy_score(tar_test, predictions) # Fit an extra trees model to the training data model = ExtraTreesClassifier().fit(pred_train, tar_train) # Display the relative importance of each attribute print(pd.Series(model.feature_importances_, index = predictors.columns).sort_values(ascending = False)) """ Running a different number of trees and see the effect of that on the accuracy of the prediction """ ntree = [50,150,250,350,450,550,650,750,850,950,1000] accuracy = [] for idx in range(len(ntree)): classifier = RandomForestClassifier(n_estimators = ntree[idx]) classifier = classifier.fit(pred_train, tar_train) predictions = classifier.predict(pred_test) accuracy.append(sklearn.metrics.accuracy_score(tar_test,predictions)) pd.Series(accuracy, index = ntree).sort_values(ascending = False) plt.plot(ntree,accuracy) plt.show()
Explanation:
In the above procedure, We first build a random forest with 25 decision trees. This gives us 74% accuracy on the testing data.
We also explore the importane of 16 explanatory variables. The first three most important explanatory variables are checking_balance, amount, months_loan_duration which are slightly different from those obtained in R.
We finally run random forest with different number of decision trees. The results show that we obtain the highest 76% accuracy when the number of trees is 850 or 250. We would definitely choose 250 due to less computation time.
Output:
array([[189, 32], [ 46, 33]])
0.73999999999999999
checking_balance 0.133015 amount 0.109541 months_loan_duration 0.096196 age 0.086818 employment_duration 0.064515 credit_history 0.064045 percent_of_income 0.063428 purpose 0.063158 savings_balance 0.055704 years_at_residence 0.052617 job 0.045315 existing_loans_count 0.039384 other_credit 0.038604 housing 0.035119 phone 0.030843 dependents 0.021698 dtype: float64
850 0.760000 250 0.760000 1000 0.756667 950 0.756667 750 0.756667 650 0.756667 450 0.756667 350 0.756667 550 0.743333 50 0.740000 150 0.736667 dtype: float64
0 notes
Note
mongold haanas huvtsas avhu fittroom dream ntrees oor
Bumbgur is the best hahahah
5 notes
·
View notes
Photo

\おにわさん更新情報📸/ [ 埼玉県川口市 ] 持宝院庭園“三業の石据” Jihoin Temple Tricolour Garden, Kawaguchi, Saitama の写真・記事を更新しました。 ーー植木の里🌲戸塚安行の町の“安行八景”の寺院にN-tree・長崎剛志さんが作庭した現代的な枯山水庭園。 ・・・・・・・・ 元来山 持宝院は安土桃山時代に創建、江戸時代初期に中興した真言宗智山派のお寺で、安行地域の史跡巡り #安行八景 の一つに数えられるスポット📷 現代建築による礼拝堂の設計は「ラブアーキテクチャー一級建築士事務所」 #浅利幸男 さん、それを取り囲む石庭は「N-tree」 #長崎剛志 さんによるもの。 川口市の戸塚安行は江戸時代から続く“植木の里”。駅から少し歩くと幾つもの大きな植木屋さん、造園会社があります(中にはオープンガーデンも)。 隣駅・新井宿に日本の都市公園100にも選定『川口市立グリーンセンター』🌳がありますが、それもこの街の特性を活かして開園したもの(のはず)。 . 持宝院はそんな安行の街の外れにあります。この石庭が『庭 NIWA』さんで以前取り上げられていて、かっこいいなあと思って…埼玉高速鉄道乗る時に観に行きたいな、と思ってまして。2019年9月に埼玉スタジアム🏟へ行く途中に立ち寄りました! 2017年に作庭されたこの庭園の名は“三業の石据”(Tricolour Garden)。立派な青石、青石、そして白い石と“トリコロール”🇫🇷の名の通り三色のカラフルな岩が配された庭園。 いずれの石も地元・安行の造園会社の資材から仕入れたものだそう。現代建築と“現代の枯山水庭園”のデザインがマッチしています。 よりきれいな写真と詳しい作庭意図はラブアーキテクチャーさんの公式サイトに書かれていますので、そちらを併せてぜひ! 〜〜〜〜〜〜〜〜 🔗おにわさん記事URL: https://oniwa.garden/jihoin-temple-kawaguchi-%e6%8c%81%e5%ae%9d%e9%99%a2%e5%ba%ad%e5%9c%92/ ーーーーーーーー #庭園 #日本庭園 #garden #japanesegarden #japanesegardens #川口 #川口市 #kawaguchi #埼玉 #埼玉県 #saitama #saitamarailway #戸塚安行 #安行 #totsukaangyo #埼玉観光 #埼玉寺院 #寺社仏閣 #saitamatemple #枯山水庭園 #枯山水 #karesansui #ntree #石庭 #rockgarden #おにわさん (持宝院) https://www.instagram.com/p/B-7aXffAyu9/?igshid=t1qzt81kkcdx
#安行八景#浅利幸男#長崎剛志#庭園#日本庭園#garden#japanesegarden#japanesegardens#川口#川口市#kawaguchi#埼玉#埼玉県#saitama#saitamarailway#戸塚安行#安行#totsukaangyo#埼玉観光#埼玉寺院#寺社仏閣#saitamatemple#枯山水庭園#枯山水#karesansui#ntree#石庭#rockgarden#おにわさん
0 notes
Text
Functional Programming Solution
Problem 3: In Lecture 12, we discussed game-trees which are, in general, infinite but we consider a finite version in this question. The following ML datatype defines an n-ary tree in which the branching factor can vary from one node to another:
datatype ‘a ntree = leaf of ‘a | node of ‘a ntree list
Assume that you are given an n-ary treein which the leaf nodes all contain an integer representing…
View On WordPress
0 notes
Video
youtube
July 25th, 2012, ViViD released the special single: 무서운 이야기
Vivid (비비드) was a group active from 2012 to 2016. The group was formed by Ntree Ent. and consisted of: S2, Seed (씨드), Showking (쇼킹), Shin Ah-reum(신아름), Jeong A Yeong (정아영) & Park Sung-hee (박성희). This lineup released 3 singles together: ‘’딱 걸렸어’’, ‘‘Breathless’‘ & ‘‘Save It’‘. ‘무서운 이야기‘ was a single made for the OST for Horror Stories. In 2015 they came back as a 4 member group, S2 and Seed had left the group. A year later a sub-unit called Thank You (땡큐), which was made up of Sung-hee, Ah-reum & A Yeong.
Honorable Mention: 2013 - Tahiti (타히티) - Five Beats of Hearts Their first mini-album, with the single: ‘’Love Sick’’
#kpop of the day#vivid#비비드#k-pop#kpop#korean music#2012#2010s#gg#short lived#땡큐#thank you#타히티#tahiti#july
0 notes
Text
Assignment #2 Functional Programming Solution
Assignment #2 Functional Programming Solution
Problem 3: In Lecture 12, we discussed game-trees which are, in general, infinite but we consider a finite version in this question. The following ML datatype defines an n-ary tree in which the branching factor can vary from one node to another:
datatype ‘a ntree = leaf of ‘a | node of ‘a ntree list
Assume that you are given an n-ary treein which the leaf nodes all contain an integer…
View On WordPress
0 notes