#sapply function | Explore Tumblr posts and blogs

ankitcodinghub · 8 months ago

Text

UCS548 - UCS 538- Data Science Fundamentals Solved

Assignment 7 c. To create a function called st.err()=sd(x)/sqrt(length(x)) to find the standard error in SUB1, SUB2, and SUB3. 3. Create a vector TOTAL_SUM that hold the value of V1, V2, and V3 using sapply(). 6. Create a function f(x,y)=x/y where x is V1 and y is V2. Use mapply() to compute this function. 7. Practice all apply functions on “Seatbelts” data set given in R.

0 notes

vlruso · 2 years ago

Text

A Simple Guide to Understand the apply() Functions in R

Hey LinkedIn network, I just came across this fantastic blog post on understanding the apply() functions in R. It provides a comprehensive overview of the apply family of functions in R, including apply(), lapply(), sapply(), and tapply(). If you've ever wondered how to efficiently apply a specified function to all the elements of a row or column in a dataset, or how to apply a function to each element of a list, then this article is a must-read for you. You can check out the full post here: [A Simple Guide to Understand the apply() Functions in R](https://ift.tt/zJiVTpo) Happy reading and happy coding! #Rprogramming #datascience #programming #coding List of Useful Links: AI Scrum Bot - ask about AI scrum and agile Our Telegram @itinai Twitter - @itinaicom

#itinai.com #AI #News #A Simple Guide to Understand the apply() Functions in R #AI News #AI tools #Gustavo Santos #Innovation #itinai #LLM #Productivity #Towards Data Science - Medium A Simple Guide to Understand the apply() Functions in R

0 notes

sakshimohitedf · 5 years ago

Link

#r sapply function #R matrix #functions #sapply function #apply function #r

0 notes

sparkbyexamples · 3 years ago

Text

R Apply Function to Vector

How to apply a function to each element of R Vector? R provides several functions to apply or call a function to each element of a vector, for example, functions like apply(), lapply(), vapply(), sapply(). In this article, I will explain how to apply the existing R base function to a vector and create a custom function, and applying to every element of the R vector. Related: What is FUN in R…

View On WordPress

0 notes

flowgreys · 3 years ago

Text

Convert factor to numeric r

CONVERT FACTOR TO NUMERIC R HOW TO

CONVERT FACTOR TO NUMERIC R CODE

Column d: Unchanged (since it was already numeric)īy using the apply() and sapply() functions, we were able to convert only the character columns to numeric columns and leave all other columns unchanged.Column c: Unchanged (since it was a factor).

CONVERT FACTOR TO NUMERIC R CODE

This code made the following changes to the data frame columns: #convert all character columns to numericĭf

CONVERT FACTOR TO NUMERIC R HOW TO

The following code shows how to convert all character columns in a data frame from character to numeric: #create data frame Most of R Programmers make mistake while converting a factor variable to integer. If we want to get only levels, Then we can use levels () function. Syntax: factor (vector) Return type: vector elements with levels. We can get the levels of the vector using factor () function. #convert column 'a' from character to numericĮxample 3: Convert Several Columns from Character to Numeric In this article, we are going to discuss how to convert the factor levels to list data structure in R Programming Language. The following code shows how to convert a specific column in a data frame from character to numeric: #create data frameĭf

#Convert factor to numeric r

0 notes

itfeature-com · 4 years ago

Text

R Language Test 3

This quiz “R Language Test” will help you to check your ability to execute some basic operations on objects in the R language, and it will also help you to understand some basic concepts. This quiz may also improve your computational understanding. The R language quiz covers some looping functions such as apply(), lapply(), mapply(), sapply(), and tapply(). Results from execution of r codes is…

View On WordPress

0 notes

nomethoderror · 4 years ago

Text

How to flatten json in R with tidyjson

library(dplyr) library(tidyjson) read_json("file.json") %>% # change it select(-document.id) %>% enter_object(nestedElement1, nestedElement2, arrayElement) %>% # change it gather_array("document.id") %>% json_structure %>% filter(type != "object" & type != "array" & type != "null") %>% mutate(path = sapply(seq, function(i) paste(i, collapse = ".")), value = ..JSON) %>% select(matches("document.id|path|value")) %>% as_tibble() %>% pivot_wider(names_from = path, values_from = value) %>% mutate(across(everything(), ~unlist(replace_na(., NA)))) %>% identity -> result

#r #tidyjson

0 notes

superaakash24 · 5 years ago

Link

#vector function in r #r #vector function #technology #data science

0 notes

szarki9 · 6 years ago

Text

R intro part 2

Hi there again,

part 2 of R intro summary :)

other handy functions:

lapply - returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.

sapply – returns vector, matrix or array of output from applying FUN to elements of X, more user-friendly then lapply.

vapply – tries to generate named array, has a pre-specified type of return value, so it can be safer (and sometimes faster) to use

seq, rep, sort, rev – reverse the elements, append – merge vectors, is.* – check the class of R object, as.* – convert an R object from one class to another, unlist – flatten list to a vectors

REGULAR EXPRESSIONS! – is a sequence of characters that define a search pattern.

grepl - true if a pattern is found, grep - vector of indices of the character strings that contain the pattern, in sub and gsub you can specify a replacement argument, sub only first match, gsub all the matches.

Date and POSIXct objects –

dyplr: arrange – in descending or ascending order, filter , mutate – adding new column out of the rest

ggplot2: ggplot, facet_wrap – wraps a 1d sequence of panels into 2d, expand_limits

different plots with ggplot2: + geom_col -bar plot, geom_point - points, geom_line, geom_histogram, geom_boxplot

LOADING DATA INTO R

Utils package:

read.csv, read.delim – from txt files, read.table, which.min, which.max – returns an index of min or max value in the column, in read.delim for example you can add column names and column classes

readr library:

read_csv, read_tsv – tsv files, read_delim

collectors – are used to pass information about how to interpret values in a column; col_integer, col_factors

data.table library:

fread – same as read.table, but extremely fast and easy

readxl library:

excel_sheets – prints out names of the sheets in excel, read_excel – imports excels as tbl_df, tbl, data.frame,

gdata library:

read.xls – converting excel files to csv and then reading csv files using read.csv,

XLConnect library:

loadWorkbook – a bridge between Excel file and R session, getSheets – list of sheets, readWorksheet – importing the sheet into a data frame, createSheet – create a new sheet, writeWorksheet – adding data frames to a sheet, saveWorkbook – store adapter excel file, renameSheet, removeSheet

DBI library:

dbConnect – MySQLConnection, dbListTables – list of the tables in db, dbReadTable, dbGetQuery – using query to get data from a database table, dbDisconnect

dbSendQuery, then dbFetch – fetching results of executing a query, gives the ability to fetch the query’s result in chunks rather than all at once, dbClearResult – frees the memory

Importing flat files from the web: using readr library, read_csv or read_tsv with url address.

Downloading files from the web: readxl and gdata libraries, read.xls(gdata) and then download.file or read_excel (readr).

Downloading .RData files, download.file first and then load it into the workspace with load.

httr library:

GET – to get request from the web, in result we get the response object, that provides easy access to the status code, content-type, and actual content, using content we can extract the content, and we can define what object we want to retrieve: raw object, R object (list) or a character vector.

JSON files: first GET, then content as text but we can also use jsonlite library and use fromJSON to convert character data into a list, we can pass an object as an argument or URL, we can convert data to JSON file using toJSON, prettify makes JSON files pretty and minify makes in as concise as possible.

haven library:

read_sas for SAS

read_dta for STATA, columns are imported as a labeled vector and in order to change it for R format we need to use as_factor function and then we can convert it for the wanted data type.

read_sav or read_por for SPSS, also labeled class and we need to change it to other standard R class.

foreign library:

Simple functions to import STATA data and SPSS - read.dta for STATA, read.spss for SPSS.

thanks, wait for more!

szarki9

0 notes

r-haus · 6 years ago

Text

all var to factor

df [] <- lapply(df, factor)

------------------------------------------

col_names <- names(df) df[,col_names] <- lapply(df [,col_names] , factor)

-------------------------------------------------------

names <- c(1:8) df[,names] <- lapply(df[,names] , factor) str(df)

----------------------------

iris$Species<-as.ordered(iris$Species) OR for(i in 1:ncol(infert)){

infert[,i] <- as.factor(infert[,i])

} OR infert <- infert %>% mutate_if(is.character,as.factor) OR DF <- data.frame(x = letters[1:5], y = 1:5, z = LETTERS[1:5], stringsAsFactors=FALSE) DF[sapply(DF, is.character)] <- lapply(DF[sapply(DF, is.character)], as.factor)

-------------------------------

----------------------------

Converting Numeric Variables to Factor

Using Column Index Numbers

In this case, we are converting first, second, third and fifth numeric variables to factor variables.

mydata

is a data frame.

names <- c(1:3,5) mydata[,names] <- lapply(mydata[,names] , factor) str(mydata)

2. Using Column Names

In this case, we are converting two variables 'Credit' and 'Balance' to factor variables.

names <- c('Credit' ,'Balance') mydata[,names] <- lapply(mydata[,names] , factor) str(mydata)

3. Converting all variables

col_names <- names(mydata) mydata[,col_names] <- lapply(mydata[,col_names] , factor)

4. Converting all numeric variables

mydata[sapply(mydata, is.numeric)] <- lapply(mydata[sapply(mydata, is.numeric)], as.factor)

5. Checking unique values in a variable and convert to factor only those variables having unique count less than 4

col_names <- sapply(mydata, function(col) length(unique(col)) < 4) mydata[ , col_names] <- lapply(mydata[ , col_names] , factor)

0 notes

ankitcodinghub · 8 months ago

Text

UCS548 - UCS538 Data Science Fundamentals Solved

Assignment 2 1. Install and configure “swirl” package for self learning. Steps: install.packages(“swirl”) packageVersion(“swirl”) library(swirl) 2. Use “swirl” in R to perform the following 1: Basic Building Blocks 2: Workspace and Files 3: Sequences of Numbers 4: Vectors 5: Missing Values 6: Subsetting Vectors 7: Matrices and Data Frames 8: Logic 9: Functions 10: lapply and sapply 11: vapply and…

0 notes

createinfinitedreams · 5 years ago

Text

R - Split a String in a Data Frame Column and Keep a Piece as a New Variable

I’ve been having trouble figuring out where to begin with this data blog, so I think I’ll start with something pretty simple but ultimately very valuable - splitting a column of values in an R data frame and creating a new variable out of one piece of the split, for every row in your dataset. I use this all the time to create a new variable who’s values are a subset of another variable. This might be a niche piece of code, but I looooooove it :)

Dataset:

Data Frame: NBA

Function

df$variable2 <- sapply(strsplit(as.character(df$variable1), " "),"[", 1)

Let’s break down the pieces to this nifty little trick, from the inside out:

as.character(df$variable1)

We want the variable that we are splitting to be a character variable, if it is not already.

strsplit(…, " ")

This will split the value in a variable by a delimiter, which is great. However, say you have a variable1, with a value “Tyler is awesome“. Using the strsplit function (and splitting on a space “ “), you would end up with “Tyler” “is” “awesome“. There’s nothing wrong with this, but if you tried to assign this to a data frame, you would have one variable with three rows, one for each of the split words. And this is only working for a single value. This isn’t what we are trying to do here - especially if you have a large data frame with lots of different values in variable1. We do want to split the variable though, which is why this is an important piece to this.

sapply(…(…),"[", 1)

This is where the magic happens. sapply() function takes a list, vector or data frame as input and gives output in vector or matrix. The apply family in general primarily are used to avoid explicit uses of loop constructs, which in our case is quite helpful as we have many rows of data that we want to perform some sort of function on.

The piece “[“,1 is the FUN function for sapply and the part where we tell R to retain just one piece of the split. The “1” tells R that we want the first piece of the split - we could change that to 2, 3 etc depending on what we want to keep.

It’s probably best to see it in action though, as even some of these intricate details can get complicated for me as well.

Example

Alright so based on the dataset above, let’s say we wanted to split the variable “NBA_Teams” and store the city that each team is from in a new column, called “Cities”. Here’s the code we would use to do that:

NBA$Cities <- sapply(strsplit(as.character(NBA$NBA_Teams), " "),"[", 1)

If we wanted to just keep the mascot portion of each team (let’s call that new variable “Mascot”, we would simply change the “1” to a “2” at the end of the function:

NBA$Mascot <- sapply(strsplit(as.character(NBA$NBA_Teams), " "),"[", 2)

So again, instead of just splitting a single value into smaller chunks, we can split an entire column of values based on any delimiter that we want (the above example we split on a space, but we could split on the letter “t” if we wanted to). No for loops necessary!

Thanks for reading!

#strsplit()#sapply()#as.character()#function

0 notes

sakshimohitedf · 5 years ago

Link

#r sapply function #r #programming #sapply function #apply function #technology

0 notes

sparkbyexamples · 3 years ago

Text

FUN in R - Apply Function on List or Vector

FUN in R – Apply Function on List or Vector

FUN translates to function in R, in most cases it is used when you wanted to apply a function over a List or Vector. You will find this in many R method syntaxes. For example, you can find this on apply(), lapply(), vapply(), sapply() e.t.c In this article, I will explain some examples of where FUN is used in R and how to create a function and use it with vectors and lists. 1. FUN in R Let’s…

View On WordPress

0 notes

mega-okesipe-victoria-o · 6 years ago

Text

R code for churn prediction-OKESIPE VICTORIA (Final Year Project 2019)

_______Title:

PREDICTING CUSTOMER CHURN IN MOBILE TELECOMMUNICATION SECTOR USING DATA MINING ANALYSIS APPROACH WITH R

________

```{r} library(plyr) library(corrplot) library(ggplot2) library(gridExtra) library(ggthemes) library(caret) library(MASS) library(randomForest) library(party) ``` ```{r} churn <- read.csv('Telco-Customer-Churn.csv') str(churn) ``` The raw data contains 7043 rows (customers) and 21 columns (features). The “Churn” column is our target. We’ll use all other columns as features to our model. We use sapply to check the number if missing values in each columns. We found that there are 11 missing values in "TotalCharges" columns. So, let's remove these rows with missing values. ```{r} sapply(churn, function(x) sum(is.na(x))) ``` ```{r} churn <- churn[complete.cases(churn), ] ``` Change “No internet service” to “No” for six columns, they are: “OnlineSecurity”, “OnlineBackup”, “DeviceProtection”, “TechSupport”, “streamingTV”, “streamingMovies”. ```{r} cols_recode1 <- c(10:15) for(i in 1:ncol(churn[,cols_recode1])) { churn[,cols_recode1][,i] <- as.factor(mapvalues (churn[,cols_recode1][,i], from =c("No internet service"),to=c("No"))) } ``` Change “No phone service” to “No” for column “MultipleLines” ```{r} churn$MultipleLines <- as.factor(mapvalues(churn$MultipleLines, from=c("No phone service"), to=c("No"))) ``` The minimum tenure is 1 month and maximum tenure is 72 months, we can group them into five tenure groups: “0–12 Month”, “12–24 Month”, “24–48 Months”, “48–60 Month”, “> 60 Month”. ```{r} min(churn$tenure); max(churn$tenure) ``` ```{r} group_tenure <- function(tenure){ if (tenure >= 0 & tenure <= 12){ return('0-12 Month') }else if(tenure > 12 & tenure <= 24){ return('12-24 Month') }else if (tenure > 24 & tenure <= 48){ return('24-48 Month') }else if (tenure > 48 & tenure <=60){ return('48-60 Month') }else if (tenure > 60){ return('> 60 Month') } } ``` ```{r} churn$tenure_group <- sapply(churn$tenure,group_tenure) churn$tenure_group <- as.factor(churn$tenure_group) ``` Change the values in column “SeniorCitizen” from 0 or 1 to “No” or “Yes”. ```{r} churn$SeniorCitizen <- as.factor(mapvalues(churn$SeniorCitizen, from=c("0","1"), to=c("No", "Yes"))) ``` Remove the columns we do not need for the analysis: ```{r} churn$customerID <- NULL churn$tenure <- NULL ``` ##Exploratory data analysis and feature selection ```{r} numeric.var <- sapply(churn, is.numeric) ## Find numerical variables corr.matrix <- cor(churn[,numeric.var]) ## Calculate the correlation matrix corrplot(corr.matrix, main="\n\nCorrelation Plot for Numeric Variables", method="number") ``` The Monthly Charges and Total Charges are correlated. So one of them will be removed from the model. We remove Total Charges. ```{r} churn$TotalCharges <- NULL ``` ## Bar plots of categorical variables ```{r} p1 <- ggplot(churn, aes(x=gender)) + ggtitle("Gender") + xlab("Gender") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p2 <- ggplot(churn, aes(x=SeniorCitizen)) + ggtitle("Senior Citizen") + xlab("Senior Citizen") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p3 <- ggplot(churn, aes(x=Partner)) + ggtitle("Partner") + xlab("Partner") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p4 <- ggplot(churn, aes(x=Dependents)) + ggtitle("Dependents") + xlab("Dependents") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() grid.arrange(p1, p2, p3, p4, ncol=2) ``` ```{r} p5 <- ggplot(churn, aes(x=PhoneService)) + ggtitle("Phone Service") + xlab("Phone Service") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p6 <- ggplot(churn, aes(x=MultipleLines)) + ggtitle("Multiple Lines") + xlab("Multiple Lines") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p7 <- ggplot(churn, aes(x=InternetService)) + ggtitle("Internet Service") + xlab("Internet Service") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p8 <- ggplot(churn, aes(x=OnlineSecurity)) + ggtitle("Online Security") + xlab("Online Security") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() grid.arrange(p5, p6, p7, p8, ncol=2) ``` ```{r} p9 <- ggplot(churn, aes(x=OnlineBackup)) + ggtitle("Online Backup") + xlab("Online Backup") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p10 <- ggplot(churn, aes(x=DeviceProtection)) + ggtitle("Device Protection") + xlab("Device Protection") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p11 <- ggplot(churn, aes(x=TechSupport)) + ggtitle("Tech Support") + xlab("Tech Support") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p12 <- ggplot(churn, aes(x=StreamingTV)) + ggtitle("Streaming TV") + xlab("Streaming TV") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() grid.arrange(p9, p10, p11, p12, ncol=2) ``` ```{r} p13 <- ggplot(churn, aes(x=StreamingMovies)) + ggtitle("Streaming Movies") + xlab("Streaming Movies") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p14 <- ggplot(churn, aes(x=Contract)) + ggtitle("Contract") + xlab("Contract") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p15 <- ggplot(churn, aes(x=PaperlessBilling)) + ggtitle("Paperless Billing") + xlab("Paperless Billing") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p16 <- ggplot(churn, aes(x=PaymentMethod)) + ggtitle("Payment Method") + xlab("Payment Method") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() p17 <- ggplot(churn, aes(x=tenure_group)) + ggtitle("Tenure Group") + xlab("Tenure Group") + geom_bar(aes(y = 100*(..count..)/sum(..count..)), width = 0.5) + ylab("Percentage") + coord_flip() + theme_minimal() grid.arrange(p13, p14, p15, p16, p17, ncol=2) ``` All categorical variables have a reasonable broad distribution, therefore, all of them will be kept for the further analysis. ## Logistic Regression Model Fitting Split the data into training and testing sets. ```{r} intrain<- createDataPartition(churn$Churn,p=0.7,list=FALSE) set.seed(2019) training<- churn[intrain,] testing<- churn[-intrain,] ``` Confirm the splitting is correct. ```{r} dim(training); dim(testing) ``` Fitting the Model ```{r} LogModel <- glm(Churn ~ .,family=binomial(link="logit"),data=training) print(summary(LogModel)) ``` Feature analysis: 1. The top three most-relevant features include Contract, Paperless Billing and tenure group, all of which are categorical variables. ```{r} anova(LogModel, test="Chisq") ``` Analyzing the deviance table we can see the drop in deviance when adding each variable one at a time. Adding InternetService, Contract and tenure_group significantly reduces the residual deviance. The other variables such as PaymentMethod and Dependents seem to improve the model less even though they all have low p-values. ## Assessing the predictive ability of the model ```{r} testing$Churn <- as.character(testing$Churn) testing$Churn[testing$Churn=="No"] <- "0" testing$Churn[testing$Churn=="Yes"] <- "1" fitted.results <- predict(LogModel,newdata=testing,type='response') fitted.results <- ifelse(fitted.results > 0.5,1,0) misClasificError <- mean(fitted.results != testing$Churn) print(paste('Logistic Regression Accuracy',1-misClasificError)) ``` ## Confusion Matrix ```{r} print("Confusion Matrix for Logistic Regression"); table(testing$Churn, fitted.results > 0.5) ``` ## Odds Ratio One of the interesting perfomance measurements in logistic regression is Odds Ratio.Basically, Odds retios is what the odds of an event is happening? ```{r} exp(cbind(OR=coef(LogModel), confint(LogModel))) ``` For each unit increase in Monthly Charge, there is a 2.4% decrease in the likelihood of a customer's churning. ## Decision Tree ```{r} churn <- read.csv('Telco-Customer-Churn.csv') churn <- churn[complete.cases(churn), ] ``` ```{r} cols_recode1 <- c(10:15) for(i in 1:ncol(churn[,cols_recode1])) { churn[,cols_recode1][,i] <- as.factor(mapvalues (churn[,cols_recode1][,i], from =c("No internet service"),to=c("No"))) } ``` ```{r} churn$MultipleLines <- as.factor(mapvalues(churn$MultipleLines, from=c("No phone service"), to=c("No"))) ``` ```{r} group_tenure <- function(tenure){ if (tenure >= 0 & tenure <= 12){ return('0-12 Month') }else if(tenure > 12 & tenure <= 24){ return('12-24 Month') }else if (tenure > 24 & tenure <= 48){ return('24-48 Month') }else if (tenure > 48 & tenure <=60){ return('48-60 Month') }else if (tenure > 60){ return('> 60 Month') } } ``` ```{r} churn$tenure_group <- sapply(churn$tenure,group_tenure) churn$tenure_group <- as.factor(churn$tenure_group) ``` ```{r} churn$SeniorCitizen <- as.factor(mapvalues(churn$SeniorCitizen, from=c("0","1"), to=c("No", "Yes"))) ``` ```{r} churn$customerID <- NULL churn$tenure <- NULL churn$TotalCharges <- NULL ``` ```{r} intrain<- createDataPartition(churn$Churn,p=0.7,list=FALSE) set.seed(2019) training<- churn[intrain,] testing<- churn[-intrain,] ``` For illustration purpose, we are going to use only three variables, they are "Contract", "tenure_group" and "PaperlessBilling". ```{r} tree <- ctree(Churn~Contract+tenure_group+PaperlessBilling, training) ``` ```{r} plot(tree, type='simple') ``` Out of three variables we use, Contract is the most important variable to predict customer churn or not churn. If a customer in a one-year contract and not using PapelessBilling, then this customer is unlikely to churn. On the other hand, if a customer is in a month-to-month contract, and in the tenure group of 0-12 months, and using PaperlessBilling, then this customer is more likely to churn. ```{r} pred_tree <- predict(tree, testing) print("Confusion Matrix for Decision Tree"); table(Predicted = pred_tree, Actual = testing$Churn) ``` ```{r} p1 <- predict(tree, training) tab1 <- table(Predicted = p1, Actual = training$Churn) tab2 <- table(Predicted = pred_tree, Actual = testing$Churn) ``` ```{r} print(paste('Decision Tree Accuracy',sum(diag(tab2))/sum(tab2))) ``` ______________________________ end of R code for churn prediction

0 notes

rafi1228 · 6 years ago

Link

Take Your R & R Studio Skills To The Next Level. Data Analytics, Data Science, Statistical Analysis in Business, GGPlot2

What you’ll learn

Perform Data Preparation in R

Identify missing records in dataframes

Locate missing data in your dataframes

Apply the Median Imputation method to replace missing records

Apply the Factual Analysis method to replace missing records

Understand how to use the which() function

Know how to reset the dataframe index

Work with the gsub() and sub() functions for replacing strings

Explain why NA is a third type of logical constant

Deal with date-times in R

Convert date-times into POSIXct time format

Create, use, append, modify, rename, access and subset Lists in R

Understand when to use [] and when to use [[]] or the $ sign when working with Lists

Create a timeseries plot in R

Understand how the Apply family of functions works

Recreate an apply statement with a for() loop

Use apply() when working with matrices

Use lapply() and sapply() when working with lists and vectors

Add your own functions into apply statements

Nest apply(), lapply() and sapply() functions within each other

Use the which.max() and which.min() functions

Requirements

Basic knowledge of R

Knowledge of the GGPlot2 package is recommended

Knowledge of dataframes

Knowledge of vectors and vectorized operations

Description

Ready to take your R Programming skills to the next level?

Want to truly become proficient at Data Science and Analytics with R?

This course is for you!

Professional R Video training, unique datasets designed with years of industry experience in mind, engaging exercises that are both fun and also give you a taste for Analytics of the REAL WORLD.

In this course you will learn:

How to prepare data for analysis in R

How to perform the median imputation method in R

How to work with date-times in R

What Lists are and how to use them

What the Apply family of functions is

How to use apply(), lapply() and sapply() instead of loops

How to nest your own functions within apply-type functions

How to nest apply(), lapply() and sapply() functions within each other

And much, much more!

The more you learn the better you will get. After every module you will already have a strong set of skills to take with you into your Data Science career.

Who this course is for:

Anybody who has basic R knowledge and would like to take their skills to the next level

Anybody who has already completed the R Programming A-Z course

This course is NOT for complete beginners in R

Created by Kirill Eremenko, SuperDataScience Team Last updated 11/2018 English English

Size: 1.25 GB

Download Now

https://ift.tt/1ONuVGS.

The post R Programming: Advanced Analytics In R For Data Science appeared first on Free Course Lab.

#IFTTT #Blogger

0 notes