方法一 :caTools包,sample.split函数
复制代码
1
2
3
4
5
6
7# Splitting the dataset into the Training set and Test set install.packages('caTools') library(caTools) set.seed(123) split = sample.split(dataset$Purchased, SplitRatio = 0.75) training_set = subset(dataset, split == TRUE) test_set = subset(dataset, split == FALSE)
方法二:caret包,createDataPartition函数
复制代码
1
2
3
4library(caret) Train <- createDataPartition(data$Obesity, p=0.6, list=FALSE) training <- data[ Train, ] testing <- data[ -Train, ]
方法三:sample函数
复制代码
1
2
3ind=sample(nrow(data),nrow(data)*4/5) training<-data[ind,] testing<-data[-ind,]
方法四:自写函数
复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17create_train_test <- function(data, size = 0.8, train = TRUE) { n_row = nrow(data) total_row = size * n_row train_sample < - 1: total_row if (train == TRUE) { return (data[train_sample, ]) } else { return (data[-train_sample, ]) } } #Code Explanation #function(data, size=0.8, train = TRUE): Add the arguments in the function #n_row = nrow(data): Count number of rows in the dataset #total_row = size*n_row: Return the nth row to construct the train set #train_sample <- 1:total_row: Select the first row to the nth rows #if (train ==TRUE){ } else { }: If condition sets to true, return the train set, else the test set.
最后
以上就是霸气未来最近收集整理的关于分割数据集R实现(spliting dataset)的全部内容,更多相关分割数据集R实现(spliting内容请搜索靠谱客的其他文章。
本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
发表评论 取消回复