Data Transformations
split(x, f, drop = FALSE, …),x是待划分的向量或数据框,f是定义了组关系的因子变量。
library(MASS)g <- split(Cars93$MPG.city, Cars93$Origin)class(g)
## [1] "list"
names(g)
## [1] "USA""non-USA"
c(median(g[[1]]), median(g[[2]]))
## [1] 20 22
lapply(lst, fun)和sapply(lst, fun),前者返回一个列表,后者优先返回向量,sapply里的s表示simplify。
Lst <- list(a = rnorm(100), b = rnorm(100), c = rnorm(100))lapply(Lst, range)
## $a## [1] -2.859 2.976## ## $b## [1] -1.961 2.906## ## $c## [1] -2.403 3.363
sapply(Lst, range)
## abc## [1,] -2.859 -1.961 -2.403## [2,] 2.976 2.906 3.363
对矩阵或数据框的行或列应用函数
对矩阵行apply(mat, 1, fun)。对矩阵列apply(mat, 2, fun)。由于data frame是由其列为元素的列表,所以有lapply(dfrm, fun)和sapply(dfrm, fun)。假设resp是响应变量(response variable),pred是一个数据框,每列为一个predictor。cors <- sapply(pred, cor, y = resp)会计算pred的每列和y的相关系数。resp <- rnorm(n = 10, mean = 0, sd = 1)pred <- as.data.frame(matrix(rnorm(n = 10 * 100, mean = 0, sd = 1), 10, 100))cors <- sapply(pred, cor, y = resp)mask <- (rank(-abs(cors)) <= 10) #函数rank给出从小到大的排序best.pred <- pred[, mask]
sapply(Cars93, class)
## Manufacturer ModelType ## "factor" "factor" "factor" ##Min.Price PriceMax.Price ##"numeric""numeric""numeric" ## MPG.city MPG.highway AirBags ##"integer""integer" "factor" ## DriveTrainCylinders EngineSize ## "factor" "factor""numeric" ## HorsepowerRPM Rev.per.mile ##"integer""integer""integer" ## Man.trans.avail Fuel.tank.capacity Passengers ## "factor""numeric""integer" ## LengthWheelbase Width ##"integer""integer""integer" ## Turn.circleRear.seat.room Luggage.room ##"integer""numeric""integer" ## Weight OriginMake ##"integer" "factor" "factor"
tapply:对数据向量按因子分组应用函数
对数据框的行按因子分组应用函数,by(dfrm, fact, fun)
非向量化函数的向量化,mapply(f, vec1, vec2, …, vecN),f有N个参数。
mapply(rep, 1:4, 4:1)
## [[1]]## [1] 1 1 1 1## ## [[2]]## [1] 2 2 2## ## [[3]]## [1] 3 3## ## [[4]]## [1] 4
mapply(rep, times = 1:4, x = 4:1)
## [[1]]## [1] 4## ## [[2]]## [1] 3 3## ## [[3]]## [1] 2 2 2## ## [[4]]## [1] 1 1 1 1
gcd <- function(a, b) {if (b == 0) return(a) else return(gcd(b, a%%b))}mapply(gcd, c(1, 2, 3), c(9, 6, 3))
## [1] 1 2 3
参考文献
R cookbook 第六章
如果觉得《R向量化操作(Data Transformations)》对你有帮助,请点赞、收藏,并留下你的观点哦!