雖然R語言有類型很豐富的數據結構,但是很多時候數據結構比較復雜,那么基本就會用到list這種結構的數據類型。但是list對象很難以文本的形式導出,因此需要一個函數能快速將復雜的list結構扁平化成dataframe。這里要介紹的就是do.call函數。
這里是do.call 函數的官方文檔:
do.call {base} | R Documentation |
Execute a Function Call
Description
do.call
constructs and executes a function call from a name or a function and a list of arguments to be passed to it.
Usage
do.call(what, args, quote = FALSE, envir = parent.frame())
Arguments
what |
either a function or a non-empty character string naming the function to be called. |
args |
a list of arguments to the function call. The |
quote |
a logical value indicating whether to quote the arguments. |
envir |
an environment within which to evaluate the call. This will be most useful if |
Details
If quote
is FALSE
, the default, then the arguments are evaluated (in the calling environment, not in envir
). If quote
is TRUE
then each argument is quoted (see quote
) so that the effect of argument evaluation is to remove the quotes – leaving the original arguments unevaluated when the call is constructed.
The behavior of some functions, such as substitute
, will not be the same for functions evaluated using do.call
as if they were evaluated from the interpreter. The precise semantics are currently undefined and subject to change.
Value
The result of the (evaluated) function call.
Warning
This should not be used to attempt to evade restrictions on the use of .Internal
and other non-API calls.
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
簡單的講,do.call 的功能就是執行一個函數,而這個函數的參數呢,放在一個list里面, 是list的每個子元素。
看例子:
> tmp <- data.frame('letter' = letters[1:10], 'number' = 1:10, 'value' = c('+','-')) > tmp letter number value 1 a 1 + 2 b 2 - 3 c 3 + 4 d 4 - 5 e 5 + 6 f 6 - 7 g 7 + 8 h 8 - 9 i 9 + 10 j 10 - > tmp[[1]] [1] a b c d e f g h i j > tmp[[2]] [1] 1 2 3 4 5 6 7 8 9 10 > tmp[[3]] [1] + - + - + - + - + - > do.call("paste", c(tmp, sep = "")) [1] "a1+" "b2-" "c3+" "d4-" "e5+" "f6-" "g7+" "h8-" "i9+" "j10-"
這里的tmp使用data.frame函數創建的,其實它本質上還是一個list,這里分別用[[]]符號顯示他的三個元素,可以看到do.call函數把tmp的三個元素(三個向量)作為paste函數的參數。這個例子我們也可以這樣寫:
> paste(tmp[[1]],tmp[[2]],tmp[[3]], sep = "") [1] "a1+" "b2-" "c3+" "d4-" "e5+" "f6-" "g7+" "h8-" "i9+" "j10-"
可以看到兩種結果是一模一樣的。
再舉一個例子:
> number_add <- list(101:110, 1:10) > number_add [[1]] [1] 101 102 103 104 105 106 107 108 109 110 [[2]] [1] 1 2 3 4 5 6 7 8 9 10 > add <- function(x,y) {x + y} > add function(x,y) {x + y} > do.call(add, number_add) [1] 102 104 106 108 110 112 114 116 118 120 > add(number_add[[1]], number_add[[2]]) [1] 102 104 106 108 110 112 114 116 118 120
最后回到開頭,假如說我們有一個list對象,這個對象里面是格式一致的dataframe,我們需要將這個list對象合並成一個總的dataframe並輸出成文本文件,那么可以這樣做:
> list1 [[1]] up down number 1 A a 1 2 B b 2 3 C c 3 4 D d 4 5 E e 5 [[2]] up down number 1 A a 1 2 B b 2 3 C c 3 4 D d 4 5 E e 5 [[3]] up down number 1 A a 1 2 B b 2 3 C c 3 4 D d 4 5 E e 5 > do.call("rbind",list1) up down number 1 A a 1 2 B b 2 3 C c 3 4 D d 4 5 E e 5 6 A a 1 7 B b 2 8 C c 3 9 D d 4 10 E e 5 11 A a 1 12 B b 2 13 C c 3 14 D d 4 15 E e 5
這里再推薦一個比較實用的函數族,apply族函數,有興趣的朋友可以查閱相關資料。