哭暈
你真的學會了stem()函數了嗎?
stem()函數的使用方法是:
stem(x, scale=1,width=80, atom=le-08)
其中x是數據向量.
scale控制繪出莖葉圖的長度.
width繪圖的寬度.
atom是容差,如果選擇scale=2,即將10個個位數俞成兩段,0~4為一段,5~9為另一段。
然而事實上,我經過反復的試驗,發現width最好取較大的數,他既不表示數據的取值范圍也不表示最長的那片葉子的長度,也不表示所有的數據的個數
With the default scale, you see that numbers left of the bar go up by two - hence anything after the 4 is forty-something(四十幾) or fifty-something(五十幾):
d<-c(60,85,72,59,37,75,93,7,98,63,41,90,5,17,97)
> stem(d,scale=1)
The decimal point is 1 digit(s) to the right of the |
0 | 577
2 | 7
4 | 19
6 | 0325
8 | 50378
Using scale=2, you'll see numbers left of the bar go up by one, so now you can get exact reconstruction(再造) of your input, since you input integers:
> stem(d,scale=2)
The decimal point is 1 digit(s) to the right of the |
0 | 57
1 | 7
2 |
3 | 7
4 | 1
5 | 9
6 | 03
7 | 25
8 | 5
9 | 0378
Going further you can even split it by
first and second five within each decade:
> stem(d,scale=4)
The decimal point is 1 digit(s) to the right of the |
0 | 57
1 |
1 | 7
2 |
2 |
3 |
3 | 7
4 | 1
4 |
5 |
5 | 9
6 | 03
6 |
7 | 2
7 | 5
8 |
8 | 5
9 | 03
9 | 78
Stem plots do not always guarantee you can reproduce the data by reversing the process, and that's not what they're for.
This is probably more of a Stack Overflow question than a CV question, because it focuses on how R works and why, rather than the statistical aspects of stem and leaf plots. Nonetheless...
The way the function is coded is designed to
shorten the length of the output, so that it better fits in the console. Few people, I believe, find that terribly helpful, or at least I don't. Just always remember to start with scale=2, and you may have to play with it further or adjust the width argument. Also, know that there is a fancier偏好者,(發燒友的比較級) version
stem.leaf() in Rcmdr.
總結:
R的stem函數其實是一個比較糟糕的設計,由於其設計的初衷是讓其在控制台上能盡量簡短的顯示(否則控制台寬度不夠),所以,當數據之間的差距較大的時候,就會出問題,他會跳着提升枝干,所以一般要設置sacle,而sacle設置的越大,分莖越多,精度越高,如果你的scale較小,他甚至會自動幫你的數據做四舍五入(這樣會降低精度)
> test<-c(57,122,1000)
> stem(test,
scale = 2)
The decimal point is
2 digit(s) to the right of the |
0 | 6
1 | 2
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 | 0
> stem(test,
scale = 10)
The decimal point is
1 digit(s) to the right of the |
4 | 7
6 |
8 |
10 |
12 | 2
14 |
16 |
18 |
20 |
22 |
24 |
26 |
28 |
30 |
32 |
34 |
36 |
38 |
40 |
42 |
44 |
46 |
48 |
50 |
52 |
54 |
56 |
58 |
60 |
62 |
64 |
66 |
68 |
70 |
72 |
74 |
76 |
78 |
80 |
82 |
84 |
86 |
88 |
90 |
92 |
94 |
96 |
98 |
100 | 0
> stem(test,
scale = 20)
The decimal point is 1 digit(s) to the right of the |
5 | 7
6 |
7 |
8 |
9 |
10 |
11 |
12 | 2
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
39 |
40 |
41 |
42 |
43 |
44 |
45 |
46 |
47 |
48 |
49 |
50 |
51 |
52 |
53 |
54 |
55 |
56 |
57 |
58 |
59 |
60 |
61 |
62 |
63 |
64 |
65 |
66 |
67 |
68 |
69 |
70 |
71 |
72 |
73 |
74 |
75 |
76 |
77 |
78 |
79 |
80 |
81 |
82 |
83 |
84 |
85 |
86 |
87 |
88 |
89 |
90 |
91 |
92 |
93 |
94 |
95 |
96 |
97 |
98 |
99 |
100 | 0
