// 此博文為遷移而來,寫於2015年5月27日,不代表本人現在的觀點與看法。原始地址:http://blog.sina.com.cn/s/blog_6022c4720102w1s8.html
UPDATE(20180827):代碼重寫。
一、前言
因為Trie樹和AC自動機的密切相關,我想一起講完哈哈。。。看過前面博文的同學應該都知道了,AC自動機其實就是相當於在Trie樹上跑KMP。
二、Trie樹
Trie樹,就是字母樹。Trie樹是多叉樹,每個節點為一個字母。其根節點為象征節點(就是說沒有含義,但是存在這個節點),從根節點開始建立,每個節點至多為26個子節點,這樣,我們就可以用這種方便快捷的方式存儲字符串。其應用也不言而喻,用於保存,統計,排序,查找大量字符串。因為很簡單,我們不講太多,根據圖像,自己造幾個字符串,慢慢理解,看看代碼,一下就懂了。

如圖所示,該字符串保存了say,she,shr,her四個字符串。有個小小的問題:在建樹的時候,我們注意到最壞情況可能為二十六叉樹,空間復雜度可想而知。所以,如果用指針能更省空間。
三、構造fail指針(KMP)
在網上看到有許多AC自動機的算法分析,但是發現好像都很相似(莫非都是Ctrl+C/V)。構造fail指針,使當前字符失配時跳轉到具有最長公共前后綴的字符繼續匹配。如同 KMP算法一樣, AC自動機在匹配時如果當前字符匹配失敗,那么利用fail指針進行跳轉。由此可知如果跳轉,跳轉后的串的前綴,必為跳轉前的模式串的后綴。並且跳轉的新位置的深度(匹配字符個數)一定小於跳之前的節點。
我們在構建好Trie樹之后,可以利用BFS進行 fail指針求解。我們最開始先將root節點入隊,因為第一個字符不匹配需要重新匹配,所以第一個字符都指向root。這樣,我們得到下圖:

四、例題
Keywords Search [ HDU 2222 ]
In the modern time, Search engine came into the life of everybody like Google, Baidu, etc.
Wiskey also wants to bring this feature to his image retrieval system.
Every image have a long description, when users type some keywords to find the image, the system will match the keywords with description of image and show the image which the most keywords be matched.
To simplify the problem, giving you a description of image, and some keywords, you should tell me how many keywords will be match.
輸入格式
First line will contain one integer means how many cases will follow by.
Each case will contain two integers N means the number of keywords and N keywords follow. (N <= 10000)
Each keyword will only contains characters 'a'-'z', and the length will be not longer than 50.
The last line is the description, and the length will be not longer than 1000000.
輸出格式
Print how many keywords are contained in the description.
輸入樣例
1
5
she
he
say
shr
her
yasherhs
輸出樣例
3
代碼:
1 #include <cstdio> 2 #include <cstring> 3 4 #define MAXM 1000005 5 6 int T, n, tot, q[MAXM], ls, l, r; 7 char s[55], ch[MAXM]; 8 9 struct Tree { 10 int a[26], x, f; 11 } t[MAXM]; 12 13 void insert() { 14 int o = r; 15 for (int i = 0; i < ls; i++) { 16 int x = s[i] - 'a'; 17 if (!t[o].a[x]) t[o].a[x] = ++tot; 18 o = t[o].a[x]; 19 } 20 t[o].x++; 21 } 22 23 void getf() { 24 int head = 1, tail = 2; 25 q[1] = r; 26 while (head != tail) { 27 int o = q[head]; 28 for (int i = 0; i <= 25; i++) { 29 int v = t[o].a[i]; 30 if (!v) continue; 31 if (o == r) t[v].f = r; 32 else { 33 int of = t[o].f; 34 while (of) { 35 if (t[of].a[i]) { 36 t[v].f = t[of].a[i]; 37 break; 38 } 39 of = t[of].f; 40 } 41 if (!of) t[v].f = r; 42 } 43 q[tail++] = v; 44 } 45 head++; 46 } 47 } 48 49 int find() { 50 int ans = 0, x = r; 51 for (int i = 0; i < l; i++) { 52 int o = ch[i] - 'a'; 53 while (!t[x].a[o] && x != r) x = t[x].f; 54 x = t[x].a[o] ? t[x].a[o] : r; 55 int y = x; 56 while (t[y].x) ans += t[y].x, t[y].x = 0, y = t[y].f; 57 } 58 return ans; 59 } 60 61 int main() { 62 scanf("%d", &T); 63 for (int i = 1; i <= T; i++) { 64 scanf("%d", &n), r = ++tot; 65 for (int j = 1; j <= n; j++) scanf("%s", s), ls = strlen(s), insert(); 66 getf(); 67 for (int j = r + 1; j <= tot; j++) if (!t[j].f) t[j].f = r; 68 scanf("%s", ch), l = strlen(ch); 69 printf("%d\n", find()); 70 } 71 return 0; 72 }