Trie 樹及Java實現

本文轉載自查看原文 2015-04-27 23:01 3208

來源於英文“retrieval”. Trie樹就是字符樹，其核心思想就是空間換時間。

舉個簡單的例子。
給你100000個長度不超過10的單詞。對於每一個單詞，我們要判斷他出沒出現過，如果出現了，第一次出現第幾個位置。
這題當然可以用hash來，但是我要介紹的是trie樹。在某些方面它的用途更大。比如說對於某一個單詞，我要詢問它的前綴是否出現過。這樣hash就不好搞了，而用trie還是很簡單。

現在回到例子中，如果我們用最傻的方法，對於每一個單詞，我們都要去查找它前面的單詞中是否有它。那么這個算法的復雜度就是O(n^2)。顯然對於100000的范圍難以接受。現在我們換個思路想。假設我要查詢的單詞是abcd，那么在他前面的單詞中，以b，c，d，f之類開頭的我顯然不必考慮。而只要找以a開頭的中是否存在abcd就可以了。同樣的，在以a開頭中的單詞中，我們只要考慮以b作為第二個字母的……這樣一個樹的模型就漸漸清晰了……

假設有b，abc，abd，bcd，abcd，efg，hii這6個單詞，我們構建的樹就是這樣的。

對於每一個節點，從根遍歷到他的過程就是一個單詞，如果這個節點被標記為紅色，就表示這個單詞存在，否則不存在。
那么，對於一個單詞，我只要順着他從根走到對應的節點，再看這個節點是否被標記為紅色就可以知道它是否出現過了。把這個節點標記為紅色，就相當於插入了這個單詞。

我們可以看到，trie樹每一層的節點數是26^i級別的。所以為了節省空間。我們用動態鏈表，或者用數組來模擬動態。空間的花費，不會超過單詞數×單詞長度。(轉自一大牛)

Trie樹的java代碼 實現如下：

import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;


/** *//**
 * A word trie which can only deal with 26 alphabeta letters.
 * @author Leeclipse
 * @since 2007-11-21
 */

public class Trie{
 
   private Vertex root;//一個Trie樹有一個根節點

    //內部類
    protected class Vertex{//節點類
        protected int words;
        protected int prefixes;
        protected Vertex[] edges;//每個節點包含26個子節點(類型為自身)
        Vertex() {
            words = 0;
            prefixes = 0;
            edges = new Vertex[26];
            for (int i = 0; i < edges.length; i++) {
                edges[i] = null;
            }
        }
    }

  
    public Trie () {
        root = new Vertex();
    }

   
    /** *//**
     * List all words in the Trie.
     * 
     * @return
     */

    public List< String> listAllWords() {
       
        List< String> words = new ArrayList< String>();
        Vertex[] edges = root.edges;
       
        for (int i = 0; i < edges.length; i++) {
            if (edges[i] != null) {
                     String word = "" + (char)('a' + i);
                depthFirstSearchWords(words, edges[i], word);
            }
        }        
        return words;
    }

     /** *//**
     * Depth First Search words in the Trie and add them to the List.
     * 
     * @param words
     * @param vertex
     * @param wordSegment
     */

    private void depthFirstSearchWords(List words, Vertex vertex, String wordSegment) {
        Vertex[] edges = vertex.edges;
        boolean hasChildren = false;
        for (int i = 0; i < edges.length; i++) {
            if (edges[i] != null) {
                hasChildren = true;
                String newWord = wordSegment + (char)('a' + i);                
                depthFirstSearchWords(words, edges[i], newWord);
            }            
        }
        if (!hasChildren) {
            words.add(wordSegment);
        }
    }

    public int countPrefixes(String prefix) {
        return countPrefixes(root, prefix);
    }

    private int countPrefixes(Vertex vertex, String prefixSegment) {
        if (prefixSegment.length() == 0) { //reach the last character of the word
            return vertex.prefixes;
        }

        char c = prefixSegment.charAt(0);
        int index = c - 'a';
        if (vertex.edges[index] == null) { // the word does NOT exist
            return 0;
        } else {

            return countPrefixes(vertex.edges[index], prefixSegment.substring(1));

        }        

    }

    public int countWords(String word) {
        return countWords(root, word);
    }    

    private int countWords(Vertex vertex, String wordSegment) {
        if (wordSegment.length() == 0) { //reach the last character of the word
            return vertex.words;
        }

        char c = wordSegment.charAt(0);
        int index = c - 'a';
        if (vertex.edges[index] == null) { // the word does NOT exist
            return 0;
        } else {
            return countWords(vertex.edges[index], wordSegment.substring(1));

        }        

    }

    
    /** *//**
     * Add a word to the Trie.
     * 
     * @param word The word to be added.
     */

    public void addWord(String word) {
        addWord(root, word);
    }

    
    /** *//**
     * Add the word from the specified vertex.
     * @param vertex The specified vertex.
     * @param word The word to be added.
     */

    private void addWord(Vertex vertex, String word) {
       if (word.length() == 0) { //if all characters of the word has been added
            vertex.words ++;
        } else {
            vertex.prefixes ++;
            char c = word.charAt(0);
            c = Character.toLowerCase(c);
            int index = c - 'a';
            if (vertex.edges[index] == null) { //if the edge does NOT exist
                vertex.edges[index] = new Vertex();
            }

            addWord(vertex.edges[index], word.substring(1)); //go the the next character
        }
    }

    public static void main(String args[])  //Just used for test
    {
    Trie trie = new Trie();
    trie.addWord("China");
    trie.addWord("China");
    trie.addWord("China");

    trie.addWord("crawl");
    trie.addWord("crime");
    trie.addWord("ban");
    trie.addWord("China");

    trie.addWord("english");
    trie.addWord("establish");
    trie.addWord("eat");
    System.out.println(trie.root.prefixes);
     System.out.println(trie.root.words);


   
     List< String> list = trie.listAllWords();
     Iterator listiterator = list.listIterator();
     
     while(listiterator.hasNext())
     {
          String s = (String)listiterator.next();
          System.out.println(s);
     }

   
     int count = trie.countPrefixes("ch");
     int count1=trie.countWords("china");
     System.out.println("the count of c prefixes:"+count);
     System.out.println("the count of china countWords:"+count1);

 
    }
}
運行:
C:\test>java   Trie
10
0
ban
china
crawl
crime
eat
english
establish
the count of c prefixes:4
the count of china countWords:4

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Trie樹實現[ java ] 雙數組Trie樹(DoubleArrayTrie)Java實現 java實現的Trie樹數據結構 Trie樹的C++實現 trie樹（前綴樹）詳解——PHP代碼實現 trie樹-前綴樹從Trie樹到雙數組Trie樹 [原創] Trie樹 php 實現敏感詞過濾字典樹（Trie tree）【trie樹】Xor Sum

Trie 樹 及Java實現

免責聲明！

Trie 樹及Java實現