Standford CoreNLP包含很多功能,github上有源碼,github地址:Stanford CoreNLP,有需要的話可以下載看看。
主要內容在網站上都有描述,原文是這樣寫的:
Choose Stanford CoreNLP if you need:
- An integrated toolkit with a good range of grammatical analysis tools
- Fast, reliable analysis of arbitrary texts
- The overall highest quality text analytics
- Support for a number of major (human) languages
- Interfaces available for various major modern programming languages
- Ability to run as a simple web service
工具以及對各種語言的支持如下表(英文和中文支持的最好),分別對應:分詞,斷句,定詞性,詞元化,分辨命名實體,語法分析,情感分析,同義詞分辨等。
Annotator | ar | zh | en | fr | de | es |
---|---|---|---|---|---|---|
Tokenize / Segment | ✔ | ✔ | ✔ | ✔ | ✔ | |
Sentence Split | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Part of Speech | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Lemma | ✔ | |||||
Named Entities | ✔ | ✔ | ✔ | ✔ | ||
Constituency Parsing | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Dependency Parsing | ✔ | ✔ | ✔ | ✔ | ||
Sentiment Analysis | ✔ | |||||
Mention Detection | ✔ | ✔ | ||||
Coreference | ✔ | ✔ | ||||
Open IE | ✔ |