语义_NLU

本文转载自查看原文 2021-08-04 17:29 108 语义

History
Scope and context
Components and architecture

Natural-language understanding (NLU) or natural-language interpretation (NLI) is a subtopic of natural-language processing in artificial intelligence that deals with machine reading comprehension.

自然语言理解(NLU)或自然语言解释(NLI)是人工智能中自然语言处理的一个子主题。

自然语言处理是处理机器阅读理解。

Natural-language understanding is considered an AI-hard problem.

自然语言理解被认为是一个人工智能难题。

There is considerable commercial interest in the field because of its application to automated reasoning, machine translation, question answering, news-gathering, text categorization, voice-activation, archiving, and large-scale content analysis.

由于其在自动推理、机器翻译、问题回答、新闻收集、文本分类、语音激活、存档和大规模内容分析中的应用，在该领域有相当大的商业利益。

History

The program STUDENT, written in 1964 by Daniel Bobrow for his PhD dissertation at MIT, is one of the earliest known attempts at natural-language understanding by a computer.

1964年，丹尼尔·伯罗为他在麻省理工学院的博士论文撰写了“STUDENT”项目，这是已知的最早通过计算机理解自然语言的尝试之一。

Eight years after John McCarthy coined the term artificial intelligence, Bobrow's dissertation (titled Natural Language Input for a Computer Problem Solving System) showed how a computer could understand simple natural language input to solve algebra word problems.

在约翰·麦卡锡创造了人工智能这个术语八年后，鲍勃罗的论文(标题为计算机问题解决系统的自然语言输入)展示了计算机如何理解简单的自然语言输入来解决代数单词问题。

A year later, in 1965, Joseph Weizenbaum at MIT wrote ELIZA, an interactive program that carried on a dialogue in English on any topic, the most popular being psychotherapy.

一年后的1965年，麻省理工学院的约瑟夫·韦森鲍姆写了《伊莱扎》，这是一个互动节目，用英语就任何话题进行对话，最受欢迎的是心理治疗。

ELIZA worked by simple parsing and substitution of key words into canned phrases and Weizenbaum sidestepped the problem of giving the program a database of real-world knowledge or a rich lexicon.

伊莱扎通过简单的解析和将关键词替换成固定短语来工作，韦森鲍姆回避了为程序提供真实世界知识数据库或丰富词典的问题。

Yet ELIZA gained surprising popularity as a toy project and can be seen as a very early precursor to current commercial systems such as those used by Ask.com.

然而，伊莱扎作为一个玩具项目获得了惊人的人气，可以被视为当前商业系统(如Ask.com使用的系统)的早期先驱。

In 1969 Roger Schank at Stanford University introduced the conceptual dependency theory for natural-language understanding.

1969年，斯坦福大学的Roger Schank引入了自然语言理解的概念依赖理论。

This model, partially influenced by the work of Sydney Lamb, was extensively used by Schank's students at Yale University, such as Robert Wilensky, Wendy Lehnert, and Janet Kolodner.

这一模型部分受到悉尼·兰姆工作的影响，被夏克在耶鲁大学的学生广泛使用，如罗伯特·威伦斯基、温迪·莱纳特和珍妮特·科洛德纳。

In 1970, William A. Woods introduced the augmented transition network (ATN) to represent natural language input.

1970年，威廉·伍兹引入了扩展转换网络(ATN)来表示自然语言输入。

Instead of phrase structure rules ATNs used an equivalent set of finite state automata that were called recursively. ATNs and their more general format called "generalized ATNs" continued to be used for a number of years.

ATNs使用递归调用的有限状态自动机的等价集合来代替短语结构规则。ATNs及其更通用的称为“通用ATNs”的格式持续使用了若干年。

In 1971 Terry Winograd finished writing SHRDLU for his PhD thesis at MIT.

1971年，特里·维诺格拉德在麻省理工学院完成了博士论文的写作。

SHRDLU could understand simple English sentences in a restricted world of children's blocks to direct a robotic arm to move items. The successful demonstration of SHRDLU provided significant momentum for continued research in the field.

SHRDLU可以在一个儿童街区的受限世界里理解简单的英语句子，指导机器人手臂移动物品。SHRDLU的成功展示为该领域的持续研究提供了重要动力。

Winograd continued to be a major influence in the field with the publication of his book Language as a Cognitive Process.

随着他的《语言作为认知过程》一书的出版，维诺格拉德继续在该领域产生重大影响。

At Stanford, Winograd would later advise Larry Page, who co-founded Google.

在斯坦福大学，维诺格拉德后来为共同创立谷歌的拉里·佩奇提供建议。

In the 1970s and 1980s the natural language processing group at SRI International continued research and development in the field.

在20世纪70年代和80年代，SRI国际的自然语言处理小组继续在该领域进行研究和开发。

A number of commercial efforts based on the research were undertaken, e.g., in 1982 Gary Hendrix formed Symantec Corporation originally as a company for developing a natural language interface for database queries on personal computers.

基于该研究进行了许多商业努力，例如，1982年加里·亨德里克斯成立了赛门铁克公司，最初是一家为个人计算机上的数据库查询开发自然语言界面的公司。

However, with the advent of mouse-driven graphical user interfaces, Symantec changed direction.

然而，随着鼠标驱动的图形用户界面的出现，赛门铁克改变了方向。

A number of other commercial efforts were started around the same time, e.g., Larry R. Harris at the Artificial Intelligence Corporation and Roger Schank and his students at Cognitive Systems Corp.

大约在同一时间，许多其他的商业努力也开始了，例如人工智能公司的拉里·哈里斯和认知系统公司的罗杰·夏克及其学生

In 1983, Michael Dyer developed the BORIS system at Yale which bore similarities to the work of Roger Schank and W. G. Lehnert.

1983年，迈克尔·戴尔在耶鲁大学开发了BORIS系统，该系统与罗杰·夏克和W. G .莱纳特的工作有相似之处。

The third millennium saw the introduction of systems using machine learning for text classification, such as the IBM Watson.

第三个千年出现了使用机器学习进行文本分类的系统，例如IBM Watson。

However, experts debate how much "understanding" such systems demonstrate: e.g., according to John Searle, Watson did not even understand the questions.

然而，专家们争论这种系统证明了多少“理解”:例如，根据约翰·塞尔的说法，沃森甚至不理解这些问题。

John Ball, cognitive scientist and inventor of Patom Theory, supports this assessment.

认知科学家和帕顿理论的发明者约翰·鲍尔支持这一评估。

Natural language processing has made inroads for applications to support human productivity in service and ecommerce, but this has largely been made possible by narrowing the scope of the application.

自然语言处理已经在支持服务和电子商务中的人类生产力的应用程序中取得了进展，但这在很大程度上是通过缩小应用程序的范围来实现的。

There are thousands of ways to request something in a human language that still defies conventional natural language processing.

有成千上万种方法可以用人类语言来请求一些仍然无法通过传统自然语言处理的东西。

"To have a meaningful conversation with machines is only possible when we match every word to the correct meaning based on the meanings of the other words in the sentence – just like a 3-year-old does without guesswork."

“只有当我们根据句子中其他单词的含义将每个单词与正确的含义进行匹配时，才有可能与机器进行有意义的对话——就像一个3岁的孩子不需要猜测一样。”

Scope and context

The umbrella term "natural-language understanding" can be applied to a diverse set of computer applications, ranging from small, relatively simple tasks such as short commands issued to robots, to highly complex endeavors such as the full comprehension of newspaper articles or poetry passages.

“自然语言理解”这个总括术语可以应用于各种各样的计算机应用，从相对简单的小任务(如向机器人发出的简短命令)到高度复杂的任务(如完全理解报纸文章或诗歌段落)。

Many real-world applications fall between the two extremes, for instance text classification for the automatic analysis of emails and their routing to a suitable department in a corporation does not require an in-depth understanding of the text,but needs to deal with a much larger vocabulary and more diverse syntax than the management of simple queries to database tables with fixed schemata.

许多现实世界的应用程序处于两个极端之间，例如，用于自动分析电子邮件的文本分类以及将它们发送到公司的合适部门不需要对文本有深入的理解，而是需要处理比管理对具有固定模式的数据库表的简单查询更大的词汇和更多样的语法。

Throughout the years various attempts at processing natural language or English-like sentences presented to computers have taken place at varying degrees of complexity.

多年来，各种处理自然语言或类似英语的句子的尝试以不同的复杂程度出现在计算机上。

Some attempts have not resulted in systems with deep understanding, but have helped overall system usability.

一些尝试没有产生具有深刻理解的系统，但是有助于整体系统可用性。

For example, Wayne Ratliff originally developed the Vulcan program with an English-like syntax to mimic the English speaking computer in Star Trek.

例如，韦恩·拉特里夫最初用类似英语的语法开发了火神程序，以模仿《星际迷航》中说英语的计算机。

Vulcan later became the dBase system whose easy-to-use syntax effectively launched the personal computer database industry.

Vulcan后来成为dBase系统，其易于使用的语法有效地启动了个人计算机数据库行业。

Systems with an easy to use or English like syntax are, however, quite distinct from systems that use a rich lexicon and include an internal representation (often as first order logic) of the semantics of natural language sentences.

然而，具有易于使用的或类似英语的语法的系统与使用丰富的词典并包括自然语言句子语义的内部表示(通常作为一阶逻辑)的系统截然不同。

Hence the breadth and depth of "understanding" aimed at by a system determine both the complexity of the system (and the implied challenges) and the types of applications it can deal with.

因此，系统所针对的“理解”的广度和深度决定了系统的复杂性(以及隐含的挑战)和它能够处理的应用类型。

The "breadth" of a system is measured by the sizes of its vocabulary and grammar.

一个系统的“广度”是由它的词汇和语法的大小来衡量的。

The "depth" is measured by the degree to which its understanding approximates that of a fluent native speaker.

“深度”是通过它的理解接近流利的母语人士的程度来衡量的。

At the narrowest and shallowest, English-like command interpreters require minimal complexity, but have a small range of applications.

在最窄和最浅的地方，类似英语的命令解释器需要最小的复杂性，但是应用范围很小。

Narrow but deep systems explore and model mechanisms of understanding,but they still have limited application.

狭窄但深入的系统探索和模型化理解机制，但它们的应用仍然有限。

Systems that attempt to understand the contents of a document such as a news release beyond simple keyword matching and to judge its suitability for a user are broader and require significant complexity,but they are still somewhat shallow.

试图超越简单的关键字匹配来理解文档内容(如新闻发布)并判断其是否适合用户的系统范围更广，需要相当大的复杂性，但它们仍然有些肤浅。

Systems that are both very broad and very deep are beyond the current state of the art.

既宽又深的系统超出了当前的技术水平。

Components and architecture

Regardless of the approach used, most natural-language-understanding systems share some common components.

不管使用哪种方法，大多数自然语言理解系统都有一些共同的组成部分。

The system needs a lexicon of the language and a parser and grammar rules to break sentences into an internal representation.

该系统需要一个语言词典、一个解析器和语法规则来将句子分解成内部表示。

The construction of a rich lexicon with a suitable ontology requires significant effort, e.g., the Wordnet lexicon required many person-years of effort.

用合适的本体构建丰富的词典需要大量的努力，例如，Wordnet词典需要许多人年的努力。

The system also needs theory from semantics to guide the comprehension.

该系统还需要语义学的理论来指导理解。

The interpretation capabilities of a language-understanding system depend on the semantic theory it uses.

该系统还需要语义学的理论来指导理解。

Competing semantic theories of language have specific trade-offs in their suitability as the basis of computer-automated semantic interpretation.

相互竞争的语言语义理论在作为计算机自动语义解释基础的适用性方面有特定的权衡。

These range from naive semantics or stochastic semantic analysis to the use of pragmatics to derive meaning from context.

这些包括从朴素语义或随机语义分析到使用语用学从语境中推导意义。

Semantic parsers convert natural-language texts into formal meaning representations.

语义分析器将自然语言文本转换成形式意义表示。

Advanced applications of natural-language understanding also attempt to incorporate logical inference within their framework.

自然语言理解的高级应用程序也试图将逻辑推理纳入它们的框架。

This is generally achieved by mapping the derived meaning into a set of assertions in predicate logic, then using logical deduction to arrive at conclusions.

这通常是通过将派生的意义映射到谓词逻辑中的一组断言中，然后使用逻辑推导得出结论来实现的。

Therefore, systems based on functional languages such as Lisp need to include a subsystem to represent logical assertions, while logic-oriented systems such as those using the language Prolog generally rely on an extension of the built-in logical representation framework.

因此，基于功能语言(如Lisp)的系统需要包含一个子系统来表示逻辑断言，而面向逻辑的系统(如使用语言Prolog的系统)通常依赖于内置逻辑表示框架的扩展。

The management of context in natural-language understanding can present special challenges.

自然语言理解中的语境管理会带来特殊的挑战。

A large variety of examples and counter examples have resulted in multiple approaches to the formal modeling of context, each with specific strengths and weaknesses.

大量不同的例子和反例导致了多种形式的上下文建模方法，每种方法都有特定的优点和缺点。

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 NLP VS NLU rasa使用教程二——NLU数据准备 Shader 语义 volatile语义理解什么是语义？任务型对话（一）—— NLU/SLU（意图识别和槽值填充）智能问答中的NLU意图识别流程梳理什么是语义化？列举几个语义化标签【原创】消息队列的消费语义和投递语义语义SLAM的数据关联和语义定位（一）