Creating a Syntax Highlighter for Ace 給ace創建一個語法高亮
Creating a new syntax highlighter for Ace is extremely simple. You'll need to define two pieces of code: a new mode, and a new set of highlighting rules.
創建一個新的ace語法高亮極為簡單。你需要定義兩個代碼: 一個新的mode和一組新的高亮規則。
Where to Start
We recommend using the Ace Mode Creator when defining your highlighter. This allows you to inspect your code's tokens, as well as providing a live preview of the syntax highlighter in action.
我們建議使用 Ace Mode Creator 定義你的高亮。這允許你檢查你的代碼的tokens,以及在操作中提供語法高亮的實時預覽。
Ace Mode Creator : https://ace.c9.io/tool/mode_creator.html
Defining a Mode
Every language needs a mode. A mode contains the paths to a language's syntax highlighting rules, indentation rules, and code folding rules. Without defining a mode, Ace won't know anything about the finer aspects of your language.
Here is the starter template we'll use to create a new mode:
每種語言都需要一個mode。mode包含語言的語法高亮規則,縮進規則和代碼折疊規則的路徑。在沒有定義mode的情況下,ACE對你語言的細微之處一無所知
這是一個啟動模板,我們將用它創建一個新的mode:
What's going on here? First, you're defining the path to TextMode (more on this later). Then you're pointing the mode to your definitions for the highlighting rules, as well as your rules for code folding. Finally, you're setting everything up to find those rules, and exporting the Mode so that it can be consumed. That's it!
這里發生了什么?首先,你定義了TextMode的路徑(稍后對此進行更多的闡述)。然后,你將mode指向你定義的高亮規則以及代碼折疊規則。最后你設置所有的內容來查找這些規則,並導出該Mode以便它可以被使用。
Regarding TextMode, you'll notice that it's only being used once: oop.inherits(Mode, TextMode);. If your new language depends on the rules of another language, you can choose to inherit the same rules, while expanding on it with your language's own requirements. For example, PHP inherits from HTML, since it can be embedded directly inside .html pages. You can either inherit from TextMode, or any other existing mode, if it already relates to your language.
關於 TextMode, 你會注意到它只使用了一次:oop.inherits(Mode, TextMode); 如果你的新語言依賴於其他語言的規則,那么你可以選擇繼承相同的規則,同時根據你的語言自身的需求對其進行擴展。例如,PHP從HTML繼承,因為PHP可以直接嵌入到.html頁面中。你也可以從 TextMode繼承,或者其他已有的mode,如果它已經涉及到你的語言。
All Ace modes can be found in the lib/ace/mode folder.
ace的所有modes都可以在 lib/ace/mode 文件夾中找到
Defining Syntax Highlighting Rules 定義語法高亮規則
The Ace highlighter can be considered to be a state machine. Regular expressions define the tokens for the current state, as well as the transitions into another state. Let's define mynew_highlight_rules.js, which our mode above uses.
All syntax highlighters start off looking something like this:
ace高亮可以被認為是一個狀態機。正則表達式給當前狀態定義tokens,以及轉換到另一個狀態。讓我們定義 mynew_highlight_rules.js,上面使用的mode。
所有的語法高亮開始都像這樣:
The token state machine operates on whatever is defined in this.$rules. The highlighter always begins at the start state, and progresses down the list, looking for a matching regex. When one is found, the resulting text is wrapped within a <span class="ace_<token>"> tag, where <token> is defined as the token property. Note that all tokens are preceded by the ace_prefix when they're rendered on the page.
token狀態機運行在 this.$rules里不管什么定義。高亮總是從start 狀態開始,並沿着列表前進,尋找匹配的正則表達式regex。當找到文本時,被找到的文本被包裹在<span class="ace_<token>">標簽中, <token>是上面定義的 token屬性。請注意,當tokens渲染到頁面上時,都會以 ace_ 前綴呈現。
Once again, we're inheriting from TextHighlightRules here. We could choose to make this any other language set we want, if our new language requires previously defined syntaxes. For more information on extending languages, see "extending Highlighters" below.
再來一次,我們從 TextHighlightRules 繼承下來。如果我們的新語言需要先前定義的語法,我們可以選擇把它變成我們想要的任何其它語言集。有關擴展語言的更多信息,請查看下面的 extending Highlighters
Defining Tokens 定義tokens
The Ace highlighting system is heavily inspired by the TextMate language grammar. Most tokens will follow the conventions of TextMate when naming grammars. A thorough (albeit incomplete) list of tokens can be found on the Ace Wiki.
ace高亮系統深受 TextMate language grammar 啟發。當命名語法時,大多數tokens將遵循 TextMate的約定。在ace wiki上可以找到完整的token列表 (雖然不完整):
token列表: https://github.com/ajaxorg/ace/wiki/Creating-or-Extending-an-Edit-Mode#commonTokens
For the complete list of tokens, see tool/tmtheme.js. It is possible to add new token names, but the scope of that knowledge is outside of this document.
有關完整的tokens列表, 請查看 tool/tmtheme.js https://github.com/ajaxorg/ace/blob/master/tool/tmtheme.js 可以添加新的token名稱,但該知識的范圍在該文檔之外。
Multiple tokens can be applied to the same text by adding dots in the token, e.g. token: support.function wraps the text in a <span class="ace_support ace_function"> tag.
通過在tokens添加 點 ,可以將多個tokens作用於同一文本。例如 token: support.function 將文本包裹在 <span class="ace_support ace_function">標簽中。
Defining Regular Expressions 定義正則表達式
Regular expressions can either be a RegExp or String definition
正則表達式既可以是正則表達式也可以是字符串定義
If you're using a regular expression, remember to start and end the line with the / character, like this:
如果你使用一個正則表達式,記住像下面這樣,在一行的開始和結束使用 / 字符。
A caveat of using stringed regular expressions is that any \ character must be escaped. That means that even an innocuous regular expression like this:
使用字符串形式的正則表達式的一個警告是任何 \ 字符必須被轉義。這意味着,即使是一個像下面這樣的無害的正則表達式:
Must actually be written like this:
必須像下面這樣編寫:
Groupings 分組
You can also include flat regexps--(var)--or have matching groups--((a+)(b+)). There is a strict requirement whereby matching groups must cover the entire matched string; thus, (hel)lo is invalid. If you want to create a non-matching group, simply start the group with the ?: predicate; thus, (hel)(?:lo) is okay. You can, of course, create longer non-matching groups. For example:
你也可以包括 單一的正則 --(var)-- 或者 匹配組 --((a+)(b+))。嚴格要求匹配組必須覆蓋整個匹配字符串,因此 (hel)lo 是無效的。如果你想創建一個不匹配的組,只需要用 ?: 謂語作為組的開始;像 (hel)(?:lo) 也是可以的。 當然,你可以創建更長的非匹配組。 例如:
For flat regular expression matches, token can be a String, or a Function that takes a single argument (the match) and returns a string token. For example, using a function might look like this:
對於單一的正則表達式匹配, token可以是一個 String, 或者是一個接收單個參數(當前匹配)並返回一個字符串token的Function。例如,使用函數可能看起來像下面這樣:
If token is a function,it should take the same number of arguments as there are groups, and return an array of tokens.
如果token是一個函數,它應該具有與組相同的參數數目,並且返回一個tokens數組。
For grouped regular expressions, token can be a String, in which case all matched groups are given that same token, like this:
對於分組正則表達式,token可以是 String , 在這種情況下,所有的匹配組都被賦予相同的token。像下面這樣
More commonly, though, token is an Array (of the same length as the number of groups), whereby matches are given the token of the same alignment as in the match. For a complicated regular expression, like defining a function, that might look something like this:
然而,更常見的是,token是一個數組(長度與 組的數量 相同),由此,匹配被賦予與匹配中相同的對齊的token。對於一個復雜的正則表達式,像定義一個函數,看起來可能像下面這樣:
Defining States 定義狀態
The syntax highlighting state machine stays in the start state, until you define a next state for it to advance to. At that point, the tokenizer stays in that new state, until it advances to another state. Afterwards, you should return to the original start state.
語法高亮狀態機停留在 start 狀態,直到你給它定義一個 next 狀態來更新。此時, tokenizer保持在新的 state , 直到它進入到另一個狀態。然后, 你應該回到原來的 start 狀態。
Here's an example:
In this extremely short sample, we're defining some highlighting rules for when Ace detects a <![CDATA tag. When one is encountered, the tokenizer moves from start into the cdata state. It remains there, applying the text token to any string it encounters. Finally, when it hits a closing ]> symbol, it returns to the start state and continues to tokenize anything else.
在這個非常短的示例中,我們定義了一些用於檢測 <![CDATA 標簽的高亮規則。當遇到一個時,tokenizer從 start 移動到 cdata狀態。它仍然存在,將 ‘text’ token應用到它遇到的任何字符串。最后,當它命中關閉 ]> 符號時, 它返回到start 狀態並且繼續標記任何其他東西。
Using the TMLanguage Tool 使用 TMLanguage 工具
There is a tool that will take an existing tmlanguage file and do its best to convert it into Javascript for Ace to consume. Here's what you need to get started:
有一個工具,它將使用現有的 tmlanguage 文件,並盡最大努力將其轉換成 Javascript以供 ace使用。一下是你需要開始的:
- In the Ace repository, navigate to the tools folder.
- 在ace庫中, 導航到 tools 文件夾
- Run
npm installto install required dependencies.- 運行 npm install 安裝需要的依賴
- Run
node tmlanguage.js <path_to_tmlanguage_file>; for example,node <path_to_tmlanguage_file> /Users/Elrond/elven.tmLanguage- 運行 node tmlanguage.js <path_to_tmlanguage_file> 例如: node tmlanguage /Users/Elrond/elven.tmLanguage
Two files are created and placed in lib/ace/mode: one for the language mode, and one for the set of highlight rules. You will still need to add the code into ace/ext/modelist.js, and add a sample file for testing.
兩個文件被創建並放置在 lib/ace/mode 目錄下: 一個是語言 mode, 一個是高亮規則的集合。你仍然需要將代碼添加到 ace/ext/modelist.js中,並添加用於測試的示例文件。
A Note on Accuracy 關於精度的一點注記
Your .tmlanguage file will then be converted to the best of the converter’s ability. It is an understatement to say that the tool is imperfect. Probably, language mode creation will never be able to be fully autogenerated. There's a list of non-determinable items; for example:
你的 .tmlanguage 文件會轉換為 轉換器最好的能力。這是一個輕描淡寫的說法,該工具是不完美的。也許,語言模式的創造永遠不能完全自生。這里有一個不可確定的項目清單,如下:
- The use of regular expression lookbehinds
This is a concept that JavaScript simply does not have and needs to be faked- 正則表達式查找表的使用
- 這是一個javascript根本沒有,需要偽造的概念。
- Deciding which state to transition to
While the tool does create new states correctly, it labels them with generic terms likestate_2,state_10, e.t.c.- 決定向哪個 狀態 過渡
- 雖然工具確實創建了新的狀態,但它用 state_2, state_10等通用屬於來標記它們。
- Extending modes
Many modes say something likeinclude source.c, to mean, “add all the rules in C highlighting.” That syntax does not make sense to Ace or this tool (though of course you can extending existing highlighters).- 擴展模式
- 許多模式都說一些類似於 include source.c 的例子, 意思是”在c高亮中加入所有的規則“。這種語法對於ace或者這個工具是沒有意義的(當然,你可以擴展現有的高亮顯示器)。
- Rule preference order
- 規則偏好順序
- Gathering keywords
Most likely, you’ll need to take keywords from your language file and run them throughcreateKeywordMapper()- 關鍵詞采集
- 最有可能的,你需要從你的語言文件中獲取關鍵詞,並通過 createKeywordMapper() 運行它們。
However, the tool is an excellent way to get a quick start, if you already possess a tmlanguage file for you language.
然而。如果你對你的語言已經擁有了一個 tmlanguage 文件,這個工具是一個很好的快速入門的方法。
Extending Highlighters 擴展高亮
Suppose you're working on a LuaPage, PHP embedded in HTML, or a Django template. You'll need to create a syntax highlighter that takes all the rules from the original language (Lua, PHP, or Python) and extends it with some additional identifiers (<?lua, <?php, {%, for example). Ace allows you to easily extend a highlighter using a few helper functions.
假設你正在處理一個 LuaPage, PHP 嵌入到 HTML, 或者一個 Django模板。你需要創建一個語法高亮程序,它從原始語言(Lua, PHP, or Python)獲取所有語法規則,並使用一些附加標識符(例如, <?lua <?php, {%)擴展它。ace允許你使用幾個輔助函數輕松擴展高亮。
Getting Existing Rules 獲取已有的規則
To get the existing syntax highlighting rules for a particular language, use the getRules() function. For example:
要獲得特定語言的現有語法高亮規則,使用getRules() 函數,例如:
Extending a Highlighter
The addRules method does one thing, and it does one thing well: it adds new rules to an existing rule set, and prefixes any state with a given tag. For example, let's say you've got two sets of rules, defined like this:
addRules 方法做一件事,並且做的很好: 它向現有規則集添加新規則,並且用一個給定的標簽給任何狀態添加前綴。例如,假設你有兩套規則,定義如下:
If you want to incorporate newRules into this.$rules, you'd do something like this:
如果你想將 newRules 合並到 this.$rules , 你可以這樣做:
Extending Two Highlighters
The last function available to you combines both of these concepts, and it's called embedRules. It takes three parameters:
最后一個可用的函數將這兩個概念結合起來,稱為 embedRules。 它接收三個參數:
- An existing rule set to embed with
- 嵌入現有的規則
- A prefix to apply for each state in the existing rule set
- 在現有規則集中應用每個狀態的前綴
- A set of new states to add
- 添加一組新的狀態
Like addRules, embedRules adds on to the existing this.$rules object.
像 addRules, embedRules 添加到現有的 this.$rules 對象。
To explain this visually, let's take a look at the syntax highlighter for Lua pages, which combines all of these concepts:
為了直觀的解釋這一點,讓我們看看 Lua頁面的語法高亮,它結合了所有這些概念:
Here, this.$rules starts off as a set of HTML highlighting rules. To this set, we add two new checks for <%= and <?lua=. We also delegate that if one of these rules are matched, we should move onto the lua-start state. Next, embedRules takes the already existing set of LuaHighlightRules and applies the lua- prefix to each state there. Finally, it adds two new checks for %> and ?>, allowing the state machine to return to start.
這里, this.$rules 規則從一組 HTML高亮規則開始。對於這個集合,我們添加了兩個新的檢查 <%= 和 <?lua= 。我們還授權,如果這些規則中的一個匹配,我們應該移動到 lua-start 狀態。接下來,embedRules將已經存在的 LuaHIghlightRUles集合應用lua-前綴到每個狀態。最后, 它為 %> 和 ?> 添加了兩個新的檢查,允許狀態機返回到 start 。
Code Folding
Adding new folding rules to your mode can be a little tricky. First, insert the following lines of code into your mode definition:
在你的mode中添加新的折疊規則可能會有點棘手。 首先,將下面幾行代碼插入到你的mode定義中。
You'll be defining your code folding rules into the lib/ace/mode/folding folder. Here's a template that you can use to get started:
你將代碼折疊規則定義到 lib/ace/mode/folding 文件夾。 這里有個模板你可以用它來開始。
Just like with TextMode for syntax highlighting, BaseFoldMode contains the starting point for code folding logic. foldingStartMarkerdefines your opening folding point, while foldingStopMarker defines the stopping point. For example, for a C-style folding system, these values might look like this:
就像TextMode語法高亮一樣,BaseFoldMode包含代碼折疊邏輯的起點。foldingStartMarker 定義了你的折疊打開點, 而foldingStopMarker定義了停止點。例如,對於 C-style 折疊系統,這些值可能是這樣:
These regular expressions identify various symbols--{, [, //--to pay attention to. getFoldWidgetRange matches on these regular expressions, and when found, returns the range of relevant folding points. For more information on the Range object, see the Ace API documentation.
這些正則表達式各種符號-- {,[,// -- 要注意。 在這些正則表達式上匹配 getFoldWidgetRange, 當找到時,返回相關折疊點的范圍。有關Range對象的更多信息,查看 the Ace API documentation
Again, for a C-style folding mechanism, a range to return for the starting fold might look like this:
同樣,對於 C-style 折疊機構,返回起始折疊范圍可能是這樣:
Let's say we stumble across the code block hello_world() {. Our range object here becomes:
Testing Your Highlighter
The best way to test your tokenizer is to see it live, right? To do that, you'll want to modify the live Ace demo to preview your changes. You can find this file in the root Ace directory with the name kitchen-sink.html.
- add an entry to
supportedModesinace/ext/modelist.js -
add a sample file to
demo/kitchen-sink/docs/with same name as the mode file
Once you set this up, you should be able to witness a live demonstration of your new highlighter.
Adding Automated Tests
Adding automated tests for a highlighter is trivial so you are not required to do it, but it can help during development.
In lib/ace/mode/_test create a file named
text_<modeName>.txt
with some example code. (You can skip this if the document you have added in demo/docs both looks good and covers various edge cases in your language syntax).
Run node highlight_rules_test.js -gen to preserve current output of your tokenizer in tokens_<modeName>.json
After this running highlight_rules_test.js optionalLanguageName will compare output of your tokenizer with the correct output you've created.
Any files ending with the _test.js suffix are automatically run by Ace's Travis CI server.
