LMD SyntaxEdit Schemes Language Reference

Synax Blocks and Folding

Text parsing goes by two stages:

•Token searching, using <RegexRule>, <RegexBlock> and <Keyword[s]>. At this stage, syntax tokens, produced by those rules collected for second stage:

•Second stage: here parser uses tokens produced at stage1, to generate folds from high-level syntax constructs, using <SyntaxBlock> elements.

Element: <SyntaxBlock>

•Attribute: priority, type: Integer
This property gives priority for this rule on parsing token sequence, acceptable for several <SyntaxBlock> rules. Meanging of priority attribute is same as priority attribute of <RegexRule> and <RegexBlock> elements.

•Attribute: start, type: Regular expression in special token syntax.
Regular expressions used in <SyntaxBlock> are constructed using special syntax, where each atom in regexp is name of token, produced at stage1. As for <RegexBlock> rule, instead of using start attribute, you can use <Start>regex</Start> sub-element.

Example 1:

[ keyword:while keyword:for ] .+? keyword:do

Means start of Lua “while” or “for” construct. As you can see, expression is same as usual regular expression, with one difference: intead of simple chars, we use names of tokens with optional token content given. Also, you can’t use here character class related regexp constructs like “\s, \S, \W, \w, \d, \D, \0xFF, \U{Unicode_cat}”, and char-related modifiers like (?ims), just because here are no chars, only int-codes for tokens, case insensitivity has no sense, and all token sequence always interpreted as single line.

For those token names: keyword, identifier, symbol we can use shortcuts: kw, id, sym respectively.

Example 2:

[ kw:while kw:for ] .+? kw:do

Example 3:

Five any keywords, after that Lua while/for construct start.

kw{5} [ kw:while kw:for ] .+? kw:do

Example 4:

JavaScript function: “function” keyword, any identifier (detected by <KeywordRegex> rule), “(“ symbol, anything except “; {}” symbols, “)” symbol, and “{” symbol.

kw:function id

sym:(

[^ sym:; sym:} sym:{ ]*

sym:) sym:{

•Attribute: end, type: Regular expression in special token syntax.
Syntax is same as for start attribute, with one difference: you can use $0..$9 variables to reference matched start expression group, as for end attribute of <RegexBlock>.

•Attribute: capture, type: Boolean (“true/false” or “0/1”)
Should this <SyntaxBlock> produce fold for TLMDEditView, or just should be skipped? See JavaScript function example for more. Also, you can use <SkipSyntaxToken> elements in scheme, to specify tokens which will not be used in syntax parsing.

Example1 (syntax blocks):

<Scheme name='JavaScriptMain' defaultToken='default'

keywordsIgnoreCase='false'>

for in if else return while

function new this var with arguments

throw try catch finally with

</Keywords>

<Regex token0='symbol'

regex='[ \} \{ \] \[  > < ]' />

<Start> kw:function id

sym:(

[^ sym:; sym:} sym:{ ]*

sym:)

sym:{

</Start>

</SyntaxBlock>

<Start>

[ kw:while kw:do kw:if kw:else kw:try

kw:catch kw:finally kw:switch ]

[^ sym:; sym:} ]*? sym:\{

</Start>

</SyntaxBlock>

<!-- We don't want folds for code in simple { .. }

We should just skip it, for parens balance,

because other constructs ends with } too. -->

</SyntaxBlock>

</Scheme>

Example2: VB syntax (using references to start of block)

<Start>

[ kw:sub kw:class kw:if

kw:function kw:property

kw:select kw:with ]

</Start>

</SyntaxBlock>

Here we fold everything like Sub FuncName .... End Sub, Class ClassName ..... End Class ... etc.

Element: <SkipSyntaxToken>

This element is sub-element of <Scheme>, it works as helper for <SyntaxBlock> element

•Attribute: token, type: string, case-sensitive, token reference.
Specifies token, which not used in high-level syntax parsing.

Example:

All comments will be skipped at syntax parsing stage, so, you can write

kw:function id

sym:(

[^ sym:; sym:} sym:{ ]*

sym:) sym:{

Instead of

kw:function comment* id comment*

sym:( comment*

[^ sym:; sym:} sym:{ ]*

sym:) comment* sym:{

for JavaScript function.

You can set multiple <SkipSyntaxToken> in scheme.