A library that helps tokenize text using Text Mate grammars.
Перейти к файлу
Alec Larson 167bbbd509
feat: add child combinator ">" (and fix a specificity bug) (#233)
* feat: fix 2 bugs and add child combinator ">"

Bug: Just because a trie element exists for a more specific scope doesn‘t mean its parent scopes will match, so we need to collect the trie elements with less specific scopes too.

Bug: If the number of scope names in both rules‘ scope paths are not
equal, the parent scope names won‘t be compared at all. Instead, the rule with
the longest scope path is preferred. This goes against the TextMate
manual (https://macromates.com/manual/en/scope_selectors). In
particular, the following line in “Ranking Matches”:
> Rules 1 and 2 applied again to the scope selector when removing the deepest element (in the case of a tie)

Feature: Add support for the child combinator (the `>` operator). This
allows for styling a parent-child relationship specifically.

* fix: proceed to next scope after successful match

* fix: increment parent indexes after comparison

* chore: add comments and some small improvements

* test: add 3 new cases

- One for the new child combinator
- One for bug 1 ("Theme resolving falls back to less specific rules")
- One for bug 2 ("Theme resolving should give deeper scopes higher specificity")

* chore(revert): undo bug 1 fix

After trying to reproduce the alleged bug, I realized it‘s not a bug. When a ThemeTrieElement is created, it inherits the `_rulesWithParentScopes` array of its parent element, which is guaranteed to be populated due to the lexicographical sorting of the theme rules in `resolveParsedThemeRules`.

* test: remove test case for non-existent bug

…and update the other bug‘s test case to actually fail on the main branch
2024-07-06 11:56:49 +00:00
.github/workflows Disable rich navigation workflow (#221) 2023-12-15 10:36:26 +01:00
.vscode Fix PoliCheck issues (#212) 2023-09-18 11:29:35 -07:00
benchmark CRLF -> LF 2022-02-04 14:21:30 +01:00
build Engineering - remove custom code to create git tag (#204) 2023-03-14 13:09:13 +01:00
scripts Fix the build pipeline (#184) 2022-10-13 17:37:11 +02:00
src feat: add child combinator ">" (and fix a specificity bug) (#233) 2024-07-06 11:56:49 +00:00
test-cases Remove unused fixtures (fixes CodeQL warning) (#223) 2024-01-08 11:07:50 +01:00
typings update deps 2020-10-07 21:48:12 +00:00
.gitattributes CRLF -> LF 2022-02-04 14:21:30 +01:00
.gitignore ignore release folder 2022-04-07 14:17:57 +02:00
.lsifrc.json Add indexing action 2021-10-01 10:58:22 +02:00
.npmignore Engineering - Scaffold pipeline (#182) 2022-10-13 14:43:37 +02:00
LICENSE.md CRLF -> LF 2022-02-04 14:21:30 +01:00
README.md CRLF -> LF 2022-02-04 14:21:30 +01:00
SECURITY.md Add SECURITY.md 2021-03-10 14:56:46 +01:00
ThirdPartyNotices.txt CRLF -> LF 2022-02-04 14:21:30 +01:00
package-lock.json bump version (#235) 2024-07-06 13:56:22 +02:00
package.json bump version (#235) 2024-07-06 13:56:22 +02:00
tsconfig.json Update target from ES5 to ES2020 2022-05-02 09:22:12 +02:00
webpack.config.js Uses webpack copy plugin to include all d.ts files in the release. 2022-05-09 10:18:24 +02:00

README.md

VSCode TextMate Build Status

An interpreter for grammar files as defined by TextMate. TextMate grammars use the oniguruma dialect (https://github.com/kkos/oniguruma). Supports loading grammar files from JSON or PLIST format. This library is used in VS Code. Cross - grammar injections are currently not supported.

Installing

npm install vscode-textmate

Using

const fs = require('fs');
const path = require('path');
const vsctm = require('vscode-textmate');
const oniguruma = require('vscode-oniguruma');

/**
 * Utility to read a file as a promise
 */
function readFile(path) {
    return new Promise((resolve, reject) => {
        fs.readFile(path, (error, data) => error ? reject(error) : resolve(data));
    })
}

const wasmBin = fs.readFileSync(path.join(__dirname, './node_modules/vscode-oniguruma/release/onig.wasm')).buffer;
const vscodeOnigurumaLib = oniguruma.loadWASM(wasmBin).then(() => {
    return {
        createOnigScanner(patterns) { return new oniguruma.OnigScanner(patterns); },
        createOnigString(s) { return new oniguruma.OnigString(s); }
    };
});

// Create a registry that can create a grammar from a scope name.
const registry = new vsctm.Registry({
    onigLib: vscodeOnigurumaLib,
    loadGrammar: (scopeName) => {
        if (scopeName === 'source.js') {
            // https://github.com/textmate/javascript.tmbundle/blob/master/Syntaxes/JavaScript.plist
            return readFile('./JavaScript.plist').then(data => vsctm.parseRawGrammar(data.toString()))
        }
        console.log(`Unknown scope name: ${scopeName}`);
        return null;
    }
});

// Load the JavaScript grammar and any other grammars included by it async.
registry.loadGrammar('source.js').then(grammar => {
    const text = [
        `function sayHello(name) {`,
        `\treturn "Hello, " + name;`,
        `}`
    ];
    let ruleStack = vsctm.INITIAL;
    for (let i = 0; i < text.length; i++) {
        const line = text[i];
        const lineTokens = grammar.tokenizeLine(line, ruleStack);
        console.log(`\nTokenizing line: ${line}`);
        for (let j = 0; j < lineTokens.tokens.length; j++) {
            const token = lineTokens.tokens[j];
            console.log(` - token from ${token.startIndex} to ${token.endIndex} ` +
              `(${line.substring(token.startIndex, token.endIndex)}) ` +
              `with scopes ${token.scopes.join(', ')}`
            );
        }
        ruleStack = lineTokens.ruleStack;
    }
});

/* OUTPUT:

Unknown scope name: source.js.regexp

Tokenizing line: function sayHello(name) {
 - token from 0 to 8 (function) with scopes source.js, meta.function.js, storage.type.function.js
 - token from 8 to 9 ( ) with scopes source.js, meta.function.js
 - token from 9 to 17 (sayHello) with scopes source.js, meta.function.js, entity.name.function.js
 - token from 17 to 18 (() with scopes source.js, meta.function.js, punctuation.definition.parameters.begin.js
 - token from 18 to 22 (name) with scopes source.js, meta.function.js, variable.parameter.function.js
 - token from 22 to 23 ()) with scopes source.js, meta.function.js, punctuation.definition.parameters.end.js
 - token from 23 to 24 ( ) with scopes source.js
 - token from 24 to 25 ({) with scopes source.js, punctuation.section.scope.begin.js

Tokenizing line:        return "Hello, " + name;
 - token from 0 to 1 (  ) with scopes source.js
 - token from 1 to 7 (return) with scopes source.js, keyword.control.js
 - token from 7 to 8 ( ) with scopes source.js
 - token from 8 to 9 (") with scopes source.js, string.quoted.double.js, punctuation.definition.string.begin.js
 - token from 9 to 16 (Hello, ) with scopes source.js, string.quoted.double.js
 - token from 16 to 17 (") with scopes source.js, string.quoted.double.js, punctuation.definition.string.end.js
 - token from 17 to 18 ( ) with scopes source.js
 - token from 18 to 19 (+) with scopes source.js, keyword.operator.arithmetic.js
 - token from 19 to 20 ( ) with scopes source.js
 - token from 20 to 24 (name) with scopes source.js, support.constant.dom.js
 - token from 24 to 25 (;) with scopes source.js, punctuation.terminator.statement.js

Tokenizing line: }
 - token from 0 to 1 (}) with scopes source.js, punctuation.section.scope.end.js

*/

For grammar authors

See vscode-tmgrammar-test that can help you write unit tests against your grammar.

API doc

See the main.ts file

Developing

  • Clone the repository
  • Run npm install
  • Compile in the background with npm run watch
  • Run tests with npm test
  • Run benchmark with npm run benchmark
  • Troubleshoot a grammar with npm run inspect -- PATH_TO_GRAMMAR PATH_TO_FILE

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

MIT