Skip to content

Parser

Yuku’s parser turns JavaScript and TypeScript source code into an Abstract Syntax Tree (AST).

Terminal window
npm install yuku-parser
import { parse } from "yuku-parser";
const { program, comments, diagnostics } = parse("const x = 1 + 2;");

See yuku-parser on npm for the full API.

Terminal window
zig fetch --save git+https://github.com/yuku-toolchain/yuku.git

In your build.zig:

const yuku_dep = b.dependency("yuku", .{
.target = target,
.optimize = optimize,
});
my_module.addImport("parser", yuku_dep.module("parser"));
const std = @import("std");
const parser = @import("parser");
pub fn main() !void {
// the page allocator is used as the backing allocator for the tree's internal arena
var tree = try parser.parse(std.heap.page_allocator, "const x = 5;", .{});
defer tree.deinit();
for (tree.diagnostics.items) |d| {
std.debug.print("{s}\n", .{d.message});
}
}

parse takes an Options struct to configure the parsing mode:

const tree = try parser.parse(allocator, source, .{
.source_type = .module,
.lang = .jsx,
});
FieldValuesDefaultDescription
source_type.script, .module.moduleScript mode or ES module mode (strict mode)
lang.js, .ts, .jsx, .tsx, .dts.jsLanguage variant and syntax features to enable
preserve_parenstrue, falsetrueKeep ParenthesizedExpression nodes in the AST
allow_return_outside_functiontrue, falsefalseAllow return statements at the top level

Both fields can be inferred from a file path:

const tree = try parser.parse(allocator, source, .{
.source_type = parser.ast.SourceType.fromPath("app.cjs"), // .script
.lang = parser.ast.Lang.fromPath("app.tsx"), // .tsx
});

parse returns a Tree containing the full AST, diagnostics, and source metadata. The allocator passed to parse is used as the backing allocator for the tree’s internal arena. All memory is owned by this arena, and tree.deinit() frees everything at once.

var tree = try parser.parse(allocator, source, .{});
defer tree.deinit();
// read the root program node
const program = tree.getData(tree.program);
// read a node's source location
const span = tree.getSpan(tree.program);
// read string content from a node
const name = tree.getString(some_identifier.name);
// iterate variable-length children (e.g. program body)
for (tree.getExtra(program.program.body)) |child_index| {
const child = tree.getData(child_index);
// ...
}

See the AST reference for the full node type catalog and memory model.

The parser recovers from errors and continues, so a single parse produces the full AST alongside all diagnostics:

for (tree.diagnostics.items) |d| {
std.debug.print("[{s}] {s} at {d}..{d}\n", .{
d.severity.toString(), d.message, d.span.start, d.span.end,
});
for (d.labels) |label| {
std.debug.print(" {d}..{d}: {s}\n", .{ label.span.start, label.span.end, label.message });
}
if (d.help) |help| {
std.debug.print(" help: {s}\n", .{help});
}
}

Each diagnostic has a severity (.error, .warning, .hint, .info), a message, a source span, optional labels pointing to related code regions, and optional help text.

The traverser system walks the AST and calls visitor hooks at every node. There are four modes with increasing context:

ModeContextResult
BasicPath (parents, ancestors, depth)
ScopedPath + lexical scopesScopeTree
SemanticPath + scopes + symbols/referencesScopeTree + SymbolTable
TransformPath + mutable tree
const traverser = parser.traverser;
const MyVisitor = struct {
pub fn enter_function(self: *MyVisitor, func: parser.ast.Function, index: parser.ast.NodeIndex, ctx: *traverser.basic.Ctx) traverser.Action {
// called when entering any function node
return .proceed;
}
};
var visitor = MyVisitor{};
try traverser.basic.traverse(MyVisitor, &tree, &visitor);

See Traverse for the full API.

The ECMAScript specification defines a set of early errors that conformant implementations must report before execution. Some are detectable during parsing from local context alone: return outside a function, yield outside a generator, invalid destructuring. Others require knowledge of the program’s scope structure and bindings: redeclarations, unresolved exports, private fields used outside their class, and more.

Yuku defers these scope-dependent checks to a separate semantic analysis pass. This keeps parsing fast and lets each consumer opt in only to the work it actually needs. A formatter, for example, only needs the AST and should not pay the cost of scope resolution.

semantic.analyze builds a scope tree and symbol table, resolves identifier references to their declarations, and reports the remaining early errors. Together, parsing and semantic analysis cover the full set of early errors required by the specification.

var tree = try parser.parse(allocator, source, .{});
defer tree.deinit();
// run semantic analysis
const result = try parser.semantic.analyze(&tree);
// result.scope_tree - all lexical scopes
// result.symbol_table - all symbols and references

Semantic diagnostics are appended directly to tree.diagnostics alongside parse errors. After analysis, tree.hasErrors() reflects both.

All allocations (scope tree, symbol table) use the tree’s arena, so they are valid for the lifetime of the tree and freed by tree.deinit().