|
Lime Parser Generator 0.1.0
Runtime-extensible LALR(1) parser with SIMD tokenization and LLVM JIT
|
This document compares Lime with other widely-used parser generators to help you choose the right tool for your project and understand the trade-offs involved.
| Tool | Algorithm | Target Languages | License | First Release |
|---|---|---|---|---|
| Lime | LALR(1) | C | Public Domain | 2024 (Lemon: 2001) |
| Yacc | LALR(1) | C | Proprietary / BSD | 1975 |
| Bison | LALR(1), GLR, IELR(1) | C, C++, Java, D | GPL v3 | 1985 |
| ANTLR | LL(*), ALL(*) | Java, C#, Python, JS, Go, C++, Swift, Dart | BSD | 1992 |
| Menhir | LR(1) | OCaml | GPL v2 | 2005 |
| Feature | Lime | Yacc | Bison | ANTLR | Menhir |
|---|---|---|---|---|---|
| Grammar class | LALR(1) | LALR(1) | LALR(1)/GLR/IELR | LL(*)/ALL(*) | LR(1) |
| Left recursion | Yes | Yes | Yes | Indirect only (v4) | Yes |
| Right recursion | Yes | Yes | Yes | Yes | Yes |
| Operator precedence | Yes | Yes | Yes | Via alternatives | Yes |
| Mid-rule actions | No | Yes | Yes | N/A (listeners) | No |
| Named parameters | A, B, C... | $1, $2... | $1, $2 or $name | Labels | Pattern vars |
| Start symbol control | start_symbol | First rule | start | grammar rule | start |
| Multiple start symbols | No | No | Yes (GLR mode) | Yes | Yes |
| Parameterized rules | No | No | No | Yes | Yes |
| Unicode support | Via tokenizer | Via lex | Via lex | Full Unicode | Full Unicode |
| Grammar modularity | Runtime extensions | None | None | import grammars | % includes |
| Conditional compilation | ifdef/ifndef | No | No | No | No |
| Feature | Lime | Yacc | Bison | ANTLR | Menhir |
|---|---|---|---|---|---|
| Output language | C | C | C, C++, Java, D | 10+ languages | OCaml |
| Parser type | Push (event-driven) | Pull (yyparse loop) | Pull (yyparse) | Recursive descent | Push or pull |
| Reentrant by default | Yes | No | Optional (pure-parser) | Yes | Yes |
| Thread safety | Yes (atomic refcount) | Manual | Manual | Yes | Yes |
| Generated file size | Small | Small | Medium | Large | Medium |
| Header generation | Automatic | Automatic | Automatic | N/A | Automatic |
| Custom template | -T flag | No | No (skeleton) | StringTemplate | No |
| Parser prefix/namespace | name | -p flag | name-prefix | Class name | Module name |
| Extra parser arguments | extra_argument | Global variables | parse-param | Constructor args | parameter |
| Feature | Lime | Yacc | Bison | ANTLR | Menhir |
|---|---|---|---|---|---|
| Error recovery | error token | error token | error token | Built-in recovery | error token |
| Syntax error callback | syntax_error | yyerror() | yyerror() | ErrorListener | on_error_reduce |
| Parse failure callback | parse_failure | None | None | None | None |
| Conflict reporting | .out file | y.output | .output file | N/A (no conflicts) | .conflicts |
| Precedence conflict info | -p flag | No | expect | N/A | Warnings |
| Error token destructor | destructor | No | destructor | N/A | N/A |
| Feature | Lime | Yacc | Bison | ANTLR | Menhir |
|---|---|---|---|---|---|
| Parse time complexity | O(n) | O(n) | O(n) / O(n^3) GLR | O(n^4) worst case | O(n) |
| SIMD tokenization | Yes (AVX2/NEON) | No | No | No | No |
| JIT compilation | Yes (LLVM) | No | No | No | No |
| Table compression | Yes (-c to disable) | Yes | Yes | N/A | Yes |
| Typical parse latency | 0.2–3 us | 1–10 us | 1–10 us | 5–50 us | 1–10 us |
| Memory footprint | ~500 KB–1 MB | ~50–200 KB | ~50–200 KB | ~2–10 MB | ~100–500 KB |
| Feature | Lime | Yacc | Bison | ANTLR | Menhir |
|---|---|---|---|---|---|
| Runtime grammar changes | Yes (extension API) | No | No | No | No |
| Add tokens at runtime | Yes | No | No | No | No |
| Add rules at runtime | Yes | No | No | No | No |
| Modify precedence at runtime | Yes | No | No | No | No |
| Extension conflict resolution | Callback-based | N/A | N/A | N/A | N/A |
| Copy-on-write snapshots | Yes | N/A | N/A | N/A | N/A |
| Hot grammar reload | Yes (zero downtime) | No | No | No | No |
| Feature | Lime | Yacc | Bison | ANTLR | Menhir |
|---|---|---|---|---|---|
| Build system | Single C file | System package | System package | Java JAR | opam |
| Dependencies | None (core) | None | None | Java 11+ | OCaml |
| Self-contained | Yes | Yes | Yes | No (runtime lib) | No (runtime lib) |
| Meson integration | Yes | Manual | Manual | Gradle/Maven | dune |
| Nix support | Yes (flake.nix) | Via nixpkgs | Via nixpkgs | Via nixpkgs | Via nixpkgs |
Lime descends from the same LALR(1) tradition as Yacc, but with a modernized design. Key differences:
yylval, yychar) and require careful modification for thread safety.destructor directives to free semantic values during error recovery, avoiding the memory leaks common in Yacc parsers.yylex() internally. The push model is easier to integrate with event-driven architectures and custom tokenizers.$1, $2 notation. This makes rules more readable when they have many symbols.Bison is the GNU successor to Yacc, with many additional features. Lime takes a different approach:
ANTLR and Lime represent fundamentally different parsing philosophies:
Menhir is an LR(1) parser generator for OCaml. Comparison is most relevant for users choosing between the C and OCaml ecosystems:
void* and struct types, which offer less compile-time safety..messages files. Lime provides syntax_error and parse_failure callbacks.expect** – to document and track expected conflictsIf you are migrating from another parser generator to Lime, see the dedicated migration guides:
These guides include directive mapping tables, syntax translation rules, and worked examples from the examples/bootstrap/ directory (a real PostgreSQL grammar converted from Bison to Lime).
These numbers are from the Lime benchmark suite on Linux x86_64 (GCC 14, -O2). See PERFORMANCE.md for full methodology.
| Grammar Size | Lime (interpreted) | Lime (JIT) | Typical Bison/Yacc |
|---|---|---|---|
| Small (64 states) | 424 ns | 168 ns | ~500 ns |
| Medium (256 states) | 1,244 ns | 412 ns | ~1,500 ns |
| Large (512 states) | 2,890 ns | 689 ns | ~3,500 ns |
| Implementation | Throughput |
|---|---|
| Lime (scalar) | ~343 MB/s |
| Lime (AVX2) | ~1.5 GB/s (estimated) |
| Typical flex-generated | ~200–400 MB/s |
| ANTLR lexer | ~50–150 MB/s |
| Tool | Base Snapshot | Per-Thread Overhead |
|---|---|---|
| Lime | ~300 KB | ~4 KB (ParseContext + stack) |
| Bison/Yacc | ~50–200 KB | ~2 KB (parser stack) |
| ANTLR | ~2–10 MB | ~500 KB (parse tree nodes) |
Note: Lime's higher base memory reflects the snapshot architecture that enables runtime extensibility and thread safety. When no extensions are loaded, the active parsing overhead is comparable to Bison/Yacc.
| Tool | License | Generated Code License |
|---|---|---|
| Lime | Public Domain | Public Domain |
| Yacc | Varies (BSD on most systems) | Same as tool |
| Bison | GPL v3 | GPL with exception (free to use) |
| ANTLR | BSD 3-Clause | BSD 3-Clause |
| Menhir | GPL v2 | Unencumbered (with --only-preprocess) |
Lime's Public Domain status means there are zero licensing restrictions on the generator tool, the generated parser code, or the runtime template. This makes it suitable for any project regardless of its own license.