|
Lime Parser Generator 0.1.0
Runtime-extensible LALR(1) parser with SIMD tokenization and LLVM JIT
|
JIT (Just-In-Time) compilation in Lime targets action table lookups during parsing. While JIT can deliver a 2-5x speedup on the lookup step itself, action table lookups account for only 10-20% of total SQL parse time. The dominant bottleneck is tokenization, which consumes 40-60% of parse time. As a result, JIT compilation of action tables alone yields a modest 1.1-1.5x overall improvement. The highest-value JIT optimization is tokenizer JIT, which targets the actual bottleneck and can deliver 1.8-2.8x overall speedup.
This document provides a detailed cost-benefit analysis to help users decide when and how to apply JIT compilation in their Lime-based parsers.
Understanding where time is spent during parsing is essential for evaluating JIT impact.
| Phase | Share of Total Parse Time | Description |
|---|---|---|
| Tokenization | 40-60% | Lexical analysis, keyword classification |
| Semantic actions | 20-30% | User-defined reduce actions, AST builds |
| Action table lookup | 10-20% | State transitions, shift/reduce dispatch |
| Stack operations | 5-10% | Push/pop, state stack management |
The action table lookup phase is the target of the core JIT optimization. It involves looking up the correct parser action (shift, reduce, accept, or error) given the current state and lookahead token.
JIT compilation replaces the interpreter-style table lookup (a sequence of comparisons or a binary search over the compressed action table) with native machine code that directly encodes state transitions as a computed jump.
The formula: if action lookup is 15% of total time and JIT gives a 3x speedup on that portion, overall speedup is 1 / (0.85 + 0.15/3) = 1.11x.
JIT compilation itself has a cost that must be amortized over multiple parse sessions.
| JIT Backend | Compilation Latency | Memory Overhead |
|---|---|---|
| MCJIT | 10-50ms | All states compiled |
| OrcJIT | Near-zero startup | Only hot states held |
The break-even point is the number of parse sessions required before JIT overhead is recovered through faster parsing.
| Backend | Break-Even (approx.) | Notes |
|---|---|---|
| MCJIT | ~100-500 parse sessions | High upfront cost, full compilation |
| OrcJIT | ~10 parse sessions | Lazy compilation, minimal startup cost |
For short-lived processes that parse only a handful of inputs, JIT overhead may never be recovered. For long-running servers, the cost is negligible.
JIT compilation provides meaningful value in the following scenarios:
JIT compilation adds complexity and overhead with little payoff in these cases:
Lime supports two strategies for compiled action tables: ahead-of-time (AOT) compilation at parser generation time, and runtime JIT compilation during parsing.
AOT compilation generates native action table code at parser generation time and embeds it directly in the output C file.
Advantages:
Disadvantages:
Runtime JIT compiles action table lookups during parser execution, potentially adapting to observed usage patterns.
Advantages:
Disadvantages:
Use AOT for most deployments. AOT compilation provides the same steady-state performance as runtime JIT with none of the runtime complexity. It eliminates the LLVM dependency from the generated parser, simplifying deployment and reducing the binary size.
Use runtime JIT for adaptive scenarios where workload patterns are unknown at build time or vary significantly across deployments. This is primarily relevant for general-purpose database engines that must handle diverse query patterns.
LLVM provides two JIT compilation frameworks. Lime initially targets MCJIT but should migrate to OrcJIT for production use.
MCJIT compiles an entire LLVM module at once. For Lime, this means all parser states are compiled in a single batch.
| Metric | MCJIT Behavior |
|---|---|
| Compilation model | Whole-module, all states at once |
| Startup latency | 10-50ms (scales with grammar size) |
| Memory usage | All states resident in compiled code |
| Granularity | Cannot compile individual states |
MCJIT is suitable for prototyping and testing the JIT pipeline but is not recommended for production due to its high startup cost and all-or-nothing compilation model.
OrcJIT (On-Request Compilation JIT) supports lazy, per-function compilation. Each parser state can be compiled independently, on demand.
| Metric | OrcJIT Behavior |
|---|---|
| Compilation model | Per-state, on demand |
| Startup latency | Near-zero (compile on first use) |
| Memory usage | Only hot states compiled and retained |
| Granularity | Individual state functions |
| CPU savings | 5-10x less compilation CPU vs MCJIT |
OrcJIT advantages over MCJIT:
The tokenizer JIT targets the actual parsing bottleneck and represents the single highest-impact JIT optimization available in Lime.
Tokenization consumes 40-60% of total parse time. The primary cost within tokenization is keyword classification: determining whether an identifier like SELECT, FROM, or WHERE is a reserved keyword or a user identifier.
Traditional approaches use hash table lookups for keyword classification. While hash tables provide O(1) average-case lookup, they involve:
The tokenizer JIT replaces hash table keyword lookup with a JIT-compiled trie (prefix tree) that maps directly to native branch instructions.
How it works:
Advantages of the trie approach:
| Metric | Value |
|---|---|
| Speedup on tokenization | 2-3x |
| Share of total time affected | 40-60% |
| Overall parse speedup | 1.8-2.8x |
| Compilation cost | < 5ms (small keyword set) |
The overall speedup is calculated as: if tokenization is 50% of total time and trie JIT gives a 2.5x speedup, overall speedup is 1 / (0.50 + 0.50/2.5) = 1.67x. Combined with action table JIT on the remaining 15% of time, the compound speedup can reach 1.8-2.8x depending on grammar and query characteristics.
| Aspect | Action Table JIT | Tokenizer JIT (5E) |
|---|---|---|
| Target | 10-20% of parse time | 40-60% of parse time |
| Speedup on target | 2-5x | 2-3x |
| Overall impact | 1.1-1.5x | 1.8-2.8x |
| Complexity | Moderate | Moderate |
| Recommendation | Implement second | Implement first |
The following defaults provide a reasonable starting point for most deployments:
| Parameter | Default Value | Description |
|---|---|---|
LIME_JIT_ENABLE | 0 (off) | Enable runtime JIT compilation |
LIME_JIT_MIN_STATES | 100 | Minimum grammar states to trigger JIT |
LIME_JIT_HOT_THRESHOLD | 10 | State visits before lazy JIT fires |
LIME_JIT_COMPILE_ALL | 0 (off) | Force compilation of all states (MCJIT) |
LIME_TOKENIZER_JIT | 0 (off) | Enable tokenizer trie JIT |
LIME_JIT_MIN_STATES if JIT is triggering on small grammars where the overhead is not justified. Values of 200-500 are reasonable for conservative deployments.LIME_JIT_HOT_THRESHOLD for servers that need fast warmup on initial queries. A value of 1 compiles states on first visit (similar to AOT but at runtime).LIME_JIT_HOT_THRESHOLD to reduce total compilation work when only a small fraction of states are truly hot. Values of 50-100 ensure only the most frequently visited states are compiled.LIME_JIT_COMPILE_ALL only when using MCJIT and you want deterministic performance (no lazy compilation jitter). This is equivalent to AOT but performed at runtime.LIME_TOKENIZER_JIT for any deployment where tokenization is the bottleneck (which is most SQL parsing workloads). This is the single most impactful optimization.