diff --git a/.cargo/config.toml b/.cargo/config.toml new file mode 100644 index 0000000..452f762 --- /dev/null +++ b/.cargo/config.toml @@ -0,0 +1,5 @@ +# `cargo regen` rebuilds every input to web/assets/bench.json with one command: +# the timing benches, the two memory benches (a separate process each, since +# they install a counting global allocator), and finally the export. +[alias] +regen = "run --release --bin sqlbench -- regen" diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index ee0ae2d..10f1a11 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -17,11 +17,21 @@ No unsafe code is allowed (`unsafe_code = "forbid"`). Clippy runs with pedantic The site under `web/` is a Dioxus -> WASM app that renders a committed snapshot, `web/assets/bench.json`, produced by `sqlbench export`. CI (`.github/workflows/pages.yml`) only builds and deploys the committed crates, so regenerate the snapshot manually after changing the corpus or parsers: ```bash -cargo bench # write target/bench_dist/ timings (long) -cargo run --bin sqlbench -- export # write web/assets/bench.json -cd web && dx serve # preview at http://127.0.0.1:8080/sql_ast_benchmark/ +cargo regen # one command: timing benches + memory benches + export (long) +cd web && dx serve # preview at http://127.0.0.1:8080/sql_ast_benchmark/ ``` +`cargo regen` (alias in `.cargo/config.toml` for `cargo run --release --bin sqlbench -- regen`) runs the producers in order and ends with the export. The memory benches install a counting global allocator, so they each run in their own process, separate from the timing bench and from export. That is the only reason this is a pipeline rather than a single binary. To run a stage on its own: + +```bash +cargo bench # write target/bench_dist/ + target/batch_dist/ timings +cargo run --release -p membench # write target/mem_dist/ per-statement memory +cargo run --release -p membench -- batch # write target/batch_mem_dist/ whole-script memory +cargo run --bin sqlbench -- export # read all of the above, write web/assets/bench.json +``` + +`export` reads whatever timing, memory, and batch summaries are present under `target/` and warns (rather than fails) for any that are missing, so the memory and batch columns stay empty until their producers have been run. + The charts are rendered in the browser from the JSON by the shared `viz` crate (plotters, SVG backend), so no chart images are committed. ## Coverage @@ -31,4 +41,4 @@ tar --zstd -xf datasets.tar.zst # coverage runs the bench in smoke mode, which cargo tarpaulin # LLVM engine, includes the bench ``` -`tarpaulin.toml` runs the benchmark in verify-only mode (`--test`) under the LLVM engine, since the benchmark is the main exercise of the `BenchParser` layer. With the corpus present it covers `benches/parsing.rs` and the dialect-mapping / accept / reprint paths in `src/lib.rs`. +`tarpaulin.toml` runs the benchmark in verify-only mode (`--test`) under the LLVM engine, since the benchmark is the main exercise of the `BenchParser` layer. With the corpus present it covers `benches/parsing.rs` and `benches/batch_parsing.rs` (both in smoke mode) and the dialect-mapping / accept / reprint paths in `src/lib.rs`. diff --git a/Cargo.toml b/Cargo.toml index b6be095..287786b 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -51,6 +51,10 @@ syntect = { version = "5", default-features = false, features = ["default-fancy" name = "parsing" harness = false +[[bench]] +name = "batch_parsing" +harness = false + [[bin]] name = "sqlbench" path = "src/bin/sqlbench.rs" diff --git a/README.md b/README.md index da37f54..77d3a1b 100644 --- a/README.md +++ b/README.md @@ -38,19 +38,29 @@ Per-parser repository metadata (stars, contributors, fuzzing, test and benchmark 311,594 statements across 34 files and 13 dialects, committed compressed as `datasets.tar.zst` (5.3 MB) and unpacked to `datasets/{dialect}/{name}.txt`, one statement per line. The commands below extract it automatically on first use. All sources are openly licensed (Apache-2.0, MIT, BSD, public domain or CC-BY), drawn from each engine's own regression suites and official samples. Natural-language-with-embedded-SQL datasets are intentionally excluded. -Correctness is defined per dialect. Dialects with a runnable engine are graded against that real database engine, run in Docker via testcontainers by the `oracle` crate: a statement is valid unless the engine reports a syntax error (a missing table or column still counts as parsed). The validity labels are computed once and committed under `oracle/labels`, so grading and CI need no Docker. That reference splits the corpus into valid and invalid and scores recall, false positives, round-trip, and fidelity. Dialects with no runnable engine (cloud services, heavy JVM engines) have no reference, so their statements count as provenance-valid (sourced from each engine's own suites) and the metric is acceptance rate. Speed is a per-statement parse-time distribution over every accepted statement, timed with an adaptive iteration count on a no-`catch_unwind` path. +Correctness is defined per dialect. Dialects with a runnable engine are graded against that real database engine, run in Docker via testcontainers by the `oracle` crate: a statement is valid unless the engine reports a syntax error (a missing table or column still counts as parsed). The validity labels are computed once and committed under `oracle/labels`, so grading and CI need no Docker. That reference splits the corpus into valid and invalid and scores recall, false positives, round-trip, and fidelity. Dialects with no runnable engine (cloud services, heavy JVM engines) have no reference, so their statements count as provenance-valid (sourced from each engine's own suites) and the metric is acceptance rate. Speed is a per-statement parse-time distribution over every accepted statement, timed with an adaptive iteration count on a no-`catch_unwind` path. Memory is measured separately with a counting allocator, as peak live bytes and retained (AST) bytes per statement. A companion batch axis parses each parser's whole accepted set as one script and normalizes the time and memory by the statement count, showing what bulk parsing amortizes against parsing one statement at a time. A batch that does not parse the whole set (a parser that bails out partway) is dropped rather than reported, and parsers without a multi-statement entry point (databend-common-ast) sit out the batch axis. ## Running -The corpus auto-extracts on first use, so just run: +The corpus auto-extracts on first use. To rebuild the whole explorer snapshot (`web/assets/bench.json`) with one command: + +```bash +cargo regen # timing benches + memory benches + export, in order +``` + +That is an alias (see `.cargo/config.toml`) for `cargo run --release --bin sqlbench -- regen`. The memory measurement installs a counting global allocator, so it has to run in its own process, separate from the timing bench (which must stay on the default allocator for fair numbers). The `regen` command orchestrates that sequence so you do not have to. The individual steps, if you want to run one on its own: ```bash cargo run --release --bin sqlbench correctness --per-file # per-file acceptance, every dialect cargo run --release --bin sqlbench correctness # reference + provenance correctness -cargo bench # parse-throughput, every dialect +cargo bench # parse time (per-statement and batch), every dialect +cargo run --release -p membench # per-statement memory (peak + retained bytes) +cargo run --release -p membench -- batch # whole-script (batch) memory, per statement cargo run --release --bin sqlbench export # regenerate web/assets/bench.json for the explorer ``` +`cargo bench` runs both the per-statement (`parsing`) and whole-script (`batch_parsing`) timing benches. Add `--bench batch_parsing` to run only the batch one. `export` reads whatever the benches left under `target/`, warning rather than failing for any missing source, so the memory and batch columns stay empty until their producers have run. + Validity labels for the reference dialects are produced by the `oracle` crate (real engines in Docker via testcontainers) and committed under `oracle/labels`, so `correctness` and `export` need no Docker. Regenerate them with `cargo run --release -p oracle`. ### Requirements diff --git a/benches/batch_parsing.rs b/benches/batch_parsing.rs new file mode 100644 index 0000000..01528b4 --- /dev/null +++ b/benches/batch_parsing.rs @@ -0,0 +1,245 @@ +//! Multi-dialect BATCH (whole-script) parse-time benchmark over the full +//! `datasets/` corpus. +//! +//! Companion to `benches/parsing.rs`. Where `parsing` times each statement in +//! isolation, this concatenates every statement a parser accepts in a dialect +//! into one script and times parsing that whole script in a single call, then +//! divides by the statement count to get a normalized per-statement cost. The +//! contrast between this and the per-statement median isolates what a batch API +//! pays or amortizes, the effect raised in issue #15: `Parser::parse_sql` grows +//! a `Vec` of large `Statement` values, so bulk parsing can behave differently +//! from many single-statement calls. +//! +//! Both axes are measured over the SAME accepted set (statements the parser +//! parses in that dialect), so the two numbers are directly comparable. +//! +//! Only parsers with a multi-statement entry point take part (see +//! `BenchParser::can_batch`); `databend-common-ast` parses one statement per +//! call and is simply skipped here. +//! +//! Output (under `target/batch_dist/`), self-contained for now (not yet wired +//! into the web export): +//! - `summary.csv` : per-pair statement count, statements the parser saw, +//! batch size in bytes, whole-script time, and time normalized per +//! statement. +//! +//! Full run: `cargo bench --bench batch_parsing` +//! Smoke (default): `cargo test` or `cargo bench --bench batch_parsing -- --test` +//! +//! The full run unpacks `datasets.tar.zst` automatically if `datasets/` is +//! missing. The smoke path needs no corpus, so `cargo test` stays fast. + +use sql_ast_benchmark::batch::join_batch; +use sql_ast_benchmark::datasets::Dialect; +use sql_ast_benchmark::report::load_dialect; +use sql_ast_benchmark::BenchParser; +use std::fs; +use std::hint::black_box; +use std::io::Write as _; +use std::time::Instant; + +/// Deep statements can exhaust the default stack inside recursive-descent +/// parsers, and a stack overflow aborts the process, so time on a large stack. +const WORKER_STACK: usize = 1024 * 1024 * 1024; + +const OUT_DIR: &str = "target/batch_dist"; + +const DIALECTS: &[Dialect] = &[ + Dialect::Postgresql, + Dialect::Sqlite, + Dialect::Mysql, + Dialect::Clickhouse, + Dialect::Duckdb, + Dialect::Hive, + Dialect::SparkSql, + Dialect::Trino, + Dialect::Tsql, + Dialect::Oracle, + Dialect::Bigquery, + Dialect::Redshift, + Dialect::Multi, +]; + +/// Whole-script parse time (ns/batch): adaptive iteration count so a short +/// script still accumulates enough work per round, capped low because one batch +/// call already does a lot. Best (min) of `ROUNDS` rounds. +fn time_batch(mut f: impl FnMut() -> usize) -> f64 { + const TARGET_NS: u128 = 2_000_000; // aim for ~2 ms of work per round + const ROUNDS: usize = 5; + + black_box(f()); // warm up + let probe = Instant::now(); + black_box(f()); + let single = probe.elapsed().as_nanos().max(1); + let iters = u64::try_from((TARGET_NS / single).clamp(1, 1_000)).unwrap_or(1); + + let mut best = f64::MAX; + for _ in 0..ROUNDS { + let start = Instant::now(); + for _ in 0..iters { + black_box(f()); + } + let per = start.elapsed().as_nanos() as f64 / iters as f64; + best = best.min(per); + } + best +} + +struct Row { + dialect: &'static str, + parser: &'static str, + /// Statements fed into the batch (the parser's accepted set). + n_accepted: usize, + /// Statements the parser reported parsing from the batch (coverage). + n_parsed: usize, + batch_bytes: usize, + /// Whole-script parse time (ns). + batch_ns: f64, + /// `batch_ns / n_accepted`: time per statement in batch context. + ns_per_stmt: f64, +} + +/// Time one (parser, dialect) pair: build the accepted set, concatenate it into +/// one script, time the whole-script parse, and normalize per statement. +fn run_pair(parser: BenchParser, dialect: Dialect, stmts: &[String]) -> Row { + let accepted: Vec<&str> = stmts + .iter() + .filter(|s| parser.accepts(s, dialect) == Some(true)) + .map(String::as_str) + .collect(); + + let mut row = Row { + dialect: dialect.dir_name(), + parser: parser.name(), + n_accepted: accepted.len(), + n_parsed: 0, + batch_bytes: 0, + batch_ns: 0.0, + ns_per_stmt: 0.0, + }; + if accepted.is_empty() { + return row; + } + + let batch = join_batch(&accepted); + row.batch_bytes = batch.len(); + row.n_parsed = parser.parse_batch(&batch, dialect).unwrap_or(0); + row.batch_ns = time_batch(|| parser.parse_batch(&batch, dialect).unwrap_or(0)); + row.ns_per_stmt = row.batch_ns / accepted.len() as f64; + row +} + +/// Quick smoke check used by `cargo test`: every batch-capable parser parses a +/// tiny multi-statement script per supported dialect without panicking. Needs +/// no corpus, so it stays instant. +fn smoke() { + std::panic::set_hook(Box::new(|_| {})); + let script = "SELECT 1;\nSELECT 2;\nSELECT 3"; + for &dialect in DIALECTS { + for parser in BenchParser::all() { + if !parser.can_batch() || !parser.supports(dialect) { + continue; + } + black_box(parser.parse_batch(script, dialect)); + } + } + println!("smoke ok"); +} + +fn main() { + // Match `benches/parsing.rs`: only an explicit `cargo bench` (which passes + // `--bench` and not `--test`) does the full, datasets-backed run. `cargo + // test` and a bare run take the fast smoke path, which needs no corpus. + let args: Vec = std::env::args().collect(); + let full_run = args.iter().any(|a| a == "--bench") && !args.iter().any(|a| a == "--test"); + if !full_run { + smoke(); + return; + } + + // Acceptance checks are panic-guarded; suppress the default panic message so + // a caught panic does not spam stderr. + std::panic::set_hook(Box::new(|_| {})); + + if let Err(e) = sql_ast_benchmark::datasets::ensure_corpus() { + eprintln!("ERROR: could not prepare datasets/: {e}"); + std::process::exit(1); + } + fs::create_dir_all(OUT_DIR).expect("create out dir"); + + let mut summary = fs::File::create(format!("{OUT_DIR}/summary.csv")).expect("summary.csv"); + writeln!( + summary, + "dialect,parser,n_accepted,n_parsed,batch_bytes,batch_ns,ns_per_stmt" + ) + .unwrap(); + + let parsers = BenchParser::all(); + let start_all = Instant::now(); + + for &dialect in DIALECTS { + let stmts = load_dialect(dialect); + if stmts.is_empty() { + continue; + } + for parser in &parsers { + let parser = *parser; + if !parser.can_batch() || !parser.supports(dialect) { + continue; + } + let job_start = Instant::now(); + // Run on a large stack: deeply nested accepted statements can + // otherwise overflow the default stack and abort the process. + let result = std::thread::scope(|scope| { + std::thread::Builder::new() + .stack_size(WORKER_STACK) + .spawn_scoped(scope, || run_pair(parser, dialect, &stmts)) + .expect("spawn worker") + .join() + }); + let Ok(row) = result else { + eprintln!( + " [warn] {}/{} panicked, skipping pair", + dialect.dir_name(), + parser.name() + ); + continue; + }; + + writeln!( + summary, + "{},{},{},{},{},{:.1},{:.1}", + row.dialect, + row.parser, + row.n_accepted, + row.n_parsed, + row.batch_bytes, + row.batch_ns, + row.ns_per_stmt, + ) + .unwrap(); + summary.flush().unwrap(); + + let coverage = if row.n_accepted == 0 { + 0.0 + } else { + 100.0 * row.n_parsed as f64 / row.n_accepted as f64 + }; + println!( + "{:<11} {:<24} n={:>6} seen={:>6} ({:>3.0}%) batch={:>9.0}ns/stmt ({:.1}s)", + row.dialect, + row.parser, + row.n_accepted, + row.n_parsed, + coverage, + row.ns_per_stmt, + job_start.elapsed().as_secs_f64(), + ); + } + } + + println!( + "\nDone in {:.1}s. summary.csv in {OUT_DIR}/", + start_all.elapsed().as_secs_f64() + ); +} diff --git a/membench/src/main.rs b/membench/src/main.rs index 281480a..4b5e1b4 100644 --- a/membench/src/main.rs +++ b/membench/src/main.rs @@ -11,7 +11,14 @@ //! window. The libpg_query bindings parse in C and report `None` (their memory //! is invisible to the Rust allocator). //! -//! Run locally: `cargo run --release -p membench` +//! A `batch` subcommand measures whole-script memory instead: per (parser, +//! dialect) it concatenates the accepted set into one script, parses it holding +//! every AST live, and records peak/retained bytes normalized per statement to +//! `target/batch_mem_dist/summary.csv`. Databend has no batch entry point and +//! is skipped there. +//! +//! Run locally: `cargo run --release -p membench` (per-statement) +//! `cargo run --release -p membench -- batch` (whole-script) use std::alloc::{GlobalAlloc, Layout, System}; use std::fmt::Write as _; @@ -19,6 +26,7 @@ use std::fs; use std::io::Write as _; use std::path::Path; +use sql_ast_benchmark::batch::join_batch; use sql_ast_benchmark::datasets::{ensure_corpus, Dialect}; use sql_ast_benchmark::stats::slug; use sql_ast_benchmark::BenchParser; @@ -59,6 +67,7 @@ unsafe impl GlobalAlloc for Counting { static GLOBAL: Counting = Counting; const OUT_DIR: &str = "target/mem_dist"; +const BATCH_OUT_DIR: &str = "target/batch_mem_dist"; /// Deep statements can overflow the stack in recursive-descent parsers, so run /// the whole measurement on a large stack (and a single thread). @@ -163,11 +172,79 @@ fn run() { } } +/// Whole-script (batch) memory: one (peak, retained) pair per (parser, dialect), +/// normalized per statement, written to a single summary file. Only parsers with +/// a batch entry point whose memory is visible to the Rust allocator take part. +fn run_batch() { + fs::create_dir_all(BATCH_OUT_DIR).expect("create batch_mem_dist dir"); + let mut summary = + fs::File::create(format!("{BATCH_OUT_DIR}/summary.csv")).expect("create summary.csv"); + writeln!( + summary, + "dialect,parser,n_accepted,n_parsed,peak_bytes,retained_bytes,peak_per_stmt,retained_per_stmt" + ) + .expect("write header"); + + for &dialect in DIALECTS { + let stmts = load_dialect(dialect); + if stmts.is_empty() { + continue; + } + for parser in BenchParser::all() { + if !parser.can_batch() || !parser.supports(dialect) { + continue; + } + let accepted: Vec<&str> = stmts + .iter() + .filter(|s| parser.accepts(s, dialect) == Some(true)) + .map(String::as_str) + .collect(); + if accepted.is_empty() { + continue; + } + let batch = join_batch(&accepted); + // Warm up: let one-time caches/lazy statics allocate first, so they + // raise the baseline rather than this measurement. Also skips + // parsers whose memory is invisible to the Rust allocator (None). + if parser.measure_mem_batch(&batch, dialect).is_none() { + continue; + } + let Some((peak, retained)) = parser.measure_mem_batch(&batch, dialect) else { + continue; + }; + // Statements the parser actually consumed from the script, so the + // export can drop a pair whose batch parse bailed out early. + let n_parsed = parser.parse_batch(&batch, dialect).unwrap_or(0); + + let n = accepted.len() as f64; + writeln!( + summary, + "{},{},{},{n_parsed},{peak},{retained},{:.1},{:.1}", + dialect.dir_name(), + parser.name(), + accepted.len(), + peak as f64 / n, + retained as f64 / n, + ) + .expect("write row"); + summary.flush().expect("flush summary"); + let coverage = 100.0 * n_parsed as f64 / n; + eprintln!( + "batch-mem {} {}: n={} seen={n_parsed} ({coverage:.0}%) peak={peak} retained={retained}", + dialect.dir_name(), + parser.name(), + accepted.len(), + ); + } + } +} + fn main() { ensure_corpus().expect("dataset corpus"); + let batch = std::env::args().any(|a| a == "batch"); std::thread::Builder::new() .stack_size(WORKER_STACK) - .spawn(run) + .spawn(move || if batch { run_batch() } else { run() }) .expect("spawn worker") .join() .expect("measurement thread panicked"); diff --git a/src/batch.rs b/src/batch.rs new file mode 100644 index 0000000..4a169ed --- /dev/null +++ b/src/batch.rs @@ -0,0 +1,51 @@ +//! Shared construction of a multi-statement script for the batch benchmarks. +//! +//! Both the batch time bench (`benches/batch_parsing.rs`) and the batch memory +//! bench (`membench -- batch`) must feed parsers byte-identical input, so the +//! join lives here in one place rather than in each binary. + +/// Join accepted statements into a single multi-statement script. +/// +/// Each corpus statement is one line, so a `;`-and-newline separator yields an +/// unambiguous script. A trailing `;` on a statement is stripped first to avoid +/// an empty statement between terminators. The last statement gets no terminator +/// (none is required at end of input). +#[must_use] +pub fn join_batch(accepted: &[&str]) -> String { + let mut out = String::with_capacity(accepted.iter().map(|s| s.len() + 2).sum()); + for (i, s) in accepted.iter().enumerate() { + if i > 0 { + out.push_str(";\n"); + } + out.push_str(s.trim().trim_end_matches(';').trim_end()); + } + out +} + +#[cfg(test)] +mod tests { + use super::join_batch; + + #[test] + fn joins_with_terminators_and_strips_trailing_semicolons() { + assert_eq!( + join_batch(&["SELECT 1;", "SELECT 2"]), + "SELECT 1;\nSELECT 2" + ); + // Already-terminated and whitespace-padded statements normalize cleanly. + assert_eq!( + join_batch(&[" SELECT 1 ; ", "SELECT 2 ;"]), + "SELECT 1;\nSELECT 2" + ); + } + + #[test] + fn single_statement_has_no_terminator() { + assert_eq!(join_batch(&["SELECT 1"]), "SELECT 1"); + } + + #[test] + fn empty_input_is_empty() { + assert_eq!(join_batch(&[]), ""); + } +} diff --git a/src/bench_dist.rs b/src/bench_dist.rs index c308620..c6dd9f6 100644 --- a/src/bench_dist.rs +++ b/src/bench_dist.rs @@ -13,6 +13,12 @@ pub const DIST_DIR: &str = "target/bench_dist"; /// Directory where `membench` writes raw per-statement memory files. pub const MEM_DIR: &str = "target/mem_dist"; +/// Directory where the batch (whole-script) time bench writes its summary. +pub const BATCH_DIST_DIR: &str = "target/batch_dist"; + +/// Directory where `membench -- batch` writes its batch-memory summary. +pub const BATCH_MEM_DIR: &str = "target/batch_mem_dist"; + /// Ascending-sorted byte values for one `(dialect, parser, kind)`, where `kind` /// is `"peak"` or `"retained"`, from `target/mem_dist/{dialect}__{slug}.{kind}.txt` /// (empty if absent). diff --git a/src/bin/sqlbench.rs b/src/bin/sqlbench.rs index ec838ea..adf4522 100644 --- a/src/bin/sqlbench.rs +++ b/src/bin/sqlbench.rs @@ -12,6 +12,8 @@ //! prints the per-dataset acceptance matrix instead //! of per-dialect reference metrics. //! export write `web/assets/bench.json` for the explorer. +//! regen run the whole data pipeline (timing + memory +//! benches, then export) with one command. //! //! The grading logic lives in the library (`report`). This binary is argument //! dispatch plus table formatting. @@ -182,10 +184,56 @@ fn run_coverage() { println!("\n(Reference dialects are graded against the real database engine, run in Docker by the `oracle` crate and cached under oracle/labels.)"); } +// regen (run the whole data pipeline with one command). + +/// Run every input producer for `bench.json` in order, then export. +/// +/// The memory bench installs a custom global allocator, so it must run in its +/// own process, separate from the timing bench (which must stay on the default +/// allocator for fair numbers) and from export. That is why this shells out to +/// the timing and memory benches rather than calling them in-process; export +/// runs in-process at the end since it needs no special allocator. +fn run_regen() { + if let Err(e) = sql_ast_benchmark::datasets::ensure_corpus() { + eprintln!("ERROR: could not prepare datasets/: {e}"); + std::process::exit(1); + } + // Each step writes under target/, which export then reads. + let steps: [(&str, &[&str]); 3] = [ + ("cargo", &["bench"]), // target/bench_dist/ + target/batch_dist/ + ("cargo", &["run", "--release", "-p", "membench"]), // target/mem_dist/ + ( + "cargo", + &["run", "--release", "-p", "membench", "--", "batch"], + ), // target/batch_mem_dist/ + ]; + let total = steps.len() + 1; + for (i, (cmd, args)) in steps.iter().enumerate() { + eprintln!("\n[regen {}/{total}] {cmd} {}", i + 1, args.join(" ")); + let status = std::process::Command::new(cmd) + .args(*args) + .status() + .unwrap_or_else(|e| { + eprintln!("ERROR: could not launch `{cmd} {}`: {e}", args.join(" ")); + std::process::exit(1); + }); + if !status.success() { + eprintln!("ERROR: step failed: `{cmd} {}`", args.join(" ")); + std::process::exit(1); + } + } + eprintln!("\n[regen {total}/{total}] export"); + if let Err(e) = export::run() { + eprintln!("ERROR: {e}"); + std::process::exit(1); + } +} + fn usage() -> ! { eprintln!("usage: sqlbench "); eprintln!(" correctness [--per-file] grade parsers over datasets/"); eprintln!(" export write web/assets/bench.json for the site"); + eprintln!(" regen run timing + memory benches, then export"); std::process::exit(2); } @@ -217,6 +265,7 @@ fn main() { std::process::exit(1); } } + Some("regen") => run_regen(), Some("-h" | "--help" | "help") => usage(), Some(other) => { eprintln!("unknown subcommand: {other}"); diff --git a/src/export.rs b/src/export.rs index 05d7ff7..78b76e8 100644 --- a/src/export.rs +++ b/src/export.rs @@ -14,8 +14,8 @@ use crate::{bench_dist, stats, BenchParser}; use std::cmp::Ordering; use std::path::Path; use viz::{ - Bundle, CoverageFile, CoverageMatrix, DialectData, MemDist, ParserFailures, ParserMem, - ParserMetrics, ParserPerf, + Bundle, CoverageFile, CoverageMatrix, DialectData, MemDist, ParserBatch, ParserFailures, + ParserMem, ParserMetrics, ParserPerf, }; /// Output path (relative to repo root, where `cargo run` runs from). @@ -235,6 +235,127 @@ fn mem_for(dir: &str, parsers: &[BenchParser]) -> Vec { out } +/// One row of the batch time summary (`batch_dist/summary.csv`): +/// `dialect,parser,n_accepted,n_parsed,batch_bytes,batch_ns,ns_per_stmt`. +struct BatchPerfRow { + dialect: String, + parser: String, + n_accepted: usize, + n_parsed: usize, + ns_per_stmt: f64, +} + +/// One row of the batch memory summary (`batch_mem_dist/summary.csv`): +/// `dialect,parser,n_accepted,n_parsed,peak_bytes,retained_bytes,peak_per_stmt,retained_per_stmt`. +struct BatchMemRow { + dialect: String, + parser: String, + n_accepted: usize, + n_parsed: usize, + peak_per_stmt: f64, + retained_per_stmt: f64, +} + +/// Whether a batch parse consumed the whole accepted set, so its normalized cost +/// can be trusted. A fail-fast parser that errors partway yields `n_parsed` +/// below `n_accepted`; statements with internal `;` only push `n_parsed` higher, +/// so `>=` is the right "fully consumed" test. +const fn batch_complete(n_parsed: usize, n_accepted: usize) -> bool { + n_accepted > 0 && n_parsed >= n_accepted +} + +fn parse_batch_perf(content: &str) -> Vec { + content + .lines() + .skip(1) + .filter_map(|line| { + let f: Vec<&str> = line.split(',').collect(); + if f.len() < 7 { + return None; + } + Some(BatchPerfRow { + dialect: f[0].to_string(), + parser: f[1].to_string(), + n_accepted: f[2].trim().parse().ok()?, + n_parsed: f[3].trim().parse().ok()?, + ns_per_stmt: f[6].trim().parse().ok()?, + }) + }) + .collect() +} + +fn parse_batch_mem(content: &str) -> Vec { + content + .lines() + .skip(1) + .filter_map(|line| { + let f: Vec<&str> = line.split(',').collect(); + if f.len() < 8 { + return None; + } + Some(BatchMemRow { + dialect: f[0].to_string(), + parser: f[1].to_string(), + n_accepted: f[2].trim().parse().ok()?, + n_parsed: f[3].trim().parse().ok()?, + peak_per_stmt: f[6].trim().parse().ok()?, + retained_per_stmt: f[7].trim().parse().ok()?, + }) + }) + .collect() +} + +fn read_batch_perf() -> Vec { + let path = format!("{}/summary.csv", bench_dist::BATCH_DIST_DIR); + std::fs::read_to_string(path).map_or_else(|_| Vec::new(), |c| parse_batch_perf(&c)) +} + +fn read_batch_mem() -> Vec { + let path = format!("{}/summary.csv", bench_dist::BATCH_MEM_DIR); + std::fs::read_to_string(path).map_or_else(|_| Vec::new(), |c| parse_batch_mem(&c)) +} + +/// Merge batch time and batch memory rows for one dialect into per-parser +/// `ParserBatch`. A parser appears only if at least one axis parsed the whole +/// accepted set (see [`batch_complete`]); an axis whose batch bailed out early +/// is dropped to `None` so the explorer never shows a misleading number. Pure, +/// so the merge and the guard are testable. +fn batch_for(dir: &str, perf: &[BatchPerfRow], mem: &[BatchMemRow]) -> Vec { + use std::collections::BTreeMap; + let mut map: BTreeMap<&str, ParserBatch> = BTreeMap::new(); + let blank = |parser: &str, n: usize| ParserBatch { + parser: parser.to_string(), + n_accepted: n, + ns_per_stmt: None, + peak_per_stmt: None, + retained_per_stmt: None, + }; + for r in perf.iter().filter(|r| r.dialect == dir) { + let e = map + .entry(r.parser.as_str()) + .or_insert_with(|| blank(&r.parser, r.n_accepted)); + e.n_accepted = r.n_accepted; + if batch_complete(r.n_parsed, r.n_accepted) { + e.ns_per_stmt = Some(r.ns_per_stmt); + } + } + for r in mem.iter().filter(|r| r.dialect == dir) { + let e = map + .entry(r.parser.as_str()) + .or_insert_with(|| blank(&r.parser, r.n_accepted)); + if batch_complete(r.n_parsed, r.n_accepted) { + e.peak_per_stmt = Some(r.peak_per_stmt); + e.retained_per_stmt = Some(r.retained_per_stmt); + } + } + // Drop parsers whose every axis was incomplete (nothing trustworthy to show). + map.into_values() + .filter(|b| { + b.ns_per_stmt.is_some() || b.peak_per_stmt.is_some() || b.retained_per_stmt.is_some() + }) + .collect() +} + fn coverage_for(dialect: Dialect, all_parsers: &[BenchParser]) -> CoverageMatrix { let (parsers, files) = report::coverage_dialect(dialect, all_parsers); let cols: Vec = parsers.iter().map(|p| p.name().to_string()).collect(); @@ -454,6 +575,15 @@ pub fn run() -> Result<(), Box> { bench_dist::DIST_DIR ); } + let batch_perf = read_batch_perf(); + let batch_mem = read_batch_mem(); + if batch_perf.is_empty() && batch_mem.is_empty() { + eprintln!( + "note: no batch summaries in {} / {}; batch columns will be empty. Run `cargo bench --bench batch_parsing` and `cargo run --release -p membench -- batch`.", + bench_dist::BATCH_DIST_DIR, + bench_dist::BATCH_MEM_DIR + ); + } let mut dialects = Vec::new(); for &d in &ORDER { @@ -472,6 +602,7 @@ pub fn run() -> Result<(), Box> { coverage: coverage_for(d, &parsers), failures: failures_for(d.dir_name(), &parsers), memory: mem_for(d.dir_name(), &parsers), + batch: batch_for(d.dir_name(), &batch_perf, &batch_mem), }); } @@ -495,8 +626,8 @@ pub fn run() -> Result<(), Box> { #[cfg(test)] mod tests { use super::{ - build_coverage_matrix, format_failure_tsv, git_short, metrics, now_utc, parse_summary, pct, - perf_row_to_perf, PerfRow, + batch_complete, batch_for, build_coverage_matrix, format_failure_tsv, git_short, metrics, + now_utc, parse_batch_mem, parse_batch_perf, parse_summary, pct, perf_row_to_perf, PerfRow, }; use crate::datasets::Dialect; use crate::report::{DialectReport, FileCoverage}; @@ -648,6 +779,82 @@ mod tests { assert_eq!(tsv, "statement\treason\nSELECT 1\t\n"); } + #[test] + fn batch_perf_parses_and_skips_short_lines() { + let csv = "dialect,parser,n_accepted,n_parsed,batch_bytes,batch_ns,ns_per_stmt\n\ + postgresql,sqlparser-rs,100,100,5000,400000.0,4000.0\n\ + short,row\n"; + let rows = parse_batch_perf(csv); + assert_eq!(rows.len(), 1); + assert_eq!(rows[0].parser, "sqlparser-rs"); + assert_eq!(rows[0].n_accepted, 100); + assert_eq!(rows[0].n_parsed, 100); + assert!((rows[0].ns_per_stmt - 4000.0).abs() < 1e-9); + } + + #[test] + fn batch_mem_parses_peak_and_retained_columns() { + let csv = "dialect,parser,n_accepted,n_parsed,peak_bytes,retained_bytes,peak_per_stmt,retained_per_stmt\n\ + sqlite,turso_parser,50,50,100000,40000,2000.0,800.0\n"; + let rows = parse_batch_mem(csv); + assert_eq!(rows.len(), 1); + assert_eq!(rows[0].n_parsed, 50); + assert!((rows[0].peak_per_stmt - 2000.0).abs() < 1e-9); + assert!((rows[0].retained_per_stmt - 800.0).abs() < 1e-9); + } + + #[test] + fn batch_complete_requires_full_consumption() { + assert!(batch_complete(10, 10)); // exactly consumed + assert!(batch_complete(12, 10)); // internal-semicolon inflation is fine + assert!(!batch_complete(9, 10)); // bailed out early + assert!(!batch_complete(0, 0)); // nothing accepted + } + + #[test] + fn batch_merge_combines_time_and_memory_and_filters_by_dialect() { + let perf = parse_batch_perf( + "h,h,h,h,h,h,h\n\ + postgresql,sqlparser-rs,10,10,1,100.0,10.0\n\ + postgresql,pg_query.rs,10,10,1,80.0,8.0\n", + ); + let mem = parse_batch_mem( + "h,h,h,h,h,h,h,h\n\ + postgresql,sqlparser-rs,10,10,1,1,500.0,200.0\n", + ); + let merged = batch_for("postgresql", &perf, &mem); + assert_eq!(merged.len(), 2); + let sp = merged.iter().find(|x| x.parser == "sqlparser-rs").unwrap(); + assert_eq!(sp.ns_per_stmt, Some(10.0)); + assert_eq!(sp.peak_per_stmt, Some(500.0)); + assert_eq!(sp.retained_per_stmt, Some(200.0)); + // pg_query has batch time but no Rust-visible batch memory. + let pg = merged.iter().find(|x| x.parser == "pg_query.rs").unwrap(); + assert_eq!(pg.ns_per_stmt, Some(8.0)); + assert_eq!(pg.peak_per_stmt, None); + // A different dialect yields nothing from the same rows. + assert!(batch_for("sqlite", &perf, &mem).is_empty()); + } + + #[test] + fn batch_merge_drops_incomplete_parses() { + // Time bailed out early (5 of 10): its ns_per_stmt is untrustworthy and + // dropped. Memory parsed fully, so it survives and keeps the entry. + let perf = parse_batch_perf("h,h,h,h,h,h,h\npostgresql,sqlparser-rs,10,5,1,100.0,10.0\n"); + let mem = + parse_batch_mem("h,h,h,h,h,h,h,h\npostgresql,sqlparser-rs,10,10,1,1,500.0,200.0\n"); + let merged = batch_for("postgresql", &perf, &mem); + assert_eq!(merged.len(), 1); + assert_eq!(merged[0].ns_per_stmt, None); // dropped: incomplete + assert_eq!(merged[0].peak_per_stmt, Some(500.0)); + + // Both axes incomplete -> the parser is omitted entirely. + let perf2 = parse_batch_perf("h,h,h,h,h,h,h\npostgresql,sqlparser-rs,10,2,1,100.0,10.0\n"); + let mem2 = + parse_batch_mem("h,h,h,h,h,h,h,h\npostgresql,sqlparser-rs,10,3,1,1,500.0,200.0\n"); + assert!(batch_for("postgresql", &perf2, &mem2).is_empty()); + } + #[test] fn tsv_respects_the_cap() { let rows: Vec = (0..2000).map(|i| format!("SELECT {i}")).collect(); diff --git a/src/lib.rs b/src/lib.rs index b2e8b1f..0b6400b 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -473,6 +473,83 @@ impl BenchParser { } } + /// Whether this parser exposes a multi-statement (batch) parse entry point, + /// so it can consume a whole script in one call. Only `databend-common-ast` + /// parses a single statement at a time, so it is excluded from the batch + /// benchmark (reported n/a there). + #[must_use] + pub const fn can_batch(self) -> bool { + !matches!(self, Self::Databend) + } + + /// Parse a whole multi-statement script `sql` in `dialect` for batch timing, + /// WITHOUT panic protection (like [`Self::parse_once`]). Returns the number + /// of statements the parser reports parsing, or `None` if the parser does + /// not model the dialect or has no batch entry point ([`Self::can_batch`]). + /// + /// Fail-fast parsers (those returning a `Vec` or erroring on the first bad + /// statement) yield `0` if the whole batch fails; streaming parsers + /// (`sqlite3-parser`, `turso_parser`) yield the count parsed before the + /// first error or EOF. Batches are built from already-accepted statements, + /// so a clean run parses all of them; the count is kept for coverage. + #[must_use] + pub fn parse_batch(self, sql: &str, dialect: Dialect) -> Option { + match self { + Self::Sqlparser => { + Some(Parser::parse_sql(&*sqlparser_dialect(dialect), sql).map_or(0, |v| v.len())) + } + Self::PgQuery => (dialect == Dialect::Postgresql) + .then(|| pg_query::parse(sql).map_or(0, |r| r.protobuf.stmts.len())), + Self::PgQuerySummary => (dialect == Dialect::Postgresql) + .then(|| pg_query::summary(sql, -1).map_or(0, |r| r.statement_types.len())), + Self::Qusql => qusql_dialect(dialect).map(|d| { + let opts = ParseOptions::new() + .dialect(d) + .arguments(qusql_parse::SQLArguments::Dollar); + let mut issues = Issues::new(sql); + let stmts = parse_statements(sql, &mut issues, &opts); + // Resilient parser: report a full count only when error-free. + if issues.get().iter().any(|i| i.level == Level::Error) { + 0 + } else { + stmts.len() + } + }), + Self::Polyglot => { + Some(polyglot_parse(sql, polyglot_dialect(dialect)).map_or(0, |v| v.len())) + } + // Single-statement parser: no batch entry point. + Self::Databend => None, + Self::Orql => { + (dialect == Dialect::Oracle).then(|| orql_parser::parse(sql).map_or(0, |v| v.len())) + } + Self::Sqlglot => Some( + sqlglot_rust::parser::parse_statements(sql, sqlglot_dialect(dialect)) + .map_or(0, |v| v.len()), + ), + Self::Sqlite3 => (dialect == Dialect::Sqlite).then(|| { + let mut parser = sqlite3_parser::lexer::sql::Parser::new(sql.as_bytes()); + let mut n = 0; + loop { + match parser.next() { + Ok(Some(_)) => n += 1, + Ok(None) | Err(_) => break n, + } + } + }), + Self::Turso => (dialect == Dialect::Sqlite).then(|| { + let mut parser = turso_parser::parser::Parser::new(sql.as_bytes()); + let mut n = 0; + loop { + match parser.next_cmd() { + Ok(Some(_)) => n += 1, + Ok(None) | Err(_) => break n, + } + } + }), + } + } + /// Parse `sql` while the `membench` allocator is active and return /// `(peak, retained)` bytes: the high-water mark of live allocations during /// the parse, and the bytes still live afterwards (the produced AST plus any @@ -594,6 +671,22 @@ impl BenchParser { } } + /// Like [`Self::measure_mem`], but for a whole multi-statement script: it + /// holds every statement's AST live at once, so the `(peak, retained)` it + /// reports reflects batch parsing (a grown `Vec` of statements, all ASTs + /// retained together). `None` when the parser has no batch entry point + /// ([`Self::can_batch`], so databend), when its memory is invisible to the + /// Rust allocator (the `libpg_query` bindings), or when it does not model + /// `dialect`. Called single-threaded from the `membench` binary; under any + /// other binary the counters are zero and it returns `Some((0, 0))`. + #[must_use] + pub fn measure_mem_batch(self, sql: &str, dialect: Dialect) -> Option<(usize, usize)> { + if !self.can_batch() { + return None; + } + self.measure_mem(sql, dialect) + } + /// Parse and pretty-print, returning `None` if the parser has no printer, does not /// model `dialect`, or fails to parse `sql`. #[must_use] @@ -666,6 +759,7 @@ impl BenchParser { } } +pub mod batch; pub mod bench_dist; pub mod datasets; pub mod export; @@ -859,4 +953,79 @@ mod tests { .fidelity("SELECT 1", Dialect::Sqlite) .is_some()); } + + #[test] + fn can_batch_excludes_only_databend() { + // databend-common-ast parses one statement per call; every other parser + // has a multi-statement entry point. + assert!(!BenchParser::Databend.can_batch()); + for p in BenchParser::all() { + if p != BenchParser::Databend { + assert!(p.can_batch(), "{} should support batch parsing", p.name()); + } + } + } + + #[test] + fn parse_batch_counts_a_three_statement_script() { + let script = "SELECT 1; SELECT 2; SELECT 3"; + // Multi-dialect Vec parsers count every statement. + assert_eq!( + BenchParser::Sqlparser.parse_batch(script, Dialect::Postgresql), + Some(3) + ); + // Streaming SQLite parsers count every statement too. + assert_eq!( + BenchParser::Sqlite3.parse_batch(script, Dialect::Sqlite), + Some(3) + ); + assert_eq!( + BenchParser::Turso.parse_batch(script, Dialect::Sqlite), + Some(3) + ); + } + + #[test] + fn parse_batch_is_none_when_unavailable() { + // No batch entry point, even on a dialect databend models. + assert_eq!( + BenchParser::Databend.parse_batch("SELECT 1", Dialect::Postgresql), + None + ); + // Unsupported dialect for a dialect-specific parser. + assert_eq!( + BenchParser::Sqlite3.parse_batch("SELECT 1", Dialect::Postgresql), + None + ); + assert_eq!( + BenchParser::Orql.parse_batch("SELECT 1", Dialect::Postgresql), + None + ); + } + + #[test] + fn measure_mem_batch_gating() { + // The counting allocator only exists in the membench binary, so under + // cargo test these assert the gating (Some vs None), not byte values. + let script = "SELECT 1; SELECT 2"; + // Batch-capable, Rust-visible parser: Some (value is (0, 0) here). + assert!(BenchParser::Sqlparser + .measure_mem_batch(script, Dialect::Postgresql) + .is_some()); + // No batch entry point. + assert_eq!( + BenchParser::Databend.measure_mem_batch(script, Dialect::Postgresql), + None + ); + // Memory invisible to the Rust allocator (parses in C). + assert_eq!( + BenchParser::PgQuery.measure_mem_batch(script, Dialect::Postgresql), + None + ); + // Unsupported dialect for a dialect-specific parser. + assert_eq!( + BenchParser::Sqlite3.measure_mem_batch(script, Dialect::Postgresql), + None + ); + } } diff --git a/viz/src/chart.rs b/viz/src/chart.rs index 9e7328d..51ea41d 100644 --- a/viz/src/chart.rs +++ b/viz/src/chart.rs @@ -377,6 +377,7 @@ mod tests { }, failures: vec![], memory: vec![], + batch: vec![], } } diff --git a/viz/src/lib.rs b/viz/src/lib.rs index c2a0126..663bd8d 100644 --- a/viz/src/lib.rs +++ b/viz/src/lib.rs @@ -14,6 +14,6 @@ pub mod schema; pub use chart::{box_lines, box_svg, ecdf_lines, ecdf_svg, mem_line, Line}; pub use color::{parser_hex, parser_rgb}; pub use schema::{ - Bundle, CoverageFile, CoverageMatrix, DialectData, MemDist, ParserFailures, ParserMem, - ParserMetrics, ParserPerf, + Bundle, CoverageFile, CoverageMatrix, DialectData, MemDist, ParserBatch, ParserFailures, + ParserMem, ParserMetrics, ParserPerf, }; diff --git a/viz/src/schema.rs b/viz/src/schema.rs index e4024ca..978e796 100644 --- a/viz/src/schema.rs +++ b/viz/src/schema.rs @@ -37,6 +37,36 @@ pub struct DialectData { /// Per-parser memory distribution (peak and retained bytes per statement). #[serde(default)] pub memory: Vec, + /// Per-parser whole-script (batch) results: the cost of parsing the whole + /// accepted set as one script, normalized per statement. + #[serde(default)] + pub batch: Vec, +} + +/// Whole-script (batch) parse results for one parser in one dialect. +/// +/// The parser's whole accepted set is concatenated into a single script and +/// parsed in one call; the cost is divided by the statement count. This +/// complements the per-statement [`ParserPerf`]/[`ParserMem`], exposing the +/// amortization a batch API gains or loses (a grown `Vec` of statements, all +/// ASTs held at once). The values are means (total over count), so they compare +/// to the per-statement `mean`. Fields are `Option` because the batch time and +/// batch memory benches run separately and either may be absent (and the +/// `libpg_query` bindings have batch time but no Rust-visible batch memory). +#[derive(Serialize, Deserialize, Clone, Debug)] +pub struct ParserBatch { + pub parser: String, + /// Statements fed into the batch (the parser's accepted set). + pub n_accepted: usize, + /// Whole-script parse time divided by statement count (ns). + #[serde(default)] + pub ns_per_stmt: Option, + /// Peak live bytes during the whole-script parse, per statement. + #[serde(default)] + pub peak_per_stmt: Option, + /// Retained bytes after the whole-script parse, per statement. + #[serde(default)] + pub retained_per_stmt: Option, } /// Per-statement memory distribution for one parser in one dialect. Bytes, diff --git a/web/src/components.rs b/web/src/components.rs index fe7c389..71ae7e5 100644 --- a/web/src/components.rs +++ b/web/src/components.rs @@ -669,6 +669,8 @@ pub fn ParserView(name: String) -> Element { "fidelity", "median ns", "p90 ns", + "mean ns", + "batch ns/stmt", ] .iter() .map(ToString::to_string) @@ -694,6 +696,8 @@ pub fn ParserView(name: String) -> Element { Cell::pct(m.and_then(|m| m.fidelity_pct)), Cell::ns(p.map(|p| p.median)), Cell::ns(p.map(|p| p.p90)), + Cell::ns(p.map(|p| p.mean)), + Cell::ns(batch_of(d, &parser).and_then(|x| x.ns_per_stmt)), ], }) .collect(); @@ -745,7 +749,7 @@ pub fn ParserView(name: String) -> Element { "Results by dialect" } p { class: "table-cap", - "One row per dialect. \"accept / recall\" is recall where a reference parser exists, otherwise the acceptance rate. \"false pos\" is the share of invalid statements wrongly accepted (lower is better). \"round-trip\" is the share of accepted statements that re-parse unchanged, \"fidelity\" the share whose printed form matches the original. \"median ns\" and \"p90 ns\" are parse times (lower is faster)." + "One row per dialect. \"accept / recall\" is recall where a reference parser exists, otherwise the acceptance rate. \"false pos\" is the share of invalid statements wrongly accepted (lower is better). \"round-trip\" is the share of accepted statements that re-parse unchanged, \"fidelity\" the share whose printed form matches the original. \"median ns\" and \"p90 ns\" are per-statement parse times (lower is faster), \"mean ns\" the per-statement average, and \"batch ns/stmt\" the whole accepted set parsed as one script divided by its statement count, so compare it to the adjacent mean (blank where not measured or no batch entry point)." } SortTable { caption: format!("Per-dialect results for {}", parser), @@ -784,8 +788,12 @@ fn parser_memory_section(b: &viz::Bundle, parser: &str) -> Element { cells: vec![ Cell::bytes(Some(m.peak.median)), Cell::bytes(Some(m.peak.p90)), + Cell::bytes(Some(m.peak.mean)), + Cell::bytes(batch_of(d, parser).and_then(|x| x.peak_per_stmt)), Cell::bytes(Some(m.retained.median)), Cell::bytes(Some(m.retained.p90)), + Cell::bytes(Some(m.retained.mean)), + Cell::bytes(batch_of(d, parser).and_then(|x| x.retained_per_stmt)), ], }) }) @@ -793,10 +801,19 @@ fn parser_memory_section(b: &viz::Bundle, parser: &str) -> Element { if rows.is_empty() { return rsx! {}; } - let columns = ["peak p50", "peak p90", "retained p50", "retained p90"] - .iter() - .map(ToString::to_string) - .collect(); + let columns = [ + "peak p50", + "peak p90", + "peak mean", + "batch peak/stmt", + "retained p50", + "retained p90", + "retained mean", + "batch ret/stmt", + ] + .iter() + .map(ToString::to_string) + .collect(); let peak_lines: Vec = b .dialects .iter() @@ -844,7 +861,7 @@ fn parser_memory_section(b: &viz::Bundle, parser: &str) -> Element { "Memory by dialect" } p { class: "table-cap", - "One row per dialect, bytes per statement. \"peak\" is the high-water mark of live memory during a parse, \"retained\" what the produced AST keeps alive afterwards." + "One row per dialect, bytes per statement. \"peak\" is the high-water mark of live memory during a parse, \"retained\" what the produced AST keeps alive afterwards. \"peak mean\" and \"retained mean\" are the per-statement averages, and \"batch peak/stmt\" and \"batch ret/stmt\" are the same over the whole accepted set parsed as one script divided by its statement count, so compare each batch column to the adjacent mean (blank where not measured or no batch entry point)." } div { class: "charts", {chart_figure(&format!("chart-{}-mempeak-ecdf", slug(parser)), &peak_ecdf, &format!("Empirical CDF of {parser} peak memory, one curve per dialect."), "Peak live memory per parse, one curve per dialect. Further left is leaner (log scale).", &format!("{}-peak-memory-ecdf", slug(parser)))} @@ -1473,11 +1490,23 @@ fn missed_val(d: &DialectData, p: &ParserPerf) -> Option { // ---- tables ---- +/// The batch (whole-script) result for one parser in a dialect, if measured. +fn batch_of<'a>(d: &'a DialectData, parser: &str) -> Option<&'a viz::ParserBatch> { + d.batch.iter().find(|x| x.parser == parser) +} + fn perf_table(d: &DialectData) -> Element { - let columns = ["median ns", "p90 ns", "missed %", "RT %"] - .iter() - .map(ToString::to_string) - .collect(); + let columns = [ + "median ns", + "p90 ns", + "mean ns", + "batch ns/stmt", + "missed %", + "RT %", + ] + .iter() + .map(ToString::to_string) + .collect(); let rows = d .perf .iter() @@ -1487,6 +1516,8 @@ fn perf_table(d: &DialectData) -> Element { cells: vec![ Cell::ns(Some(p.median)), Cell::ns(Some(p.p90)), + Cell::ns(Some(p.mean)), + Cell::ns(batch_of(d, &p.parser).and_then(|x| x.ns_per_stmt)), Cell::with(missed_pct(d, p), missed_val(d, p)), Cell::pct(p.roundtrip_pct), ], @@ -1499,7 +1530,7 @@ fn perf_table(d: &DialectData) -> Element { "Speed" } p { class: "table-cap", - "One row per parser. \"median ns\" and \"p90 ns\" are per-statement parse times in nanoseconds (lower is faster). \"missed %\" is the share of expected statements not accepted, \"RT %\" the round-trip rate, the share of accepted statements that re-parse unchanged." + "One row per parser. \"median ns\" and \"p90 ns\" are per-statement parse times in nanoseconds (lower is faster). \"mean ns\" is the per-statement average, and \"batch ns/stmt\" is the whole accepted set parsed as one script divided by its statement count, so comparing those two adjacent averages shows what bulk parsing saves or costs (batch blank where not measured or no batch entry point). \"missed %\" is the share of expected statements not accepted, \"RT %\" the round-trip rate, the share of accepted statements that re-parse unchanged." } SortTable { caption: format!("Per-parser parse time in nanoseconds for {}", d.display_name), @@ -1516,10 +1547,19 @@ fn memory_table(d: &DialectData) -> Element { if d.memory.is_empty() { return rsx! {}; } - let columns = ["peak p50", "peak p90", "retained p50", "retained p90"] - .iter() - .map(ToString::to_string) - .collect(); + let columns = [ + "peak p50", + "peak p90", + "peak mean", + "batch peak/stmt", + "retained p50", + "retained p90", + "retained mean", + "batch ret/stmt", + ] + .iter() + .map(ToString::to_string) + .collect(); // Order parsers the same way as every other plot/table on the page (the // speed order in `d.perf`), so a parser sits in the same legend slot // everywhere. @@ -1535,8 +1575,12 @@ fn memory_table(d: &DialectData) -> Element { cells: vec![ Cell::bytes(Some(m.peak.median)), Cell::bytes(Some(m.peak.p90)), + Cell::bytes(Some(m.peak.mean)), + Cell::bytes(batch_of(d, &m.parser).and_then(|x| x.peak_per_stmt)), Cell::bytes(Some(m.retained.median)), Cell::bytes(Some(m.retained.p90)), + Cell::bytes(Some(m.retained.mean)), + Cell::bytes(batch_of(d, &m.parser).and_then(|x| x.retained_per_stmt)), ], }) .collect(); @@ -1561,7 +1605,7 @@ fn memory_table(d: &DialectData) -> Element { "Memory" } p { class: "table-cap", - "Bytes per statement, measured with a counting allocator. \"peak\" is the high-water mark of live memory during the parse, \"retained\" what the produced AST keeps alive afterwards. The libpg_query bindings are omitted (they parse in C, invisible to the Rust allocator)." + "Bytes per statement, measured with a counting allocator. \"peak\" is the high-water mark of live memory during the parse, \"retained\" what the produced AST keeps alive afterwards. \"peak mean\" and \"retained mean\" are the per-statement averages, and \"batch peak/stmt\" and \"batch ret/stmt\" are the same over the whole accepted set parsed as one script divided by its statement count, so compare each batch column to the adjacent mean (batch retained is higher when every statement's AST is held at once; blank where not measured or no batch entry point). The libpg_query bindings are omitted (they parse in C, invisible to the Rust allocator)." } div { class: "charts", {chart_figure(&format!("chart-{}-mempeak-ecdf", d.dir_name), &peak_ecdf, &format!("Empirical CDF of peak memory for {}, one curve per parser.", d.display_name), "Peak live memory per parse, one curve per parser. Further left is leaner (log scale).", &format!("{}-peak-memory-ecdf", d.dir_name))}