M7b: $match + $replace regex builtins#25
Merged
Conversation
…NOTHING keys Shares the regex-value apply helpers (third use, for $match/$replace) and stops a match object's function-valued `next` field from producing invalid JSON in $string. Adds D3010/D3011/D3012/D3040/T1010 error messages. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… walker Reshapes the match closure's start->index, iterates global matches, unwraps a single result (jsonata sequence semantics), D3040 on negative limit. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…limits String pattern does a literal replace (replacement verbatim); regex pattern iterates matches with a ported jsonata $-scanner ($$ literal, $0 whole, $N group via maxDigits) or a function replacer (D3012 if it returns non-string). D3010 empty pattern, D3011 negative limit. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…essions Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…fidelity) Adversarial review found lrexlib yields `false` for an unmatched optional group; jsonata uses null. We crashed ($replace string replacer concat on a boolean) and mis-rendered ($match groups had false). Normalize false->V.NULL at the regex.lua boundary, and the $N substituter only emits actual string captures (null -> empty). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…case016 (39/39) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the two remaining regex builtins on M7a's foundation, taking the regex group fully green.
2b6ef89): hoistis_regex+ the lazyapplyinto sharedH(third use);H.serializenow skips function-valued (and NOTHING) object keys so a raw match object serializes to valid JSON (no…"next":}); adds D3010/D3011/D3012/D3040/T1010.$match(7d42bc6)<s-f<s:o>n?:a<o>>: applies the regex value, walks global matches via the closure's.next, reshapes each to{match, index, groups}(jsonata'sstart→index), singleton-unwraps a single result, D3040 on negative limit.$replace(e254c4b)<s-(sf)(sf)n?:s>: string pattern → literal replace (replacement verbatim); regex pattern → a ported jsonata$-scanner ($$→$,$0→whole,$N→group via the maxDigits rule) or a function replacer (D3012 if it returns non-string); D3010 empty pattern, D3011 negative limit.63a9efe): adversarial review caught a real crash —lrexlibreturns Luafalsefor a non-participating optional capture group where jsonata usesnull, which made$replace's string replacertable.concaton a boolean and putfalsein$matchgroups. One root-cause fix (normalizefalse→nullat theregex.luaboundary; the$Nsubstituter only emits actual string captures) closes both symptoms andregex/case016.Results
regexgroup 10 → 39/39 (fully green), function-replace value cases +5, matchers +1.$contains/$splitand the HOFs (apply-hoist call path) byte-identical.$matchshapes/limits, the full$Nscanner incl. maxDigits + alternation/non-participating edges, function replacers, string-vs-regex patterns, errors, and the serialize fix — all oracle-faithful.Deferred (non-suite)
$match(…)[0].groupssingleton-unwrap ("b"vs["b"]) — affects zero suite cases (a probe-only edge); the holistic cons-array navigation fix remains a separate follow-up. Plus minor cleanups (H.is_callablehoist, module-levelparse_int,$splitO(n²) re-slice).Test plan
busted spec/— 528/0busted spec/jsonata_suite_spec.lua— zero-regression guard greenbash scripts/run-suite.sh— 1296/1682, regex 39/39$NmaxDigits + non-participating-group edges🤖 Generated with Claude Code