feat(pii): add redaction timing metrics across sidecar and persist path#5264
Conversation
- Log per-request duration in the Presidio sidecar (/analyze, /anonymize) - Add durationMs to the mask-batch endpoint log line - Emit per-execution PII redaction timing (stringCount, totalBytes, durationMs, scrubbed)
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
|
@greptile review |
PR SummaryLow Risk Overview The Presidio service ( The internal Redaction behavior and fail-safe semantics are unchanged. Reviewed by Cursor Bugbot for commit 4b2f1f2. Bugbot is set up for automated code reviews on this repo. Configure here. |
Greptile SummaryThis PR adds timing logs for PII redaction. The main changes are:
Confidence Score: 4/5The changed flow looks mergeable after tightening the new diagnostics.
apps/sim/lib/logs/execution/pii-redaction.ts, apps/sim/app/api/guardrails/mask-batch/route.ts, apps/pii/server.py Important Files Changed
|
| @@ -151,12 +152,14 @@ export async function redactPIIFromExecution( | |||
| if (collected.length === 0) return payload | |||
There was a problem hiding this comment.
No-String Executions Skip Metrics
When an execution payload has no collected strings, this early return skips the new PII redaction completed log. Those valid no-op executions become indistinguishable from cases where the redaction path did not run or logging failed, so per-execution dashboards undercount the persist path.
| if (collected.length === 0) return payload | |
| if (collected.length === 0) { | |
| logger.info('PII redaction completed', { | |
| stringCount: 0, | |
| totalBytes, | |
| durationMs: Math.round(performance.now() - startedAt), | |
| scrubbed: false, | |
| }) | |
| return payload | |
| } |
| results = analyzer.analyze( | ||
| text=req.text, | ||
| language=req.language, | ||
| entities=req.entities or None, | ||
| score_threshold=req.score_threshold, |
There was a problem hiding this comment.
Greptile SummaryThis PR adds timing metrics for PII redaction paths. The main changes are:
Confidence Score: 4/5The redaction behavior looks unchanged, but the new production metrics can be missing on the paths they are meant to illuminate.
apps/pii/server.py, apps/sim/lib/logs/execution/pii-redaction.ts, apps/sim/app/api/guardrails/mask-batch/route.ts Important Files Changed
|
| anonymizer = AnonymizerEngine() | ||
|
|
||
| # Propagates to uvicorn's root handler, so timing lands in the container log stream. | ||
| logger = logging.getLogger("sim.pii") |
There was a problem hiding this comment.
Under the sidecar's default uvicorn server:app startup, uvicorn configures its own named loggers rather than a root handler for sim.pii. The new /analyze and /anonymize logger.info calls can be dropped silently, so the timing logs described by the PR never reach the container log stream.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| logger.info('PII redaction completed', { | ||
| stringCount: collected.length, | ||
| totalBytes, | ||
| durationMs: Math.round(performance.now() - startedAt), | ||
| scrubbed, | ||
| }) |
There was a problem hiding this comment.
Production Suppresses Completion Logs
The app logger's production default suppresses info messages unless the deployment lowers LOG_LEVEL. In that common production state, successful redactions never emit PII redaction completed, so the per-execution timing metric added here is not queryable after deployment.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Summary
apps/pii/server.py) now logs per-request duration for/analyzeand/anonymize— previously it only emitted uvicorn200 OKaccess lines with no timing, so the redact path had no observable latency anywheremask-batchendpoint log line now carriesdurationMs(per-chunk wall time incl. the 8-way concurrent analyze+anonymize fan-out)redactPIIFromExecutionemits a per-executionPII redaction completedlog:stringCount,totalBytes,durationMs, and ascrubbedflag distinguishing real masking from the fail-safe pathType of Change
Testing
Tested manually:
python3 -m py_compile apps/pii/server.pypasses;bun run lintandbun run check:api-validation:strictboth pass. No behavioral change to redaction — timing instrumentation only.Queryable once deployed: filter the
/ecs/sim-*/presidiolog group forduration_ms(model time), and the app log group forPII redaction completed(per-execution total).Checklist