Skip to content

feat(sample transform): add optional discarded_event_tags callback#25477

Draft
p-parekh wants to merge 1 commit into
vectordotdev:masterfrom
p-parekh:sample-discarded-event-tags
Draft

feat(sample transform): add optional discarded_event_tags callback#25477
p-parekh wants to merge 1 commit into
vectordotdev:masterfrom
p-parekh:sample-discarded-event-tags

Conversation

@p-parekh
Copy link
Copy Markdown

Summary

Adds an optional with_discarded_event_tags builder on the Sample transform's runtime type. When set, the callback is invoked at the drop site and its returned (key, value) pairs are merged into the component_discarded_events_total counter labels. Default None — behavior unchanged for every existing caller.

Motivation

Downstream wrappers (concretely, the Observability Pipelines Worker — a Datadog-internal consumer that uses Vector as a library) configure Sample with a group_by template but cannot currently attach per-event group-by tags to the discard metric. The discard counter today carries only intentional:true, so when a customer pipeline drops events the operator has to grep logs to know which group caused the drops.

A wrapping fork or duplicating Sample wholesale was rejected; this small extension point lets library consumers attach the tags they own without Vector taking a position on what the tags mean.

Vector configuration

No new TOML/YAML config. The callback is a programmatic API used by library consumers, not the Vector binary. Existing pipelines using sample see no behavior change.

How did you test this PR?

  • Added discarded_event_tags_callback_is_invoked_on_drop in src/transforms/sample/tests.rs that:
    • constructs a Sample with an aggressive rate=100
    • attaches a callback that increments an atomic counter and returns a fixed tag pair
    • drives 200 random events through Sample::transform
    • asserts the callback was invoked at least 100 times (most events drop)
  • Confirmed existing tests pass unchanged (the discarded_event_tags field is None in every existing test path).
  • Manually verified the existing SampleEventDiscarded emission path is taken when the callback is None, so the ComponentEventsDropped<INTENTIONAL> plumbing is unaffected for current users.

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

The new field defaults to None. Sample::new and Sample::new_with_dynamic signatures are unchanged. The only addition is the with_discarded_event_tags builder method.

Does this PR include user facing changes?

  • Yes. Added changelog.d/<id>_sample_discarded_event_tags.enhancement.md.

Library consumers gain a new opt-in API; pipeline operators see no change unless a downstream consumer of Vector wires the callback through.

Notes

  • The metric continues to be component_discarded_events_total; only the label set varies based on the (optional) callback's return value.
  • When the callback is set, the drop site emits the counter directly (via metrics::counter!) and logs a debug message in the same shape as ComponentEventsDropped<INTENTIONAL>::emit would. This is necessary because ComponentEventsDropped registers its counter with a fixed label set at registration time, so dynamic per-event labels cannot flow through it.
  • The bypass is only active when the callback is Some. With None, the existing emit!(SampleEventDiscarded) path runs unchanged.
  • The callback signature takes &Event (not &LogEvent) so trace-event drops also benefit if a future consumer wants per-trace tags. Today the existing trace-event handling in Sample::transform calls emit!(SampleEventDiscarded) via the same branch, so it gains tag support automatically when the callback is set.

Adds Sample::with_discarded_event_tags(f) — an optional callback whose
returned (key, value) pairs are merged into the
component_discarded_events_total counter labels at the drop site. Default
None; behavior unchanged for every existing caller.

Library consumers (e.g., downstream wrappers that pre-group events) can
use this to attach per-event tags to the discard metric without forking
the transform. When the callback is set, the drop site bypasses the
standard ComponentEventsDropped<INTENTIONAL> path because that type
registers its counter with a fixed label set at registration time —
dynamic per-event labels cannot flow through it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the domain: transforms Anything related to Vector's transform components label May 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Thank you for your contribution! Before we can merge this PR, please sign our Contributor License Agreement.

To sign, copy and post the phrase below as a new comment on this PR.

Note: If the bot says your username was not found, the email used in your git commit may not be linked to your GitHub account. Fix this at github.com/settings/emails, then comment recheck to retry.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: transforms Anything related to Vector's transform components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant