Skip to content

BDMS-221-225: Polymorphic DataProvenance model#245

Merged
jacob-a-brown merged 7 commits into
bdms-221from
kas-bdms-221-225-core-well-info-models-schemas
Nov 17, 2025
Merged

BDMS-221-225: Polymorphic DataProvenance model#245
jacob-a-brown merged 7 commits into
bdms-221from
kas-bdms-221-225-core-well-info-models-schemas

Conversation

@ksmuczynski

Copy link
Copy Markdown
Contributor

Why

This PR addresses the following problem / context:

  • Centralizes provenance metadata for foundational/static records to improve data traceability.
  • Enables models to associate provenance information efficiently using reusable mixins.
  • Refines location and thing models to support provenance tracking and standardized metadata.
  • The purpose of this PR is to make these intermediate updates available to @jacob-a-brown as we continue to co-work on the main BDMS-221 task.

How

Implementation summary - the following was changed / added / removed:

  • Added new DataProvenance model in db/data_provenance.py to centralize provenance metadata for foundational/static records.
  • Introduced DataProvenanceMixin in db/base.py for reusable polymorphic relationships, enabling models to associate provenance efficiently.
  • Added DataProvenanceMixin to the Location and Thing models to support provenance tracking for location and thing records.
  • Enhanced field definitions, relationships, and documentation in all affected files for clarity and maintainability.

Notes

Any special considerations, workarounds, or follow-up work to note?

  • Lexicon categories and terms for provenance fields need to be added to cover all relevant controlled vocabularies.
  • Further integration with other models may be required as provenance tracking is extended.

The current schema lacks a way to store and track provenance (origin) data across the database.

Created db/data_provenance.py with a polymorphic DataProvenance model for tracking foundational metadata across tables.
Added mixin DataProvenanceMixin to db/base.py for reusable polymorphic relationships.
Improved documentation and comments in db/base.py for mixins and helper functions.
Introduced DataProvenanceMixin to the `Thing` and `Location` models to enable reusable, efficient, polymorphic relationships to the DataProvenance table.
Comment thread db/data_provenance.py
Comment thread db/base.py Outdated
Comment thread db/data_provenance.py
Comment on lines +62 to +74
collection_method: Mapped[str] = lexicon_term(
nullable=True,
comment="Indicates the method used to collect the data (e.g., 'GPS - Survey Grade').",
)
# TODO: Values from the following NMAquifer tables should be included as terms in the lexicon: 'LU_CoordinateAccuracy'.
accuracy_value: Mapped[float] = mapped_column(
nullable=True, comment="A numeric value representing the data's accuracy."
)
# TODO: Values from the following NMAquifer tables should be included as terms in the lexicon: 'LU_CoordinateAccuracy'.
accuracy_unit: Mapped[str] = lexicon_term(
nullable=True,
comment="The unit for the `accuracy_value` (e.g., 'meters', 'feet').",
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these fields are used for a small subset of fields in a subset of tables. should these move to those tables, rather than be here? since they won't apply to a number of fields for which DataProvenance will be used.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but I saw this as one of the advantages of the Provenance model - all the sparse, optional, and evolving metadata is organized in one central place. If we move these fields to the Location table we'd have to add even more fields (coordinate_accuracy, coordinate_collection_method, coordinate_accuracy_value, coordiante_accuracy_unit, plus the same ones for elevation). I think storing this type of metadata is more efficient with the DataProvenance table, but will let @jirhiker weigh in, too.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems appropriate to store these fields here. We can reevaluate later if user requirements dictate

ksmuczynski and others added 5 commits November 13, 2025 09:27
The database tables are snake_case, so for consistency and ease of debugging, the `target_table` values should also use snake_case.

Refined the _thing_target and _location_target relationships to ensure DataProvenance.target_table uses snake_case ('thing', 'location') for the target table name.
… for class-level usage

- Relocated DataProvenanceMixin from base.py to data_provenance.py for better modularity and provenance management.
- Refactored mixin to use cls in @declared_attr for proper class-level relationship definition.
…nformation.

- Added new `origin_source` and `collection_method` categories and terms.
- Added 'meters' as a term associated with the `unit` category.
- Added `OriginStatus` to `enums.py`.
…nformation.

- Added new `origin_source` and `collection_method` categories and terms.
- Added 'meters' as a term associated with the `unit` category.
- Added `OriginStatus` to `enums.py`.
@jacob-a-brown jacob-a-brown merged commit 66a843b into bdms-221 Nov 17, 2025
4 checks passed
@jacob-a-brown jacob-a-brown deleted the kas-bdms-221-225-core-well-info-models-schemas branch November 17, 2025 18:00
@jacob-a-brown jacob-a-brown restored the kas-bdms-221-225-core-well-info-models-schemas branch November 17, 2025 18:01
@ksmuczynski ksmuczynski deleted the kas-bdms-221-225-core-well-info-models-schemas branch December 10, 2025 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants