-
Notifications
You must be signed in to change notification settings - Fork 4
BDMS-221-225: Polymorphic DataProvenance model #245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jacob-a-brown
merged 7 commits into
bdms-221
from
kas-bdms-221-225-core-well-info-models-schemas
Nov 17, 2025
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
57bd631
feat: add DataProvenance model and enhance base mixins
ksmuczynski 0e601fd
feat: add DataProvenanceMixin for polymorphic provenance tracking
ksmuczynski f2184d2
refactor: refine polymorphic parent relationships.
ksmuczynski 27b7c82
refactor: move DataProvenanceMixin to data_provenance.py and refactor…
ksmuczynski 73d3a48
refactor: Update lexicon and `enums.py` with DataProvenance related i…
ksmuczynski 781d3f4
refactor: Update lexicon and `enums.py` with DataProvenance related i…
ksmuczynski a1614f6
Merge branch 'bdms-221' into kas-bdms-221-225-core-well-info-models-s…
jacob-a-brown File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,123 @@ | ||
| """ | ||
| SQLAlchemy model for the Provenance table. | ||
|
|
||
| This is the central polymorphic repository for all provenance (origin) metadata | ||
| for foundational or static data in the database, such as elevation details or | ||
| well construction information. | ||
|
|
||
| ***NOTE:*** | ||
| This table is **not** used to store routine, transactional analytical metadata | ||
| (such as lab qualifiers, detection limits, or analysis dates). That information | ||
| is an intrinsic part of a lab result and is stored in the `Observation` and | ||
| `LabLimit` tables. This table is for sourcing foundational data, such as a well's | ||
| construction details or a site's coordinates. | ||
|
|
||
| """ | ||
|
|
||
| from typing import TYPE_CHECKING | ||
|
|
||
| from sqlalchemy import Integer, Index, and_ | ||
| from sqlalchemy.orm import relationship, Mapped, mapped_column, declared_attr, foreign | ||
|
|
||
| from db.base import Base, AutoBaseMixin, ReleaseMixin, pascal_to_snake | ||
|
|
||
| from db import lexicon_term | ||
|
|
||
| if TYPE_CHECKING: | ||
| from db.thing import Thing | ||
| from db.location import Location | ||
|
|
||
|
|
||
| class DataProvenance(AutoBaseMixin, ReleaseMixin, Base): | ||
| """ | ||
| Represents a single piece of provenance metadata that can be attached to | ||
| any other record or field in the database. | ||
| """ | ||
|
|
||
| # --- Polymorphic Columns --- | ||
| target_id: Mapped[int] = mapped_column( | ||
| Integer, | ||
| nullable=False, | ||
| comment="The primary key (`id`) of the parent record this metadata is about (e.g., the `thing_id` of a well).", | ||
| ) | ||
| target_table: Mapped[str] = mapped_column( | ||
| nullable=False, | ||
| comment="The name of the parent table this metadata is for (e.g., 'Thing', 'Location', etc).", | ||
| ) | ||
|
|
||
| # --- Columns --- | ||
| field_name: Mapped[str] = mapped_column( | ||
| nullable=True, | ||
| comment="The specific column in the parent table that this metadata applies to (e.g., 'well_depth_ft', 'coordinates')." | ||
| "If `NULL`, the record applies to the entire parent object.", | ||
| ) | ||
| # Values from the following NMAquifer tables are included as `origin_source` terms in the lexicon: | ||
| # 'LU_DataSource', 'LU_Depth_CompletionSource'. | ||
| origin_source: Mapped[str] = lexicon_term( | ||
| nullable=True, | ||
| comment="Indicates the origin source of the data (e.g'Driller's Log', 'Well Report'.", | ||
| ) | ||
| # Values from the following NMAquifer tables are included as `collection_method` terms in the lexicon: | ||
| # 'LU_AltitudeMethod','LU_CoordinateMethod'. | ||
| collection_method: Mapped[str] = lexicon_term( | ||
| nullable=True, | ||
| comment="Indicates the method used to collect the data (e.g., 'GPS - Survey Grade').", | ||
| ) | ||
| accuracy_value: Mapped[float] = mapped_column( | ||
| nullable=True, comment="A numeric value representing the data's accuracy." | ||
| ) | ||
| # Unit values from the following NMAquifer tables are included as 'unit' terms in the lexicon: 'LU_CoordinateAccuracy'. | ||
| accuracy_unit: Mapped[str] = lexicon_term( | ||
| nullable=True, | ||
| comment="The unit for the `accuracy_value` (e.g., 'meters', 'feet').", | ||
| ) | ||
|
|
||
| # --- Polymorphic Parent Relationships (Internal) --- | ||
| # These are view-only relationships used by the 'target' property below. | ||
| # They tell SQLAlchemy exactly how to join `DataProvenance` to the parent/target table. | ||
| _thing_target: Mapped["Thing"] = relationship( | ||
| "Thing", | ||
| primaryjoin="and_(foreign(DataProvenance.target_id) == Thing.id, DataProvenance.target_table == 'thing')", | ||
| viewonly=True, | ||
| ) | ||
| _location_target: Mapped["Location"] = relationship( | ||
| "Location", | ||
| primaryjoin="and_(foreign(DataProvenance.target_id) == Location.id, DataProvenance.target_table == 'location')", | ||
| viewonly=True, | ||
| ) | ||
|
ksmuczynski marked this conversation as resolved.
|
||
|
|
||
| @property | ||
| def target(self): | ||
| """ | ||
| A generic property to get the parent object (Thing, Location, etc.). | ||
| This is useful for simplifying application code by providing a single, | ||
| consistent way to access the parent of a polymorphic record. | ||
| """ | ||
| return getattr(self, f"_{self.target_table.lower()}_target") | ||
|
|
||
| # --- Table Arguments --- | ||
| __table_args__ = ( | ||
| # Composite index for fast polymorphic lookups | ||
| Index("ix_provenance_targets", "target_id", "target_table"), | ||
| ) | ||
|
|
||
|
|
||
| class DataProvenanceMixin: | ||
| """ | ||
| Mixin for models that can have data provenance records (e.g., Thing, Location). | ||
| It automatically creates a polymorphic One-to-Many relationship to the | ||
| DataProvenance table. | ||
| """ | ||
|
|
||
| @declared_attr | ||
| def data_provenance(cls): | ||
| # One-to-Many polymorphic relationship | ||
| return relationship( | ||
| "DataProvenance", | ||
| primaryjoin=and_( | ||
| cls.id == foreign(DataProvenance.target_id), | ||
| DataProvenance.target_table == pascal_to_snake(cls.__name__), | ||
| ), | ||
| lazy="selectin", | ||
| viewonly=True, | ||
| ) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these fields are used for a small subset of fields in a subset of tables. should these move to those tables, rather than be here? since they won't apply to a number of fields for which
DataProvenancewill be used.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but I saw this as one of the advantages of the Provenance model - all the sparse, optional, and evolving metadata is organized in one central place. If we move these fields to the Location table we'd have to add even more fields (
coordinate_accuracy,coordinate_collection_method,coordinate_accuracy_value,coordiante_accuracy_unit, plus the same ones for elevation). I think storing this type of metadata is more efficient with theDataProvenancetable, but will let @jirhiker weigh in, too.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems appropriate to store these fields here. We can reevaluate later if user requirements dictate