Timestamp proposal#264
Conversation
Codecov Report❌ Patch coverage is
|
There was a problem hiding this comment.
Pull request overview
This PR proposes the addition of three new date fields to preserve critical temporal data during migration from the legacy AMPAPI system to NMSampleLocations: Location.legacy_date_created, Location.inventoried_on, and Thing.well_completed_on. Analysis revealed that without these fields, 3,638 historical inventory dates (9.14% of locations) and 11,634 well completion dates (30% of wells) would be permanently lost, including historical context spanning up to 54 years.
Key Changes:
- Enhanced CSV reading utility to check local data directory before cache/GCS
- Added comprehensive test scenarios for well completion date field functionality
- Added comprehensive test scenarios for location legacy date field functionality
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| transfers/util.py | Modified CSV reading logic to prioritize local data directory, improving development workflow and data access patterns |
| tests/features/well-completion-date.feature | Added BDD scenarios covering creation, retrieval, updates, and filtering of wells with completion dates |
| tests/features/location-legacy-dates.feature | Added BDD scenarios for location legacy date fields, including edge cases like 54-year gaps between inventory and database entry |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # populate legacy date fields | ||
| data_dict["properties"]["legacy_date_created"] = data_dict.get( | ||
| "legacy_date_created" | ||
| ) | ||
| data_dict["properties"]["legacy_site_date"] = data_dict.get("legacy_site_date") |
There was a problem hiding this comment.
there are no features in feature files indicating that this data is required. should it be returned in the response? or just left in the database for audits/when folks are interested?
There was a problem hiding this comment.
TODO: create a feature file for this
There was a problem hiding this comment.
should this file go into the repo OcotilloBDD rather than here?
There was a problem hiding this comment.
suggestion: Move most features in OcotilloBDD into this repo going forward.
|
I appreciate the structure and comprehensiveness of this PR. I think it would benefit from some additional context regarding the history of NMAquifer and how its been used in the past. |
| AMP folks said that the earlier date between DateCreated and SiteDate is when | ||
| the site was inventoried, whereas the later is when the record was made in | ||
| the database. This was because they were used interchangeably. | ||
| """ |
There was a problem hiding this comment.
This is the vexing logic this PR is avoiding now.
| @@ -0,0 +1,354 @@ | |||
| # =============================================================================== | |||
There was a problem hiding this comment.
TODO: mark this as disposable post-migration
Timestamp Field Migration: AMPAPI → NMSampleLocations
Status: Phase 1 (Location AMPAPI Date Fields) - Implemented
Overview
Analysis of AMPAPI timestamp fields revealed that critical temporal data would be lost during migration. This guide documents the Location AMPAPI date fields implemented in Phase 1 using the
nma_prefix (New Mexico Aquifer Mapping).Implemented Fields (Phase 1)
1.
Location.nma_date_created(Date)Source: AMPAPI
Location.DateCreatedDate(day-precision, no timezone)LocationResponseonly (NOT inCreateLocationorUpdateLocation- read-only)AutoBaseMixin.created_atas true audit timestamp for new system; store AMPAPI creation date separately2.
Location.nma_site_date(Date)Source: AMPAPI
Location.SiteDateDate(day-precision, no timezone)LocationResponseonly (NOT inCreateLocationorUpdateLocation- read-only)nma_site_dateusing AMPAPI/NMA prefix to clarify this is migration data, not an active field in the new systemData Analysis Summary
Location Date Relationships (3,638 records with both dates)
Top 5 Largest Time Gaps
Design Decisions
Why
Dateinstead ofDateTime?00:00:00)Why separate
nma_date_createdfromcreated_at?AutoBaseMixin.created_atsemantics (when record created in new system)Why
nma_site_date?nma_prefix clarifies these are AMPAPI/NMA-specific fields (not active fields in the new system)SiteDate)Thing.first_visit_datefor ongoing field visit trackingLocation.nma_site_datevsThing.first_visit_dateThese fields serve different purposes and live on different entities:
nma_site_datefirst_visit_dateKey Differences:
Entity Level:
nma_site_dateis onLocation(just a geographic point)first_visit_dateis onThing(a monitored physical object at that location)Temporal Context:
nma_site_date: Historical site reconnaissance (may predate well installation)first_visit_date: First interaction with the specific Thing being monitoredData Population:
nma_site_date: Only 9% populated, from AMPAPI migration, read-onlyfirst_visit_date: Actively maintained for new Things, user-editableUse Cases:
nma_site_date: Historical research, understanding data gaps from legacy systemfirst_visit_date: Current monitoring workflow, field visit schedulingExample Scenario:
nma_site_date = 1954-05-01)first_visit_date = 2003-12-10)nma_date_created = 2008-05-28)Post-Migration Write Permissions
Location AMPAPI date fields are read-only (migration-only):
Location.nma_date_createdLocation.nma_site_dateImplementation Constraints:
nma_date_createdandnma_site_dateMUST NOT be inCreateLocationorUpdateLocationschemas (read-only)LocationResponseschemanma_prefix (New Mexico Aquifer Mapping) to indicate AMPAPI provenanceData Provenance:
nma_date_created, may havenma_site_dateImplementation Notes
GeoJSON Response Structure
Issue: AMPAPI date fields need to be accessible in nested GeoJSON responses.
Root Cause:
current_locationas aLocationGeoJSONResponsepropertiesSolution (schemas/location.py:110-112, 158-159):
Impact:
response["current_location"]["properties"]["nma_date_created"]Created At Field Removal
Issue: Transfer scripts were manually setting
created_atfrom AMPAPI dates.Root Cause:
DateCreatedandSiteDateto determinecreated_atAutoBaseMixin.created_atshould reflect when record is created in NMSampleLocations, not AMPAPISolution (transfers/util.py:247-268):
Impact:
created_atisNoneduring transfer (set automatically when saved to database)Fields NOT Being Migrated
WLReportDeliverChemistryReportDeliverImpact Assessment
Without These Fields (Phase 1)
With These Fields (Phase 1)
nma_prefix clearly indicates AMPAPI provenance