Skip to content

BDMS-227-231-additional-well-info-transfer-scripts#258

Merged
jacob-a-brown merged 20 commits into
bdms-227from
kas-227-231-additional-well-info-transfer-scripts
Dec 1, 2025
Merged

BDMS-227-231-additional-well-info-transfer-scripts#258
jacob-a-brown merged 20 commits into
bdms-227from
kas-227-231-additional-well-info-transfer-scripts

Conversation

@ksmuczynski

@ksmuczynski ksmuczynski commented Nov 25, 2025

Copy link
Copy Markdown
Contributor

Why

This PR addresses the following problem / context:

  • Need to migrate legacy NM_Aquifer data to new Ocotillo database schema
  • Legacy data has aquifer and formation information split across multiple lookup tables (LU_AquiferClass, LU_AquiferType, LU_Formations, LU_Lithology)
  • Wells can have multiple aquifer characteristics (e.g., both "Fractured" and "Confined") which wasn't properly modeled
  • Detailed stratigraphy data with depth intervals and lithology needs to be transferred from Stratigraphy.csv

The purpose of this PR is to make these intermediate updates available to @jacob-a-brown as we continue to co-work on the main BDMS-227 task.

How

Implementation summary - the following was changed / added / removed:

Transfer Scripts

  • Created aquifer_system_transfer.py to transfer aquifer systems from LU_AquiferClass (creates aquifer names with placeholder primary_type)
  • Created geologic_formation_transfer.py to transfer formation codes from LU_Formations (stores only formation_code, ignores legacy MEANING values)
  • Created stratigraphy_transfer.py to transfer detailed lithology logs from Stratigraphy.csv and create ThingGeologicFormationAssociation records (creates associations with depth intervals and updates formation lithology)
  • Updated well_transfer.py to:
    • Map WellData.AqClass codes to aquifer names using lexicon mapper
    • Parse multi-character AquiferType codes (e.g., "FC" → ["F", "C"])
    • Create AquiferType records for each characteristic
    • Update aquifer primary_type from "Unknown" to actual observed type
    • Handle missing AqClass by falling back to aquifer type name
    • Create ThingAquiferAssociation records linking wells to aquifer systems

Utility Updates:

  • Updated lexicon_mapper to include LU_AquiferClass (CODE → aquifer name mapping)
  • Updated lexicon_mapper to include LU_Lithology (ABBREVIATION → TERM mapping)
  • Added special handling for LU_Lithology which uses ABBREVIATION/TERM instead of CODE/MEANING columns

Transfer Workflow:

  1. aquifer_system_transfer.py - Creates named aquifer systems (e.g., "Ogallala Aquifer")
  2. geologic_formation_transfer.py - Creates formation code records (e.g., "121ALVM")
  3. well_transfer.py - Creates wells, aquifer associations, AquiferType records.
  4. stratigraphy_transfer.py - Creates geologic formation associations with depth intervals and lithology

Notes

Any special considerations, workarounds, or follow-up work to note?
Follow-up work:

  • Create formations.json file with USGS formation code-to-name mappings for API
  • Update API endpoints to map formation codes to names using formations.json
  • May need to handle cases where wells have conflicting lithology data for same formation code

Design Decisions:

  • Formation names intentionally NOT stored in database; API will map formation codes to names using authoritative formations.json file that can be updated independently
  • GeologicFormation.lithology populated during stratigraphy transfer, not formation transfer, since lithology varies by location

…bles to `util.py`.

Maps the `LU_AquiferClass` and `LU_AquiferType` lookup tables to the lexicon.
…31-additional-well-info-transfer-scripts

# Conflicts:
#	schemas/aquifer_system.py
…formation transfers.

Add stratigraphy_transfer.py to handle detailed lithology log import, create well-formation associations, and update formation lithology fields from stratigraphy data. This script is essential for linking wells to geologic formations with depth intervals.

@jirhiker jirhiker left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. A few minor tweaks needed

Comment thread transfers/aquifer_system_transfer.py Outdated
Comment thread transfers/stratigraphy_transfer.py Outdated

@jacob-a-brown jacob-a-brown left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think it looks good! Just left a few comments

Comment thread schemas/geologic_formation.py Outdated
Comment thread schemas/geologic_formation.py Outdated
Comment thread transfers/aquifer_system_transfer.py Outdated
Comment thread transfers/stratigraphy_transfer.py
Comment thread transfers/stratigraphy_transfer.py
…phy transfer

- Changed log level from 'warning' to 'critical' in the depth validation section
- Added error tracking and clearer error messages.
…ystem transfer

- Changed log level from 'warning' to 'critical' in the depth validation section
- Added error tracking
…s to well transfer

- Implement _extract_aquifer_type_codes() to parse compound codes (e.g., "FC" -> Fractured + Confined)
- Add get_or_create_aquifer() helper to manage unique aquifer system records
- Add get_or_create_formation() helper to manage geologic formation records
- Integrate aquifer association logic in `transfer_wells()` to create ThingAquiferAssociation and AquiferType records
- Integrate formation association logic to create ThingGeologicFormationAssociation records with depth data
- Support lexicon mapping for both AqClass (aquifer name) and AquiferType (characteristics) fields
- Add comprehensive error handling and logging for aquifer/formation associations

This enables proper tracking of wells' aquifer systems with multiple type characteristics
and their associated geologic formations, preserving all source data from NM_Aquifer.
@ksmuczynski

ksmuczynski commented Nov 25, 2025

Copy link
Copy Markdown
Contributor Author

Made some updates to well_transfer.py:

  • Implemented _extract_aquifer_type_codes() function to parse compound aquifer type codes (e.g., "FC" splits into ["F", "C"] for Fractured and Confined)
  • Implemented get_or_create_aquifer() helper function that creates one aquifer system record per named aquifer (e.g., "Santa Fe Group") with deduplication logic
  • Implemented get_or_create_formation() helper function to manage geologic formation records with deduplication
  • Integrated aquifer association logic in transfer_wells() that:
    • Maps AqClass codes to aquifer names using lexicon mapper
    • Creates ThingAquiferAssociation records linking wells to aquifer systems
    • Creates separate AquiferType records for each characteristic (supports wells with multiple types)
    • Checks for existing associations to prevent duplicates
  • Integrated formation association logic that:
    • Creates ThingGeologicFormationAssociation records with depth intervals

Comment thread transfers/well_transfer.py
Comment thread transfers/well_transfer.py Outdated
Comment thread transfers/well_transfer.py Outdated
- Set primary_type placeholder to "Unknown" instead of None when creating an aquifer_system in `well_transfer.py`
…ations

- Updated the logic so that a formation association is created only if valid well depth data exist.
…th AMMP

It is assumed that the first recorded type of a compound type is the primary type of the aquifer.
Comment thread transfers/well_transfer.py Outdated
if row.WellDepth and not isna(row.WellDepth)
else 100.0
)
# Onlyl create association if valid well depth data exists

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good validation to check 👍

Comment thread transfers/well_transfer.py Outdated
Comment on lines +581 to +590
# Create association using actual well depth
top_depth = 0.0
bottom_depth = float(row.WellDepth)

formation_assoc = ThingGeologicFormationAssociation(
thing=well,
geologic_formation=formation,
top_depth=top_depth,
bottom_depth=bottom_depth,
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An issue that comes to mind (and we just discussed) is: what should be done if the well record has a formation zone but doesn't have anything in the stratigraphy table? this will create associations indicating that the full well is that zone (top depth = 0 and bottom depth = well depth). perhaps we can store WellData.FormationZone in a field called nma_formation_zone? This will ensure the strictness of Ocotillo but also preserve legacy data if/when it is wanted.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a thoughtful comment. I've added a formation_completion_code field to the Thing model and I've updated this section so it simply records which formation the well was completed in. ThingGeologicFormationAssociation records are no longer being created via the well transfer, they are only being created via the stratigraphy transfer.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the formation_completion_code be included in the geologic_formations field in the response? That is, added to the list of geologic formations a well passes through? Or should it be its own field to indicate this is where the well was completed?

On that note, @chasetmartin the step in the feature file says "And the response should include the formation as the formation zone of well completion." Should we return all geologic formations associated with the well, or just the formation in which the well was completed?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are just returning the formation in which the well terminates I the following steps will need to be taken:

  1. update the response schema
  2. update the testing data
  3. update the test for the response should include the formation as the formation zone of well completion

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the formation_completion_code should be its own field. I think there is value in distinguishing between what formation a well is completed in and what formations the borehole passes through.

@chasetmartin chasetmartin Dec 1, 2025

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jacob-a-brown @ksmuczynski So just so I understand the question, there are multiple formations associated with a well in NMAquifer?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they're in the stratigraphy table (though there's also the field FormationZone for where the well terminates)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah ok, the feature file was just referencing that FormationZone termination field, so Kelsey's suggestion above seems like a good plan.

…s in `well_transfer.py`

The current implementation for creating a Formation Association was problematic for the following reasons:
	1. The FormationZone field from the WellData csv indicates completion zone, not the entire well stratigraphy
	2. Setting top_depth = 0.0 incorrectly implies:
		* The formation starts at ground surface
		* The well only penetrates one formation
		* The entire well depth is within that single formation
	3. ThingGeologicFormationAssociation = currently implies full stratigraphic column with depth intervals
	4. Forcing FormationZone into a depth-based association creates misleading data

Implementation
- Added `formation_completion_code` Field to Thing Model
	- This provides a clear separation between `formation_completion_code` = "What formation is the well completed in" and `formation_associations` = "What formations does the borehole pass through?"
- Updated well_transfer.py so that `ThingGeologicFormationAssociation` records are only being created from the Stratigraphy.csv (they were previously being created from WellData, too).
@ksmuczynski

Copy link
Copy Markdown
Contributor Author

Made some updates to well_transfer.py and Thing model.

Context:
The current implementation for creating a Formation Association was problematic for the following reasons:

  1. The FormationZone field from the WellData csv indicates completion zone, not the entire well stratigraphy
  2. Setting top_depth = 0.0 incorrectly implies:
    • The formation starts at ground surface
    • The well only penetrates one formation
    • The entire well depth is within that single formation
  3. ThingGeologicFormationAssociation = currently implies full stratigraphic column with depth intervals
  4. Forcing FormationZone into a depth-based association creates misleading data

Updates:

  • Added formation_completion_code Field to Thing model
    • This provides a clear separation between formation_completion_code = "What formation is the well completed in" and formation_associations = "What formations does the borehole pass through?"
  • Updated Thing schema to include formation_completion_code field in CreateWell( )
  • Updated well_transfer.py so that ThingGeologicFormationAssociation records are only being created from the Stratigraphy.csv (they were previously being created from WellData, too).

@jacob-a-brown jacob-a-brown left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it looks good - I just have one minor comment. The last thing to do is add the new transfer scripts to transfers/transfer.py and make sure the transfers are working (and do some spot checking between Ocotillo and NM_Aquifer to make sure the data is being transferred and persisted as expected)

Comment thread schemas/thing.py Outdated
@jacob-a-brown jacob-a-brown merged commit 12ebaac into bdms-227 Dec 1, 2025
4 checks passed
@ksmuczynski ksmuczynski deleted the kas-227-231-additional-well-info-transfer-scripts branch December 10, 2025 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants