NO TICKET: thing update and fix by jacob-a-brown · Pull Request #179 · DataIntegrationGroup/OcotilloAPI

jacob-a-brown · 2025-10-08T17:34:59Z

Why

This PR addresses the following problem / context:

a ThingResponse should always have an active_location attribute, no matter if that response is from the /thing router or contained in another response, like the ContactResponse
all thing fields should be tested
depths associated with a well should be validated against each other

How

Implementation summary - the following was changed / added / removed:

To ensure that every thing object has an active_location property I eagerly load location_associations for each thing, and for each location_association I eagerly load the location. By doing this I can get set a thing's active_location property as

@property
def active_location(self):
    """
    Returns the currently active Location by sorting the effective_start
    field. Thing eagerly loads location_association, which eagerly loads
    location, which will hopefully prevent N+1 query problems.
    """
    active_location = sorted(
        self.location_associations, key=lambda x: x.effective_start
    )
    return active_location[0].location if active_location else None

added missing fields to thing schemas and tests
added some validations for well_depth, hole_depth, and well_casing_depth and tested those validations

Notes

Any special considerations, workarounds, or follow-up work to note?

Because so many updates have been made the well and water level transfer scripts have been broken. After this PR has been amended, merged, and accepted I'll open a PR for both water level and thing transfer updates. Then I'll work on adding more metadata to the thing model
because I eagerly load as lazy="joined" I needed to invoke .unique() when using SQL Alchemy's select (not necessary for query)
because a thing's active_location is now part of the object functions they don't need to be retrieved in thing_helper functions

when a different object needs to call a thing, like a contact, the `active_location` should be present in the thing response. this is achieved by eagerly loading the `active_location`

codecov-commenter · 2025-10-08T17:37:06Z

Codecov Report

❌ Patch coverage is 99.04762% with 1 line in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
services/thing_helper.py	85.71%	1 Missing ⚠️

Files with missing lines	Coverage Δ
api/search.py	`97.77% <100.00%> (ø)`
api/thing.py	`98.46% <ø> (ø)`
db/location.py	`95.00% <100.00%> (ø)`
db/thing.py	`100.00% <100.00%> (ø)`
schemas/thing.py	`99.15% <100.00%> (ø)`
services/geospatial_helper.py	`74.46% <100.00%> (ø)`
tests/conftest.py	`98.59% <100.00%> (ø)`
tests/test_geospatial.py	`88.23% <100.00%> (ø)`
tests/test_observation.py	`95.65% <ø> (ø)`
tests/test_thing.py	`100.00% <100.00%> (ø)`
... and 1 more

TylerAdamMartinez

Overall, lgtm

TylerAdamMartinez · 2025-10-08T18:09:38Z

+
+
+class ValidateWell(BaseModel):
+    well_depth: float | None = None  # in feet


Suggested change

well_depth: float | None = None # in feet

well_depth_in_feet: float | None = Field(None, alias="well_depth")

This change would improve clarity by using _in_feet to make units explicit, reducing ambiguity and preventing any future unit-mixing errors, while the alias preserves database compatibility. It also makes the API more 'self-documenting.'

Rather than add the unit to the field name I can add another field called well_depth_unit and have it automatically be set to ft. Awhile ago we made the decision to not include units in the names, so I'll proceed with this method to keep a consistent style.

Fields can have descriptions, so I'll add that to document it for create/update

TylerAdamMartinez · 2025-10-08T18:12:32Z

    well_depth: float | None = None  # in feet
    hole_depth: float | None = None  # in feet
    well_construction_notes: str | None = None
+    well_casing_diameter: float | None = None  # in inches


Same thing here. Could you add the same level of expressiveness by including _in_inches to make the units explicit and consistent with the other fields?

TylerAdamMartinez · 2025-10-08T18:14:24Z

+        """
+        Returns the currently active Location by sorting the effective_start
+        field. Thing eagerly loads location_association, which eagerly loads
+        location, which will hopefully prevent N+1 query problems.


ksmuczynski · 2025-10-08T18:20:04Z

        cascade="all, delete-orphan",
        passive_deletes=True,
        order_by="LocationThingAssociation.effective_start.desc()",
+        lazy="joined",


Won't this fetch the entire location history for a Thing every time you load it? I would think we would only want a list of historical locations if it's explicitly asked for, which is what the default lazy loading provides.

I understand lazy loading can cause N+1 query problems if you loop through many Things and access the history for each one, but is that really something that will be done very often?

That does load the entire history location for a Thing every time thing records are returned from the database. To access the active (current?) location, though, we need to have access to all location associations so that we can get the latest one (currently based off of effective_start).

If the user gets a list of things, whether from /thing, /thing/water-well, or /thing/spring, extra queries will need to be performed for every object in that list. Weaver, for example, fetches many thousands of wells at a time, so if we use lazy loading there'd be many thousands of extra queries made, which would impact the performance of the API.

I apologize if I'm misinterpreting this discussion, but is the current/active location something that we can cache on the server side so that the location history query doesn't have to happen every time a list of things is requested (with a reasonable TTL)? Along those same lines - the many thousands of things with active locations potentially most relevant to the /geospatial endpoint for requesting many geojsons on a map or in a shapefile all at once (the Weaver story you highlighted - eventually maps in Ocotillo). So perhaps a cache or precalculated results would help with that once we arrive at working on that type of user story?

By eagerly loading the location associations only one query is made; if done correctly it shouldn't impact the performance of the API. Where it can run into issues, I think, is if there is too much data being transferred. I don't think that's too much of an issue for the time being since the LocationThingAssociation and Location tables are pretty small, but it could become a problem in the future.

There are other eager loading techniques I can look into, like requesting specific columns (I think), but right now I think all of a Location's fields are in the response.

Would the caching technique be difficult to implement?

I'm checking with CoPilot if there are methods to keep the active/current location as an attribute, but avoid eager loading and N+1. If I can get that to work I'll push these updates

Seems like the two best options, if we want to take care of this via the API, are:

use eager loading (new method)

write explicit queries (previous method)

The issues with 1 is if too much data is loaded. That is, if a thing has hella associations. Wells only have one location, so this is a moot point for the time being. Rain gages may move, but do they move all that much? Would other types of things have many location associations?

The issue with 2 is that this needs to be implemented every place where a thing is needed, which is cumbersome and error-prone. (This was how I first identified the issue because a contact response has its associated things, and none of those had active locations loaded)

Sounds like you're on a good track then 👍. We don't need to make anything more complicated when we don't have a concrete use case or story showing otherwise.

ksmuczynski · 2025-10-08T18:29:19Z

    )

+    @property
+    def active_location(self):


How do you feel about current_location instead of active_location?

I'm okay with either name. @jirhiker do you have a preference?

current_location is fine. Its perhaps less ambiguous than active_location

ksmuczynski · 2025-10-08T18:57:20Z

+        location, which will hopefully prevent N+1 query problems.
+        """
+        active_location = sorted(
+            self.location_associations, key=lambda x: x.effective_start


Would sorting by effective_end have any advantages? An effective_end of NULL indicates an active/current location. Just curious since that's how I was imagining querying for current location.

In updating the retrieval function i also made sure that the effective_end is None

return ( active_location[0].location if active_location and active_location[0].effective_end is None else None )

My concern would be someone would forget to set effective_end and since effective_start is required I thought that'd be a safer way to sort

jacob-a-brown · 2025-10-08T21:03:04Z

@jirhiker @ksmuczynski @chasetmartin @TylerAdamMartinez

In making these updates I realized that the /geospatial endpoint would return multiple records for a thing if it had multiple location associations. I amended that section of the code (/services/geospatial_helper.py) so that if there are multiple location associations only the current/active geometry is used. This wasn't in the original PR and should be reviewed if you are re-reviewing.

jirhiker · 2025-10-08T21:31:33Z

@jacob-a-brown @chasetmartin @ksmuczynski What use case or UI feature leads to this statement "a ThingResponse should always have an active_location attribute, no matter if that response is from the /thing router or contained in another response"

jacob-a-brown · 2025-10-08T22:22:40Z

Upon reflection I wasn't really thinking about a UI feature. I wanted to get everything standardized so that every time the API needs a particular response or object (e.g. ThingResponse) it will always have information (for the same record), no matter where it is invoked. I thought that if something was missing and it needed to be used elsewhere that it would be difficult to track down, so when I ran into this issue I went ahead and made the updates. Looking at the contact table in Ocotillo, clicking on the thing takes you to the detail page, so active_location (soon to be current) is truly needed there.

jacob-a-brown · 2025-10-08T22:35:19Z

In light of those needs (or lack thereof), is a reversion recommended? Or should I keep eager loading for standardizing the ThingResponse every time it is invoked?

jirhiker · 2025-10-09T02:30:18Z

I think your intuition is correct with regards to having a consistent ThingResponse. Can you do some performance testing to ensure that this does not have any negative effects

jacob-a-brown · 2025-10-09T22:18:00Z

I did some testing locally and don't see any impacts, though sometimes doing these things locally can mask them (I'm still haunted by expand, but that was due to an N+1 issue [which I'm trying to avoid with eager loading])

jacob-a-brown added 4 commits October 7, 2025 17:24

fix: eager load active location for things

03acb5f

when a different object needs to call a thing, like a contact, the `active_location` should be present in the thing response. this is achieved by eagerly loading the `active_location`

Merge branch 'staging' into jab-thing-update-fix

a528178

Merge branch 'staging' into jab-thing-update-fix

1cf12ac

feat: add depth validations for a well

1f9eca0

jacob-a-brown requested review from TylerAdamMartinez, chasetmartin, jirhiker and ksmuczynski October 8, 2025 17:34

TylerAdamMartinez approved these changes Oct 8, 2025

View reviewed changes

ksmuczynski reviewed Oct 8, 2025

View reviewed changes

Comment thread db/thing.py Outdated

ksmuczynski reviewed Oct 8, 2025

View reviewed changes

jacob-a-brown added 4 commits October 8, 2025 13:18

feat: add units in well response | describe depth fields in well create

0171821

note: add note about eager loading of location associations

4ef32fe

fix: only get geospatial things with current location

1c5d814

update note

a900dd6

jacob-a-brown requested review from TylerAdamMartinez, chasetmartin and ksmuczynski and removed request for chasetmartin October 8, 2025 20:59

refactor: rename active_location to current_location

111aef4

jacob-a-brown mentioned this pull request Oct 9, 2025

BDMS 159: well transfers updates, water level transfer updates, & other revisions #180

Merged

jirhiker closed this pull request by merging all changes into staging in 75a2193 Oct 10, 2025

TylerAdamMartinez deleted the jab-thing-update-fix branch February 5, 2026 18:08



		class ValidateWell(BaseModel):
		well_depth: float \| None = None # in feet

	well_depth: float \| None = None # in feet
	well_depth_in_feet: float \| None = Field(None, alias="well_depth")

Uh oh!

Conversation

jacob-a-brown commented Oct 8, 2025

Why

How

Notes

Uh oh!

codecov-commenter commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

TylerAdamMartinez left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ksmuczynski Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jacob-a-brown Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jacob-a-brown commented Oct 8, 2025

Uh oh!

jirhiker commented Oct 8, 2025

Uh oh!

jacob-a-brown commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacob-a-brown commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jirhiker commented Oct 9, 2025

Uh oh!

jacob-a-brown commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

codecov-commenter commented Oct 8, 2025 •

edited

Loading

ksmuczynski Oct 8, 2025 •

edited

Loading

jacob-a-brown Oct 8, 2025 •

edited

Loading

jacob-a-brown commented Oct 8, 2025 •

edited

Loading

jacob-a-brown commented Oct 8, 2025 •

edited

Loading

jacob-a-brown commented Oct 9, 2025 •

edited

Loading