docs(#1429): add Cascade Specification#182
Conversation
New normative spec at src/reference/specs/cascade.md (~200 lines) covering how DataJoint propagates restrictions across the FK graph for cascade delete and cascade preview. Pairs with datajoint-python #1468. Contents: - Overview of cascade entry points (Table.delete, Diagram.cascade) - Dependency graph structure: nodes, edges, alias nodes for aliased FKs - The three forward propagation rules (F1 copy, F2 aliased rename, F3 projection) and the three symmetric upward rules (U1, U2, U3) - Cascade flow diagram and multi-pass termination - part_integrity modes (enforce / ignore / cascade) - Part-to-Master upward propagation: Master identification by naming convention, FK-path walk via nx.shortest_path, alias-node transparency, intermediate-Part restrictions, materialization at the Master - Seed-is-Part case - Two worked examples mirroring #1429 Case 1 (renamed FK) and Case 2 (Part-of-Part chain) - Algorithmic complexity - Out-of-scope cross-references (Diagram.trace #1423, cross-schema cascade, custom rules) Examples use core DataJoint types (int32) per project convention. Nav entry added under Reference > Specifications > Data Operations.
MilagrosMarin
left a comment
There was a problem hiding this comment.
Cross-checked the spec against the implementation in #1468 line-by-line:
✅ Forward rules F1/F2/F3 match _apply_propagation_rule (diagram.py:549-605) exactly
✅ Upward rules U1/U2/U3 match _apply_propagation_rule_upward (diagram.py:607-661) exactly
✅ Alias-node convention (integer-named nodes, both half-edges carry same attr_map/aliased) matches _find_real_edge_props
✅ extract_master naming convention verified in dependencies.py:20
✅ nx.shortest_path for FK path walking matches _propagate_part_to_master
✅ Materialization rationale (MySQL 1093 error 'You can't specify target table for update in FROM clause') precisely stated; the spec's (master_ft & restrictions).proj().to_arrays() is equivalent to the impl's self._restricted_table(master_name).proj().to_arrays()
✅ Seed-is-Part pre-loop trigger explanation matches diagram.py:467-474
✅ All cross-links resolve (master-part.md, diagram.md, data-manipulation.md, delete-data.md)
✅ load_all_downstream exists in dependencies.py:229
✅ Nav placement under Reference → Specifications → Data Operations is sensible
✅ Both worked examples trace correctly through the rules
A few minor wording observations, none blocking:
1. Worked Example 1, step 3b. Line 148 says:
Apply U3 —
Subjectis restricted bySubject.Session.proj()(projected tosubject_id).
The (projected to subject_id) parenthetical is slightly misleading. Subject.Session.proj() projects to Session's PK ({subject_id, session_id}), not to subject_id alone. Only subject_id matters in the resulting restriction because that's the shared column with Subject's PK in the natural join. Tighter wording: "…projected to Session's PK; the natural join with Subject filters on the shared subject_id".
2. Empty-result case in materialization not mentioned. Impl handles len(master_pk_values) == 0 by setting restrictions[master_name] = [False] — the master appears with zero affected rows. Spec's Materialization section doesn't mention this branch. One sentence would round it out.
3. Multiple FK paths from Master to Part. Same observation I raised on #1468. The spec uses nx.shortest_path (line 90) but doesn't note the limitation that multiple FK chains could exist between the same Master/Part pair. Worth one sentence in "What is not part of this specification" or a brief caveat where shortest_path is mentioned.
4. Naming-convention fragility. Line 86: "The Master is identified by naming convention via dependencies.extract_master". Worth noting this is fragile — a Part whose __ convention is broken, or a Part referenced from a different schema, wouldn't be matched. Current fallback ("otherwise the upward walk is skipped") is correct behavior, but the failure mode could be surprising.
5. Optional: brief notation section. Terms child_attrs, parent_attrs, parent_pk, child_pk, attr_map, aliased are used without an explicit glossary. A short "Notation" subsection (4–5 definitions) would lower the bar for newer contributors reading the spec.
6. Algorithmic complexity bound. Line 190's "O(N · E) per pass, with at most one pass per master pulled in" is fine but academically loose. Cleaner: O(P · N · E) where P is the number of distinct masters pulled in by upward propagation. Cosmetic.
Approving — the spec is accurate, well-organized, and the worked examples are exactly the right grounding. The minor items are optional polish.
Summary
Adds a detailed normative spec at `src/reference/specs/cascade.md` covering how DataJoint propagates restrictions across the foreign-key graph for cascade delete and cascade preview. Pairs with datajoint-python #1468, which fixes the Part-to-Master upward propagation under `part_integrity="cascade"` (#1429).
Closes the docs side of datajoint-python #1429. Slated for DataJoint 2.3.
Contents
Examples use core DataJoint types (`int32`) per project convention.
Nav placement
New entry under Reference → Specifications → Data Operations, between Data Manipulation and AutoPopulate.
Sequencing
Reviewable now. Should land alongside or after datajoint-python #1468 so the spec doesn't describe code that isn't shipped.
Test plan