Fix division by zero in clDice loss harmonic mean#8739
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThe change clarifies the Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi @Mr-Neutr0n thanks for this change, there should be a value to prevent divide-by-zero but I would suggest following the pattern found with the other Dice losses. I would add a |
0e408ee to
746afac
Compare
|
Thanks @ericspod — the
The original commit only patched the harmonic-mean denominator, but the same 0/0 bug existed in the individual Verified locally — all 5 tests in Latest push: |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
monai/losses/cldice.py (1)
95-106:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd one explicit
smooth_drregression test.
tests/losses/test_cldice_loss.pyonly exercises the defaultsmooth_dr, so a wiring regression in these new public parameters would still pass. Please add one case that sets a non-defaultsmooth_drand asserts the forward path stays finite and uses that value.As per coding guidelines, "Ensure new or modified definitions will be covered by existing or new unit tests."
Also applies to: 131-143, 169-182
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@monai/losses/cldice.py` around lines 95 - 106, Add a regression unit test in tests/losses/test_cldice_loss.py that calls the public soft_dice API with an explicit non-default smooth_dr (e.g., 1e-3) and verifies the forward output is finite (no NaN/inf) and that the smooth_dr value actually affects the result by comparing outputs for two different smooth_dr values (they should differ). Do the same for the other public clDice-related forward functions referenced in the diff (the functions around lines 131-143 and 169-182) so each new/modified parameter is exercised and asserted to produce finite outputs and to change when smooth_dr changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@monai/losses/cldice.py`:
- Around line 175-176: Update the docstring for the parameter smooth_dr in
clDice (referencing smooth_dr and the harmonic mean computation using tprec and
tsens) to explicitly state that smooth_dr is added not only to each individual
ratio denominator but also to the harmonic-mean denominator (the final tprec +
tsens + self.smooth_dr term) so the docs match the implementation; locate the
parameter description in the clDice/clDiceLoss docstring and reword it to
mention both the per-ratio denominators and the combined harmonic denominator.
---
Outside diff comments:
In `@monai/losses/cldice.py`:
- Around line 95-106: Add a regression unit test in
tests/losses/test_cldice_loss.py that calls the public soft_dice API with an
explicit non-default smooth_dr (e.g., 1e-3) and verifies the forward output is
finite (no NaN/inf) and that the smooth_dr value actually affects the result by
comparing outputs for two different smooth_dr values (they should differ). Do
the same for the other public clDice-related forward functions referenced in the
diff (the functions around lines 131-143 and 169-182) so each new/modified
parameter is exercised and asserted to produce finite outputs and to change when
smooth_dr changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: da0bf21b-1698-4174-88d1-fd388e9fb0d2
📒 Files selected for processing (1)
monai/losses/cldice.py
|
Hi @Mr-Neutr0n please have a look at the conflicts that need resolving, There's been changes to the section of code you wanted to update, unfortunately you'll have to untangle where your changes should go. |
746afac to
45216c2
Compare
|
Hi @ericspod — I rebased onto the current dev branch and resolved the conflict. While resolving it I saw that the core guard you asked for is already present on dev (added in #8703): the precision/sensitivity denominators use smooth_dr and the harmonic mean uses smooth, with smooth validated as positive. So the source-level division-by-zero issue is already covered upstream. I turned this PR into the missing regression coverage instead:
The branch is now clean on top of dev. Let me know if you'd rather I close this and open a fresh test-only PR. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/losses/test_cldice_loss.py`:
- Around line 117-132: The current tests for SoftclDiceLoss only cover all-zero
inputs, so they miss the non-overlap case this change is meant to protect. Add
explicit regression tests using non-overlapping input/target tensors for each
affected loss class in test_cldice_loss.py, and assert the returned loss is
finite; use the existing SoftclDiceLoss test methods as the pattern and extend
the corresponding loss-specific test sections referenced by the current
additions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 051147a5-e817-4147-a8d1-548927d84a7f
📒 Files selected for processing (2)
monai/losses/cldice.pytests/losses/test_cldice_loss.py
|
Hi @Mr-Neutr0n I guess other changes got to this fix first, sorry about that. I'm fine with keeping this PR and merging shortly, if you could add the tests as Coderabbit suggests we'd be good to go. |
…overlap inputs The source division-by-zero fix is already present on upstream/dev via Project-MONAI#8703 (it added smooth/smooth_dr guards and validates smooth > 0). This commit adds the missing regression coverage: - test_zero_input_is_finite for SoftclDiceLoss and SoftDiceclDiceLoss - test_non_default_smooth_dr_changes_result to prove the new parameter is wired through and actually affects the output - test_non_overlapping_input_is_finite for both loss classes, covering the scenario the original bug report was about Also clarifies the SoftclDiceLoss smooth_dr docstring.
45216c2 to
be465ab
Compare
|
Hi @ericspod — added the non-overlap finiteness tests for both SoftclDiceLoss and SoftDiceclDiceLoss as suggested. The branch is rebased on dev and should be ready for another look. |
ericspod
left a comment
There was a problem hiding this comment.
@Mr-Neutr0n looks good now thanks, I can pass your DCO for you if it's okay with you.
|
Thanks for the review, @ericspod. Yes, please go ahead and pass the DCO for me — the commit messages are already signed-off. Let me know if anything else is needed. |
Summary
SoftclDiceLossandSoftDiceclDiceLosswhen computing the harmonic mean of topology precision and sensitivity1e-7) to the denominator(tprec + tsens)to preventNaNwhen both values are zerosmooth=0Details
The clDice loss computes
cl_dice = 1.0 - 2.0 * (tprec * tsens) / (tprec + tsens). When bothtprecandtsensare zero (e.g., empty inputs, non-overlapping predictions/targets, orsmooth=0), this results in0/0 = NaN, which propagates through the loss and crashes training.While the default
smooth=1.0preventstprecandtsensfrom being exactly zero in most cases, settingsmooth=0(a valid configuration) exposes this bug whenever skeleton overlap is zero. The fix adds1e-7to the harmonic mean denominator, which:cl_dice = 1.0(maximum loss) when both precision and sensitivity are zero, which is the correct semantic resultsmooth_drinDiceLoss)Test plan
test_cldice_loss.pytests still pass (perfect overlap cases)test_zero_input_no_nan: verifies zero-valued inputs withsmooth=0do not produce NaNtest_no_overlap_no_nan: verifies non-overlapping predictions/targets withsmooth=0do not produce NaN