Disable cuDNN 9.23.0/9.23.1 for MXFP8 attention by cyanguwa · Pull Request #3173 · NVIDIA/TransformerEngine

cyanguwa · 2026-07-02T22:49:37Z

Description

There are some Inf/Nan issues with MXFP8 attention when running with cuDNN 9.23.0 and 9.23.1. They are fixed in 9.23.2.

https://docs.nvidia.com/deeplearning/cudnn/backend/latest/release-notes.html#cudnn-9-23-2

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Changes

See Description.

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Signed-off-by: Charlene Yang <8636796+cyanguwa@users.noreply.github.com>

for more information, see https://pre-commit.ci

greptile-apps · 2026-07-02T22:52:34Z

Greptile Summary

This PR adds a targeted version guard to disable MXFP8 FusedAttention when the detected cuDNN version is 9.23.0 or 9.23.1, both of which have known bugs causing Inf/NaN results in SDPA. The fix is minimal and scoped to the MXFP8 branch of the version-gating logic in get_attention_backend.

The cudnn_version in ((9, 23, 0), (9, 23, 1)) check is inserted between the existing < (9, 21, 0) lower-bound guard and the qkv_format == \"thd\" check, correctly allowing 9.21.x–9.22.x and 9.23.2+ through while blocking the two buggy point releases.
Affected users silently fall back to an alternate attention backend with no warning-level log message, making it harder to diagnose if the fallback causes unexpected performance or accuracy differences.

Confidence Score: 4/5

Safe to merge — the guard correctly blocks the two known-buggy cuDNN point releases and does not affect any other version.

The version tuple check and its placement in the conditional chain are correct. The only noteworthy gap is that the disable is logged at debug level, so users running a buggy cuDNN build get no visible indication they have fallen back to a slower or different attention backend.

transformer_engine/pytorch/attention/dot_product_attention/utils.py — specifically the log level used for the new disable message at line 634.

Important Files Changed

Filename	Overview
transformer_engine/pytorch/attention/dot_product_attention/utils.py	Adds a cuDNN version guard to disable MXFP8 FusedAttention on 9.23.0 and 9.23.1, which have known Inf/NaN bugs. Logic and placement are correct; the disable message uses debug-level logging like the rest of the block, but this is a silent correctness issue rather than a capability gap.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[fp8_recipe.mxfp8 is True] --> B{device_compute_capability < sm100?}
    B -- Yes --> Z1[Disable FusedAttention]
    B -- No --> C{fp8_recipe.fp8_mha?}
    C -- Yes --> Z2[Disable FusedAttention]
    C -- No --> D{cudnn_version < 9.21.0?}
    D -- Yes --> Z3[Disable FusedAttention]
    D -- No --> E{cudnn_version == 9.23.0 or 9.23.1?}
    E -- Yes --> Z4["Disable FusedAttention (NEW — known Inf/NaN bug)"]
    E -- No --> F{qkv_format == 'thd'?}
    F -- Yes --> Z5[Disable FusedAttention]
    F -- No --> G[FusedAttention Enabled]

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[fp8_recipe.mxfp8 is True] --> B{device_compute_capability < sm100?}
    B -- Yes --> Z1[Disable FusedAttention]
    B -- No --> C{fp8_recipe.fp8_mha?}
    C -- Yes --> Z2[Disable FusedAttention]
    C -- No --> D{cudnn_version < 9.21.0?}
    D -- Yes --> Z3[Disable FusedAttention]
    D -- No --> E{cudnn_version == 9.23.0 or 9.23.1?}
    E -- Yes --> Z4["Disable FusedAttention (NEW — known Inf/NaN bug)"]
    E -- No --> F{qkv_format == 'thd'?}
    F -- Yes --> Z5[Disable FusedAttention]
    F -- No --> G[FusedAttention Enabled]

_{Reviews (1): Last reviewed commit: "[pre-commit.ci] auto fixes from pre-comm..." | Re-trigger Greptile}

cyanguwa · 2026-07-02T23:10:19Z

Pipeline 56671690 for 9.23.0; 56671787 for 9.23.1; and 56672233 for 9.23.2. Nightly CI uses 9.24 so it's confirmed that 9.24 works.

Local testing confirms the fix in 9.23.2.

disable 9.23.0/.1 for mxfp8 attention

7a94efa

Signed-off-by: Charlene Yang <8636796+cyanguwa@users.noreply.github.com>

cyanguwa added the 2.17 label Jul 2, 2026

cyanguwa requested a review from KshitijLakhani July 2, 2026 22:50

cyanguwa changed the title ~~disable 9.23.0/.1 for mxfp8 attention~~ Disable 9.23.0/.1 for MXFP8 attention Jul 2, 2026

[pre-commit.ci] auto fixes from pre-commit.com hooks

ad4cbef

for more information, see https://pre-commit.ci

cyanguwa changed the title ~~Disable 9.23.0/.1 for MXFP8 attention~~ Disable 9.23.0/9.23.1 for MXFP8 attention Jul 2, 2026

cyanguwa changed the title ~~Disable 9.23.0/9.23.1 for MXFP8 attention~~ Disable cuDNN 9.23.0/9.23.1 for MXFP8 attention Jul 2, 2026

greptile-apps Bot reviewed Jul 2, 2026

View reviewed changes

Comment thread transformer_engine/pytorch/attention/dot_product_attention/utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Disable cuDNN 9.23.0/9.23.1 for MXFP8 attention#3173

Disable cuDNN 9.23.0/9.23.1 for MXFP8 attention#3173
cyanguwa wants to merge 2 commits into
NVIDIA:mainfrom
cyanguwa:disable_9.23

cyanguwa commented Jul 2, 2026

Uh oh!

greptile-apps Bot commented Jul 2, 2026

Uh oh!

Uh oh!

cyanguwa commented Jul 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

cyanguwa commented Jul 2, 2026

Description

Type of change

Changes

Checklist:

Uh oh!

greptile-apps Bot commented Jul 2, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

cyanguwa commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cyanguwa commented Jul 2, 2026 •

edited

Loading