Skip to content

Remove cuDNN frontend submodule#3169

Open
vcherepanov-nv wants to merge 2 commits into
NVIDIA:mainfrom
vcherepanov-nv:no-cudnn-submodule
Open

Remove cuDNN frontend submodule#3169
vcherepanov-nv wants to merge 2 commits into
NVIDIA:mainfrom
vcherepanov-nv:no-cudnn-submodule

Conversation

@vcherepanov-nv

Copy link
Copy Markdown
Collaborator

Description

Some time ago TE started using Python bindings of the cuDNN FE, packaged as nvidia-cudnn-frontend, while C++ code in TE kept using cuDNN FE from the git submodule. This change removes the git submodule, making all of the TE use nvidia-cudnn-frontend package exclusively.

One consequence of this change is that now installing different version of nvidia-cudnn-frontend after installing TE will require rebuilding TE.

Fixes # (issue)

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

Please list the changes introduced in this PR:

  • Remove cudnn-frontend git submodule
  • Update TE to use cuDNN FE from the pip installable package

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Use nvidia-cudnn-frontend for the C++ headers and Python bindings. Keep the cuDNN library discovery helper in tree and update common, PyTorch, JAX, packaging, and test build paths.

Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com>
Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com>
@greptile-apps

greptile-apps Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR removes the 3rdparty/cudnn-frontend git submodule and replaces every reference to it with the pip-installable nvidia-cudnn-frontend>=1.25.0 package, unifying how both C++ headers and the Python cudnn bindings are sourced.

  • A new cudnn_frontend_include_path() Python helper locates the C++ headers from the installed pip package via importlib.metadata.distribution(...).locate_file("include") and validates that cudnn_frontend.h is present before returning the path; this path is injected into CMake as -DCUDNN_FRONTEND_INCLUDE_DIR and into the JAX extension's include_dirs.
  • A copy of cuDNN.cmake is brought in from the removed submodule as transformer_engine/common/cmake/cuDNN.cmake and added to MANIFEST.in so it ships with sdists; both the common-library and C++-test CMakeLists are updated to reference the new local path.
  • nvidia-cudnn-frontend>=1.25.0 is added to pyproject.toml build requirements, and to install_requirements() for both the PyTorch and JAX backends, so pip ensures the package is present at build- and run-time.

Confidence Score: 4/5

Safe to merge; the build refactor is well-structured and the Python helper validates the header before returning the path, making misconfiguration detectable early.

The newly-copied cuDNN.cmake has two minor issues inherited from the submodule: find_package_handle_standard_args uses LIBRARY as its package name (misleading CMake output but no functional impact), and there is no else branch for cuDNN major versions outside 8 and 9 (silent underlink risk for any future major release). Both are non-blocking for the current supported cuDNN versions. The rest of the change — include-path discovery, build requirement wiring, and Python import simplification — is correct and well-guarded.

transformer_engine/common/cmake/cuDNN.cmake — the two issues described in inline comments are confined to this new file.

Important Files Changed

Filename Overview
transformer_engine/common/cmake/cuDNN.cmake New file copied from the removed submodule; contains two issues: wrong first argument (LIBRARY) to find_package_handle_standard_args, and no fallthrough for cuDNN versions outside 8/9.
build_tools/utils.py Adds cudnn_frontend_include_path() which correctly uses locate_file("include") to find site-packages/include/cudnn_frontend.h and validates the header exists before returning.
transformer_engine/common/CMakeLists.txt Replaces the hardcoded submodule path with a CUDNN_FRONTEND_INCLUDE_DIR variable (injected from setup.py) and switches to the new local cuDNN.cmake.
tests/cpp/CMakeLists.txt Updates to include the local cuDNN.cmake instead of the removed submodule path; C++ tests do not use cudnn_frontend.h directly so no include-path gap.
setup.py Passes -DCUDNN_FRONTEND_INCLUDE_DIR from the new helper function to CMake for the common library build.
build_tools/jax.py Replaces the multi-step submodule path search with a single cudnn_frontend_include_path() call and adds nvidia-cudnn-frontend>=1.25.0 as a runtime requirement.
build_tools/pytorch.py Adds nvidia-cudnn-frontend>=1.25.0 as a runtime install requirement for the PyTorch backend.
transformer_engine/jax/cpp_extensions/flex_attention.py Removes the vendored sys.path injection for the cudnn module; now relies solely on the installed nvidia-cudnn-frontend package.
transformer_engine/pytorch/attention/dot_product_attention/flex_attention.py Removes the dual-path (vendored vs installed) cudnn import logic; now wraps importlib.import_module in a try/except with a helpful error message.
pyproject.toml Adds nvidia-cudnn-frontend>=1.25.0 to build-system requires so pip installs it before invoking setup.py.
MANIFEST.in Adds the new cmake directory to source distributions so cuDNN.cmake is included in sdists.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[pip install nvidia-cudnn-frontend>=1.25.0] --> B[site-packages/include/cudnn_frontend.h]
    A --> C[site-packages/cudnn/ Python bindings]

    B --> D[cudnn_frontend_include_path
build_tools/utils.py]
    D -->|validate header exists| D
    D --> E[setup.py
-DCUDNN_FRONTEND_INCLUDE_DIR=...]
    D --> F[build_tools/jax.py
include_dirs.append]

    E --> G[transformer_engine/common/CMakeLists.txt
CUDNN_FRONTEND_INCLUDE_DIR check + include]
    G --> H[transformer_engine/common/cmake/cuDNN.cmake
finds libcudnn.so]
    G --> I[target_include_directories transformer_engine
CUDNN_FRONTEND_INCLUDE_DIR]

    C --> J[PyTorch flex_attention.py
importlib.import_module cudnn]
    C --> K[JAX flex_attention.py
importlib.import_module cudnn]

    H --> L[CUDNN::cudnn_all target
linked into libtransformer_engine.so]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[pip install nvidia-cudnn-frontend>=1.25.0] --> B[site-packages/include/cudnn_frontend.h]
    A --> C[site-packages/cudnn/ Python bindings]

    B --> D[cudnn_frontend_include_path
build_tools/utils.py]
    D -->|validate header exists| D
    D --> E[setup.py
-DCUDNN_FRONTEND_INCLUDE_DIR=...]
    D --> F[build_tools/jax.py
include_dirs.append]

    E --> G[transformer_engine/common/CMakeLists.txt
CUDNN_FRONTEND_INCLUDE_DIR check + include]
    G --> H[transformer_engine/common/cmake/cuDNN.cmake
finds libcudnn.so]
    G --> I[target_include_directories transformer_engine
CUDNN_FRONTEND_INCLUDE_DIR]

    C --> J[PyTorch flex_attention.py
importlib.import_module cudnn]
    C --> K[JAX flex_attention.py
importlib.import_module cudnn]

    H --> L[CUDNN::cudnn_all target
linked into libtransformer_engine.so]
Loading

Reviews (1): Last reviewed commit: "Fix JAX isolated build requirements" | Re-trigger Greptile

Comment on lines +85 to +88
find_package_handle_standard_args(
LIBRARY REQUIRED_VARS
CUDNN_INCLUDE_DIR ${CUDNN_LIBRARY_VAR}
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Wrong package name in find_package_handle_standard_args

The first argument should be the logical package name (e.g., cuDNN), not LIBRARY. As written, CMake sets LIBRARY_FOUND instead of CUDNN_FOUND and prints "LIBRARY found" in its status output, which is misleading. Because CUDNN_FOUND is then set explicitly below, there is no functional impact — but the status/warning messages during cmake configuration will be confusing, and if this call ever gets upgraded to use the standard-args-produced _FOUND variable, the wrong variable will be in scope.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +121 to +163
if(CUDNN_MAJOR_VERSION EQUAL 8)
find_cudnn_library(cudnn_adv_infer)
find_cudnn_library(cudnn_adv_train)
find_cudnn_library(cudnn_cnn_infer)
find_cudnn_library(cudnn_cnn_train)
find_cudnn_library(cudnn_ops_infer)
find_cudnn_library(cudnn_ops_train)

target_link_libraries(
CUDNN::cudnn_all
INTERFACE
CUDNN::cudnn_adv_train
CUDNN::cudnn_ops_train
CUDNN::cudnn_cnn_train
CUDNN::cudnn_adv_infer
CUDNN::cudnn_cnn_infer
CUDNN::cudnn_ops_infer
)
elseif(CUDNN_MAJOR_VERSION EQUAL 9)
find_cudnn_library(cudnn_graph)
find_cudnn_library(cudnn_engines_runtime_compiled)
find_cudnn_library(cudnn_ops OPTIONAL)
find_cudnn_library(cudnn_cnn OPTIONAL)
find_cudnn_library(cudnn_adv OPTIONAL)
find_cudnn_library(cudnn_engines_precompiled OPTIONAL)
find_cudnn_library(cudnn_heuristic OPTIONAL)
find_cudnn_library(cudnn_ext OPTIONAL)

target_link_libraries(
CUDNN::cudnn_all
INTERFACE
$<$<BOOL:${CUDNN_STATIC}>:-Wl,--whole-archive>
CUDNN::cudnn_graph
CUDNN::cudnn_engines_runtime_compiled
CUDNN::cudnn_ops
CUDNN::cudnn_cnn
CUDNN::cudnn_adv
$<$<NOT:$<BOOL:${CUDNN_SKIP_PRECOMPILED_LINK}>>:CUDNN::cudnn_engines_precompiled>
CUDNN::cudnn_heuristic
$<$<BOOL:${CUDNN_STATIC}>:-Wl,--no-whole-archive>
$<$<BOOL:${CUDNN_STATIC}>:CUDA::cublasLt_static $<IF:$<TARGET_EXISTS:CUDA::nvrtc_static>,CUDA::nvrtc_static,CUDA::nvrtc> ZLIB::ZLIB>
)
endif()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No handling for cuDNN major versions outside 8 and 9

If CUDNN_MAJOR_VERSION is anything other than 8 or 9 (e.g., a future cuDNN 10+), neither branch runs. CUDNN::cudnn_all will link only against CUDNN::cudnn for the non-static case and will have no sub-libraries linked. This would silently produce an underlinked target and likely cause symbol-resolution failures at load time. Adding an else() branch with a message(WARNING ...) or FATAL_ERROR would make the failure loud and actionable rather than silent.

@vcherepanov-nv vcherepanov-nv requested a review from timmoon10 July 2, 2026 18:04
@cyanguwa cyanguwa added the 2.18 label Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants