Remove cuDNN frontend submodule#3169
Conversation
Use nvidia-cudnn-frontend for the C++ headers and Python bindings. Keep the cuDNN library discovery helper in tree and update common, PyTorch, JAX, packaging, and test build paths. Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com>
Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com>
Greptile SummaryThis PR removes the
Confidence Score: 4/5Safe to merge; the build refactor is well-structured and the Python helper validates the header before returning the path, making misconfiguration detectable early. The newly-copied cuDNN.cmake has two minor issues inherited from the submodule: find_package_handle_standard_args uses LIBRARY as its package name (misleading CMake output but no functional impact), and there is no else branch for cuDNN major versions outside 8 and 9 (silent underlink risk for any future major release). Both are non-blocking for the current supported cuDNN versions. The rest of the change — include-path discovery, build requirement wiring, and Python import simplification — is correct and well-guarded. transformer_engine/common/cmake/cuDNN.cmake — the two issues described in inline comments are confined to this new file. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[pip install nvidia-cudnn-frontend>=1.25.0] --> B[site-packages/include/cudnn_frontend.h]
A --> C[site-packages/cudnn/ Python bindings]
B --> D[cudnn_frontend_include_path
build_tools/utils.py]
D -->|validate header exists| D
D --> E[setup.py
-DCUDNN_FRONTEND_INCLUDE_DIR=...]
D --> F[build_tools/jax.py
include_dirs.append]
E --> G[transformer_engine/common/CMakeLists.txt
CUDNN_FRONTEND_INCLUDE_DIR check + include]
G --> H[transformer_engine/common/cmake/cuDNN.cmake
finds libcudnn.so]
G --> I[target_include_directories transformer_engine
CUDNN_FRONTEND_INCLUDE_DIR]
C --> J[PyTorch flex_attention.py
importlib.import_module cudnn]
C --> K[JAX flex_attention.py
importlib.import_module cudnn]
H --> L[CUDNN::cudnn_all target
linked into libtransformer_engine.so]
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
A[pip install nvidia-cudnn-frontend>=1.25.0] --> B[site-packages/include/cudnn_frontend.h]
A --> C[site-packages/cudnn/ Python bindings]
B --> D[cudnn_frontend_include_path
build_tools/utils.py]
D -->|validate header exists| D
D --> E[setup.py
-DCUDNN_FRONTEND_INCLUDE_DIR=...]
D --> F[build_tools/jax.py
include_dirs.append]
E --> G[transformer_engine/common/CMakeLists.txt
CUDNN_FRONTEND_INCLUDE_DIR check + include]
G --> H[transformer_engine/common/cmake/cuDNN.cmake
finds libcudnn.so]
G --> I[target_include_directories transformer_engine
CUDNN_FRONTEND_INCLUDE_DIR]
C --> J[PyTorch flex_attention.py
importlib.import_module cudnn]
C --> K[JAX flex_attention.py
importlib.import_module cudnn]
H --> L[CUDNN::cudnn_all target
linked into libtransformer_engine.so]
Reviews (1): Last reviewed commit: "Fix JAX isolated build requirements" | Re-trigger Greptile |
| find_package_handle_standard_args( | ||
| LIBRARY REQUIRED_VARS | ||
| CUDNN_INCLUDE_DIR ${CUDNN_LIBRARY_VAR} | ||
| ) |
There was a problem hiding this comment.
Wrong package name in
find_package_handle_standard_args
The first argument should be the logical package name (e.g., cuDNN), not LIBRARY. As written, CMake sets LIBRARY_FOUND instead of CUDNN_FOUND and prints "LIBRARY found" in its status output, which is misleading. Because CUDNN_FOUND is then set explicitly below, there is no functional impact — but the status/warning messages during cmake configuration will be confusing, and if this call ever gets upgraded to use the standard-args-produced _FOUND variable, the wrong variable will be in scope.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| if(CUDNN_MAJOR_VERSION EQUAL 8) | ||
| find_cudnn_library(cudnn_adv_infer) | ||
| find_cudnn_library(cudnn_adv_train) | ||
| find_cudnn_library(cudnn_cnn_infer) | ||
| find_cudnn_library(cudnn_cnn_train) | ||
| find_cudnn_library(cudnn_ops_infer) | ||
| find_cudnn_library(cudnn_ops_train) | ||
|
|
||
| target_link_libraries( | ||
| CUDNN::cudnn_all | ||
| INTERFACE | ||
| CUDNN::cudnn_adv_train | ||
| CUDNN::cudnn_ops_train | ||
| CUDNN::cudnn_cnn_train | ||
| CUDNN::cudnn_adv_infer | ||
| CUDNN::cudnn_cnn_infer | ||
| CUDNN::cudnn_ops_infer | ||
| ) | ||
| elseif(CUDNN_MAJOR_VERSION EQUAL 9) | ||
| find_cudnn_library(cudnn_graph) | ||
| find_cudnn_library(cudnn_engines_runtime_compiled) | ||
| find_cudnn_library(cudnn_ops OPTIONAL) | ||
| find_cudnn_library(cudnn_cnn OPTIONAL) | ||
| find_cudnn_library(cudnn_adv OPTIONAL) | ||
| find_cudnn_library(cudnn_engines_precompiled OPTIONAL) | ||
| find_cudnn_library(cudnn_heuristic OPTIONAL) | ||
| find_cudnn_library(cudnn_ext OPTIONAL) | ||
|
|
||
| target_link_libraries( | ||
| CUDNN::cudnn_all | ||
| INTERFACE | ||
| $<$<BOOL:${CUDNN_STATIC}>:-Wl,--whole-archive> | ||
| CUDNN::cudnn_graph | ||
| CUDNN::cudnn_engines_runtime_compiled | ||
| CUDNN::cudnn_ops | ||
| CUDNN::cudnn_cnn | ||
| CUDNN::cudnn_adv | ||
| $<$<NOT:$<BOOL:${CUDNN_SKIP_PRECOMPILED_LINK}>>:CUDNN::cudnn_engines_precompiled> | ||
| CUDNN::cudnn_heuristic | ||
| $<$<BOOL:${CUDNN_STATIC}>:-Wl,--no-whole-archive> | ||
| $<$<BOOL:${CUDNN_STATIC}>:CUDA::cublasLt_static $<IF:$<TARGET_EXISTS:CUDA::nvrtc_static>,CUDA::nvrtc_static,CUDA::nvrtc> ZLIB::ZLIB> | ||
| ) | ||
| endif() |
There was a problem hiding this comment.
No handling for cuDNN major versions outside 8 and 9
If CUDNN_MAJOR_VERSION is anything other than 8 or 9 (e.g., a future cuDNN 10+), neither branch runs. CUDNN::cudnn_all will link only against CUDNN::cudnn for the non-static case and will have no sub-libraries linked. This would silently produce an underlinked target and likely cause symbol-resolution failures at load time. Adding an else() branch with a message(WARNING ...) or FATAL_ERROR would make the failure loud and actionable rather than silent.
Description
Some time ago TE started using Python bindings of the cuDNN FE, packaged as nvidia-cudnn-frontend, while C++ code in TE kept using cuDNN FE from the git submodule. This change removes the git submodule, making all of the TE use nvidia-cudnn-frontend package exclusively.
One consequence of this change is that now installing different version of nvidia-cudnn-frontend after installing TE will require rebuilding TE.
Fixes # (issue)
Type of change
Changes
Please list the changes introduced in this PR:
Checklist: