gpu-programming

Star

Here are 63 public repositories matching this topic...

exaloop / codon

Star

A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support

python compiler high-performance numpy llvm parallel-programming gpu-programming

Updated May 21, 2026
Python

plasma-umass / scalene

Sponsor

Star

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

python cpu profiler gpu performance-analysis memory-allocation profiling cpu-profiling memory-consumption gpu-programming python-profilers scalene profiles-memory performance-cpu

Updated May 16, 2026
Python

geomstats / geomstats

Star

Computations and statistics on manifolds with geometric structures.

machine-learning statistics deep-learning geometry neural-networks manifold lie-groups geodesic gpu-programming riemannian-geometry

Updated May 14, 2026
Python

nabla-ml / nabla

Sponsor

Star

Nabla: High-Performance Scientific Computing

python machine-learning mojo amd pytorch autograd nvidia scientific-computing tensor autodiff gpu-programming jax apple-silicon

Updated Mar 6, 2026
Python

lucidrains / triton-transformer

Star

Implementation of a Transformer, but completely in Triton

deep-learning transformers artificial-intelligence attention-mechanism gpu-programming

Updated Apr 5, 2022
Python

r-aristov / simba-ps

Star

Fast deterministic all-Python Lennard-Jones particle simulator that utilizes Numba for GPU-accelerated computation.

simulation physics cuda pygame particles numba lennard-jones gpu-programming

Updated Jul 15, 2023
Python

andi611 / Apriori-and-Eclat-Frequent-Itemset-Mining

Star

Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mining frequent item sets in a set of transactions, implementation in Python.

python data-mining gpu gcc transaction cuda plot transactions gpu-acceleration apriori frequent-itemset-mining data-mining-algorithms frequent-pattern-mining apriori-algorithm frequent-itemsets pycuda gpu-programming eclat eclat-algorithm

Updated Dec 12, 2018
Python

alexfromapex / tensorexperiments

Star

Boilerplate for GPU-Accelerated TensorFlow and PyTorch code on M1 Macbook

python tensorflow gpu python3 pytorch gpu-acceleration gpu-computing m1 gpu-programming tensorflow2 m1-mac

Updated May 27, 2022
Python

bfGraph / STGraph

Star

🌟 Compiler for vertex-centric programming of GNNs/TGNNs

python graphs gpu-programming graph-neural-networks temporal-graphs temporal-graph-neural-networks

Updated Nov 12, 2024
Python

WenqiJiang / Convolution-Neural-Network-by-pyCUDA

Star

pyCUDA implementation of forward propagation for Convolutional Neural Networks

gpu parallel-computing cuda cnn gpu-acceleration pycuda gpu-programming

Updated Jan 4, 2019
Python

MolSSI-Education / gpu_programming_beginner

Star

Fundamentals of heterogeneous parallel programming with CUDA C/C++ at the beginner level.

gpu-acceleration cuda-toolkit nvidia-cuda nvidia-gpu gpu-programming gpu-profiler cuda-programming

Updated Mar 18, 2026
Python

coderonion / cuda-beginner-course-python-version

Star

bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码

python rust cpp gpu cuda cublas nvidia cudnn nvcc cupy parallel-programming gpu-programming cuda-programming

Updated Mar 18, 2024
Python

AICL-Lab / diy-flash-attention

Star

Learn Triton by building FlashAttention from scratch — V2 kernels, persistent threads, mask DSL, profiling toolkit, bilingual docs

tutorial cuda pytorch triton educational attention-mechanism gpu-programming forward-pass flash-attention kernel-optimization online-softmax

Updated May 22, 2026
Python

arpankapoor / pycuda-vgg16

Star

vgg16 inference implementation using tensorflow, numpy and pycuda

deep-learning tensorflow cuda gpu-programming

Updated Jun 17, 2022
Python

dipta007 / gpu-wait

Star

A package to run commands when GPU resources are available

machine-learning gpu ml gpu-programming

Updated Dec 27, 2024
Python

mihi-r / numba_timer

Star

A helper package to easily time Numba CUDA GPU events ⌛

gpu-programming parellel-programming

Updated Sep 25, 2020
Python

rbga / A51-Realtime-AI-Object-Detection-with-Pyglet-Powered-UI

Star

Real-time object detection app using YOLOv5/YOLOv8 with custom UI built from scratch using Pyglet & OpenGL. UI animations made in Adobe After Effects, rendered as GIFs, and integrated via uxElements.py. Multi-core processing enables live capture, detection, and display with low latency. Uses Open Images v7 dataset. Train mode is WIP.

Updated May 17, 2025
Python

aepokcorporation / gpu-setup-tool

Star

Simplify GPU Setup: Drivers, CUDA, Frameworks, and more!

aws azure gpu cuda nvidia drivers cudnn gpu-programming gpu-support

Updated Jan 24, 2025
Python

AIComputing101 / reinforcement-learning-101

Star

An opinionated, end‑to‑end tutorial project for learning Reinforcement Learning (RL) from first principles to deployment. No notebooks. Everything is an explicit, inspectable Python script you can diff, profile, containerize, and ship.

reinforcement-learning docker-container distributed-computing deep-reinforcement-learning q-learning reinforcement-learning-algorithms gpu-programming reinforcement-learning-agent rlhf

Updated Oct 6, 2025
Python

Zilize / AutoGraph

Star

A Taichi component for automatically compiling and launching compute graph.

gpu taichi gpu-programming just-in-time ahead-of-time

Updated Jun 19, 2023
Python

Improve this page

Add a description, image, and links to the gpu-programming topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gpu-programming topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpu-programming

Here are 63 public repositories matching this topic...

exaloop / codon

plasma-umass / scalene

geomstats / geomstats

nabla-ml / nabla

lucidrains / triton-transformer

r-aristov / simba-ps

andi611 / Apriori-and-Eclat-Frequent-Itemset-Mining

alexfromapex / tensorexperiments

bfGraph / STGraph

WenqiJiang / Convolution-Neural-Network-by-pyCUDA

MolSSI-Education / gpu_programming_beginner

coderonion / cuda-beginner-course-python-version

AICL-Lab / diy-flash-attention

arpankapoor / pycuda-vgg16

dipta007 / gpu-wait

mihi-r / numba_timer

rbga / A51-Realtime-AI-Object-Detection-with-Pyglet-Powered-UI

aepokcorporation / gpu-setup-tool

AIComputing101 / reinforcement-learning-101

Zilize / AutoGraph

Improve this page

Add this topic to your repo