Efficient in-memory representation for ONNX, in Python
-
Updated
Jun 1, 2026 - Python
Efficient in-memory representation for ONNX, in Python
TinyML & Edge AI: On-device inference, model quantization, embedded ML, ultra-low-power AI for microcontrollers and IoT devices.
Code of the ICASSP 2022 paper "Gradient Variance Loss for Structure Enhanced Super-Resolution"
Ralph Loop Optimizer: an AI-driven framework that turns any evaluatable codebase into a self-improving optimization loop for strategies, models, prompts, and workflows
本仓库包含了完整的深度学习应用开发流程,以经典的手写字符识别为例,基于LeNet网络构建。推理部分使用torch、onnxruntime以及openvino框架💖
Vision-lanugage model example code.
ptdeco is a library for model optimization by matrix decomposition built on top of PyTorch
Minimal Reproducibility Study of (https://arxiv.org/abs/1911.05248). Experiments with Compression of Deep Neural Networks
DA2Lite is an automated model compression toolkit for PyTorch.
Mobile AI: iOS CoreML, Android TFLite, on-device inference, ONNX, TensorRT, and ML deployment for smartphones.
Convert and optimize BirdNET models for ONNX Runtime inference on GPUs, CPUs, and embedded devices
40x faster AI inference: ONNX to TensorRT optimization with FP16/INT8 quantization, multi-GPU support, and deployment
compares different pretrained object classification with per-layer and per-channel quantization using pytorch
Pytorch-TurboQuant: High-performance weight-only quantization for PyTorch. Optimized for fast inference and reduced memory footprint.
Model quantization techniques for efficient LLM inference. Experiments with INT8, INT4, and mixed-precision quantization.
This project presents a robust machine learning solution to predict KCET (Karnataka Common Entrance Test) ranks based on marks and provide personalized college recommendations. The system aids students in estimating their competitive rank prior to official results and assists in selecting suitable colleges based on predicted ranks and branch.
First thermal super-resolution system to achieve 34.2 dB PSNR at 229+ FPS using novel IMDN architecture with specialized thermal adaptations. Features breakthrough RGB→thermal transfer learning, thermal-aware multi-component loss, and real-time inference (2x: 270.6 FPS, 3x: 256.1 FPS, 4x: 250.9 FPS). Production-ready PyTorch + CUDA implementation
A hardware-agnostic profiler for tracking the FLOPs and BOPs of Machine and Deep Learning algorithms.
Tools and experiments for converting Human Activity Recognition (HAR) models to TensorFlow Lite for efficient on-device inference on mobile and wearable devices.
This project is built to detect spam messages using a Long Short-Term Memory (LSTM) model combined with Word2Vec as the word embedding technique. The model has been optimized using Grid Search, achieving a best accuracy of 95.65%.
Add a description, image, and links to the model-optimization topic page so that developers can more easily learn about it.
To associate your repository with the model-optimization topic, visit your repo's landing page and select "manage topics."