Change the repository type filter
All
Repositories list
33 repositories
- TLLM_QMM strips the implementation of quantized kernels of Nvidia's TensorRT-LLM, removing NVInfer dependency and exposes ease of use Pytorch module. We modified the dequantation and weight preprocessing to align with popular quantization alogirthms such as AWQ and GPTQ, and combine them with new FP8 quantization.
griffith
PublicA React-based web video playerrucene
Public- 🎆 A well-designed local image and video selector for Android
redis-shard
Publiczetta-client-go
PublicSERank
PublicAn efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.chaika
Publicpromate
PublicGraphite On VictoriaMetricscuBERT
PublicFast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKLmirror
Publickids
Publiczetta-client-java
Publiccmdb
Publicpresto-connectors
Publictache
PublicSugarAdapter
PublicRxLifecycle
PublicAndroidGodEye
Publichive
Publiczhihu-rxjava-meetup
Publicprotobuf
Publicphabricator
Publiclibphutil
Publicarcanist
Publicpuppet-cdh
Public