Skip to content
View gty111's full-sized avatar

Highlights

  • Pro

Block or report gty111

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
gty111/README.md
  • PH.D. student at Sun Yat-sen university

  • LLM Inference, HPC, Simulaters, GPU, architecture

  • Visit my personal web

PRs for Project

  • xDiT: Fix parallel vae link
  • DistVAE: Fix batch dimension link
  • vLLM: [Benchmark] Refactor sample_requests in benchmark_throughput link
  • vLLM: [Bugfix] fix automatic prefix args and add log info link
  • vLLM: [Minor Fix] Fix comments in benchmark_serving link
  • vLLM: [Minor Fix] Remove unused code in benchmark_prefix_caching.py link
  • TVM: [Doc] Fix minor error in "Expressions in Relay" link
  • TVM: [Doc] Fix minor error in doc (Add an operator to Relay) link

Pinned Loading

  1. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 31.1k 4.7k

  2. PTX-EMU PTX-EMU Public

    PTX-EMU is a simple emulator for CUDA program.

    C++ 24 2

  3. SimpleUseGpgpuSim SimpleUseGpgpuSim Public

    GPGPU-SIM 使用篇

    Shell 13 1

  4. GEMM_MMA GEMM_MMA Public

    Optimize GEMM with tensorcore step by step

    15 4

  5. arcsysu/SYSU-ARCH arcsysu/SYSU-ARCH Public

    SYSU-ARCH is a LAB that focuses on the use and extending of simulators.

    Cuda 9 4

  6. ConvNN ConvNN Public

    A simple CNN training framework support on CPU and GPU(CUDNN)

    C++ 3