-
2023: HPCA, ASPLOS, DATE, EUROSYS, MLSYS, ISCA, DAC, ATC, MICRO, ICCAD, SC, ISSCC, CICC, VLSI, ESSCIRC
-
2024: HPCA, ASPLOS, DATE, EUROSYS, MLSYS, ISCA, DAC, ATC, MICRO, ICCAD, SC, ISSCC, CICC, VLSI,
- An Optimizing Framework on MLIR for Efficient FPGA-based Accelerator Generation Link
- E2EMap: End-to-End Reinforcement Learning for CGRA Compilation via Reverse Mapping Link
- PruneGNN: Algorithm-Architecture Pruning Framework for Graph Neural Network Acceleration Link
- RELIEF: Relieving Memory Pressure In SoCs Via Data Movement-Aware Accelerator Scheduling Link
- ZENO: A Type-based Optimization Framework for Zero Knowledge Neural Network Inference Link
- Carat: Unlocking Value-Level Parallelism for Multiplier-Free GEMMs Link
- FPGA Technology Mapping Using Sketch-Guided Program Synthesis Link
- MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training Link
- PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization Link
- TGLite: A Lightweight Programming Framework for Continuous-Time Temporal Graph Neural Networks Link
- 8-bit Transformer Inference and Fine-tuning for Edge Accelerators Link
- EVT: Accelerating Deep Learning Training with Epilogue Visitor Tree Link
- GMorph: Accelerating Multi-DNN Inference via Model Fusion Link
- Improving GPU Energy Efficiency through an Application-transparent Frequency Scaling Policy with Performance Assurance Link
- Minuet: Accelerating 3D Sparse Convolutions on GPUs Link
- Orion: Interference-aware, Fine-grained GPU Sharing for ML Applications Link