Skip to content

Commit

Permalink
Deploying to gh-pages from @ e9b3098 🚀
Browse files Browse the repository at this point in the history
  • Loading branch information
shintaro-iwasaki committed Apr 14, 2024
1 parent 547be90 commit e4357c5
Show file tree
Hide file tree
Showing 3 changed files with 72 additions and 9 deletions.
Binary file removed 2024/pics/PedroValero-Lara.jpeg
Binary file not shown.
Binary file added 2024/pics/PhilippeTillet.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
81 changes: 72 additions & 9 deletions 2024/program.html
Original file line number Diff line number Diff line change
Expand Up @@ -52,16 +52,79 @@
</div> <!-- End of heading -->

<div id="sub-frame">
TBD
<!-- <div class="midBox1">
<h1>Best Paper Award</h1>
<h3>TBD</h3>
</div> -->
<div class="midBox1">
<h1>Opening Remarks</h1>
<h3>10:30 am - 10:40 am</h3>

<!-- <div class="midBox1">
<h1>Closing Remarks</h1>
<h3>TBD</h3>
</div> -->
<h1>Session 1: High-Performance Computing</h1>
<h3>10:40 am - 12:00 pm</h3>
<h3>Session Chair: Shintaro Iwasaki, Meta</h3>
<ul>
<li>
10:40 am - 11:00 am<br/>
<b>Performance Versus Maintainability: A Case Study of Scream on Frontier</b><br/>
James White
</li>
<li>
11:00 am - 11:30 am<br/>
<b>ParaGraph: Weighted Graph Representation for Performance Optimization of HPC Kernels</b><br/>
Ali Tehranijamsaz, Alok Mishra, Akash Dutta, Abid M. Malik, Barbara Chapman, and Ali Jannesari
</li>
<li>
11:30 am - 12:00 pm<br/>
<b>Alternative Quadrant Representations with Morton Index and AVX2 Vectorization for AMR Algorithms within the p4rest Software Library</b><br/>
Mikhail Kirilin and Carsten Burstedde
</li>
</ul>

<h2>Lunch Break</h2>
<h3>12:00 pm - 1:00 pm</h3>
<ul>
<li>
Lunch will not be provided by the conference.
</li>
</ul>

<h1>Keynote</h1>
<h3>1:00 pm - 2:00 pm</h3>
<h4><b>Block-based GPU Programming with Triton</b></h4>
<h4><b>Philippe Tillet, OpenAI</b></h4>
<h4><b>Abstract:</b>
<font color="#FFFFFF"><img src="pics/PhilippeTillet.jpeg" alt="Philippe Tillet" border="1" align="right" class="right"/></font>
Traditional single instruction, multiple threads (SIMT) programming with CUDA, for all its benefits, can be daunting to machine learning researchers in need of fast custom kernels. We'll shed light on alternative programming models capable of improving GPU programmability without too much of an impact on expressivity. Some such models have recently emerged (e.g., Exo, MLIR Affine), but these are rarely applicable beyond dense tensor algebra — making them a poor fit for workloads requiring (for example) custom data structures. We'll describe the design and implementation of Triton, a mid-level programming language that uses block-based abstractions to simplify kernel development and fusion for researchers without any GPU programming expertise.
</h4>
<h4><b>Bio:</b>
Philippe Tillet first began working with GPUs in 2011 as a contributor to the ViennaCL library. He then received his B.S. from Telecom SudParis (France) in 2012, his M.S. from NCTU (Taiwan) in 2014, and his Ph.D. from Harvard University in 2020 with a dissertation on compilers for blocked algorithms on GPUs. He joined OpenAI full time in 2020 to pursue his work on the Triton compiler — a project he started in 2018 after being frustrated by the difficulty of writing auto-tuners for matrix multiplications in CUDA. Since then, he grew the Triton language into a reference for block-based programming model, and wrote all the training kernels that were used by GPT4.
</h4>

<h1>Session 2: Accelerating AI/ML Workloads</h1>
<h3>2:00 pm - 3:10 pm</h3>
<h3>Session Chair: Carl Pearson, Sandia National Laboratories</h3>
<ul>
<li>
2:00 pm - 2:30 pm<br/>
<b>Avoiding Training in the Platform-Aware Optimization Process for Faster DNN Latency Reduction</b><br/>
Raúl Marichal, Ernesto Dufrechou, and Pablo Ezzatti
</li>
<li>
2:30 pm - 2:50 pm<br/>
<b>A Comparative Study on Simulation Frameworks for AI Accelerator Evaluation</b><br/>
Christoffer Ã…leskog, HÃ¥kan Grahn, and Anton Borg
</li>
<li>
2:50 pm - 3:10 pm<br/>
<b>Extending the SYCL Joint Matrix for Binarized Neural Networks</b><br/>
Zheming Jin
</li>
</ul>

<h1>Closing Remarks</h1>
<h3>3:10 pm - 3:20 pm</h3>

<h2>Presentation</h2>
All presentations will be in-person.
Presenters are expected to target 25 minutes (full papers) or 15 minutes (short papers) for the talks with 5 minutes for questions.
</div>
</div>


Expand Down

0 comments on commit e4357c5

Please sign in to comment.