Skip to content

Commit

Permalink
Merge pull request #688 from LLNL/aug-news
Browse files Browse the repository at this point in the history
Catching up July & August news
  • Loading branch information
hauten authored Aug 20, 2024
2 parents b01c925 + 8e94189 commit 108246b
Show file tree
Hide file tree
Showing 13 changed files with 54 additions and 8 deletions.
2 changes: 1 addition & 1 deletion _posts/2021-04-21-str.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ categories: story

The latest issue of LLNL's *Science & Technology Review* magazine showcases Computing in the cover story (see abstract below) and Commentary. Open source software plays a prominent role in the initiatives described in the story. The cover art shows an advection simulation powered by open source repos [MFEM](https://mfem.org/) and [GLVis](https://glvis.org/).

* [Full issue](https://str.llnl.gov/2021-02) (and [PDF version](https://str.llnl.gov/content/pages/2021-02/pdf/02.21.pdf))
* [Full issue](https://str.llnl.gov/past-issues/february-2021) (and [PDF version](https://str.llnl.gov/sites/str/files/2024-04/02.21.pdf))
* Commentary: [To Exascale and Beyond](https://str.llnl.gov/2021-02/comfeb21) by Computing associate director Bruce Hendrickson
* Cover story: [The Exascale Software Portfolio](https://str.llnl.gov/2021-02/diachin) by Holly Auten and featuring Lori Diachin, Rob Neely, Jeff Hittinger, Ulrike Meier Yang, David Beckingsale, and Tzanio Kolev

Expand Down
2 changes: 1 addition & 1 deletion _posts/2021-04-28-fluxmou.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ title: "LLNL, IBM, and Red Hat Joining Forces"
categories: story
---

Under a new memorandum of understanding, researchers at LLNL, IBM, and Red Hat will aim to enable next-generation workloads by integrating LLNL’s open source [Flux scheduling framework](http://flux-framework.org/) with Red Hat OpenShift to allow more traditional HPC jobs to take advantage of cloud and container technologies. “Cloud systems are increasingly setting the directions of the broader computing ecosystem, and economics are a primary driver,” said Bronis de Supinski, CTO of Livermore Computing at LLNL. “With the growing prevalence of cloud-based systems, we must align our HPC strategy with cloud technologies, particularly in terms of their software environments, to ensure the long-term sustainability and affordability of our mission-critical HPC systems.” [Read more about the agreement at LLNL News.](https://www.llnl.gov/archive/news/llnl-ibm-red-hat-joining-forces-explore-standardized-hpc-resource-management-interface)
Under a new memorandum of understanding, researchers at LLNL, IBM, and Red Hat will aim to enable next-generation workloads by integrating LLNL’s open source [Flux scheduling framework](http://flux-framework.org/) with Red Hat OpenShift to allow more traditional HPC jobs to take advantage of cloud and container technologies. “Cloud systems are increasingly setting the directions of the broader computing ecosystem, and economics are a primary driver,” said Bronis de Supinski, CTO of Livermore Computing at LLNL. “With the growing prevalence of cloud-based systems, we must align our HPC strategy with cloud technologies, particularly in terms of their software environments, to ensure the long-term sustainability and affordability of our mission-critical HPC systems.” [Read more about the agreement at LLNL News.](https://www.llnl.gov/article/47511/llnl-ibm-red-hat-joining-forces-explore-standardized-hpc-resource-management-interface)
2 changes: 1 addition & 1 deletion _posts/2022-11-09-sustain.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ title: "Spack: Sustaining the HPC Software Ecosystem"
categories: event
---

Todd Gamblin, an LLNL Distinguished Member of Technical Staff, gave a presentation on November 9 for the Dell Technologies HPC Community. His talk, "Sustaining the HPC Software Ecosystem," described how HPC software can be managed more easily for all customers and users with [Spack](https://spack.io) and included an overview of recent developments in the Spack community such as a partnership with AWS to provide infrastructure for a worldwide binary cache, a recent machine learning special interest group within Spack, and work to handle the complexities of installing software for GPUs. Slides can be [downloaded](https://d21hwc2yj2s6ok.cloudfront.net/assets/uploads/306737/asset/Sustaining_the_HPC_Software_Ecosystem.pdf?1668109004).
Todd Gamblin, an LLNL Distinguished Member of Technical Staff, gave a presentation on November 9 for the Dell Technologies HPC Community. His talk, "Sustaining the HPC Software Ecosystem," described how HPC software can be managed more easily for all customers and users with [Spack](https://spack.io) and included an overview of recent developments in the Spack community such as a partnership with AWS to provide infrastructure for a worldwide binary cache, a recent machine learning special interest group within Spack, and work to handle the complexities of installing software for GPUs.
2 changes: 1 addition & 1 deletion _posts/2023-05-26-dftf-new.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ title: "New Repo: DFTF"
categories: new-repo
---

[DFTF](https://github.com/LLNL/dftf), or Drink From The Firehose, is a Python program that subscribes to Redfish events on Cray/HPE hardware and republishes them to topics in Kafka. In an attempt to tame the "firehose" of information from CrayTelmetry, DFTF drops any repeated metrics so only the most recent value for each unique metric is maintained. In effect, this usually means values are reported roughly every five seconds rather than every second. See the [`example.conf` file](https://github.com/LLNL/dftf/blob/main/example.conf) for example configuration. DFTF is being used during Livermore Computing's efforts to site the Lab's upcoming exascale machine El Capitan.
[DFTF](https://github.com/LLNL/dftf), or Drink From The Firehose, is a Python program that subscribes to Redfish events on Cray/HPE hardware and republishes them to topics in Kafka. In an attempt to tame the "firehose" of information from CrayTelmetry, DFTF drops any repeated metrics so only the most recent value for each unique metric is maintained. In effect, this usually means values are reported roughly every five seconds rather than every second. DFTF is being used during Livermore Computing's efforts to site the Lab's upcoming exascale machine El Capitan.
2 changes: 1 addition & 1 deletion _posts/2023-08-23-rd100.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ The annual R&D 100 Awards recognize new S&T products, technologies, and

**Variorum (vendor-agnostic power management)**

- Tapasya Patki, Stephanie Brink, Aniruddha Marathe, Barry Rountree, Kathleen Shoga, and Eric Green
- Tapasya Patki, Stephanie Brink, Aniruddha Marathe, Barry Rountree, Kathleen Shoga, and Elena Green
- [Variorum video on YouTube](https://www.youtube.com/watch?v=rgJGgPERBao) (6:57)
- [Variorum project summary](https://computing.llnl.gov/projects/variorum)
- [Variorum GitHub repository](https://github.com/LLNL/variorum)
Expand Down
2 changes: 1 addition & 1 deletion _posts/2024-04-15-ssapy-new.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ title: "New Repo: SSAPy"
categories: new-repo
---

The Space Situational Awareness for Python ([SSAPy](https://github.com/LLNL/SSAPy)) is a Python package allowing for fast and precise orbital modeling. All documentation is hosted at [software.llnl.gov/SSAPy/](https://software.llnl.gov/SSAPy/), and the team has established a separate repository for data at [SSAPy-Data](https://github.com/LLNL/SSAPy-Data).
The Space Situational Awareness for Python ([SSAPy](https://github.com/LLNL/SSAPy)) is a Python package allowing for fast and precise orbital modeling. The team has also established a separate repository for data at [SSAPy-Data](https://github.com/LLNL/SSAPy-Data).
2 changes: 1 addition & 1 deletion _posts/2024-06-04-scr.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ title: "Evolving at the Speed of Exascale"
categories: story
---

Bugs, broken codes, or system failures require added time for troubleshooting and increase the risk of data loss. LLNL has addressed failure recovery by developing the Scalable Checkpoint/Restart ([SCR](https://github.com/LLNL/scr)) framework. [Read more in *Science & Technology Review*.](https://str.llnl.gov/2024-03/2024-03-evolving-speed-exascale)
Bugs, broken codes, or system failures require added time for troubleshooting and increase the risk of data loss. LLNL has addressed failure recovery by developing the Scalable Checkpoint/Restart ([SCR](https://github.com/LLNL/scr)) framework. [Read more in *Science & Technology Review*.](https://str.llnl.gov/past-issues/march-2024/evolving-speed-exascale)
2 changes: 1 addition & 1 deletion _posts/2024-06-04-str.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ title: "S&TR Cover Story: The Laboratory’s Habit of Innovation"
categories: story
---

LLNL’s HPC capabilities play a significant role in international science research and innovation, and Lab researchers have won 10 R&D 100 Awards in the Software–Services category in the past decade. [Read the full story in *Science & Technology Review*.](https://str.llnl.gov/2024-03/2024-03-laboratorys-habit-innovation)
LLNL’s HPC capabilities play a significant role in international science research and innovation, and Lab researchers have won 10 R&D 100 Awards in the Software–Services category in the past decade. [Read the full story in *Science & Technology Review*.](https://str.llnl.gov/past-issues/march-2024/laboratorys-habit-innovation)
6 changes: 6 additions & 0 deletions _posts/2024-07-10-lcuss-new.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
title: "New Repo: LCUSS"
categories: new-repo
---

Livermore Computing User and System Scripts ([LCUSS](https://github.com/LLNL/LCUSS)) is a collection of scripts used to improve productivity on HPC systems for both administrators and general users. These may include general scripts for user management, scripts for helping users interact with Livermore Computing (LC) resource management software (e.g., SLURM and Flux), and scripts to automate common user command-line tasks on LC and other HPC machines.
6 changes: 6 additions & 0 deletions _posts/2024-07-23-ctm-new.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
title: "New Repo: CTM"
categories: new-repo
---

The Common Electric Power Transmission System Model [(CTM)](https://github.com/LLNL/ctm) is an intuitive, extensible, language-agnostic, and range-validating specification of electric power network components' parameter names and units, and the relation between components, intended for use by the research community developing new computational methods for power systems operations and simulation. This repository specifies CTM as a JSON Schema, provides documentation, derivates (code-generated) implementations of CTM, and example data and usage of the schema for important use cases.
22 changes: 22 additions & 0 deletions _posts/2024-08-08-rd100.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: "R&D 100 Award Winners: UMap and UnifyFS"
categories: story
---

The annual R&D 100 Awards recognize new S&T products, technologies, and materials for their technological significance that are available for sale or license. [The 2024 winners](https://www.rdworldonline.com/rd-100-winners-for-2024-are-announced/) were announced on August 8. Congratulations to these teams:

**UMap (fast, extensible memory-mapping library for diverse data storage ​)**

- LLNL developers: Maya Gokhale, Marty McFadden, Elena Green, Roger Pearce, Keita Iwabuchi, Karim Youssef
- [UMap video on YouTube](https://www.youtube.com/watch?v=oC5Zh8CMAUM) (4:55)
- [UMap project summary](https://computing.llnl.gov/projects/umap)
- [UMap GitHub repository](https://github.com/LLNL/umap)

**UnifyFS (user-level file system for supercomputers)**

- LLNL developers: Kathryn Mohror, Cameron Stanavige, Chen Wang, Hariharan Devarajan, Ned Bass, Tony Hutter
- [UnifyFS video on YouTube](https://www.youtube.com/watch?v=I-O5hdcQRGw) (5:46)
- [UnifyFS project summary](https://computing.llnl.gov/projects/unifyfs)
- [UnifyFS GitHub repository](https://github.com/LLNL/UnifyFS)

LLNL has had a long history of R&D 100 Awards since the awards began in 1963. Software- and computing-related projects have been recognized in the decades since with [PRUNERS](https://pruners.github.io/), [Babel](https://software.llnl.gov/Babel/#page=home), [Sapphire](https://computing.llnl.gov/projects/sapphire), [LLMDA](https://gs.llnl.gov/biosecurity-center/llmda), [STAT](https://github.com/LLNL/STAT), [ROSE compiler](https://github.com/rose-compiler/rose), [*hypre*](https://github.com/LLNL/hypre), and others. Since 2019, LLNL teams have produced eight open source finalists and/or winners: [Spack](https://spack.io/), [SCR](https://github.com/LLNL/scr), [MFEM](https://mfem.org/), [Flux](https://flux-framework.org/), [Variorum](https://computing.llnl.gov/projects/variorum), [zfp](https://computing.llnl.gov/projects/zfp), UMap, and UnifyFS.
6 changes: 6 additions & 0 deletions _posts/2024-08-13-carpentry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
title: "HPC Carpentry at LLNL"
categories: story
---

The Lab invited HPC Carpentry to present their two-day user workshop twice in June. The event was popular with LLNL summer students and staff. [Visit the HPC Carpentry blog](https://www.hpc-carpentry.org/blog/2024/08/llnl-workshop-blog-post.html) to read more about the lessons.
6 changes: 6 additions & 0 deletions _posts/2024-08-14-ppoaf-new.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
title: "New Repo: PPO-AF"
categories: new-repo
---

PPO And Friends ([PPO-AF][https://github.com/LLNL/ppo_and_friends]) is an MPI distributed PyTorch implementation of Proximal Policy Optimization along with various extra optimizations and add-ons (friends). It is currently compatible with the following environment frameworks: Gymnasium, Gym (including versions <= 0.21), PettingZoo, and Abmarl Gridworld.

0 comments on commit 108246b

Please sign in to comment.