Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add events #39

Merged
merged 1 commit into from
Sep 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions config/_default/menus.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,6 @@ main:
- name: 🎤 Publications
url: "#pubs"
weight: 60
- name: 🎪 Events
url: "event"
weighty: 70
Binary file added content/event/20230427/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
62 changes: 62 additions & 0 deletions content/event/20230427/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
title: 'CROSS, Skyhook, and Polyphy'

event: 'Skyhook POSE Workshop Series'
#event_url: https://example.org

location: University of California, Santa Cruz
address:
street: 1156 High St
city: Santa Cruz
region: CA
postcode: '95064'
country: United States

summary: 'Institutional support for creating Paths to Open Source Ecosystems for Open Source Products in Research.'
abstract: ''

# Talk start and end times.
# End time can optionally be hidden by prefixing the line with `#`.
date: '2023-04-27'
#date_end: '2023-04-27'
all_day: true

# Schedule page publish date (NOT talk date).
publishDate: '2023-03-20'

authors: [admin, slieggi]
tags: []

# Is this a featured talk? (true/false)
featured: false

image:
caption: ''
focal_point: Right

url_code: ''
url_pdf: ''
url_slides: ''
url_video: ''

# Markdown Slides (optional).
# Associate this talk with Markdown slides.
# Simply enter your slide deck's filename without extension.
# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
# Otherwise, set `slides = ""`.
slides:

# Projects (optional).
# Associate this post with one or more of your projects.
# Simply enter your project's folder or file name without extension.
# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
# Otherwise, set `projects = []`.
projects:
---

We would like to invite you to a day-long hybrid workshop on April 27, 2023, entitled “CROSS, Skyhook, and Polyphy: Institutional support for creating Paths to Open Source Ecosystems for Open Source Products in Research." This hybrid workshop will be held at UC Santa Cruz with remote access for participants unable to attend in-person.

With support from an NSF Pathways to Enable Open Source Ecosystems (POSE) Phase 1 grant, the NSF Institute for Research and Innovation in Software for High-Energy Physics (IRIS-HEP), and the Alfred P. Sloan Foundation, the goal of the workshop is to scope out self-sustaining ecosystems for the [Skyhook](https://github.com/skyhookdm) and [Polyphy](https://polyphy.io/) projects and to create an institutional model within the [Center for Research in Open Source Software (CROSS)](https://cross.ucsc.edu) and the [Open Source Program Office (OSPO) UC Santa Cruz](/) to create paths to sustainable open source ecosystems for research products across UC campuses.

A draft agenda for the event ~~will be provided by March 27~~ is available [here](https://cross.ucsc.edu/news/news/20230427poseevent.html) as well as a [white paper](https://docs.google.com/document/d/1znmoRvnmoZk1YMGWu7wIusC7KxlI6R_sSuXuK5PQ9LQ/edit?usp=sharing) that will form the basis for the day’s discussion. The workshop’s interactive sessions will discuss strategies for creating and sustaining institutional support for creating open source ecosystems, using
Skyhook and Polyphy as examples. We will also hear updates from our CROSS fellows about their research projects that are already creating open source products that likely will benefit from institutional support for open source ecosystems in the future.
Binary file added content/event/20230817/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
75 changes: 75 additions & 0 deletions content/event/20230817/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
title: 'Computational I/O Stack Workshop'

event: 'Skyhook POSE Workshop Series'
#event_url: https://example.org

location: University of California, Santa Cruz
address:
street: 1156 High St
city: Santa Cruz
region: CA
postcode: '95064'
country: United States

summary: A workshop featuring keynote speaker Yoichiro Tanaka (Tohoku University) that will take place on August 17, 2023 at UC Santa Cruz in the Engineering 2 building (room to be confirmed)
abstract: ''

# Talk start and end times.
# End time can optionally be hidden by prefixing the line with `#`.
date: '2023-08-17'
#date_end: '2023-04-27'
all_day: true

# Schedule page publish date (NOT talk date).
publishDate: '2023-03-20'

authors: [carlos.maltzahn, slieggi]
tags: []

# Is this a featured talk? (true/false)
featured: true

image:
caption: ''
focal_point: Right

url_code: ''
url_pdf: ''
url_slides: ''
url_video: ''

# Markdown Slides (optional).
# Associate this talk with Markdown slides.
# Simply enter your slide deck's filename without extension.
# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
# Otherwise, set `slides = ""`.
slides:

# Projects (optional).
# Associate this post with one or more of your projects.
# Simply enter your project's folder or file name without extension.
# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
# Otherwise, set `projects = []`.
projects:
---

👋 Join us for an exciting event featuring IEEE Distinguished Lecturer {{% mention yoichiro.tanaka %}} (Tohoku University) discussing technological and institutional innovations to make the computational I/O stack a reality!

{{% callout note %}}
This is now a past event. In case you missed it, we have started posting speaker slides and recordings of the event in the agenda below.
{{% /callout %}}

The introduction of computational data management services into the I/O stack, especially in storage and networking devices, requires both technological innovations and new relations between university and industry. This one-day workshop will convene experts from storage systems, open source, and community architecture to discuss technologies and strategies for a computational I/O stack with low market entry barriers.

The workshop will take place on August 17, 2023 from 10am to 5pm, at UC Santa Cruz, Engineering 2, Room 506 (5th floor, north-west of the lobby/elevators, see [floor plans](https://facilities.soe.ucsc.edu/floor-plans)), and is jointly organized by the [IEEE Magnetics Society's Distinguished Lecturers Program][web-ieee-lecturers], the Skyhook Data Management community with funding by the National Science Foundation ([TI-2229773][web-nsf-award]), the Center for Research in Open Source Software ([cross.ucsc.edu][web-cross]), and the Open Source Program Office, UC Santa Cruz ([ospo.ucsc.edu][web-ospo]).

{{< table path="agenda.csv" header="true" caption="Table: Agenda" >}}


<!-- Resources -->
[web-ieee-lecturers]: https://ieeemagnetics.org/membership/educational-outreach/distinguished-lecturers
[web-nsf-award]: https://www.nsf.gov/awardsearch/showAward?AWD_ID=2229773
[web-cross]: https://cross.ucsc.edu
[web-ospo]: https://ospo.ucsc.edu

Binary file added content/event/20230907/featured.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
91 changes: 91 additions & 0 deletions content/event/20230907/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
---
title: 'Adam Kennedy (Voltron Data): Polygraph'

event: 'Eusocial Interest Group Meeting'
#event_url: https://example.org

location: University of California, Santa Cruz
address:
street: 1156 High St
city: Santa Cruz
region: CA
postcode: '95064'
country: United States

summary: Adam Kennedy (Voltron Data) is speaking about Polygraph, a new effort to make processing and optimizations of query plans more efficient
abstract: ''

# Talk start and end times.
# End time can optionally be hidden by prefixing the line with `#`.
date: '2023-09-07T14:00:00-0700'
date_end: '2023-09-07T15:00:00-0700'
all_day: false

# Schedule page publish date (NOT talk date).
publishDate: '2023-09-06'

authors: [adam.kennedy]
tags: []

# Is this a featured talk? (true/false)
featured: false

image:
caption: ''
focal_point: Right

url_code: ''
url_pdf: ''
url_slides: ''
url_video: 'https://www.icloud.com/iclouddrive/0920UPOGUXIosE6viyjHJe6BQ#video1862580471'

# Markdown Slides (optional).
# Associate this talk with Markdown slides.
# Simply enter your slide deck's filename without extension.
# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
# Otherwise, set `slides = ""`.
slides:

# Projects (optional).
# Associate this post with one or more of your projects.
# Simply enter your project's folder or file name without extension.
# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
# Otherwise, set `projects = []`.
projects:
---

{{% callout note %}}
This is an expanded version of Adam Kennedy's presentation at the [2nd International Workshop on Composable Data Management Systems 2023 (CDMS)](https://ceur-ws.org/Vol-3462/CDMS0.pdf) ([agenda](https://ceur-ws.org/Vol-3462)). The following abstract is copied from [there]((https://ceur-ws.org/Vol-3462/CDMS12.pdf)).
{{% /callout %}}

The maturity and substantial investment in Apache Calcite establish it as the open source standard for query planning and
optimization across numerous data tools. Nevertheless, utilizing Apache Calcite for dynamic query planning in a diverse tool
stack with multiple languages has proven challenging. Through the integration of Apache Arrow, we introduce Polygraph:
a language-independent, parse-free, and efficient format for query plans. Its purpose is to enhance plan interoperability,
diminish latency and overheads, and facilitate dynamic query optimization. This experimental format allows for the efficient
exchange of query plans between tools in diverse languages with minimal serialization overhead.

While future query engines are steering away from Java, Calcite remains the solitary mature option for query planning
across a broad spectrum of workloads. Few alternatives come close to matching its features. However, Calcite relies on
tree-based JSON or XML plan representations that do not readily lend themselves to certain optimizations and necessitate
substantial overhead for serialization, I/O, and parsing. The commingling of planners and engines across languages is rare,
unusual, and complex. Such approaches typically result in ad hoc, internal formats with limited reusability. Addressing
these challenges, Polygraph relocates the query plan to Arrow. Polygraph employs a graph structure encoded with columnar
storage techniques. Preliminary experiments indicate an order of magnitude reduction in query plan size compared to JSON
encoding, without incurring copying and serialization overheads. Arrow provides zero-copy, shared-memory, and parse-free
capabilities, along with fast RPC via Arrow Flight. In this representation, plan consumers only need to load the components
and properties of a query plan required for a given computation. These efficiencies substantially reduce the latency between
plan generation and query execution. Moreover, we envision significant potential for other advancements, including resource
planning, ML preprocessing, and integration into ML training and inference.

Until recently, there was no urgent imperative to represent query plans efficiently. However, the escalating complexity
and size of query graphs will persist as data tools become more deeply integrated into intricate ML workloads. Polygraph’s
agile and decomposable graph representation empowers data engines to contribute to query optimization and resource
management. Enhanced integration with top-tier ML systems becomes more viable, facilitating the incorporation of run-time
compute planning and resource management into the query plan, utilizing tools like Apache Acero. The benefits extend
beyond improvements in space efficiency and latency. Query sub-plans can be optimized in-situ using real-time hardware
metrics. Value relations and broadcast tables can be seamlessly embedded in the plan as Arrow objects, accessed in a zero-copy
manner. Large models can be directly incorporated into the query plan, incurring no loading cost until required. Increased
investment in query plan representations, exemplified by Polygraph, supports the data community in keeping pace with
advancements in new architectures and problem domains, such as AI.

64 changes: 64 additions & 0 deletions content/event/20230929/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: 'Workshop: Creating the SkyhookDM Ecosystem for the Computational I/O Stack'

event: '2023 UC Open Source Symposium'
event_url: https://ucospo23.sched.com

location: University of California, Santa Cruz
address:
street: 1156 High St
city: Santa Cruz
region: CA
postcode: '95064'
country: United States

summary: The lack of an open and shared computational I/O software stack ecosystem hampers composability and innovation, and increases design cost. This workshop invites participants to discuss a roadmap for the technologies and governance of the Skyhook Data Management effort.
abstract: ''

# Talk start and end times.
# End time can optionally be hidden by prefixing the line with `#`.
date: '2023-09-29T13:00:00-0700'
date_end: '2023-09-29T14:20:00-0700'
all_day: false

# Schedule page publish date (NOT talk date).
publishDate: '2023-09-23'

authors: [carlos.maltzahn]
tags: []

# Is this a featured talk? (true/false)
featured: false

image:
caption: ''
focal_point: Right

url_code: ''
url_pdf: ''
url_slides: ''
url_video: ''

# Markdown Slides (optional).
# Associate this talk with Markdown slides.
# Simply enter your slide deck's filename without extension.
# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
# Otherwise, set `slides = ""`.
slides:

# Projects (optional).
# Associate this post with one or more of your projects.
# Simply enter your project's folder or file name without extension.
# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
# Otherwise, set `projects = []`.
projects:
---

{{% callout note %}}
This event is part of the [2023 UC Open Source Symposium](https://ucospo23.sched.com), September 27-29, 2023 ([this workshop's link](https://ucospo23.sched.com/event/1RHgi/workshop-creating-the-skyhookdm-ecosystem-for-the-computational-io-stack))
{{% /callout %}}

Hardware acceleration for computational I/O, that is the integration of specialized computational devices into the I/O path, is one of the most promising technologies to further improve performance and energy efficiency of analyzing high-volume and high-velocity datasets and streams. Despite the general availability of a number of devices such as Data Processing Units (DPUs, also known as SmartNICs) and Samsung's SmartSSDs, the open source data science ecosystem lacks an open and shared computational I/O software stack ecosystem. This lack hampers composability and innovation, and increases design cost. To address this. the Center for Research in Open Source Software launched Skyhook Data Management to create open source blueprints for a computational I/O stack that can be adopted by industry. With seed funding from industry component makers, SkyhookDM had a promising start: a blueprint using the unmodified Ceph open source distributed storage system was contributed to Apache Arrow in 2022 and has been included in every release since v7.0.0. It serves as a use case for SNIA Computational Storage TWG, and has attracted world-leading experts from industry and national labs.

This workshop invites participants to help put together a roadmap for an open and shared computational I/O software stack ecosystem at UC Santa Cruz following best practices in open source software techniques, strategies, and governance. We will discuss technical and organizational opportunities, leveraging readily available technologies and institutions.

2 changes: 1 addition & 1 deletion content/event/_index.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Recent & Upcoming Events
title: 🎪 Recent & Upcoming Events

# Listing view
view: compact
Expand Down
Binary file removed content/event/example/featured.jpg
Binary file not shown.
63 changes: 0 additions & 63 deletions content/event/example/index.md

This file was deleted.

Loading