-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathindex.qmd
74 lines (40 loc) · 9.63 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---
title: "Streamlining API Integration: Jon Harmon’s Journey with the *api2r* Package"
description: "The R package _api2r_ simplifies the process of wrapping APIs in R. The R Consortium interviewed Jon Harmon, Principal Data Solutions Engineer at Atorus, about this R Consortium-funded project."
author: "R Consortium"
image: "api2r-main-image.png"
date: "10/30/2024"
---
The R Consortium recently interviewed [Jon Harmon](https://www.linkedin.com/in/jonthegeek/), Principal Data Solutions Engineer at Atorus, about his ISC-funded project, *api2r*. Jon's extensive experience working with APIs from platforms like Slack, YouTube, and LinkedIn, along with his leadership of the Data Science Learning Community (DSLC.io), inspired him to develop *api2r* to simplify the process of wrapping APIs in R. The package streamlines the creation of R packages for various APIs by leveraging the OpenAPI standard, making it easier for developers to integrate external services into their R workflows.
![](JonHarmon.png){width=50%}
Jon discusses the challenges he faced during the package's development, particularly around authentication and endpoint setup, and how *api2r* differs from other tools like {httr2} and {paws}.
The R Consortium funded this project.
**What inspired you to create the api2r package, and how do you envision benefiting the R community?**
I run the [Data Science Learning Community (DSLC) at dslc.io](dslc.io), and through that, as well as some paid work, I frequently work with various APIs like Slack, YouTube, Zoom, LinkedIn, and different task management systems. I noticed that most of these are documented using the [OpenAPI Specification](https://www.openapis.org/) or an older variant, which made it easier for me to read and write R code for them.
However, I found myself repeatedly looking up how I had done similar tasks before, which made me realize it would be much easier if I created a tool to streamline this process. Since I love writing R packages, it felt natural to pursue this idea.
In terms of how the R community could benefit, I want to make it easier for others to wrap APIs in R so people can create packages for whatever tasks they might need.
**Can you elaborate on the challenges you foresee in implementing authentication and endpoint calls in the initial milestone (version 0.1.0)?**
I've actually moved past the initial milestone of setting up authentication, but it definitely presented—and continues to present—challenges. First, I ended up splitting the package into three parts. There's *[rapid](https://rapid.api2r.org/)*, which parses API descriptions into a standard object, *[nectar](https://nectar.api2r.org/)*, which wraps the `httr2` package to handle common tasks, and *[beekeeper](https://beekeeper.api2r.org/)*, which is the main package. *Nectar* was particularly helpful because I was able to abstract the standard types of authentication into just a few simple wrappers. While `httr2` handles much of the heavy lifting, it doesn't guide you through what you need to find in the documentation. That's where *Nectar* steps in, helping streamline the setup process.
![](beekeeperlogo.png){width=50%}
That said, there are still ongoing challenges. Many sites don't fully or correctly implement the OpenAPI Specification, making it difficult to find the necessary information. Even when they do, details on how to obtain an API key can be tricky to track down. I spent a lot of time trying to address this issue, although there's no perfect solution. Fortunately, after I wrote my grant application, a new version of `httr2` was released that simplifies a lot of these processes, which has been a big help.
**How does the api2r package aim to address the inconsistencies and inefficiencies in the current process of wrapping APIs into R packages?**
One thing I really wanted to focus on is the challenge of testing packages that wrap APIs. I've helped many people navigate this issue, as it's easy for your package to get rejected from CRAN if the tests are not set up correctly or if they require an internet connection. To address this, I made sure that part of the automatic package building process includes generating a test suite that follows best practices for API testing.
With the *Nectar* package I mentioned earlier, it simplifies things by abstracting common setups—normally done with something like `httr2`—down to just what's necessary, based on the API documentation. This creates a straightforward, one-to-one correspondence between what the API specifies and what you need to do, automatically generating a version of the package that handles this process smoothly.
**What are the key differences between the api2r package and other API-related tools like {httr}, {httr2}, and {paws}?**
*Paws* is actually a family of packages designed specifically for working with Amazon Web Services (AWS) APIs, which is just one use case. The goal of the *API to R* family of packages I created is to make it easier to write packages like *Paws*. Essentially, *Paws* could be a package you’d create using my tools.
I extensively use <code><em>httr2</em></code> (the updated version of <code><em>httr</em></code>), which helps simplify writing API calls.
However, with <code><em>httr2</em></code>, you still have to manually piece everything together. The idea behind <em>API to R</em> is to take the API documentation and automatically build the <code><em>httr2</em></code> calls for you, eliminating the need to manually complete all the intermediate steps.
**How do you plan to engage the R community in the development and testing of the api2r package, and what role do you see for collaboration?**
I engage with many members of the R community on social media, primarily through [Mastodon](https://fosstodon.org/@jonthegeek) and [LinkedIn](https://www.linkedin.com/in/jonthegeek/), where I share updates about the packages I’m working on, like *API to R*. I also maintain websites for these packages, and you can visit [beekeeper.api2r.org](beekeeper.api2r.org) to find links to all of them. I keep everything up to date there.
As I mentioned earlier, I run the Data Science Learning Community at dslc.io, which now has over 19,000 members, with 400-500 active weekly. Most of our members are R users, and I regularly keep the group updated. We also host a monthly project club where members present their work, and I’ve presented about this project a few times. I'm also active on other R platforms like the ROpenSci Slack team and various Discord channels.
That being said, I’ve reached a point where I could use some help. I’d love for people to check out the GitHub repositories for these packages and share how they’re using them. If you encounter any bugs or have feedback, let me know. Just last week, I had an interaction on Mastodon where someone was working with a new API and asked if I could help. While we haven’t fully connected yet, I was able to send them the documentation to get started on creating the package and exploring what works and what doesn’t. It’s exciting to see where this could lead!
**What are your long-term goals for maintaining and expanding api2r, and how do you plan to ensure its sustainability in the R ecosystem?**
I strongly believe in developing everything openly, so the project is available on [GitHub](https://github.com/jonthegeek/beekeeper/tree/main) for anyone to explore. You can access it from the website. Since it's public, people are free to use it however they like. If I don’t take the project in the direction they prefer, they can fork it and work on their own version.
I also work closely with [rOpenSci](https://ropensci.org/) and other package developers to ensure that there's a supportive community to keep things running. That said, if anyone is particularly interested in this package and its goals, I would really welcome an active collaborator or two to help manage and advance it. The standard we're using—like the object version of the OpenAPI Specification—does evolve, and as APIs update, the package will need to adapt as well. My aim is to keep this project going long-term, and I don't want it to rely entirely on me. Collaboration would be key to ensuring it grows and stays up to date.
**How has it been working with the R Consortium? Would you recommend applying for an ISC grant to other R developers?**
The R Consortium has been fantastic to work with—completely reasonable, transparent, and supportive at every step. The challenge for me, however, was that I seriously underestimated how much work this project would require. It wasn’t just about the difficulty; I also ended up going off on tangents that weren’t part of the original plan. Since the grant was milestone-based, there are still some milestones I haven’t reached, and that’s been tough to manage.
I definitely recommend applying for an ISC grant if you’re working on something, but I would advise being very careful with how you structure your proposal. I got a bit stuck in my details, and now I’m at a point where I have to find extra time to work on it. There's still some grant money left, but not enough to fully cover the remaining work. So, I tend to work on the project when I have a specific use for it and can justify the time.
Overall, having the grant was incredibly helpful in giving me the time and space to explore this package, but it's something to approach with caution regarding planning and execution.
## About ISC Funded Projects
A major goal of the R Consortium is to strengthen and improve the infrastructure supporting the R Ecosystem. We seek to accomplish this by funding projects that will improve both technical infrastructure and social infrastructure.
[https://r-consortium.org/all-projects/](https://r-consortium.org/all-projects/)