All of Us Workbench API Structure & Technologies

The All of Us Workbench API is largely built on Spring Boot. It also uses Swagger for autogenerating REST APIs.

Database

The primary application database is a Google Cloud SQL instance running MySQL. We modify its schema using Liquibase migrations, which live in db/changelog/.

When creating a new database migration, add an XML file to the changelog directory that describes the migration to perform. This new file should be named db.changelog-###-description-of-migration.xml, where ### is the number of the newest migration + 1. Then, add the file to the end of the db.changelog-master.xml file.

We generally name database objects in lower_snake_case.

Liquibase migrations are applied (indirectly) via a bash script called db/run-migrations.sh. Do not run this script directly; instead, call one of the following project.rb commands:

dev-up (starts a local dev server)
run-local-all-migrations (runs migrations on the workbench and cdr schemas)
run-local-rw-migrations (runs migrations only on the workbench schema)

DAOs

Data Access Objects (DAOs) are interfaces that describe what operations the application can perform on the database. Generally, we give DAOs the CamelCased name of the database table. All of the Workbench DAOs implement Spring CrudRepository, which means they can do simple operations like find-by-primary-key, save, delete-by-primary-key, etc.

CrudRepository also allows queries to be automatically derived from the name of a method defined on the interface. As such, you will see methods like findByUserIdAndWorkspaceId with no apparent implementation. The implementation is generated by Spring.

To define a query that uses a foreign key, there are a couple of options. First, you could use @JoinColumn/@[One|Many]ToOne annotation on the corresponding DB Model, as described below. Second, you could use an @Query annotation to just describe the query in stringy SQL.

DB Models

DB Models represent a single row of a single table in the database, possibly with extender tables attached. We denote DB models by prefixing Db to the name of the database table. DB models heavily use jakarta.persistence annotations to describe where in the database table each part of the model lives.

In order to automatically gather data from extender tables, we use the @JoinColumn, @OneToOne, @OneToMany, @ManyToOne, and @MappedBy annotations. Documentation and blogs on the usage of these annotations can be found in these links:

https://www.baeldung.com/jpa-join-column
https://www.baeldung.com/hibernate-one-to-many
https://www.baeldung.com/jpa-joincolumn-vs-mappedby
https://docs.jboss.org/hibernate/jpa/2.1/api/javax/persistence/JoinColumn.html (contains See Also links to the rest of these annotations)

Examples of the use of these annotations can be found in the following places:

DbUser
DbAddress
DbCohort

Much of the structure of DB models can be autogenerated in IntelliJ by defining the class members and then doing Generate -> Getter and Setter and Generate -> equals() and hashCode(). This will not automatically add the required annotations - those must be added manually.

Services and ServiceImpls

Services are interfaces that declare methods of business logic. ServiceImpls are classes implementing their correspondingly named Service and the business logic methods declared therein.

Practically, this allows one ServiceImpl to implement multiple interfaces. This is used in several UserServiceImpl.

Services / ServiceImpls are generally named after a feature or functional area, like 'UserService', 'CohortService', or 'MonitoringService'.

ServiceImpls can call into other services, but we try to keep the number of dependencies down. We try to avoid having ServiceImpls call into a large number of DAOs, especially DAOs not directly related to the functional area for which the ServiceImpl is responsible. Ideally, each DAO would roll up to exactly one Service.

As a positive example, the UserServiceImpl calls into the UserDao, the UserTermsOfServiceDao, and the VerifiedInstitutionalAffiliationDao, all three of which are directly related to the UserService. As an example of something to avoid, the WorkspaceServiceImpl (and a number of Controllers!) talk to the UserDao without going through the UserService.

ServiceImpls should handle only application logic. They should not need explicit knowledge of the database, the REST API, or the UI.

Controllers

Controllers take Request objects from the API, call into the Service layer, and return Response objects back to the API. Controllers implement autogenerated APIControllers. All Controllers must be located in the org.pmiops.workbench.api package.

Controller methods should generally be small and should not contain any actual logic. The most common conditional in the Controller layer should be checking a feature flag. Most of the work done by Controllers should be unpacking data from a Request object or packaging data into a Response object.

Mappers

There are four types of models in the system - API models, DB models, and API Request and Response models. Mappers are utility interfaces that own methods that convert between these types of models. We use MapStruct annotations to describe these mappings and autogenerate the implementation of the Mappers.

API

The actual REST API is defined in a series of YAML files according to the Swagger OpenAPI spec. swagger-codegen takes these YAML files, merges them, and generates all the classes defined by the YAMLs. The command to do so is ./project.rb compile-generated-java.

One non-Swagger-standard thing about these YAML files is that the tag param is used to indicate which Controller / APIController an endpoint should be associated with.

Each REST endpoint is defined under PATHS in the YAML file.

API Models

API models are defined under DEFINITIONS in the YAML file. Swagger auto generates classes from these. Both API models and DB models are currently used in the Controller and Service layer. While it is agreed that only one type of model should be used throughout the Service layer, there is no consensus on which type of model that should be, nor on how highly conforming to such a standard should be prioritized.

Requests and Responses

Swagger interprets the data sent to a given endpoint according to the Request model definition associated with that endpoint and throws an error if the data is improperly formatted. Likewise, Swagger validates that the data being returned from a given endpoint conforms to the Response model definition associated with that endpoint and throws an error if it does not.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code-structure.md

code-structure.md

All of Us Workbench API Structure & Technologies

Database

DAOs

DB Models

Services and ServiceImpls

Controllers

Mappers

API

API Models

Requests and Responses

Files

code-structure.md

Latest commit

History

code-structure.md

File metadata and controls

All of Us Workbench API Structure & Technologies

Database

DAOs

DB Models

Services and ServiceImpls

Controllers

Mappers

API

API Models

Requests and Responses