Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project: Rebuild IDMS stack #1

Open
1 of 13 tasks
mikethicke opened this issue Dec 5, 2024 · 0 comments
Open
1 of 13 tasks

Project: Rebuild IDMS stack #1

mikethicke opened this issue Dec 5, 2024 · 0 comments
Assignees
Labels
wip work-in-progress (issue is incomplete)

Comments

@mikethicke
Copy link
Contributor

mikethicke commented Dec 5, 2024

The current IDMS stack exists over multiple long-running EC2 instances with services that have rarely been updated and are often not tracked in source control.

The primary aims of this project are:

  • Consolidate the IDMS stack in a single repository
  • Dockerize the IDMS stack for deployment on ECS or Kubernetes
  • Upgrade core services to current versions

Additionally, this project will include transitioning from PWM for password management to COmanage.

Other than the transition from PWM, this project does not include any feature development. The aim is to rebuild the stack with feature parity to the current stack, which will facilitate feature development and maintenance in the future.

Overview of the current system

See also: Humanities Commons Authentication and Identity Management Stack

KC IDMS Stack - IDMS Stack
Miro

The Knowledge Commons IDMS stack provides identity information and authorizes users for two services: The Knowledge Commons WordPress site and the Knowledge Commons Works repository. It consists of the following main services:

  • SATOSA (proxy.hcommons.org): Acts as a proxy between the KC applications and the KC identity providers. Also returns user information such as email address and organizational & group memberships to the applications upon successful authentication.
  • COmanage Registry (registry.hcommons.org): Manages a database that is the primary source of truth for user information. Handles user registration flows. Connects to external APIs to obtain organizational and group membership data. Provisions user data to LDAP.
  • Registry LDAP (registry.hcommons.org): Contains user and membership data. Provisioned to from COmanage. Queried by SATOSA.
  • Shibboleth (registry.hcommons.org): Duplicates Simplesamlphp functionality, specifically for logging in to COmanage. Probably redundant and unnecessary.
  • Simplesamlphp (hc-idp.hcommons.org): Does user authentication. For "KC" (username / password) login, checks credentials against PWM LDAP. For other login methods (Google, ORCID, MLA, etc), redirects to external IDP. When user successfully authenticates, redirect to SATOSA.
  • PWM (hc-idp.hcommons.org): Password manager. Provisions to PWM LDAP, which is queried by Simplesamlphp.
  • PWM LDAP (hc-idp.hcommons.org): Contains data solely for logging-in users with username and password.

User stories

User logs-in to Knowledge Commons using Google

  1. User clicks "Login" on Knowledge Commons (WordPress) site
  2. Proxy checks KC Wordpress entity id against metadata file, confirms KC WP is registered.
  3. User is redirected to SATOSA (proxy)
  4. User is redirected to discovery page (registry)
  5. User clicks "Google" login method
  6. User is redirected to Simplesamlphp Google gateway (google-gateway.hcommons.org)
  7. User authenticates with Google
  8. User is returned to Simplesamlphp (google-gateway.hcommons.org)
  9. User is redirected to SATOSA (proxy)
  10. SATOSA queries Registry LDAP for user data, matching against Google identifier
  11. SATOSA adds returned data to SAML assertion
  12. User is redirected to Knowledge Commons

(Process is essentially the same for Works)

User logs-in to Knowledge Commons using "KC" (username / password) method

  1. User clicks "Login" on Knowledge Commons (WordPress) site
  2. Proxy checks KC Wordpress entity id against metadata file, confirms KC WP is registered.
  3. User is redirected to SATOSA (proxy)
  4. User is redirected to discovery page (registry)
  5. User clicks "KC" login method
  6. User is redirected to Simplesamlphp HC gateway (hc-idp.hcommons.org)
  7. User supplies username and password to Simplesaml PHP
  8. Simplesamlphp checks supplied credentials against those in PWM LDAP
  9. User is redirected to SATOSA (proxy)
  10. SATOSA queries Registry LDAP for user data, matching against hc-idp identifier
  11. SATOSA adds returned data to SAML assertion
  12. User is redirected to Knowledge Commons

User registers to Knowledge Commons

See also: Enrollment Flows Flow (Miro)

  1. User clicks "Register" on Knowledge Commons (WordPress) site
  2. User is shown some information about KC and registration
  3. User clicks "Register now"
  4. User is linked directly to an enrollment flow in COmanage (this is specific to each organization, so each "Register now" link on KC has to have a different URL.
  5. After successful enrollment (see "Enrollment Flows Flow" linked above), user is redirected to SATOSA
  6. SATOSA queries Registry LDAP for user data
  7. User is redirected back to Knowledge Commons
  8. WP recognizes that the user is authenticated but does not have an account. An account is created and the user is logged-in.

Redesigned IDMS stack

HC IDMS Stack - IDMS Redesign
Miro

Referencing the above diagram:

  • In contrast to the current stack, each service runs in its own container, rather than having services spread over multiple EC2 instances. This contrast is indicated by sharp-cornered rectangles in the diagram.
  • The most-common "happy" path for user login has arrows color-coded in red for the outward journey before successful authentication, and green for the return journey after.
  • As in the current system, MSU login does not go through Simplesamlphp but instead Okta communicates directly with SATOSA. This design decision might be revisited after the rebuild.
  • Application state exists in two locations: the database backing COmanage ("RDS") and the LDAP filesystem, which in production could be in EFS or EBS.
  • PWM has been removed, as discussed above. Password management is done by COmanage and provisioned to the LDAP along with other user data. The PWM Ldap has been removed.
  • Shibboleth has been removed, and COmanage is expected to use SATOSA / Simplesamlphp for login. This might need to be revisited if there is some unanticipated requirement of COmanage that can't be satisfied by Simplesamlphp.
  • An additional service, "Discovery" exists in the redesigned system. In the current system, the discovery page is served by the same Apache webserver that serves COmanage. In the redesign, discovery pages are served from a separate container, likely using Nginx.

Plan of Work

  • Create IDMS stack repository
  • Copy configuration files and custom code to IDMS stack repository
  • Get services running in local dev containers
    • SATOSA
    • COmanage
    • OpenLDAP
    • Simplesamlphp
    • Discovery
  • Create local SP stand-in for testing (takes the place of WP / Works when doing local dev)
  • Create local test configuration (COmanage DB)
  • Deploy dev stack to AWS for testing
  • Testing
  • Deploy to production
@mikethicke mikethicke self-assigned this Dec 5, 2024
@mikethicke mikethicke added the wip work-in-progress (issue is incomplete) label Dec 5, 2024
@mikethicke mikethicke assigned tzouris and mikethicke and unassigned mikethicke and tzouris Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wip work-in-progress (issue is incomplete)
Projects
None yet
Development

No branches or pull requests

2 participants