Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webapi.source JDBC string results in connection refused in develop branch but not in main branch #93

Open
RomainTching opened this issue Jun 9, 2023 · 11 comments

Comments

@RomainTching
Copy link

Hi!
In order to try out the openldap implementation for user management, I switched to the develop branch and managed to implement it (again, thanks for the help!).
But now I noticed that a few inconsistent behaviours from https://<hostname>/WebAPI/source/sources and https://<hostname>/WebAPI/source/refresh, depending on the Broadsea set up:

  • When using the main branch (and HTTPS protocol) I get the list of sources as expected, including the default EUNOMIA and the source I added manually to webapi.source and webapi.source_daimon
  • When switching to the develop branch BUT not configuring the OpenLDAP server security option, I get an empty page, and checking docker logs ohdsi-webapi shows an error "Connection to localhost:5432 refused...". I assume I can't use localhost and need to use the actual host IP and maybe even change some postgres configuration, as suggested here (not tested yet). But removing my source data (only leaving Eunomia entries) and restarting the docker allows to at least show the Eunomia entries in https://<hostname>/WebAPI/source/refresh and https://<hostname>/WebAPI/source/sources
  • When setting up the openldap solution nothing shows up, not even Eunomia

So on one hand I wanted to point out this difference between the main and develop branches (no error message in main) to make sure you were aware of it and it is an expected behaviour.
On the other, I was hoping to clarify why not even Eunomia shows up when the OpenLDAP is configured and how to fix that, since then I can't see the sources in ATLAS even when I do figure out exactly what to put as hostname in the JDBC...

@alondhe
Copy link
Collaborator

alondhe commented Jun 9, 2023

With security enabled, those "source" endpoints aren't supposed to be available just straightaway. They're supposed to only be available with a session token from your ldap.

The way to verify sources are working would be through the Atlas Config GUI, or by using an API call with a session token, such as the method provided in ROhdsiWebApi.

@RomainTching
Copy link
Author

But then once you are connected to ATLAS through LDAP, do you not have a valid session token that allows access to the sources? Does this also need to be configured in Apache Directory studio?

Interestingly, I notice that when I set the JDBC string host to broadsea-atlasdb (source IBD in the screenshot below) instead of localhost or the IP of the host where the CDM database is, the check is validated. Could you also clarify this? Should I use broadsea-atlasdb for all sources, even though the database is not in any of the Broadsea dockers?
atlas_sources_check

Finally, I was also wondering, does this affect the HADES/RStudio server since it's on the same network? Do I need to somehow (maybe using ROhdsiWebApi as you suggest) retrieve a LDAP token to query the database from RStudio, even though it does not use the LDAP authentication?

I really grateful for your help!

@alondhe
Copy link
Collaborator

alondhe commented Jun 12, 2023

But then once you are connected to ATLAS through LDAP, do you not have a valid session token that allows access to the sources? Does this also need to be configured in Apache Directory studio?

What I mean is, going via web browser to the /source API endpoints is not valid because the browser doesn't have the session token on its own. That's expected because with security enabled, we should not allow any users to refresh the sources as that is an admin task. To use those endpoints, you'd need to use Atlas and log in (and be an admin), or use a tool like Postman or some other programmatic method (e.g. R, Python, etc) to log in (get a session token) and then invoke the API endpoints.

TL;DR: going to the /source endpoints with your browser after enabling security is not supposed to work.

Interestingly, I notice that when I set the JDBC string host to broadsea-atlasdb (source IBD in the screenshot below) instead of localhost or the IP of the host where the CDM database is, the check is validated. Could you also clarify this? Should I use broadsea-atlasdb for all sources, even though the database is not in any of the Broadsea dockers?

Can you walk me through where the 3 non-Eunomia CDMs are hosted?

@RomainTching
Copy link
Author

What I mean is, going via web browser to the /source API endpoints is not valid because the browser doesn't have the session token on its own. That's expected because with security enabled, we should not allow any users to refresh the sources as that is an admin task. To use those endpoints, you'd need to use Atlas and log in (and be an admin), or use a tool like Postman or some other programmatic method (e.g. R, Python, etc) to log in (get a session token) and then invoke the API endpoints.

Ah ok, sorry I misunderstood what you meant. Yeah that makes sense.

Can you walk me through where the 3 non-Eunomia CDMs are hosted?

They are all hosted in different schemas of a postgres database on a server (the database is not in a container). The Broadsea containers are on that same server. The difference between the three is that I tried different JDBC strings in webapi.source (I'm using the broadsea-atlasdb container for webapi) since I don't have direct access to the web application and need someone else to connect and try to see if connection works. Here is the JDBC string for the three, with redacted username and password for the database:

jdbc:postgresql://broadsea-atlasdb:5432/postgres?user=<default>&password=<default> # IBD, identical to the default EUNOMIA entry
jdbc:postgresql://<resolvable hostname>:5432/charite?user=<db user>&password=<user password> # Diabetes
jdbc:postgresql://localhost:5432/charite?user=<db user>&password=<user password> # Cancer

@alondhe
Copy link
Collaborator

alondhe commented Jun 12, 2023

Is your Broadsea Host env variable set to 127.0.0.1? Perhaps that is the reason for the issue with the cancer one being on localhost.

For the diabetes one hosted on another server, I think it might be good to verify that there's no firewall / AWS sec group rules blocking it.

@RomainTching
Copy link
Author

No the Broadsea Host is set to the <resolvable hostname>. I'll look into potential firewall issues.
But then is there any explanation why connection for the IBD one works? Again, all three sources are in the same database on the host, just different schemas, I did not load any of them in the containers. The JDBC are different only because I wanted to try out different combinations and see if I could find one that works.

@RomainTching
Copy link
Author

Also, I forgot to mention, but only the Cancer entry shows up in the Data Sources tab (not even Eunomia, which I haven't touched).
atlas_data_source_only_cancer

I also tried checking the logs from ohdsi-webapi and here are errors I found for each of the three sources:

  • Cancer: Connection to localhost:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
  • Diabetes: FATAL: no pg_hba.conf entry for host "<redacted, container IP>", user "<db user>", database "charite", SSL off
  • *IBD: I could not find an error message directly relative to IBD (although there was one complaining about missing Achilles results for it which makes sense since it was not run yet). There were however errors lines such as jobName=warming cache: EUNOMIA,IBD, source_key=EUNOMIA,IBD, source_id=-1 where EUNOMIA and IBD seemed to be combined as one ID, and I have no clue where that comes from

This makes me wonder whether I misconfigured the pg_hba.conf file, but I've also tried several combinations there without success.

@alondhe
Copy link
Collaborator

alondhe commented Jun 15, 2023

So it sounds like IBD is fine, just need achilles tables. That cache warming line, was there an actual error stated? I just see the job description.

For Cancer, I think resolving "localhost" seems to be an issue for the ohdsi-webapi container, it must be trying to use localhost within the container. If 127.0.0.1 is not set as your BROADSEA_HOST, maybe try using that for the host name?

For Diabetes, I'm not clear on the error here. If you use the develop branch (and update to the latest version of it), you can set the WebAPI log levels to get more details (again, only with latest commits from develop branch): https://github.com/OHDSI/Broadsea/blob/develop/.env#L33-L35

@alondhe
Copy link
Collaborator

alondhe commented Jun 28, 2023

Hi @RomainTching - any luck?

@RomainTching
Copy link
Author

Hi @alondhe, sorry I got caught up in testing and setting other things up. But yes, a little over a week ago I finally managed to establish the connection. The issue was that for the host I needed to use the IPv4 address from the "docker bridge" entry of the network. For people like me who struggle over these network configurations notions:

  1. Run ifconfig on the host
  2. identify the docker0: entry and note the inet IPv4 address (from what I understand online it is 172.17.0.1 by default). That is the host address to use in connection parameters from within the containers, including the JDBC connection string
  3. In case you are restricting connection accesses to the database in the pg_hba.conf file (if using a postgres database for the data), make sure the range of the containers (can be found connecting to one of the containers with eg docker exec -it broadsea-atlasdb /bin/bash and running ifconfig from there) also has an entry there

Now things seem to work fine, thanks a lot for all your help and patience!

@RomainTching
Copy link
Author

Hi @alondhe, sorry I got caught up in testing and setting other things up. But yes, a little over a week ago I finally managed to establish the connection.

Actually I stand corrected, I managed to query the database with that JDBC string from the HADES/RStudio container, but with that same string used in webapi.source access is still denied in ATLAS and I can't see the sources in the Data Sources tab...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants