-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update for TestTokenQueries.py and TestWebserver.py #975
Conversation
Using str.format() to update the log message to be matched with the current conda environment name. Then using encode() to convert it back to a byte string. ------ Python byte strings https://stackoverflow.com/questions/6224052/what-is-the-difference-between-a-string-and-a-byte-string https://www.digitalocean.com/community/tutorials/python-string-encode-decode https://stackoverflow.com/questions/67680296/syntaxerror-f-string-expression-part-cannot-include-a-backslash Found encode() / decode() usage in bottle.py e-mission-server/emission/net/api/bottle.py utf-8 not needed as it is default.
…bles ------- A. TestWebserver.py Modified the code to have both approaches to read from config file as well as environment variables. The config files approach is based on the previous code logic that was used in this file to copy webserver.conf.sample as webserver.conf. e-mission@b20cc36 The difference is I'm taking a backup of the webserver.conf file if present. Then writing the test values to this file before running tests. In teardown(), the original webserver.conf is restored from the backup file. ----------- B. cfc_webapp.py WEBSERVER_NOT_FOUND_REDIRECT was missing in the config var_map passed to get_config. It was being read using config.get in the variable not_found_redirect. Hence added it.
Tests failed in CI/CD. On testing locally it was able to fetch the `emissiontest` name. But in the CI/CD, it is defaulting to use the `emission` environment name. I did test locally using the steps in the manual install workflow file to use the activate_tests.sh and setup_tests.sh scripts. Now testing locally by running the docker-compose.tests.yml which is used in the test-with-docker workflow.
The tests failed since despite using ENV vars, it wasn’t picking up value. This was because ENV vars are being deleted in setUp(). Added CONDA_DEFAULT_ENV to the skipped env vars. However, this approach of fetching ENV vars programmatically in the assert statement isn’t correct. This is because the code in get_config() itself uses ENV vars to set the ret_val which is the exact value as being fetched in setUp(). Thus the values will always match. ——— Then I thought of adding another log message for the `emission` environment name. While testing this I found that the 1st message isn’t being used in the normal situations when testing locally. This message is observed when the conf/storage/db.conf file is used instead of the sample file. But in normal operation, the server repo doesn’t have any conf files to begin with. Additionally, we now set the DB_HOST as db and WEB_SERVER_HOST as 0.0.0.0 in the Dockerfile by default. Hence the assert log message should also contain `Config file not found…` message like the other two messages. It should also have DB_HOST set to Db like the 3rd message.
The CI/CD tests had failed for both manual install and test-with-docker workflow. On looking at the test
So I went ahead and added CONDA_DEFAULT_ENV to the list to be excluded from deletion. Updated the config log messages and it all seemed okay. |
I followed an incorrect testing approach in But then I questioned myself, is this the right approach to test using Probably not. In this case, values will always match and I was testing a value with itself. |
For the changes to TestTokenQueries.py. I updated the 1st assertion log message to include the DB_HOST and the While testing, I found that the 1st message isn’t being used in the normal situations when testing locally which involves working in the This message is observed when the Additionally, we now set the Hence the assert log message should also contain This was confirmed on running the test locally by building the docker image and running container:
|
@MukuFlash03 Before diving into the details of the changes, I would like to understand the problem that these changes are trying to fix. The CI/CD pipeline was working fine without the changes. Why do we need the changes in the first place?
This is not incorrect, but I am not sure why it is required. The tests were working before this change. Can you clarify what is the specific use case you are trying to address here? What is the scenario in which this failed and why do you need to test that scenario?
I don't understand this comment. What is the exact value you are fetching? The token code doesn't use or rely on
But the Dockerfile is not involved in the local testing scenario.
EDIT: I see that this is documented internally. However, I still think it would be helpful to map each change to the specific problem that you are trying to solve. |
# Backwards_Compatibility method to read from config files | ||
self.webserver_conf_path = "conf/net/api/webserver.conf" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when will we ever have a webserver.conf
file that we need to back up in a test environment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when will we ever have a webserver.conf file that we need to back up in a test environment?
After clarification regarding the testing environments, I do not think we'll have this situation and won't need changes in TestWebserver.py
Note also that we currently support testing in three environments:
In a linux CI/CD environment, using docker compose
In a linux CI/CD environment, running from a local install
In a Mac environment, running from a local install
The keywords here are "test environment".
I had assumed that we might have conf files present in test environment in case someone wanted to check whether values were being read correctly from the config file.
If the test was run in the presence of webserver.conf file then it was failing and hence I thought it was something to fix.
But if it is true that we will not have any conf (non-sample) files in testing environments, then the changes related to TestWebserver.py
are not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we need to consider the use case of an arbitrary config file in automated tests. The whole point of the automated tests is that they run in a known, reproducible environment and generate known, reproducible results. This is particularly true for backwards compatible code, which will be removed soon.
Having said all that, more tests are not bad. As I said "This is not incorrect, but I am not sure why it is required.". If you can make the case for having this test, we can include it and remove it when we remove the backwards compat code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can skip the changes in the test since I wasn't adding any new tests. I had just added code that ensured we read from the config file (non-sample) if present. But it's clear to me now that this isn't necessary in a known testing environment.
@@ -168,24 +168,24 @@ def test_run_script_show(self): | |||
esdt.insert({'token':'y'}) | |||
esdt.insert({'token':'z'}) | |||
sp = subprocess.run(["python3", "bin/auth/insert_tokens.py", "--show"], capture_output=True) | |||
# The first message is displayed when we run tests locally | |||
# The first message is displayed when we run tests locally with the emission environment; we are now setting DB_HOST to `db` by default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is true only in dockerized environments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I see the difference now in the testing environment I was using.
With regards to TestTokenQueries.py, the two issues I've observed are: Problem 1: Conda environment name differs:
|
With regards to Problem 2: Tests failing Seeing extra values regarding apple security, codex, cryprex, etcScenario: Testing locally directly on my MacOS without running any Docker container with the emission server code. Found some info about these These mention that it's a part of security measures that Apple releases. On running
To confirm if my Mac is the only one facing issues, I requested @nataliejschultz to run it on her Mac as well. Steps to reproduce / test
Mukul's Test logs:
Natalie's Test logs
|
Also, for the 2nd problem for I understand @shankari did not observe this when running it locally on her Mac:
Ignoring the So, the config message would have been this as per the comments in TestTokenQueries.py:
But I with the way ecbc.get_config works, if config files are not present, the
So, I'm guessing when Shankari ran the tests on her Mac, the config file would have been present and hence the Additionally, I'm assuming the Environment variables will not be set manually in a non-docker local setup. These are the relevant commits where these log messages were added to the TestTokenQueries.py (commit) and backwards_compat_config.py (commit) |
So summarizing config file related concern from these comments (1, 2) If valid assumption:
|
For the CONDA_DEFAULT_ENV discussion in this comment, what I meant by The following execution path is followed in TestTokenQueries.py for the two tests with the config messages assertions:
bin/auth/insert_tokens invokes code in
On comparing the flow of value of Step 3:
vs Step 4
Thus I'm not sure if this is the right approach to have assertion tests. |
Details in this PR comment: e-mission#975 (comment)
…NV variables" This reverts commit 1dc4275.
WEBSERVER_NOT_FOUND_REDIRECT was missing in the config var_map passed to get_config. It was being read using config.get in the variable not_found_redirect.
Correct, I had a local config file for manual testing which I should have removed before running the automated tests.
Correct. Not sure what the implication of that is. Are you seeing the values printed in the log messages? Or are you not seeing them but you expect to see them?
More tests are not bad. But tests to check custom configurations are not that useful. If I had a custom configuration in which I had loaded a bunch of data into the database, tests that count the number of entries (for example) will fail if the UUID is the same. If we can frame this testing as focused on checking the backwards compat code, it is likely to be more meaningful.
Agreed.
Yes, you need to fix this before we are done.
I don't understand this. You are not comparing Concretely, if you have an expected string "Config file not found... emissiontest....", but you are running the tests when the environment is emission, you can change the expected string to "Config file not found.... emission....". This will now match the string to test. I don't see what is wrong with that or how it is "testing with itself". |
The PATH values vary based on a user's local system and could have additional values which could fail the test in a local environment run without docker containers. Why do we even need the PATH, PYTHONPATH values in the assertion log messages. This is because we are printing the config in get_database.py which contains environment variables or config file values. Now, the question is why to print the entire set of values if we are only using DB_HOST, DB_RESULT_LIMIT? I am now printing only these values in the get_database.py. This has two benefits: - Allows to eliminate variable environment variables like PATH that might vary in local testing environments for every user. - Since PATH is no longer printed, helps avoid issue of 'apple security' values. - Helps reduce the number of log messages the 4 testing environments: local without docker, local with docker, CI/CD without docker, CI/CD with docker. The two docker environments have the same log message; similarly for the two non-docker environments. Thus we only need two log messages in the assertion tests.
Right, understood. My changes in TestWebserver.py aren't relevant, as I hadn't added any new tests but added code to read in from config file (non-sample) too if present.
Fixed in this latest commit, by not printing out the PATH and hence no need to handle the security values. Additionally, PATH might include additional paths and vary for each user in their local environment. For instance, Natalie's PATH value had
Right I understand.
|
emission/core/get_database.py
Outdated
url = config.get("DB_HOST", "localhost") | ||
result_limit = config.get("DB_RESULT_LIMIT", 250000) | ||
print("Retrieved config: {'DB_HOST': '%s', 'DB_RESULT_LIMIT': '%s'}" % (url, result_limit)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MukuFlash03 the problem with printing these values after the config.get
calls is that it is not sufficient for debugging. You can see the final value of the variables here, but you also see them on line 34. The point behind putting the "Retrieved config" before the config.get
is so that we could see the inputs from the environment before they were modified by the defaults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging these values before using default values.
Notes in commit message.
With reference to this commit: e-mission#975 (comment) So we have two key things to solve: - Unnecessary variables like PATH, that may vary for each user to be removed (helps avoid "apple security" values). - Need to see ENV vars values before being modified by default values in config.get() lines. Two approaches: 1. Delete PATH, PYTHONPATH from the retrieved config. - This wouldn't help much since we still need to handle the scenario when db config environment values haven't been set. 2. Using another config dictionary before modifying with default values - If db config related keys exist, set it to the value from the retrieved config. - Else if key does not exist in retrieved config, which means config file not present AND environment variables not set, then set it to None. Note: - This is not the same as changing the retrieved config value as the original dictionary is still intact. - Additionally, in the case where config file not present AND environment variables not set, there are no values to before modifying with default values. So manually setting to None should be okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has now taken 10 days and 10 commits and is not yet fixed. I am taking over.
db_config = {} | ||
for key in ["DB_HOST", "DB_RESULT_LIMIT"]: | ||
if key in config: | ||
db_config[key] = config.get(key) | ||
else: | ||
db_config[key] = None | ||
print("Retrieved config: %s" % db_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like the approach of changing code to make the tests pass, particularly when this is not even critical to the functionality exercised by the failing test.
The test that is failing really needs to check whether the script is working properly or not.
To rephrase, the previous logs actually have other benefits in terms of logging - for example, they can tell us whether the variables came from the environment (there are other variables) or from the config file (there are not).
This may even be meaningful if this was affecting multiple tests.
But this is really only affecting one test, which doesn't even care about the values of the variables, and only cares about the operation of the script with various input options. So changing the code doesn't appear to be the correct approach to fix this.
Principled fix for the issue outlined in e-mission#975 The fundamental issue is that we check the output to validate that the script is behaving as expected. However, the script output can vary in different environments. We had originally handled this by having multiple expected output strings. But as we expand the scope of what we print out, the set of expected output strings expanded. Some of the strings also included paths, so had the username embedded, so would be hard to enumerate in the test. Those logs are also not critical to this test, which is focused on the response to various arguments passed in to the script. So we fix this by simply removing the parts that vary across environments, replacing them by `REDACTED`. As a bonus, this also allows us to collapse the previous two strings into one, and supports all three testing environments: (i) locally on laptop, (ii) locally in docker, and (iii) docker CI/CD (although (ii) is not really supported since it is not tested extensively). Testing done: ``` $ ./e-mission-py.bash emission/tests/storageTests/TestTokenQueries.py ---------------------------------------------------------------------- Ran 21 tests in 22.862s OK ```
Was causing an exception leading to "push not configured" message everytime irrespective of whether config files present or env variables configured. Code with error: ``` try: logging.info(f"Push configured for app {push_config.get('PUSH_SERVER_AUTH_TOKEN')} using platform {os.getenv('PUSH_PROVIDER')} with token {os.getenv('PUSH_SERVER_AUTH_TOKEN')[:10]}... of length {len(os.getenv('PUSH_SERVER_AUTH_TOKEN'))}") except: logging.warning("push service not configured, push notifications not supported") ```
Squash merging to avoid commit churn |
Made some fixes to these tests based on issues observed while testing the internal images.
A. TestTokenQueries.py
The tests (tests_run_script_empty() and test_run_script_show()) were failing since the conda environment name wasn't matching the standard environment name on running tests locally.
emissiontest
is used in the tests manual Github actions workflow whereasemission
is the standard one.The solution was to use the
CONDA_DEFAULT_ENV
value from the environment variables that will ensure that the config log messages used in the assert statements matches the current environment.B. TestWebserver.py
After a lot of failed testing and some different approaches, I decided to bring back some of the earlier logic that was used to handle the testing in this file by making copies of the config file. The commit for the earlier logic is here.
I am now taking a backup of
webserver.conf
(if exists), running tests, restoring from the backup file.The newer approach to read from environment variables entirely is also retained.
The testing was done with three scenarios similar to Shankari's checks in this commit.
No backup is taken as
webserver.conf
is not there in the first place.Test values used directly.
Conf file present, hence backup taken.
Tests value including
404_redirect
are stored inwebserver.conf
file.Restored file values show that
404_redirect
is absent since original backup is restored.Conf file present, hence backup taken.
Tests value including
404_redirect
are stored inwebserver.conf
file.Restored file values show that
404_redirect
is present since original backup is restored.404_redirect
URL is different from test values.