Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

85 - 86 - 87 - Fix medic-users-meta views and make compatible with daily telemetry #91

Merged
merged 6 commits into from
May 21, 2021

Conversation

mrsarm
Copy link
Contributor

@mrsarm mrsarm commented May 18, 2021

Deprecate old medic-users-meta views and create new ones fixing bugs and making compatible with upcoming daily telemetry.

Issues: #85 , #86 , #87

Changes:

Affected views

  • Feedback view: useview_feedback
  • Telemetry view: useview_telemetry

CC @derickl @garethbowen @kitsao

mrsarm added 2 commits May 17, 2021 16:26
…and making compatible with upcoming daily telemetry

- The migration script is backward compatible with the old script introduced in v3.2.x, running different scripts depending if the deprecated scripts were executed before (in the end the last SQL script executed is the same)
- Remove the indexes for both views (telemetry and feedback) based on `user_name` and `period_start` of the record that could cause collisions, instead the `uuid` of the records are used as index to allow concurrent refresh of the views
- The column `period_start` from the telemetry view has the same `DATE` type, but the day component is parsed from the JSON `metadata.day` field if present, otherwise defaulted to 1 as it was before to maintain backward compatibility while accepting daily telemetry records from the upcoming CHT release
- Fix bug that causes records being parsed with months 0-indexed as 1-indexed and vice versa
@mrsarm mrsarm requested a review from dianabarsan May 18, 2021 02:43
@mrsarm mrsarm mentioned this pull request May 18, 2021
@mrsarm
Copy link
Contributor Author

mrsarm commented May 18, 2021

BTW, I kept the file 202102191153.do.54-create-couchdb-users-meta-table.sql in the deprecated/ folder but is not executed anymore either when is executed against a new Porgres DB or an existent one, I just wanted to keep it for reference due the immutability nature of the SQL scripts, but maybe is better to just remove it from HEAD and keep the copy only in our git history.

Copy link
Member

@dianabarsan dianabarsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this. I have a question inline.

// The first migration that contains bugs was executed earlier,
// the migrations under ./migrations/patch are executed to drop
// the views without dropping the tables with the data
const postgratorPatch = new Postgrator({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow. Maybe we don't have to drop the table. Just the materialized view and re-create it. If that's possible, I'd rather do that and keep the code clean without a bunch of hardcoded stuff relating to migration dates and so on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow. Maybe we don't have to drop the table.

Yes, actually the table were not dropped, only the views (and its dependents indexes that were also part of the problem).

I'd rather do that and keep the code clean without a bunch of hardcoded stuff relating to migration dates and so on.

Yes, my fault, I rolled back the changes and added just the changes needed in one script. For some reason testing different scenarios: - installing with clean DBs - installing with existent DB running last stable version - same but the previous stable version that don't have the meta views... and in some of the tests I got an error like the migration couldn't run because previous data already imported, but I guess I misinterpreted the results.

I tested again with the latest changes all the possible scenarios and it worked. Also after run the migrations for the first time, I tried stopping the the process, adding and removing records in Couch, and restarting the process did refresh not just the data but the materialized vires (I could wait also the process to re-run without stopping).

WHERE
doc->>'type'='feedback';

CREATE UNIQUE INDEX idx_useview_feedback_uuid ON useview_feedback(uuid); --> Only to allow the refresh of the view CONCURRENTLY
Copy link
Member

@dianabarsan dianabarsan May 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to this thread, app_services opted to keep the old indexes, without the unique constraint for querying purposes: https://medic.slack.com/archives/C0813CQRX/p1621248880109500?thread_ts=1621245953.105300&cid=C0813CQRX
Same for telemetry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, added again without the unique constraint.

--
-- The views are created again at ../202105171933.do.86.userMetaViews.sql

DROP MATERIALIZED VIEW IF EXISTS useview_feedback;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just run this patch for everyone and remove all the conditional code in the index file?
The IF EXISTS should cover situations where this didn't run right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@mrsarm mrsarm requested a review from dianabarsan May 18, 2021 22:57
@mrsarm
Copy link
Contributor Author

mrsarm commented May 18, 2021

Updated description of the PR with the changes implemented.

Copy link
Member

@dianabarsan dianabarsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. Let's hope this solves some of the current couch2pg problems.

doc#>>'{metadata,year}', --> year
CASE --> month of the year
WHEN
string_to_array(substring(doc#>>'{metadata,versions,app}' FROM '(\d+.\d+.\d+)'),'.')::int[] < '{3,8,0}'::int[]
Copy link
Member

@dianabarsan dianabarsan May 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this wasn't your addition.
Converting semver to int for comparing won't always yield correct results, for example : 3.13.3 compared to 3.7.12.
Let's hope this won't bite us later on, hopefully we won't have > 10 patch releases for a single minor ever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 yes in fact I couldn't test this properly in my local environment, the field metadata.versions.app is set to "unknown" in CHT 3.10 and 3.11, either if the record comes from desktop or mobile (at least in my local tests), but for those cases the result ends up with months 1-indexed.

I'll add one more condition: now that in 3.12+ we will add the metadata.day field, at least for these cases the month will be always 1-indexed, otherwise the condition above will be used.

Copy link
Member

@dianabarsan dianabarsan May 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metadata.versions.app is set to "unknown"

That's because you're not deploying via horticulturalist.... so your ddoc is missing deploy_info

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, didn't know that. The change to compute the month was introduced in 3.2.0 with errors and @derickl provided the patch because the condition was the opposite to the one needed as you also pointed out. I'm not sure what value to expect there , but I guess it is a string value that is cast by that SELECT clause. In the case you mention : "app": "3.13.3", the result from the expression seems to be right. Here is the query with that value that I tested in a psql session:

postgres=# SELECT string_to_array(substring('3.13.3' FROM '(\d+.\d+.\d+)'),'.')::int[] < '{3,8,0}'::int[];
 ?column? 
----------
 f
(1 row)

I also tested it locally editing records in my CouchDB with the values "3.13.3" for one record, and another with "3.7.1", and in both cases I got the result expected in Postgres after run the process. Also testing the "unknown" value and no value at, the results are the expected:

postgres=# SELECT CASE WHEN string_to_array(substring('unknown' FROM '(\d+.\d+.\d+)'),'.')::int[] < '{3,8,0}'::int[] THEN TRUE ELSE FALSE END;
 case 
------
 f
(1 row)
postgres=# SELECT CASE WHEN string_to_array(substring(NULL FROM '(\d+.\d+.\d+)'),'.')::int[] < '{3,8,0}'::int[] THEN TRUE ELSE FALSE END;
 case 
------
 f
(1 row)

So I'm not going to add that extra condition checking the day of the month after all.

@mrsarm mrsarm changed the title 86 - 87 - Fix medic-users-meta views and make compatible with daily telemetry 85 - 86 - 87 - Fix medic-users-meta views and make compatible with daily telemetry May 19, 2021
@derickl
Copy link
Member

derickl commented May 21, 2021

@mrsarm it's probably time we consider introducing this into couch2pg to avoid a similar aftermath with DROP CASCADE (in case it needs to happen again) - with this script we'd have logged and recreated the dependencies.

cc @kennsippell @michaelkohn @yrimal

@mrsarm
Copy link
Contributor Author

mrsarm commented May 21, 2021

@mrsarm it's probably time we consider introducing this into couch2pg to avoid a similar aftermath with DROP CASCADE (in case it needs to happen again) - with this script we'd have logged and recreated the dependencies.

cc @kennsippell @michaelkohn @yrimal

@derickl we were discussing with members of the product team and from other teams about the recent incidents, and we don't have yet agree exactly what are the root causes and the final solution, but I created some tickets with my findings and proposals, and next week we will start a dipper discussion about it ( @njogz is also working on this).

The errors I found and the way I think we can prevent them are described here:

But beyond the technical problems we have, let me define the problem with an hypothetical scenario that can help to understand the root of the problem better: image some teams start to help in the evolution of the code of the CHT, we use Git, that allows to keep track of the incremental changes we make to the source code, and despite Git is a "distributed" version control, in the end anyone that use Git centralize the management of code in a single source of truth: the Github repo.

But imagine that some teams in this hypothetical scenario wants to use Google Drive instead of Git / Github, and other just local versions from their desktop computers. They do make changes, and deploy these changes in prod environments. At some point somebody wants to also deploy cool features added to the CHT but available in the Github repo, and doing so it drops all the changes deployed before. Now we want to rollback the changes, but we are not sure what was deployed, and whether all the previous content was backed up in one of the other sources.

Unfortunately this hypothetical case is not so different of how we manage the changes in the PG databases, we have more than one source of truth , and they are not in sync because are different management systems.

Adding the scripts pointed above will add more sources of truth, and we need only one. Moreover, this script may rollback changes made by the last script that were applied to objects that because were dependents of the object targeted, are rolled back but turns out these views were also modified in the last script executed after delete them, therefore we may be "restoring" the DB to an inconsistent state.

Right now there is no fault on using external scripts to make modifications to the Postgres databases, because we never have provided a way to centralize the changes and allow custom made modifications for each project that can keep in sync with the "main" scripts that are in this repo, that is way I created #93 as a proposal to address this (although better ideas are welcome).

Also as one of the other tickets points out, not handle these changes in a transactional way is a waste of one of the most powerful features that Postgres provide: ACID transaction even for schema changes (other DBs like MySQL don't fully support it).

@mrsarm mrsarm merged commit 84b4047 into master May 21, 2021
@mrsarm mrsarm mentioned this pull request May 21, 2021
@ngaruko ngaruko requested review from ngaruko and removed request for ngaruko May 23, 2021 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants