Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy to ec2/Manage OTP servers at the application level #463

Merged
merged 42 commits into from
Oct 9, 2019

Conversation

landonreed
Copy link
Contributor

@landonreed landonreed commented Aug 7, 2019

Checklist

  • Appropriate branch selected (all PRs must first be merged to dev before they can be merged to master)
  • Any modified or new methods or classes have helpful JSDoc and code is thoroughly commented
  • The description lists all applicable issues this PR seeks to resolve
  • The description lists any configuration setting(s) that differ from the default settings
  • All tests and CI builds passing

Description

Note: this code was taken from catalogueglobal#256

This is the UI PR to accompany ibi-group/datatools-server#225. It moves the management of OTP servers from the project settings to the site admin settings and allows servers to be optionally available to all projects in an application (or project specific). This also resolves catalogueglobal#234, another deployment-related ticket.

This also fixes the merge project feeds job to conform with the standard approach of attaching just the ID of the related entity (project in this case) to the job JSON object, rather than the whole project object.

To test out the actual deployment @evansiroky, you can create an OTP server in the app at http://localhost:9966/admin/servers based on the following JSON I copied from my Mongo instance:

{
    "admin" : false,
    "instanceType" : "r4.xlarge",
    "internalUrl" : [],
    "name" : "IBI Test",
    "publicUrl" : "ibi-test-otp-828479082.us-east-1.elb.amazonaws.com",
    "targetGroupArn" : "arn:aws:elasticloadbalancing:us-east-1:739199387312:targetgroup/ibi-test-otp/c28d4b2ff45fe3e2",
    "projectId" : null,
    "s3Bucket" : "datatools-production"
}

@codecov-io
Copy link

codecov-io commented Aug 8, 2019

Codecov Report

Merging #463 into dev will decrease coverage by 0.01%.
The diff coverage is 15.64%.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev     #463      +/-   ##
==========================================
- Coverage   16.27%   16.26%   -0.02%     
==========================================
  Files         309      313       +4     
  Lines       15308    15739     +431     
  Branches     4666     4789     +123     
==========================================
+ Hits         2492     2560      +68     
- Misses      10954    11260     +306     
- Partials     1862     1919      +57
Flag Coverage Δ
#end_to_end_tests ?
#unit_tests 16.26% <15.64%> (-0.02%) ⬇️
Impacted Files Coverage Δ
lib/types/reducers.js 0% <ø> (ø) ⬆️
lib/types/index.js 0% <ø> (ø) ⬆️
lib/common/util/config.js 75.38% <ø> (+7.52%) ⬆️
lib/public/components/PublicLandingPage.js 0% <ø> (ø) ⬆️
lib/admin/components/AdminPage.js 0% <0%> (ø)
lib/admin/components/UserList.js 0% <0%> (ø) ⬆️
lib/manager/containers/ActiveDeploymentViewer.js 66.66% <0%> (-13.34%) ⬇️
lib/common/containers/App.js 0% <0%> (ø) ⬆️
lib/admin/components/OrganizationList.js 0% <0%> (ø) ⬆️
lib/admin/components/ServerSettings.js 0% <0%> (ø)
... and 29 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20fe8b5...4d0ef23. Read the comment docs.

<li>
Deployed to: {this._getServerLabel()}
</li>
<li>Using ELB?: {ec2Info ? 'Yes' : 'No (built graph over wire)'}</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since everything below here is so dependent on using an ELB, I think it may be useful to have 3 distinct types of deployments:

  • deploy just the bundle to s3
  • build graph over wire deployment
  • ELB/EC2 deployment

In doing this, we can also modify the forms such that only the needed fields for each deployment type are shown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are not mutually exclusive types though. For example, you can build the graph over the wire and deploy to s3. Also, the S3 bucket field is necessary for ELB. At the moment this UI is essentially just limited to application admins (i.e., us), so while I think it is important to improve the UI of this form, I think that deserves a separate conversation at this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make things simpler if those 3 items were mutually exclusive. Anyways, this will be something to explore as we get feedback from clients.

@landonreed
Copy link
Contributor Author

I've made a number of changes/improvements. Thanks for the great feedback!

One thing you'll need to change on your end is updating the OtpServer field name in MongoDB from ec2Info.iamRoleArn -> ec2Info.iamInstanceProfileArn. Other than that, I think all you need to do is click Deploy and test out the changes!

@evansiroky
Copy link
Contributor

While these changes are a step in the right direction and do provide raw information about what the ec2 instances are running, I still think we should rethink the design of deployments as noted here and here. Right now, when I go to view a deployment, the amount of screen real estate devoted to information about the active deployment is very small compared to the map and other items which if changed have no effect on the active deployment. Furthermore, the raw information about the bundle is helpful, but the details about what feed versions were used are hard to retrace.

Also, in my testing of deploying 10 instances, basically every time an I tried, there was some kind of failure. I still think there are more opportunities to implement more detailed fault tolerance with automatic retrying and for better progress reporting so that it is easier for a user to understand what's going on and debug what went wrong. Ideally, we'd have more detailed logs that we can be automatically parsed to examine where failures occurred so that we know where to focus our efforts and detect any regressions.

Also, there's no way to add an extra instance to a deployment, which I can imagine might be desired if not all of the instances start up correctly. If we're trying to get something shipped fast, then we can probably merge this once the Travis builds pass, but we should be prepared to tell clients that things aren't perfect and train them a bit on how the UI works.

@landonreed
Copy link
Contributor Author

Can you describe more fully all of the errors you encountered when starting up the 10 instances? Also, what commits/feeds were you using? Yesterday, I pushed some more commits due to some issues I faced with the FDOT instance that may fix a few things.

I think our goal here for this PR should be to eliminate the failing instances you're encountering while starting up. The goal for this PR is to introduce this new functionality that dramatically reduces the risk for downtime due to deploying with build graph over wire. I understand your interest in seeing additional features like starting up/terminating single instances, but my perspective on that remains that for now that can entirely be handled by re-deploying to the ELB. Sure, that may change after we've used this feature for a bit, but I don't want to over-speculate on what needs to be in here and spend time building out unneeded features.

@evansiroky
Copy link
Contributor

Each time, there was a different error, unfortunately. But I just tried again this morning and everything went smoothly. Most of the time, the errors seemed to be interruptions in uploading or downloading stuff. But based on my experience, I think we should be prepared for our clients to also see various random errors.

I still also think the UI could use a redesign to have more rigidly defined deployment types and to better distinguish between active deployments and deployment templates, but if we really want to get this shipped and improve those items later then we can merge.

@evansiroky
Copy link
Contributor

Travis builds need to pass, but should be good to go after that.

@evansiroky evansiroky assigned landonreed and unassigned evansiroky Oct 2, 2019
@landonreed landonreed assigned evansiroky and unassigned landonreed Oct 8, 2019
@landonreed
Copy link
Contributor Author

@evansiroky, any ideas on why the snapshots are failing here but passing on our local machines?

@landonreed landonreed merged commit 610cabc into dev Oct 9, 2019
@landonreed landonreed deleted the deploy-to-ec2 branch October 9, 2019 21:37
@landonreed
Copy link
Contributor Author

🎉 This PR is included in version 4.1.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deploy to R5
3 participants