Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy OTP to AWS load balancer and manage OTP servers in separate collection (and misc other fixes) #225

Merged
merged 50 commits into from
Oct 9, 2019

Conversation

landonreed
Copy link
Contributor

@landonreed landonreed commented Aug 8, 2019

Checklist

  • Appropriate branch selected (all PRs must first be merged to dev before they can be merged to master)
  • Any modified or new methods or classes have helpful JavaDoc and code is thoroughly commented
  • The description lists all applicable issues this PR seeks to resolve
  • The description lists any configuration setting(s) that differ from the default settings
  • All tests and CI builds passing

Description

Note: this contains code originally contained in catalogueglobal#124

We have come to realize that the way we deploy new graphs to OTP (making a "build over the wire" request to existing EC2 servers) isn't reliable at all. This PR in conjunction with a UI PR adds to Data Tools the option to set up a "server" that targets a load balancer (rather than an existing EC2 server). It will:

  • spin up a new EC2 instance,
  • have the user script build the graph,
  • optionally upload the graph to S3, and finally
  • run OTP.

Another important change here that is related to deployments is to move the OtpServer objects stored in the Project POJO to their own MongoDB collection. Now, these objects are tied to projects by projectId and fetched via a JsonProperty-annotated getter method. This additionally allows for one of these server targets to be tied to a project or (if projectId is null) available across all projects in the application.

Finally, this PR does some other housekeeping:

fixes catalogueglobal#120
removes unused/legacy snapshot classes
fixes catalogueglobal#110
fixes catalogueglobal/datatools-ui#234

Notes on configuration

Instructions for setting up EC2 auto deploy:

1. Create a new application load balancer in the desired VPC.
2. Create a target group for the load balancer (arn is defined in OTP server)
3. Ensure you have an AMI with the appropriate software installed and nginx configuration (set at `deployment.ec2.default_ami`).
4. Create an OTP Server at http://localhost:9966/admin/servers with `Use Elastic Load Balancer (ELB)?` checked. Make sure to add required fields (Subnet ID, Security Group ID, IAM Role ARN, Key file name (without .pem), Target Group ARN) to the server.
6. Deploy a feed to your new server!

Deployment module in server.yml should look like:

  deployment:
    enabled: true
    ec2:
      enabled: true
      default_ami: ami-041ee0ca5cd75f7d7

@codecov-io
Copy link

codecov-io commented Aug 8, 2019

Codecov Report

Merging #225 into dev will decrease coverage by 1.1%.
The diff coverage is 3.32%.

Impacted file tree graph

@@             Coverage Diff              @@
##                dev     #225      +/-   ##
============================================
- Coverage     19.83%   18.72%   -1.11%     
  Complexity      361      361              
============================================
  Files           149      150       +1     
  Lines          7462     7963     +501     
  Branches        996     1076      +80     
============================================
+ Hits           1480     1491      +11     
- Misses         5850     6339     +489     
- Partials        132      133       +1
Flag Coverage Δ Complexity Δ
#unit_tests 18.72% <3.32%> (-1.11%) 361 <6> (ø)
Impacted Files Coverage Δ Complexity Δ
...atools/manager/controllers/api/UserController.java 39.35% <ø> (ø) 16 <0> (ø) ⬇️
...com/conveyal/datatools/editor/models/Snapshot.java 0% <ø> (ø) 0 <0> (ø) ⬇️
...veyal/datatools/manager/persistence/FeedStore.java 26.35% <ø> (ø) 13 <0> (ø) ⬇️
...ls/manager/jobs/NotifyUsersForSubscriptionJob.java 0% <ø> (ø) 0 <0> (ø) ⬇️
...ools/manager/controllers/api/StatusController.java 13.04% <ø> (ø) 2 <0> (ø) ⬇️
...onveyal/datatools/manager/models/Organization.java 0% <ø> (ø) 0 <0> (ø) ⬇️
...om/conveyal/datatools/manager/models/Snapshot.java 0% <0%> (ø) 0 <0> (ø) ⬇️
...m/conveyal/datatools/manager/models/OtpServer.java 0% <0%> (ø) 0 <0> (ø) ⬇️
...ols/manager/controllers/api/ProjectController.java 12.5% <0%> (+0.19%) 2 <0> (ø) ⬇️
...com/conveyal/datatools/manager/jobs/DeployJob.java 0% <0%> (ø) 0 <0> (ø) ⬇️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 568af07...240a6e0. Read the comment docs.

if (!deployment.r5) {
lines.add(String.format("aws s3 --region us-east-1 cp %s/%s %s ", routerDir, OTP_GRAPH_FILENAME, getS3GraphURI()));
String s3BuildLogPath = joinToS3FolderURI(getBuildLogFilename());
lines.add(String.format("aws s3 --region us-east-1 cp $BUILDLOGFILE %s ", s3BuildLogPath));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may make more sense to upload the logs first in case the graph wasn't built. If the graph wasn't built, the script might error out on a file not found error perhaps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will that actually happen? I suppose I can make the change regardless, but I don't think the execution of any line is dependent on the previous line, is it?

@evansiroky
Copy link
Contributor

See other comments here.

* Construct the user data script (as string) that should be provided to the AMI and executed upon EC2 instance
* startup.
*/
private String constructUserData(boolean graphAlreadyBuilt) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm increasingly hesitant about the design of this method. I think this is going to limit us moving forward to be writing bash in java strings. Something I would really like to see is more detailed reporting and logging of the progress of building and loading a graph and automatically retrying if something fails that can be retried. We could attempt to do this with bash, but it would be ugly. If we're trying to ship this quick, then fine I can live with it, but I think we should be making a separate startup script specifically for this where we can do more detailed things. Furthermore, right now this is basically using Amazon s3 as a database of sorts for storing various information when we already have a mongo and postgres instance. Maybe it would make sense for the startup script to directly edit the mongo db or maybe make queries to datatools-server. Either way, we should be thinking about this feature long-term and about what the easiest way to create maintainable code will be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments. Perhaps there is a better way to handle what gets run in user data, but at some level it is going to require running some bash (unless we opt to include some kind of startup script on the AMI or write a script to a file that gets fetched by the instance). Because of this, I think our best option for now is to keep everything contained in a single place -- this bash script.

Regarding using S3 as a database, can you describe what you mean more fully? We're just storing static files that were generated as a result of the deploy process. We also store GTFS zip files in S3 as well. I'm really not too sure of what the concern there is. The alternative you're proposing as I understand it (the OTP instance connecting to the mongo instance and handing off files/messages) sounds far more complex.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I'm saying. I'm recommending creating a script in some kind of programming language - be it java, python or JavaScript. Then this startup script would communicate directly with either mongo db or datatools-server via HTTP requests perhpas. As it is right now, all of this bash is in one place, but we have to construct it using strings in java. Therefore, we can't view it nicely formatted in an IDE, we can't lint or do static typing analysis, it's much harder to import 3rd party libraries and it is impossible to write isolated tests.

Regarding s3 I'm fine with most of what's going on. Most of the data makes sense to store in a raw format (logs, GTFS files, OSM pbf). The thing in particular that I'm concerned about is constantly checking for bundle download and graph build completion and who knows what else we may add in the future by checking for the existence of an s3 file. I'm not recommending that OTP connect to the mongo instance, but I am recommending that a replacement of the bash script do this and handle the checks of bundle download, graph build, and graph load. These intermediate files that are waited upon seem like the more complex option to me when the bash replacement script could just be writing to the mongo database directly or making an HTTP request to datatools-server, or maybe even using some other messaging framework.

Furthermore, using s3 as a database like this will cost more money since s3 has per-request pricing so the more we scale the more this will cost us. The cost seems lowish at $0.005 per 1k requests, but let's say new graphs are built on a nightly basis. And let's say graph building takes 1 hour for a largeish region with elevation data. So, it'll make an s3 request every 5 seconds for an hour or 3,600 requests per day and 1.3 million a year. If we have 100 regions doing this, the cost of just these s3 requests would be $657 a year compared to $0 for using a database or sending over an HTTP request to the datatools-server instance.

@@ -121,6 +130,16 @@ public String getDeploymentId () {
return deployment.id;
}

/** Increment the completed servers count (for use during ELB deployment) and update the job status. */
public void incrementCompletedServers() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drake yes

@@ -88,52 +92,70 @@ public void jobLogic() {
}
// First, check that OTP has started up.
status.update("Prepping for graph build...", 20);
String bundleUrl = String.join("/", ipUrl, BUNDLE_DOWNLOAD_COMPLETE_FILE);
long bundleDownloadStartTime = System.currentTimeMillis();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this addition as well.

Copy link
Contributor

@evansiroky evansiroky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm giving my approval, but only on the understanding that this PR is a first proof-of-concept feature that will probably need a more thorough update in the future. The 2 items I still have issues with are:

  1. Writing the startup script in bash in strings in Java code. I think that ideally, it would be good to replace this with a startup script written in a scripting language like python even JavaScript for the reasons mentioned here.
  2. Having a more discrete way viewing and managing deployments as noted here. Depending on how clients and users manage deployments, this may or may not be a good idea. We'll see.

@evansiroky evansiroky assigned landonreed and unassigned evansiroky Oct 2, 2019
@landonreed
Copy link
Contributor Author

🎉 This PR is included in version 3.6.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants