-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deploy OTP to AWS load balancer and manage OTP servers in separate collection (and misc other fixes) #225
Conversation
Adds a separate controller/collection to manage OTP servers and allow for deploying to a new EC2 servers and adding them to a load balancer rather than building an OTP graph over the wire.
This change allows servers to be assigned to a specific project or left open for any project in the application.
Conflicts: pom.xml
Merge IBI dev branch into catalogueglobal dev
Codecov Report
@@ Coverage Diff @@
## dev #225 +/- ##
============================================
- Coverage 19.83% 18.72% -1.11%
Complexity 361 361
============================================
Files 149 150 +1
Lines 7462 7963 +501
Branches 996 1076 +80
============================================
+ Hits 1480 1491 +11
- Misses 5850 6339 +489
- Partials 132 133 +1
Continue to review full report at Codecov.
|
src/main/java/com/conveyal/datatools/manager/controllers/api/ServerController.java
Outdated
Show resolved
Hide resolved
src/main/java/com/conveyal/datatools/manager/controllers/api/ServerController.java
Outdated
Show resolved
Hide resolved
src/main/java/com/conveyal/datatools/manager/controllers/api/ServerController.java
Outdated
Show resolved
Hide resolved
src/main/java/com/conveyal/datatools/manager/controllers/api/DeploymentController.java
Outdated
Show resolved
Hide resolved
src/main/java/com/conveyal/datatools/manager/jobs/DeployJob.java
Outdated
Show resolved
Hide resolved
src/main/java/com/conveyal/datatools/manager/jobs/DeployJob.java
Outdated
Show resolved
Hide resolved
src/main/java/com/conveyal/datatools/manager/jobs/DeployJob.java
Outdated
Show resolved
Hide resolved
if (!deployment.r5) { | ||
lines.add(String.format("aws s3 --region us-east-1 cp %s/%s %s ", routerDir, OTP_GRAPH_FILENAME, getS3GraphURI())); | ||
String s3BuildLogPath = joinToS3FolderURI(getBuildLogFilename()); | ||
lines.add(String.format("aws s3 --region us-east-1 cp $BUILDLOGFILE %s ", s3BuildLogPath)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may make more sense to upload the logs first in case the graph wasn't built. If the graph wasn't built, the script might error out on a file not found error perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will that actually happen? I suppose I can make the change regardless, but I don't think the execution of any line is dependent on the previous line, is it?
src/main/java/com/conveyal/datatools/manager/jobs/MonitorServerStatusJob.java
Show resolved
Hide resolved
See other comments here. |
* Construct the user data script (as string) that should be provided to the AMI and executed upon EC2 instance | ||
* startup. | ||
*/ | ||
private String constructUserData(boolean graphAlreadyBuilt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm increasingly hesitant about the design of this method. I think this is going to limit us moving forward to be writing bash in java strings. Something I would really like to see is more detailed reporting and logging of the progress of building and loading a graph and automatically retrying if something fails that can be retried. We could attempt to do this with bash, but it would be ugly. If we're trying to ship this quick, then fine I can live with it, but I think we should be making a separate startup script specifically for this where we can do more detailed things. Furthermore, right now this is basically using Amazon s3 as a database of sorts for storing various information when we already have a mongo and postgres instance. Maybe it would make sense for the startup script to directly edit the mongo db or maybe make queries to datatools-server. Either way, we should be thinking about this feature long-term and about what the easiest way to create maintainable code will be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments. Perhaps there is a better way to handle what gets run in user data, but at some level it is going to require running some bash (unless we opt to include some kind of startup script on the AMI or write a script to a file that gets fetched by the instance). Because of this, I think our best option for now is to keep everything contained in a single place -- this bash script.
Regarding using S3 as a database, can you describe what you mean more fully? We're just storing static files that were generated as a result of the deploy process. We also store GTFS zip files in S3 as well. I'm really not too sure of what the concern there is. The alternative you're proposing as I understand it (the OTP instance connecting to the mongo instance and handing off files/messages) sounds far more complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I'm saying. I'm recommending creating a script in some kind of programming language - be it java, python or JavaScript. Then this startup script would communicate directly with either mongo db or datatools-server via HTTP requests perhpas. As it is right now, all of this bash is in one place, but we have to construct it using strings in java. Therefore, we can't view it nicely formatted in an IDE, we can't lint or do static typing analysis, it's much harder to import 3rd party libraries and it is impossible to write isolated tests.
Regarding s3 I'm fine with most of what's going on. Most of the data makes sense to store in a raw format (logs, GTFS files, OSM pbf). The thing in particular that I'm concerned about is constantly checking for bundle download and graph build completion and who knows what else we may add in the future by checking for the existence of an s3 file. I'm not recommending that OTP connect to the mongo instance, but I am recommending that a replacement of the bash script do this and handle the checks of bundle download, graph build, and graph load. These intermediate files that are waited upon seem like the more complex option to me when the bash replacement script could just be writing to the mongo database directly or making an HTTP request to datatools-server, or maybe even using some other messaging framework.
Furthermore, using s3 as a database like this will cost more money since s3 has per-request pricing so the more we scale the more this will cost us. The cost seems lowish at $0.005 per 1k requests, but let's say new graphs are built on a nightly basis. And let's say graph building takes 1 hour for a largeish region with elevation data. So, it'll make an s3 request every 5 seconds for an hour or 3,600 requests per day and 1.3 million a year. If we have 100 regions doing this, the cost of just these s3 requests would be $657 a year compared to $0 for using a database or sending over an HTTP request to the datatools-server instance.
@@ -121,6 +130,16 @@ public String getDeploymentId () { | |||
return deployment.id; | |||
} | |||
|
|||
/** Increment the completed servers count (for use during ELB deployment) and update the job status. */ | |||
public void incrementCompletedServers() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -88,52 +92,70 @@ public void jobLogic() { | |||
} | |||
// First, check that OTP has started up. | |||
status.update("Prepping for graph build...", 20); | |||
String bundleUrl = String.join("/", ipUrl, BUNDLE_DOWNLOAD_COMPLETE_FILE); | |||
long bundleDownloadStartTime = System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this addition as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm giving my approval, but only on the understanding that this PR is a first proof-of-concept feature that will probably need a more thorough update in the future. The 2 items I still have issues with are:
- Writing the startup script in bash in strings in Java code. I think that ideally, it would be good to replace this with a startup script written in a scripting language like python even JavaScript for the reasons mentioned here.
- Having a more discrete way viewing and managing deployments as noted here. Depending on how clients and users manage deployments, this may or may not be a good idea. We'll see.
🎉 This PR is included in version 3.6.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
Checklist
dev
before they can be merged tomaster
)Description
Note: this contains code originally contained in catalogueglobal#124
We have come to realize that the way we deploy new graphs to OTP (making a "build over the wire" request to existing EC2 servers) isn't reliable at all. This PR in conjunction with a UI PR adds to Data Tools the option to set up a "server" that targets a load balancer (rather than an existing EC2 server). It will:
Another important change here that is related to deployments is to move the OtpServer objects stored in the Project POJO to their own MongoDB collection. Now, these objects are tied to projects by projectId and fetched via a JsonProperty-annotated getter method. This additionally allows for one of these server targets to be tied to a project or (if projectId is null) available across all projects in the application.
Finally, this PR does some other housekeeping:
fixes catalogueglobal#120
removes unused/legacy snapshot classes
fixes catalogueglobal#110
fixes catalogueglobal/datatools-ui#234
Notes on configuration
Instructions for setting up EC2 auto deploy:
Deployment module in
server.yml
should look like: