Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy OTP to AWS load balancer and manage OTP servers in separate collection (and misc other fixes) #225

Merged
merged 50 commits into from
Oct 9, 2019
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
ac0b00c
WIP spin up EC2 (no user data)
landonreed Aug 7, 2018
9c5e92c
Merge branch 'remove-r5' into deploy-to-ec2
landonreed Sep 18, 2018
d34b96a
Merge branch 'dev' into deploy-to-ec2
landonreed Nov 6, 2018
560c4c6
refactor(snapshot): remove legacy MapDB-based snapshot jobs
landonreed Nov 12, 2018
dbbd130
fix(delete): delete SQL namespace when feed version/snapshot deleted
landonreed Nov 12, 2018
c01eadc
feature(deploy-ec2): deployment enhancements for load balancers
landonreed Nov 12, 2018
f777b02
fix(user-mgmt): better error handling when Auth0 cannot update/create…
landonreed Nov 12, 2018
8898376
fix: move toGtfsDate from deleted class to FeedTx
landonreed Nov 12, 2018
fa5ce83
refactor: fix whitespace
landonreed Nov 12, 2018
e0a1eb6
refactor: add missing aws pom entry
landonreed Nov 12, 2018
61c04d2
build(pom): update gtfs-lib dependency
landonreed Nov 12, 2018
7231a95
feature(server-mgmt): manage deployment servers at the application level
landonreed Nov 15, 2018
b372303
refactor(server-job): attach just the project ID to the merge feeds job
landonreed Nov 29, 2018
7c92c1d
Merge branch 'dev' into deploy-to-ec2
landonreed Nov 30, 2018
aa0553b
refactor(deploy): shuffle deploy job code for clarity
landonreed Nov 30, 2018
c52d6f0
Merge pull request #133 from ibi-group/dev
landonreed Aug 7, 2019
56f7642
Merge branch 'dev' into deploy-to-ec2
landonreed Aug 7, 2019
6ae2055
refactor: fix issues resulting from merge
landonreed Aug 7, 2019
c33c290
refactor(deployment): tweak user script and update default config
landonreed Aug 8, 2019
05ec4df
refactor(deployment): improve validation of server fields
landonreed Aug 9, 2019
b6d7363
refactor(deployments): modify OtpServer fields and refactor server cr…
landonreed Aug 9, 2019
7a020c8
refactor: remove unused import
landonreed Aug 9, 2019
c753a31
refactor(ServerController): add comment about checking S3 permissions
landonreed Aug 13, 2019
b793727
refactor(ServerController): add missing exceptions to logMessageAndHalt
landonreed Aug 14, 2019
a3ed73c
refactor(deploy): fix check for s3 graph object
landonreed Aug 20, 2019
a07935a
refactor(deploy): revert to default instance type if none specified
landonreed Aug 20, 2019
a177e79
refactor(deploy): make instance profile arn optional
landonreed Aug 20, 2019
2507ab5
refactor(deploy): use set method rather than with for instance profile
landonreed Aug 22, 2019
e1fe1a3
Merge branch 'dev' into deploy-to-ec2
landonreed Sep 3, 2019
4a1ef29
refactor(deploy): move ec2 config into OtpServer
landonreed Sep 9, 2019
9b957dd
refactor(deploy): tweak deployJob for NPE fix and fix server delete
landonreed Sep 10, 2019
bf0f1bc
ci(config): update server.yml.tmp for e2e
landonreed Sep 10, 2019
ade0b40
refactor(EC2InstanceSummary): add empty constructor for serialization
landonreed Sep 10, 2019
95a2333
Merge branch 'dev' into deploy-to-ec2
landonreed Sep 12, 2019
c273350
test(.gitignore): don't ignore test config
landonreed Sep 12, 2019
3493570
test(mtc): fix broken MTC feed merge test with new test config
landonreed Sep 12, 2019
4992b32
refactor(ServerController): isolate jackson parse to utility method
landonreed Sep 12, 2019
3441fa2
refactor(ServerController): surround validation method calls in try/c…
landonreed Sep 12, 2019
a36a7d9
refactor(deploy-to-ec2): address PR comments
landonreed Sep 20, 2019
18bd9d3
refactor(deploy-to-ec2): add json property latest; add server ID to s…
landonreed Sep 20, 2019
c28c215
Merge branch 'dev' into deploy-to-ec2
landonreed Sep 20, 2019
41c21a8
refactor(deploy): fix check for S3 jar
landonreed Sep 20, 2019
7e1528b
refactor(deploy-to-ec2): address PR comments
landonreed Sep 24, 2019
f34affb
refactor(deploy-to-ec2): surround s3 checks in try/catch
landonreed Sep 24, 2019
4cc9b68
refactor(deploy-to-ec2): actually skip termination request
landonreed Sep 24, 2019
fb44a61
refactor(deploy): fix duration calc
landonreed Sep 30, 2019
101b7f9
refactor(deploy): use onboard nginx to signal ec2 deploy status
landonreed Oct 1, 2019
523801d
refactor(deploy): bump default otp version to 1.4
landonreed Oct 3, 2019
c399189
refactor(deploy): add terminate EC2 instance HTTP endpoint
landonreed Oct 8, 2019
240a6e0
refactor(deploy): refine terminate instances endpoint and check for g…
landonreed Oct 8, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import org.slf4j.LoggerFactory;

import java.io.File;
import java.io.Serializable;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.ArrayList;
Expand All @@ -18,7 +19,8 @@
/**
* Created by landon on 6/13/16.
*/
public abstract class MonitorableJob implements Runnable {
public abstract class MonitorableJob implements Runnable, Serializable {
private static final long serialVersionUID = 1L;
private static final Logger LOG = LoggerFactory.getLogger(MonitorableJob.class);
public final String owner;

Expand Down Expand Up @@ -129,7 +131,6 @@ public void run () {
boolean parentJobErrored = false;
boolean subTaskErrored = false;
String cancelMessage = "";
long startTimeNanos = System.nanoTime();
try {
// First execute the core logic of the specific MonitorableJob subclass
jobLogic();
Expand Down Expand Up @@ -188,8 +189,7 @@ public void run () {
LOG.error("Job failed", ex);
status.update(true, ex.getMessage(), 100, true);
}
status.startTime = TimeUnit.NANOSECONDS.toMillis(startTimeNanos);
status.duration = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startTimeNanos);
status.duration = System.currentTimeMillis() - status.startTime;
LOG.info("{} {} {} in {} ms", type, jobId, status.error ? "errored" : "completed", status.duration);
}

Expand Down Expand Up @@ -243,7 +243,7 @@ public static class Status {
/** How much of task is complete? */
public double percentComplete;

public long startTime;
public long startTime = System.currentTimeMillis();
public long duration;

// When was the job initialized?
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
package com.conveyal.datatools.manager.auth;

import com.auth0.jwt.JWTExpiredException;
import com.auth0.jwt.JWTVerifier;
import com.auth0.jwt.pem.PemReader;
import com.conveyal.datatools.manager.DataManager;
Expand Down Expand Up @@ -90,6 +91,9 @@ public static void checkUser(Request req) {
// The user attribute is used on the server side to check user permissions and does not have all of the
// fields that the raw Auth0 profile string does.
req.attribute("user", profile);
} catch (JWTExpiredException e) {
LOG.warn("JWT token has expired for user.");
logMessageAndHalt(req, 401, "User's authentication token has expired. Please re-login.");
} catch (Exception e) {
LOG.warn("Login failed to verify with our authorization provider.", e);
logMessageAndHalt(req, 401, "Could not verify user's token");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import com.amazonaws.services.ec2.model.Filter;
import com.amazonaws.services.ec2.model.Instance;
import com.amazonaws.services.ec2.model.Reservation;
import com.amazonaws.services.s3.AmazonS3URI;
import com.conveyal.datatools.common.utils.SparkUtils;
import com.conveyal.datatools.manager.DataManager;
import com.conveyal.datatools.manager.auth.Auth0UserProfile;
Expand All @@ -18,8 +19,10 @@
import com.conveyal.datatools.manager.models.JsonViews;
import com.conveyal.datatools.manager.models.OtpServer;
import com.conveyal.datatools.manager.models.Project;
import com.conveyal.datatools.manager.persistence.FeedStore;
import com.conveyal.datatools.manager.persistence.Persistence;
import com.conveyal.datatools.manager.utils.json.JsonManager;
import com.mongodb.client.FindIterable;
import org.bson.Document;
import org.eclipse.jetty.http.HttpStatus;
import org.slf4j.Logger;
Expand All @@ -38,6 +41,7 @@
import java.util.Map;
import java.util.stream.Collectors;

import static com.conveyal.datatools.common.utils.S3Utils.downloadFromS3;
import static com.conveyal.datatools.common.utils.SparkUtils.logMessageAndHalt;
import static spark.Spark.delete;
import static spark.Spark.get;
Expand Down Expand Up @@ -84,6 +88,46 @@ private static Deployment deleteDeployment (Request req, Response res) {
return deployment;
}

/**
* HTTP endpoint for downloading a build artifact (e.g., otp build log or Graph.obj) from S3.
*/
private static String downloadBuildArtifact (Request req, Response res) {
Deployment deployment = checkDeploymentPermissions(req, res);
DeployJob.DeploySummary summaryToDownload = null;
String uriString = null;
// If a jobId query param is provided, find the matching job summary.
String jobId = req.queryParams("jobId");
if (jobId != null) {
for (DeployJob.DeploySummary summary : deployment.deployJobSummaries) {
if (summary.jobId.equals(jobId)) {
summaryToDownload = summary;
break;
}
}
} else {
summaryToDownload = deployment.latest();
}
if (summaryToDownload == null) {
// Try to construct the URI string
OtpServer server = Persistence.servers.getById(deployment.deployedTo);
if (server == null) {
uriString = String.format("s3://%s/bundles/%s/%s/%s", "S3_BUCKET", deployment.projectId, deployment.id, jobId);
logMessageAndHalt(req, 400, "Cannot construct URI for build artifact. " + uriString);
landonreed marked this conversation as resolved.
Show resolved Hide resolved
return null;
}
uriString = String.format("s3://%s/bundles/%s/%s/%s", server.s3Bucket, deployment.projectId, deployment.id, jobId);
LOG.warn("Could not find deploy summary for job. Attempting to use {}", uriString);
} else {
uriString = summaryToDownload.buildArtifactsFolder;
}
AmazonS3URI uri = new AmazonS3URI(uriString);
String filename = req.queryParams("filename");
if (filename == null) {
logMessageAndHalt(req, HttpStatus.BAD_REQUEST_400, "Must provide filename query param for build artifact.");
}
return downloadFromS3(FeedStore.s3Client, uri.getBucket(), String.join("/", uri.getKey(), filename), false, res);
}

/**
* Download all of the GTFS files in the feed.
*
Expand Down Expand Up @@ -266,6 +310,9 @@ public static List<EC2InstanceSummary> fetchEC2InstanceSummaries(Filter... filte
return fetchEC2Instances(filters).stream().map(EC2InstanceSummary::new).collect(Collectors.toList());
}

/**
* Fetch EC2 instances from AWS that match the provided set of filters (e.g., tags, instance ID, or other properties).
*/
public static List<Instance> fetchEC2Instances(Filter... filters) {
landonreed marked this conversation as resolved.
Show resolved Hide resolved
List<Instance> instances = new ArrayList<>();
DescribeInstancesRequest request = new DescribeInstancesRequest().withFilters(filters);
Expand All @@ -291,7 +338,10 @@ private static String deploy (Request req, Response res) {
logMessageAndHalt(req, 400, "Internal reference error. Deployment's project ID is invalid");
// Get server by ID
OtpServer otpServer = Persistence.servers.getById(target);
if (otpServer == null) logMessageAndHalt(req, 400, "Must provide valid OTP server target ID.");
if (otpServer == null) {
logMessageAndHalt(req, 400, "Must provide valid OTP server target ID.");
return null;
}

// Check that permissions of user allow them to deploy to target.
boolean isProjectAdmin = userProfile.canAdministerProject(deployment.projectId, deployment.organizationId());
Expand Down Expand Up @@ -350,7 +400,11 @@ public static void register (String apiPrefix) {
}), json::write);
options(apiPrefix + "secure/deployments", (q, s) -> "");
get(apiPrefix + "secure/deployments/:id/download", DeploymentController::downloadDeployment);
get(apiPrefix + "secure/deployments/:id/artifact", DeploymentController::downloadBuildArtifact);
get(apiPrefix + "secure/deployments/:id/ec2", DeploymentController::fetchEC2InstanceSummaries, json::write);
// TODO: In the future, we may have need for terminating a single EC2 instance. For now, an admin using the AWS
// console should suffice.
// delete(apiPrefix + "secure/deployments/:id/ec2", DeploymentController::terminateEC2Instance, json::write);
get(apiPrefix + "secure/deployments/:id", DeploymentController::getDeployment, json::write);
delete(apiPrefix + "secure/deployments/:id", DeploymentController::deleteDeployment, json::write);
get(apiPrefix + "secure/deployments", DeploymentController::getAllDeployments, json::write);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,8 @@ public static Organization createOrganization (Request req, Response res) {

public static Organization updateOrganization (Request req, Response res) throws IOException {
String organizationId = req.params("id");
requestOrganizationById(req);
Organization organization = Persistence.organizations.update(organizationId, req.body());
Organization updatedOrganization = requestOrganizationById(req);
Persistence.organizations.replace(organizationId, updatedOrganization);

// FIXME: Add back in hook after organization is updated.
// See https://github.com/catalogueglobal/datatools-server/issues/111
Expand Down Expand Up @@ -101,7 +101,7 @@ public static Organization updateOrganization (Request req, Response res) throws
// p.save();
// }

return organization;
return updatedOrganization;
}

public static Organization deleteOrganization (Request req, Response res) {
Expand Down
Loading