-
Notifications
You must be signed in to change notification settings - Fork 54
EC2 launch
Given AWS credentials, within an hour of downloading our code, you can have a machine running on EC2 exactly replicating our experiments from the NEXT paper. The purpose of this guide is to describe the necessary steps to launch an adaptive learning experiment on NEXT, starting from scratch. If you’re new to AWS or EC2, you can begin by following our AWS-Account-Quickstart or the more in-depth official AWS account set-up guide.
Note: Restarting EC2 machines is mentioned in Instance teardown and database backups in bullet point 7.
-
We’ll begin by setting our AWS secret access key and access key ID as environment variables using:
$ export AWS_SECRET_ACCESS_KEY = [your_secret_aws_access_key_here] $ export AWS_ACCESS_KEY_ID = [your_aws_access_key_id_here]
-
Clone the open-source NEXT repository using:
$ git clone https://github.com/nextml/NEXT.git
-
Navigate to our EC2 launch folder with:
$ cd NEXT $ sudo pip install -r local_requirements.txt
-
For persistent data storage, we first need to create a bucket in AWS S3 using
createbucket
:$ cd ec2 $ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] createbucket [cluster-name]
where:
-
[keypair]
is the name of your EC2 key pair -
[key-file]
is the private key file for your key pair -
[cluster-name]
is the custom name you create and assign to your cluster
This will print out another environment variable command:
export AWS_BUCKET_NAME=[bucket_uid]
Copy and paste this command into your terminal.
-
-
Now we can fire up the NEXT system using the
launch
command. This command will create a new EC2 instance, pull the NEXT repository to that instance, install all of the relevant Docker images, and finally run all Docker containers:$ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] launch [cluster-name]
For example, if you would like to fire up a c3.8xlarge instance type, after filling in your credentials, your launch command will look something like this:
$ python next_ec2.py --key-pair=my_aws_key --identity-file=/home/username/Downloads/my_aws_key.pem \
--instance-type=c3.8xlarge launch next_test_instance
The previous command will take a few minutes to run. In addition launch
, next_ec2.py can perform many [actions]
including:
- launch
: launch a new cluster of any instance type
- docker_up
: automagically build and run the NEXT docker modules
- perform basic EC2 operations like start
, stop
, login
, or terminate using destroy
- rsync
: sync local code changes to your machine
- get-master
: obtain the public DNS and URL for your machine
- backup
: force immediate database backups
- restore
: restore your database from s3
This script is pretty simple to use. You can interact with all of these functions in the terminal with the following format:
$ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] [action] [cluster-name]
- You have simulated your first active learning experiments on the NEXT system. The rest of this tutorial is optional, but covers key topics including:
- stopping and restarting your EC2 machine for later
- immediate data backups to S3
- terminating your EC2 machine
- database recovery from S3
- rsyncing local code changes to your EC2 machine.
- If you desire to stop your EC2 machine so you can bring it back later, you can use the following command in the usual format (aka restarting your EC2 machine):
$ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] stop [cluster-name]
and to restart that same instance and launch the NEXT system:
$ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] start [cluster-name]
- If you desire instead to terminate your machine, you can save all of the database records for future use by doing a database backup to AWS S3:
$ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] --backup-filename=[backup-filename] backup [cluster-name]
For example, your command should look like the following:
$ python next_ec2.py --key-pair=my_aws_key --identity-file=/home/username/Downloads/my_aws_key.pem \
--backup-filename=next_tutorial_backup_db backup next_test_instance
Once completed, you can safely terminate your EC2 instance using:
$ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] destroy [cluster-name]
-
When it comes time to access the data you backed up, you can fire up a new EC2 instance and launch the NEXT system using, again, the following command:
$ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] launch [cluster-name]
After a few minutes, when the system is back online, you can verify that this is a fresh NEXT installation, navigate to the experiment dashboard and see that it does not reflect our past simulations:
http://your_public_ec2_DNS_here:8000/dashboard/experiment_list
You can restore your database with the contents of your previous backup using:
$ python next_ec2.py --key-pair=[keypair] --identity-file=[key-file] --backup-filename=[backup-filename] restore [cluster-name]
Using our previous backup example, the command is:
$ python next_ec2.py --key-pair=my_aws_key --identity-file=/home/username/Downloads/my_aws_key.pem \
--backup-filename=next_tutorial_backup_db.tar.gz restore [cluster-name]
where [cluster-name]
is the name of your recently launched instance.
- Navigate back to our experiment dashboard to see that all of your backed up experiments are available:
http://your_public_ec2_DNS_here:8000/dashboard/experiment_list
While looking at the below troubleshooting hints, look at the output your script gives you, towards the end. The keywords you're looking for will probably be in the last couple lines.
Keypairs and AMIs are specific to a region. Changing the default region or passing in the --ami
and region
flags for the region you created the keypair in should work.
Solution: Assuming that your key-pair and key-file are indeed correct, the error is most likely cause by a mismatch in region. The AWS EC2 keys are only in one region and are not global and the next_ec2.py
script uses the default region us-west-2
.
There are two solutions
- Change your AWS region to Oregon and create a key pair there. The image below shows how.
This is documented on GitHub with NEXT issue #11.
- Specify a region using
--region=
. If you do this, you must also specify an AMI. We recommend an Ubuntu backed AMI, you can use the official Ubuntu guide to help you choose a volume.
Errors such as SyntaxError: Missing parentheses in call to 'print'
This is happening because Python 3 is being used using to run a python script. This issue can be avoided by running it under a Python 2 interpreter. This change should be local or it could (but should not be!) system-wide.
Solution: Activate a virtualenv for Python 2 (I would go with Python 2.7).
To do that, you can either use virtualenv
or conda
(but only if you
use the Anaconda Python
distro.
# if using Anaconda. Can be run from anywhere
create --name py26 python=2.6 anaconda
# setup if using virtual env
mkdir env # making the environment folder
virtualenv env # activating the virtualenv
# virtualenv activation. Must have the path to `activate`
source env/bin/activate
# get rid of virtualenv (applies to both conda and virtualenv)
source deactivate
Python 3 is not backwards compatible with Python 2, and our project uses Python 2. If this error happens, it's almost certainly because you're using Python 3 by default.
Machine | Dedicated instance price (2016) |
---|---|
c3.4xlarge | $0.798/hr |
c3.8xlarge | $1.596/hr |
m4.large | $0.114/hr |
The prices may have changed; see https://aws.amazon.com/ec2/pricing/on-demand/ or https://aws.amazon.com/ec2/previous-generation/ for modern-day prices.
Spot instances prices vary with demand. Scott uses m4.large for testing. c3.8xlarge is better used for live production machines (i.e., the New Yorker experiments).