Skip to content

Commit

Permalink
Merge pull request #4 from Internet2/caseyd
Browse files Browse the repository at this point in the history
Caseyd
  • Loading branch information
aravipaticloudskills authored Apr 12, 2023
2 parents b273a8b + 5bae48e commit b654294
Show file tree
Hide file tree
Showing 23 changed files with 1,573 additions and 8 deletions.
28 changes: 20 additions & 8 deletions caseyd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,32 +3,44 @@
Working directory for Casey Dinsmore


# Jupyterhub Common Commands
# Common Commands

## Helm

### Install / Reconfigure

```
helm upgrade --cleanup-on-fail \
--install jhub jupyterhub/jupyterhub \
--namespace jhub \
--create-namespace \
--version=2.0.0 \
--values config.yaml

```
## Kubectl

### Get Proxy Address

kubectl -n jhub get service proxy-public
```
kubectl -n jhub get service proxy-public
```

### Show all pod states
```
kubectl get pods -A
```

### View details about a pod including deployment errors

```
kubectl -n jhub describe pod <pod.name>

```
### Get the logs for a pod

```
kubectl -n jhub get logs <pod.name>
```

### Get Persistent Volumes/Claims
```
kubectl -n jhub get pv
```
```
kubectl -n jhub get pvc
```
25 changes: 25 additions & 0 deletions caseyd/aws/eksctl/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# AWS EKS cluster config with eksctl

## Resources

https://www.arhea.net/posts/2020-06-18-jupyterhub-amazon-eks


## Issues

* With 4 availability zones in us-west-2, eksctl will randomly pick three and
so sometimes the deployment will fail.

Adding the AvailibilityZones: stanza to cluster.yaml resolves the issue as outlined here:

https://github.com/weaveworks/eksctl/blob/main/examples/05-advanced-nodegroups.yaml

availabilityZones: ["us-west-2a", "us-west-2b", "us-west-2d"]

* Hubs end up stuck in the Pending state

running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition

Resolution here does not seem to work

https://discourse.jupyter.org/t/hub-pod-stuck-on-pending-timed-out-binding-volumes/17176
113 changes: 113 additions & 0 deletions caseyd/aws/eksctl/cluster.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# file: cluster.yml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
name: jupyterhub
region: us-west-2

iam:
withOIDC: true
serviceAccounts:
- metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
aws-usage: "cluster-ops"
app.kubernetes.io/name: cluster-autoscaler
attachPolicy:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- "autoscaling:DescribeAutoScalingGroups"
- "autoscaling:DescribeAutoScalingInstances"
- "autoscaling:DescribeLaunchConfigurations"
- "autoscaling:DescribeTags"
- "autoscaling:SetDesiredCapacity"
- "autoscaling:TerminateInstanceInAutoScalingGroup"
- "ec2:DescribeLaunchTemplateVersions"
Resource: '*'
- metadata:
name: ebs-csi-controller-sa
namespace: kube-system
labels:
aws-usage: "cluster-ops"
app.kubernetes.io/name: aws-ebs-csi-driver
attachPolicy:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- "ec2:AttachVolume"
- "ec2:CreateSnapshot"
- "ec2:CreateTags"
- "ec2:CreateVolume"
- "ec2:DeleteSnapshot"
- "ec2:DeleteTags"
- "ec2:DeleteVolume"
- "ec2:DescribeInstances"
- "ec2:DescribeSnapshots"
- "ec2:DescribeTags"
- "ec2:DescribeVolumes"
- "ec2:DetachVolume"
Resource: '*'

managedNodeGroups:
- name: ng-us-west-2a
instanceType: t3.medium
volumeSize: 30
desiredCapacity: 1
privateNetworking: true
availabilityZones:
- us-west-2a
tags:
k8s.io/cluster-autoscaler/enabled: "true"
k8s.io/cluster-autoscaler/jupyterhub: "owned"
- name: ng-us-west-2b
instanceType: t3.medium
volumeSize: 30
desiredCapacity: 1
privateNetworking: true
availabilityZones:
- us-west-2b
tags:
k8s.io/cluster-autoscaler/enabled: "true"
k8s.io/cluster-autoscaler/jupyterhub: "owned"
- name: ng-us-west-2c
instanceType: t3.medium
volumeSize: 30
desiredCapacity: 1
privateNetworking: true
availabilityZones:
- us-west-2d
tags:
k8s.io/cluster-autoscaler/enabled: "true"
k8s.io/cluster-autoscaler/jupyterhub: "owned"

availabilityZones: ["us-west-2a", "us-west-2b", "us-west-2d"]

# Adding EBS CSI to try to resolve permissions
# Does not seem to work
# 2023/04/11
# https://discourse.jupyter.org/t/hub-pod-stuck-on-pending-timed-out-binding-volumes/17176
addons:
- name: aws-ebs-csi-driver
attachPolicy:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- "ec2:AttachVolume"
- "ec2:CreateSnapshot"
- "ec2:CreateTags"
- "ec2:CreateVolume"
- "ec2:DeleteSnapshot"
- "ec2:DeleteTags"
- "ec2:DeleteVolume"
- "ec2:DescribeInstances"
- "ec2:DescribeSnapshots"
- "ec2:DescribeTags"
- "ec2:DescribeVolumes"
- "ec2:DetachVolume"
Resource: '*'
12 changes: 12 additions & 0 deletions caseyd/aws/jup-default.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# This file can update the JupyterHub Helm chart's default configuration values.
#
# For reference see the configuration reference and default values, but make
# sure to refer to the Helm chart version of interest to you!
#
# Introduction to YAML: https://www.youtube.com/watch?v=cdLNKUoMc6c
# Chart config reference: https://zero-to-jupyterhub.readthedocs.io/en/stable/resources/reference.html
# Chart default values: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/HEAD/jupyterhub/values.yaml
# Available chart versions: https://jupyterhub.github.io/helm-chart/
#


62 changes: 62 additions & 0 deletions caseyd/aws/tf/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@

## Terraform Commands

### Install TF Requirements
```
terraform init
```

### Validate Terraform file syntax
```
terraform validate
```

### Preview Changes
```
terraform plan
```

### Apply Terraform files
```
terraform apply
```

When the provisioning is complete, details will be provided about the cluster.

```
cluster_endpoint = "https://E44319CC44678D8EE100B7C42A46AE5D.gr7.us-west-2.eks.amazonaws.com"
cluster_name = "education-eks-pAGhwfz9"
cluster_security_group_id = "sg-01f527e90fdbf2f6d"
region = "us-west-2"
```

### Show the current terraform state
```
terraform show
```

This will also show the cluster output information


## Configure kube for the new cluster

```
aws eks update-kubeconfig --name <clustername>
```

Update kubectl from Terraform output (from the EKS terraform directory)
```
aws eks update-kubeconfig --name $(terraform output -raw cluster_name)
```



## Deleting a terraform deployment
```
terraform destroy
```


# References
* [Terraform EKS Example](https://developer.hashicorp.com/terraform/tutorials/kubernetes/eks)
* [Terraform Helm Example](https://developer.hashicorp.com/terraform/tutorials/kubernetes/helm-provider)
27 changes: 27 additions & 0 deletions caseyd/aws/tf/provision-eks-cluster/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Local .terraform directories
**/.terraform/*

# .tfstate files
*.tfstate
*.tfstate.*
*.tfplan

# Crash log files
crash.log

# Exclude all .tfvars files, which are likely to contain sentitive data, such as
# password, private keys, and other secrets. These should not be part of version
# control as they are data points which are potentially sensitive and subject
# to change depending on the environment.
*.tfvars

# Ignore override files as they are usually used to override resources locally and so
# are not checked in
override.tf
override.tf.json
*_override.tf
*_override.tf.json

# Ignore CLI configuration files
.terraformrc
terraform.rc
Loading

0 comments on commit b654294

Please sign in to comment.