Skip to content

Integrating Serge in your orchestration

Juan Calderon-Perez edited this page Jul 10, 2023 · 9 revisions

Docker Compose example

You can deploy server using just docker-compose, with lets encrypt configured on your host (Domain required).

  1. Add a new container with traefik configured, which will host web traffic on ports :80 and :443

Replace DOMAIN.COM... with appropriate domain name you have which is associated to your IP address in DNS.

Following examples will configure CHAT.DOMAIN.COM to server serge traffic.

  traefik:
    container_name: traefik
    image: "traefik:latest"
    command:
      - --entrypoints.web.address=:80
      - --entrypoints.websecure.address=:443
      - --entrypoints.web.http.redirections.entrypoint.to=websecure
      - --entrypoints.web.http.redirections.entrypoint.scheme=https
      - --providers.docker=true
      - --providers.docker.exposedByDefault=false
      - --api
      - --api.insecure=true
      - --certificatesresolvers.leresolver.acme.httpchallenge=true
      - [email protected]
      - --certificatesresolvers.leresolver.acme.storage=/data/acme.json
      - --certificatesresolvers.leresolver.acme.httpchallenge.entrypoint=web
      - --log.level=WARN
    network_mode: "host" 
    ports:
      - "80:80"
      - "443:443"
      - "127.0.0.1:8080:8080"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "/home/ubuntu/traefik-data/:/data"
    labels:
      - "traefik.http.routers.http-catchall.rule=hostregexp(`{host:.+}`)"
      - "traefik.http.routers.http-catchall.entrypoints=web"
      - "traefik.http.routers.http-catchall.middlewares=redirect-to-https"
      - "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https"
      - "traefik.http.routers.localhost.rule=Host(`host.DOMAIN.COM`)"
      - "traefik.http.routers.localhost.tls.certresolver=leresolver"
      - "traefik.http.services.localhost.loadbalancer.server.port=3000"
  1. To serge container, add labels so traefik will do reverse proxy for it
....
    ports:
      - "8008:8008"
    labels:
      # Traefik uses labels for container to add it to configuration
      - "traefik.enable=true"
      - "traefik.http.services.jitsu.loadbalancer.server.port=8008"
      - "traefik.http.routers.jitsu.rule=Host(`CHAT.DOMAIN.COM`)"
      - "traefik.http.routers.jitsu.tls.certresolver=leresolver"
      - "traefik.http.routers.jitsu.entrypoints=websecure"
....

Kubernetes example

You can deploy Serge using the manifests below, it contains the required kind to make it run on a Kubernetes cluster.

  1. Use this deployment manifest for your setup:
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: serge
  name: serge
  namespace: serge-ai
spec:
  ports:
    - name: "8008"
      port: 8008
      targetPort: 8008
    - name: "9124"
      port: 9124
      targetPort: 9124
  selector:
    app: serge
status:
  loadBalancer: {}
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: serge
  name: serge
  namespace: serge-ai
spec:
  replicas: 1
  selector:
    matchLabels:
      app: serge
  template:
    metadata:
      labels:
        app: serge
    spec:
      containers:
        - image: ghcr.io/serge-chat/serge:latest
          name: serge
          ports:
            - containerPort: 8008
            - containerPort: 9124
          resources:
            requests:
              cpu: 5000m
              memory: 5120Mi
            limits:
              cpu: 8000m
              memory: 8192Mi
          volumeMounts:
            - mountPath: /data/db
              name: datadb
            - mountPath: /usr/src/app/weights
              name: weights
      restartPolicy: Always
      volumes:
        - name: datadb
          persistentVolumeClaim:
            claimName: datadb
        - name: weights
          persistentVolumeClaim:
            claimName: weights
status: {}
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: serge
  name: weights
  namespace: serge-ai
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 64Gi
status: {}
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: serge
  name: datadb
  namespace: serge-ai
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 16Gi
status: {}
---

You can now deploy Serge with the following commands:

$ kubectl create ns serge-ai
$ kubectl apply -f manifest.yaml
  1. You can add the supported Alpaca models using the following commands after gathering the Pod ID:
$ kubectl get pod -n serge-ai
NAME                     READY   STATUS    RESTARTS   AGE
serge-58959fb6b7-px76v   1/1     Running   0          8m42s

$ kubectl exec -it serge-58959fb6b7-px76v -n serge-ai python3 /usr/src/app/api/utils/download.py tokenizer 7B
  1. If you have an IngressClass on your cluster, it is possible to use Serge behind an ingress. Below is an example with an Nginx IngressClass:
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: serge-ingress
  namespace: serge-ai
  annotations:
    nginx.ingress.kubernetes.io/configuration-snippet: |
      proxy_set_header Upgrade $http_upgrade;
      proxy_set_header Connection upgrade;
      proxy_set_header Accept-Encoding gzip;
    nginx.org/websocket-services: serge
    nginx.ingress.kubernetes.io/cors-allow-methods: "PUT, GET, POST, OPTIONS, DELETE"
spec:
  ingressClassName: nginx
  tls:
    - hosts:
        - MY-DOMAIN.COM >>> **EDIT HERE**
      secretName: serge-tls
  rules:
    - host: MY-DOMAIN.COM >>> **EDIT HERE**
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: serge
                port:
                  number: 8008
---
  1. If you have Cert-Manager installed, you can make a TLS certificate with the following YAML file:
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: serge-tls
  namespace: serge-ai
spec:
  secretName: serge-tls
  issuerRef:
    name: acme-issuer
    kind: ClusterIssuer
  dnsNames:
  - 'MY-DOMAIN.COM' >>> **EDIT HERE**
  privateKey:
    algorithm: RSA
    encoding: PKCS1
    size: 4096
---
Clone this wiki locally