# Deploying Your RI Platform Cluster With requirements satisfied and configuration files populated, you are now ready to deploy the RI Platform. At a high-level, there are two main ways to do this: as a standalone Kubernetes cluster (recommended) or integrated within an existing Kubernetes cluster. The former is achieved by specifying `create_eks = true`, whereas the latter is specified by `create_eks = false` and including a `cluster_name`. The steps below may vary, depending on your infrastructure. ## Deploy the Cluster 1. Open a terminal session on a box containing the installation tools. 2. Verify that your Terraform configuration files (`main.tf` and `backend.tf`) are present in your working directory. 3. Authenticate your AWS CLI. ```shell aws sts get-caller-identity # verify you're in the right account ``` 4. Add the Robust Intelligence Helm repository (or your private registry, if configured). ```shell helm repo add robustintelligence https://robustintelligence.github.io/helm --force-update ``` 5. Initialize your Terraform environment. ```shell terraform init ``` 6. Verify your Terraform plan (recommended). ```shell terraform plan -out "rime.plan" | tee "rime-plan.txt" less rime-plan.txt # proof-read the changes ``` 7. **Terraform apply!** (This step can take up to ~30 minutes.) ```shell terraform apply "rime.plan" # if you skipped #6, you can omit the "rime.plan" ``` --- ## Validate Your Deployment Once the `terraform apply` command completes, your cluster should be operational! The following actions can help verify all services are up and running. ### Load Balancer ALPN Policy 1. Find the load balancer used by the `rime-kong-proxy` with `kubectl get svc rime-kong-proxy`. 2. Locate the Load Balancer in your AWS console. 3. In the "Listeners" section, verify that the `TLS: 443` listener's ALPN policy is set to `HTTP2Preferred`. ### Kubernetes Services 1. Point your local `kubectl` to the new cluster. ```shell aws eks --region us-west-2 update-kubeconfig --name ``` 2. Inspect the running pods. ```shell kubectl get pods -n ``` Your output should look something like this: ``` NAME READY STATUS RESTARTS AGE rime-agent-job-monitor-6bddd4697d-t9118 1/1 Running 0 5m26s rime-agent-launcher-56bc47549c-dod60 1/1 Running 0 5m26s rime-frontend-cd6c89884-8ljrl 1/1 Running 0 5m26s ... ``` 2. Verify you can access the web client at your `rime` sub-domain. This domain is the value you configured during DNS setup and will be of the form (`rime..com`). 3. cURL your version endpoint and verify that metadata is successfully returned: ```shell curl --location --request GET rime..com/v1/rime-info ``` 4. Verify you can make an API token in the web client using this [guide](/for_admins/how_to_guides/api-authentication.md). 5. Test your Python SDK connection using the API token you made: ```shell pip install rime-sdk ``` ```python rime_client = Client("rime..com", "") project = rime_client.create_project("Health Check", "Testing the SDK's upstream connection.") ``` 5. Return to the web client and verify that a project was created. If everything succeeds you are ready to achieve ML Integrity with the RI Platform! ## Configure Backups Backups ensure that your team can restore your testing data in the event of a disaster. If your cluster has been successfully deployed (and you opted in via `install_velero = true`), you can configure backups using the steps below. 1. Download Velero. ```shell curl -fsSL -o velero-v1.6.3-linux-amd64.tar.gz https://github.com/vmware-tanzu/velero/releases/download/v1.6.3/velero-v1.6.3-linux-amd64.tar.gz tar -xvf velero-v1.6.3-linux-amd64.tar.gz ``` 2. Ensure that your backups are scheduled properly. ```shell ./velero schedule get -n rime-extras ``` --- ## Troubleshooting 1. If you are getting timeouts in the SDK, ensure that you are connected to VPN. 2. If the webapp is marked as insecure, verify that you have an ACM SSL/TLS cert for your webapp. 3. On older operating systems, you may need to run `export GRPC_DNS_RESOLVER=native` in the shell. Otherwise requests may hang due to ipv4 vs ipv6 issues.