Skip to content

Installing Lamini Platform on Kubernetes

Lamini Platform on Kubernetes enables multi-node, multi-GPU inference and training running on your own GPUs, in the environment of your choice.

Prerequisites

Lamini Enterprise access

Contact us for access to the Kubernetes installer to host Lamini Platform on your own GPUs or in your cloud VPC.

Hardware system requirements

  • 64 GB CPU memory
  • 1 TB disk
  • GPU memory: 2xHBM per GPU

  • Example: AMD MI250 has 64GB of HBM, so Lamini requires 128GB of RAM per GPU.

  • Example: AMD MI300 has 192GB HBM, so Lamini requires 384GB of RAM per GPU.

NFS Provisioner

Lamini requires a RWX NFS provisioner. For example, you can set up a simple provisioner using nfs-subdir-external-provisioner:

helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner \
    --set nfs.server=<NFS_IP> \
    --set nfs.path=<NFS_SUBFOLDER_PATH>

GPU Operator

  • For AMD:
    git clone https://github.com/ROCm/k8s-device-plugin.git
    kubectl create -f k8s-device-plugin/k8s-ds-amdgpu-dp.yaml
    
  • For NVIDIA:
    helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
      && helm repo update
    helm install --wait --generate-name \
      -n gpu-operator --create-namespace \
      nvidia/gpu-operator
    

Docker

We recommend using Docker to generate the Helm charts as described below.

Make sure to switch to a user with sufficient RBAC privieges to deploy the Helm charts in the Kubernetes cluster. Typically that's root (sudo su).

Installation Steps

1. Update helm_config.yaml

  1. Set the name of the PVC provisioner being used for the Lamini cluster. If the PVC has been created beforehand, ensure the name is correct, that it is in the lamini namespace, and set create to False:

    helm_config.yaml
    pvcLamini: {
       name: lamini-volume,
       size: 200Gi,
       create: True
    }
    
    We recommend at least >200Gi (and the more, the better!) for lamini-volume. Base models, trained weights, and datasets will all be stored on this volume.

  2. Update the PVC provisioner classname by changing the pvc_provisioner field.

    helm_config.yaml
    pvc_provisioner: nfs-client
    

  3. Confirm the top-level platform type (one of: amd, nvidia, or cpu) matches your hardware.

    helm_config.yaml
    type: "amd"
    

  4. Update the distribution of inference pods.

    helm_config.yaml
    inference: {
       type: ClusterIP,
       batch: 1,
       streaming: 1,
       embedding: 1,
       catchall: 1
    }
    
    The example above would create 4 pods using 4 GPUs in total. Each pod has 1 GPU. The example shows 1 inference pod allocated to batch inference, 1 pod dedicated only to streaming inference, 1 dedicated only to embedding inference (also used in classification), and 1 for the catchall pod, which is intended to handle requests for models that have not been preloaded on the batch pod. See Model Management for more details.

  5. Update the number of training pods and number of GPUs per pod:

    helm_config.yaml
    training: {
       type: ClusterIP,
       num_pods: 1,
       num_gpus_per_pod: 8
    }
    
    We recommend minimizing the number of pods per node. For example, instead of 2 pods with 4 GPUs, it's better to create 1 pod with all 8 GPUs.

  6. Update the node affinity for the Lamini deployment. These are the nodes where Lamini pods will be deployed:

    helm_config.yaml
    nodes: [
       "node0"
    ]
    

  7. (Optional) If you want to use a custom ingress pathway, update the ingress field:

    helm_config.yaml
    ingress: 'ingress/pathway'
    

2. Generate and Deploy Helm Charts

The Lamini Platform Kubernetes deployment consists of 2 Helm charts:

  • lamini: Most Lamini services. Will be upgraded when updating to a new version of Lamini Platform.
  • persistent-lamini: Components that are meant to be persistent and unchanging across Lamini Platform upgrades. These include the Lamini database, PVC, and the Kubernetes secret to download new Lamini images.

First time install

Run ./install.sh - This takes helm_config.yaml and dynamically creates the Helm charts to deploy lamini & persistent-lamini.

Upgrade

Run ./upgrade.sh - This recreates the Helm charts and redeploys lamini without touching persistent-lamini.

That's it! You're up and running with Lamini Platform on Kubernetes.

Get Lamini version

Run the following command to find the tag of the container image:

kubectl get deployments -o wide -n lamini
Look for the tag of the images listed in the IMAGES column