Kubernetes Shim Design
Github repo: https://github.com/apache/incubator-yunikorn-k8shim
Please read the architecture doc before reading this one, you will need to understand the 3 layer design of YuniKorn before getting to understand what is the Kubernetes shim.
The Kubernetes shimβ
The YuniKorn Kubernetes shim is responsible for talking to Kubernetes, it is responsible for translating the Kubernetes cluster resources, and resource requests via scheduler interface and send them to the scheduler core. And when a scheduler decision is made, it is responsible for binding the pod to the specific node. All the communication between the shim and the scheduler core is through the scheduler-interface.
The admission controllerβ
The admission controller runs in a separate pod, it runs a mutation webhook and a validation webhook, where:
- The mutation webhookmutates pod spec by:- Adding schedulerName: yunikorn- By explicitly specifying the scheduler name, the pod will be scheduled by YuniKorn scheduler.
 
- Adding applicationIdlabel- When a label applicationIdexists, reuse the given applicationId.
- When a label spark-app-selectorexists, reuse the given spark app ID.
- Otherwise, assign a generated application ID for this pod, using convention: yunikorn-<namespace>-autogen. This is unique per namespace.
 
- When a label 
- Adding queuelabel- When a label queueexists, reuse the given queue name. Note, if placement rule is enabled, values set in the label is ignored.
- Otherwise, adds queue: root.default
 
- When a label 
- Adding disableStateAwarelabel- If pod was assigned a generated applicationId by the admission controller, also set disableStateAware: true. This causes the generated application to immediately transition from theStartingtoRunningstate so that it will not block other applications.
 
- If pod was assigned a generated applicationId by the admission controller, also set 
 
- Adding 
- The validation webhookvalidates the configuration set in the configmap- This is used to prevent writing malformed configuration into the configmap.
- The validation webhook calls scheduler validation REST API to validate configmap updates.
 
Admission controller deploymentβ
Currently, the deployment of the admission-controller is done as a post-start hook in the scheduler deployment, similarly, the
uninstall is done as a pre-stop hook. See the related code here.
During the installation, it is expected to always co-locate the admission controller with the scheduler pod, this is done
by adding the pod-affinity in the admission-controller pod, like:
podAffinity:
  requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
      matchExpressions:
      - key: component
        operator: In
        values:
        - yunikorn-scheduler
      topologyKey: "kubernetes.io/hostname"
it also tolerates all the taints in case the scheduler pod has some toleration set.
tolerations:
- operator: "Exists"