Lesson 01

SAS Viya Administration Operations
Lesson 01, Section 0 Exercise: Validate Environment

Validate the Workshop Environment

In this exercise you will validate the status of the SAS Viya platform deployment you will be using for the remainder of the workshop. In addition to verifying the status and health of the SAS Viya components, you will also validate the functional aspects of both SAS Viya as well as third-party applications you will be exposed to later.

List existing namespaces

  1. From the collection’s Windows machine, open MobaXterm and initiate a new session to sasnode01, the Linux machine you will use to access the cluster.

  2. List the namespaces in the Kubernetes cluster.

    kubectl get ns

    Efficiency tip…

    Many of the commands you see throughout this workshop are long and somewhat challenging to type accurately. While you are certainly free to type the commands for yourself, we encourage you to copy the commands you see in the instructions and paste them into your MobaXterm session.

    To copy: Select the Copy to clipboard icon which will copy all of the code from the code portlet to your clipboard. You do not have to highlight the text yourself for this to work.

    Copy to clipboard

    To paste: In MobaXterm, right-click to paste the selected text.

  3. The key namespaces to verify are

    • gelcorp
    • v4mlog
    • v4mmon

    There will be additional namespaces listed but if you see these three you can move on.

  4. To simplify all subsequent kubectl commands, set the default namespace to gelcorp. This will effectively add --namespace gelcorp to every kubectl command.

    gel_setCurrentNamespace gelcorp

    You should see

    THE DEFAULT KUBERNETES DEPLOYMENT NAMESPACE IS: gelcorp
    
    If you want to change the current namespace, please run this command: gel_setCurrentNamespace [yourNamespace]
    
    and
    
    Context "gelcluster" modified.

Check status of SAS Viya pods

  1. Let’s verify the status of your SAS Viya pods. Maximize your MobaXterm window and run

    kubectl get pods -o wide

    You should see that all of the pods have a status of Running or Completed. The output will also show you how the Viya pods are distributed across the nodes in your cluster.

    A status of Running does not necessarily signal that the pod is ‘open for business’ so now let’s try reaching an endpoint on each pod.

  2. Run the gel_ReadyViya4 script to make sure all Viya pods are reporting in as Ready.

    gel_ReadyViya4 -n gelcorp

    You should see a message similar to the following:

    NOTE: POD labeled sas-readiness in namespace gelcorp is sas-readiness-6475759dc9-jld24
    NOTE: Viya namespace gelcorp is running Stable 2024.03 : 20240425.1714076884655
    NOTE: All checks passed. Marking as ready. The first recorded failure was 8m4s ago.
  3. Display cadence information of your Viya deployment.

    kubectl get configmaps -o yaml | grep CADENCE | head -8

    Except for SAS_CADENCE_RELEASE, your values should match those below.

    SAS_BASE_CADENCE_NAME: stable
    SAS_BASE_CADENCE_VERSION: "2024.03"
    SAS_CADENCE_DISPLAY_NAME: Long-Term Support 2024.03
    SAS_CADENCE_DISPLAY_SHORT_NAME: Long-Term Support
    SAS_CADENCE_DISPLAY_VERSION: "2024.03"
    SAS_CADENCE_NAME: lts
    SAS_CADENCE_RELEASE: "20240517.1700181118101"
    SAS_CADENCE_VERSION: "2024.03"

Validate Applications

  1. During the deployment, a file was written to the cloud-user’s home directory with application URLs. List that file.

    cat ~/urls.md

    Your listing should look like this with different hostname components embedded in the URLs. The URLs below will not work! Simply highlight one of the URLs in your MobaXterm window which will automatically copy it and then open a browser on the client machine and paste the copied text into the address bar.

    # List of URLs for your environment
    
    * [Airflow]( http://airflow.*hostname*.race.sas.com )
    * [Alert Manager]( http://alertmanager.*hostname*.race.sas.com/ )
    * [Grafana (u=admin p=lnxsas)]( http://grafana.*hostname*.race.sas.com/ )
    * [OpenSearch Dashboards (u=logadm p=lnxsas, u=admin p=lnxsas)]( http://osd.*hostname*.race.sas.com/ )
    * [Prometheus]( http://prometheus.*hostname*.race.sas.com/ )
    * [SAS Drive (gelcorp) ]( https://gelcorp.*hostname*.race.sas.com/SASDrive/ )
    * [SAS Environment Manager (gelcorp) ]( https://gelcorp.*hostname*.race.sas.com/SASEnvironmentManager/ )
    * [SAS Studio (gelcorp) ]( https://gelcorp.*hostname*.race.sas.com/SASStudio/ )
    * [SAS Visual Analytics (gelcorp) ]( https://gelcorp.*hostname*.race.sas.com/SASVisualAnalytics/ )
  2. Verify that you can login to SAS Environment Manager.

    User: geladm
    Password: lnxsas
    • Examine the users and groups
    • Do you see any pages that are new for Viya?
    • Are there pages missing that you are used to seeing in SAS Environment Manager?
  3. Verify that you can login to Grafana.

    User: admin
    Password: lnxsas
    • See if you can display the SAS CAS Overview dashboard

    There will be much more on Grafana later in the workshop.

  4. Verify that you can login to OpenSearch Dashboards.

    User: logadm
    Password: lnxsas
    • Locate the Dashboard page
    • Open the Log Message Volumes with Levels dashboard

    There will be more coverage of OpenSearch Dashboards later in the workshop.

Let the workshop leader know if you have trouble verifying access to any of the applications listed in urls.md.

This completes the exercise.


Lesson 02

SAS Viya Administration Operations
Lesson 02, Section 1 Exercise: Working with Labels

Working with Labels

Introduction

In this exercise you will experiment with selecting Kubernetes resources using labels. You will experiment with pod listings because the number of pods presents an abundance of options for learning about how to reference labels. You will learn how to

  • Discover labels associated to pods
  • Filter pod listings for pods that match a given label selector
  • Combine multiple selector queries.

As you work through the following steps you can refer to this table of operators for building selector queries.

Operator Meaning Example Meaning
= equal to ‘env=prod’ The env key has a value of prod
!= not equal to ‘env!=qa’ The env key has a value that is not qa
in occurs in a list ‘env in (prod,test)’ The env key value is either prod or test
notin does not occur in the listed values ‘env notin (prod,dev)’ The env key value is not prod or dev; or the env key is unassigned
exists (key only) the referenced key exists ‘env’ The env key exists but the value is not tested
!exists (key only) the referenced key is not assigned ‘!env’ The env key does not exist for an object
, logical AND joining multiple expressions ‘env=prod, tier=compute’ The env key value is prod AND the tier key value is compute

Set the default namespace

  • Set the default namespace to gelcorp so you can omit -n gelcorp from any kubectl command intended for the gelcorp namespace.

    gel_setCurrentNamespace gelcorp

Discovering labels

Before you can reference labels you probably need a way to find out which labels have been assigned.

  • The easiest way to show all pod labels is to employ the --show-labels option with a get pods command. As a warning, the following command generates a very wide pod listing so you may want to maximize your MobaXterm window before running it to minimize line wrapping.

    kubectl get pods --show-labels
  • You can also use the describe command to view the labels for any single object. For the following example, replace <paste-a-pod-name-here-from-your-console> with the name of any pod of interest from the output of your previous command.

    kubectl describe pod <paste-a-pod-name-here-from-your-console>

Useful labels for pod selection

Now let’s look at a few labels that are quite useful for identifying key groups of pods in Viya.

NOTE: you can use -l or –selector to use a label query.

  1. If you took time to scan all of the labels from the --show-labels output you may have noticed that almost every pod in SAS Viya has the sas.com/deployment=sas-viya label.

    kubectl get pods --selector 'sas.com/deployment=sas-viya'
  2. That listing is pretty long so let’s use the wc command to count how many pods have the sas.com/deployment=sas-viya label. The --no-headers option prevents the column headings from polluting our count.

    kubectl get pods --selector 'sas.com/deployment=sas-viya' --no-headers | wc -l
  3. How does that compare to our total number of pods?

    kubectl get pods --no-headers | wc -l
  4. So now let’s use a label trick to discover which pods do not have the sas.com/deployment=sas-viya label. In this example, the selector '!sas.com/deployment selects pods that do not have a label key of sas.com/deployment assigned.

    kubectl get pods --selector '!sas.com/deployment'
  5. The workload.sas.com/class key is another useful label for looking at pods according to the type of workload they impart. This label key is important for the workload placement strategy for Viya when directing certain types of workload to specific nodes in the cluster. Expected values of the key are stateful, stateless, compute, connect, and cas.

    For example, it is oftentimes useful to be able to identify the pods for all of the stateful services (Consul, Postgres, RabbitMQ, Redis, OpenDistro Elastic, and Workload Orchestrator).

    kubectl get pods --selector 'workload.sas.com/class=stateful'
  6. Now take a look at the pods for stateless services.

    kubectl get pods --selector 'workload.sas.com/class=stateless'
  7. What are the results when you look for pods with the label workload.sas.com/class=cas?

    kubectl get pods --selector 'workload.sas.com/class=cas'

    You should see no pods returned for the class=cas query. While this may seem odd, we do not use a workload placement strategy for CAS on our RACE collections to minimize the resources we require. Because we allow CAS pods to be scheduled on any node, the CAS pods themselves do not get the workload.sas.com label. This is not a good practice in the real world.

Your turn

Using what you have learned so far, try to write a pod selector query that will return the requested list of pods. There are multiple ways of solving these queries but an example solution is provided for each one if you get stuck.

  1. List the pods related to Postgres but not those of the Postgres server itself. You may see no, or a different number of sas-crunchy-platform-postgres-repo1 jobs depending on how long your reservation has been running.

    NAME                                                      READY   STATUS      RESTARTS   AGE
    sas-crunchy-platform-postgres-backup-gzj6-xwhvh           0/1     Completed   0          4d21h
    sas-crunchy-platform-postgres-repo-host-0                 2/2     Running     0          4d21h
    sas-crunchy-platform-postgres-repo1-full-28057320-sjmzp   0/1     Completed   0          3d12h
    sas-crunchy-platform-postgres-repo1-incr-28058760-5jwcx   0/1     Completed   0          2d12h
    sas-crunchy-platform-postgres-repo1-incr-28060200-7cmjf   0/1     Completed   0          36h
    sas-crunchy-platform-postgres-repo1-incr-28061640-nbdqn   0/1     Completed   0          12h

    Click here to see one possible solution

    kubectl get pods --selector 'postgres-operator.crunchydata.com/cluster=sas-crunchy-platform-postgres, postgres-operator.crunchydata.com/role notin (replica,master)'
  2. List all of the CAS server pods that are managed by the CAS Operator.

    NAME                                READY   STATUS    RESTARTS   AGE
    sas-cas-server-default-controller   3/3     Running   0          4d21h

    Click here to see one possible solution

    kubectl get pods --selector app.kubernetes.io/managed-by=sas-cas-operator
  3. List all of the pods that have a job-name key assigned (your output will differ from the example)

    NAME                                                      READY   STATUS      RESTARTS   AGE
    sas-backup-purge-job-28055535-2xcsl                       0/2     Completed   0          4d17h
    sas-backup-purge-job-28056975-tg4q6                       0/2     Completed   0          3d17h
    sas-backup-purge-job-28058415-q5dzw                       0/2     Completed   0          2d17h
    sas-backup-purge-job-28059855-v5kf9                       0/2     Completed   0          41h
    sas-backup-purge-job-28061295-qflmk                       0/2     Completed   0          17h
    sas-create-openssl-ingress-certificate-6b4qm              0/1     Completed   0          4d21h
    sas-crunchy-platform-postgres-backup-gzj6-xwhvh           0/1     Completed   0          4d21h
    sas-crunchy-platform-postgres-repo1-full-28057320-sjmzp   0/1     Completed   0          3d12h
    sas-crunchy-platform-postgres-repo1-incr-28058760-5jwcx   0/1     Completed   0          2d12h
    sas-crunchy-platform-postgres-repo1-incr-28060200-7cmjf   0/1     Completed   0          36h
    sas-crunchy-platform-postgres-repo1-incr-28061640-nbdqn   0/1     Completed   0          12h
    sas-import-data-loader-28062120-v4pt4                     0/1     Completed   0          4h9m
    sas-import-data-loader-28062240-gmsl4                     0/1     Completed   0          129m
    sas-import-data-loader-28062360-zt5sk                     0/1     Completed   0          9m11s
    sas-pyconfig-cjinitial-t7zhw                              0/1     Completed   0          4d21h
    sas-scheduled-backup-job-28057020-gndz9                   0/2     Completed   0          3d17h
    sas-update-checker-28055349-x6zsp                         0/1     Completed   0          4d21h

    Click here to see one possible solution

    kubectl get pods --selector job-name

You have completed this exercise!


SAS Viya Administration Operations
Lesson 02, Section 2 Exercise: Kustomize Basics

In this hands-on you will use kustomize to make changes to your Viya Deployment.

In this hands-on we will demonstrate the use of Kustomize to:

  • Create a new K8S resource a Persistent Volume Claim
  • Update an existing K8S resource to use the Persistent Volume Claim

We will use a Kubernetes Persistent Volume claim to make data available to CAS. Our PVC is on NFS but that detail is abstracted from the user. In the cloud, the PVC would most likely be a different type of storage. Don’t worry if you don’t understand all of the Kubernetes concepts yet. This section is to help you get oriented to kustomize and kubectl.

NOTE: in the hands-on we use yq to update the yaml files on the command-line. This is less error prone than editing the files in an editor. In each case where we use the yq command we also show you the change you could make interactively in the editor. Please use the yq approach in class to ensure your success.

Preliminary Tasks

  1. Set the current namespace.

    gel_setCurrentNamespace gelcorp
  2. Review the current contents of the kustomization.yaml file. Notice the different sections for:

    • resources
    • configurations
    • transformers
    • patches
    • generators
    • etc.
    cd ~/project/deploy/
    
    cat ~/project/deploy/${current_namespace}/kustomization.yaml
  3. Keep a copy of the original manifest and kustomization.yaml files. We will use these copies to track the changes your kustomization processing makes to these two files.

    cp -p ~/project/deploy/${current_namespace}/site.yaml /tmp/${current_namespace}/manifest_02-021.yaml
    cp -p ~/project/deploy/${current_namespace}/kustomization.yaml /tmp/${current_namespace}/kustomization_02-021.yaml
    cp -p ~/project/deploy/${current_namespace}/kustomization.yaml /tmp/${current_namespace}/kustomization.yaml.orig

Use Kustomize to create Persistent Volume Claim resource

  1. Create the resource definition. Create a yaml file that contains a complete persistent volume definition. We will use Kustomize to add the PVC to the generated manifest. By convention the file is created in the site-config sub-directory of the project directory.

    tee ~/project/deploy/${current_namespace}/site-config/gelcontent_pvc.yaml > /dev/null <<EOF
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: gelcontent-data
    spec:
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 2Gi
    EOF
  2. Reference the resource definition in kustomization.yaml. Modify ~/project/deploy/gelcorp/kustomization.yaml to add a reference to the file site-config/gelcontent_pvc.yaml. The file contains a complete Kubernetes resource so the reference is added to the resources section. in class we use yq to automate the update of the kuztomization.yaml.

    Run this command to update your kustomization.yaml file using the yq tool:

    [[ $(grep -c "site-config/gelcontent_pvc.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.resources += ["site-config/gelcontent_pvc.yaml"]' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you could manually edit the resources section to add the line - site-config/gelcontent_pvc.yaml.

    [...]
    resources:
    [... previous resource items ...]
    - site-config/gelcontent_pvc.yaml
    [...]
  3. At this point we could perform the build and apply, but lets make a another change before we do that.

Use Kustomize to create mount of PVC to CAS Deployment

Now we will use a JSON patch to patch the CAS Deployment Kubernetes resource. The patch will:

  • be added to the patches section of kustomization.yaml
  • target all resources of type CASDeployment
  • add the claimName at /spec/controllerTemplate/spec/volumes/-
  • add the mount path at /spec/controllerTemplate/spec/containers/0/volumeMounts/-
  1. Create a JSON patch file that updates the CASDeployment

    tee ~/project/deploy/${current_namespace}/site-config/cas-gelcontent-mount-pvc.yaml > /dev/null << EOF
    - op: add
      path:  /spec/controllerTemplate/spec/volumes/-
      value:
        name: sas-viya-gelcontent-pvc-volume
        persistentVolumeClaim:
            claimName: gelcontent-data
    - op: add
      path: /spec/controllerTemplate/spec/containers/0/volumeMounts/-
      value:
        name: sas-viya-gelcontent-pvc-volume
        mountPath: /mnt/gelcontent
    EOF
  2. Reference the patch in kustomization.yaml.Modify the ~/project/deploy/gelcorp/kustomization.yaml to reference the patch.

    Run this command to update kustomization.yaml using the yq tool:

    [[ $(grep -c "site-config/cas-gelcontent-mount-pvc.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.patches += {
        "path": "site-config/cas-gelcontent-mount-pvc.yaml",
        "target":
        {"group": "viya.sas.com",
            "kind": "CASDeployment",
            "name": ".*",
            "version": "v1alpha1"}
        }' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you can manually edit the patches section to add the lines below that patch the compute server deployment.

    [...]
        patches:
        [... previous patches items ...]
        - path: site-config/cas-gelcontent-mount-pvc.yaml
            target:
            group: viya.sas.com
            kind: CASDeployment
            # The following name specification will target all CAS servers. To target specific
            # CAS servers, comment out the following line then uncomment and edit one of the lines
            # targeting specific CAS servers.
            name: .*
            # Uncomment this to apply to one particular named CAS server:
            #name: {{ NAME-OF-SERVER }}
            # Uncomment this to apply to the default CAS server:
            #labelSelector: "sas.com/cas-server-default"
            version: v1alpha1
    [...]
  3. What have we done so far? Two files (cas-gelcontent-mount-pvc.yaml and gelcontent_pvc.yaml ) were created in the site-config sub-directory of the project and the kustomization.yaml was updated to reference those files. Run the following command to view the changes made to kustomization.yaml. The changes are in green in the right column.

    icdiff /tmp/${current_namespace}/kustomization_02-021.yaml ~/project/deploy/${current_namespace}/kustomization.yaml

Build and Apply the mainfests

The sas-orchestration deploy command uses the sas-orchestration docker container to build and apply the Kuberenetes manifests. The deploy command performs the following steps within the container:

  • pulls the deployment assets
  • builds the manifest (with kustomize)
  • applys the manfests with kubectl apply commands and runs any necessary life-cycle operations
  1. Review the .gelcorp_var file in which we store the parameters needed by the orchestration deploy command.

    cat ~/project/deploy/.${current_namespace}_vars

    The output will show what cadence and release we are using.

    #deployment parameters
    _viyaMirrorReg=crcache-race-sas-cary.unx.sas.com
    _order=9CV11D
    _cadenceName=stable
    _cadenceVersion=2023.04
    _cadenceRelease=latest
  2. Run the sas-orchestration deploy command. Follow the output in the terminal, this command will take a few minutes to complete.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
        -v ${PWD}/license:/license \
        -v ${PWD}/${current_namespace}:/${current_namespace} \
        -v ${HOME}/.kube/config_portable:/kube/config \
        -v /tmp/${current_namespace}/deploy_work:/work \
        -e KUBECONFIG=/kube/config \
        --user $(id -u):$(id -g) \
    sas-orchestration \
    deploy \
        --namespace ${current_namespace} \
        --deployment-data /license/SASViyaV4_${_order}_certs.zip \
        --license /license/SASViyaV4_${_order}_license.jwt \
        --user-content /${current_namespace} \
        --cadence-name ${_cadenceName} \
        --cadence-version ${_cadenceVersion} \
        --image-registry ${_viyaMirrorReg}

    When the deploy commmand completes succesfully the final message should say The deploy command completed succesfully as shown in the log snippet below.

    The deploy command started
    Generating deployment artifacts
    Generating deployment artifacts complete
    Generating kustomizations
    Generating kustomizations complete
    Generating manifests
    Applying manifests
    > start_leading gelcorp
    
    [...more...]
    
    > kubectl delete --namespace gelcorp --wait --timeout 7200s --ignore-not-found configmap sas-deploy-lifecycle-operation-variables
    configmap "sas-deploy-lifecycle-operation-variables" deleted
    
    > stop_leading gelcorp
    
    Applying manifests complete
    The deploy command completed successfully
  3. If the sas-orchestration deploy command fails checkout the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

Test the changes were made in the cluster

  1. It may takes some time for the PVC to be bound. Run the following command to wait for that to happen.

    while [[ $(kubectl get pvc gelcontent-data -o 'jsonpath={..status.phase}') != "Bound" ]];
    do
        echo "waiting for PVC status" && sleep 1;
    done
  2. Confirm that the PVC was created in the namespace.

    kubectl get pvc gelcontent-data

    Expected output: log NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE gelcontent-data Bound pvc-8dccb0ed-cfdc-4a26-afd0-76aad37bb7ff 2Gi RWX nfs-client 13m

  3. Delete the CAS pod so the PVC change can be picked up when CAS restarts.

    kubectl delete pod --selector='app.kubernetes.io/managed-by=sas-cas-operator'
  4. Confirm that the PVC is mounted into the pods at the location /mnt/gelcontent.

    kubectl describe pod -l casoperator.sas.com/node-type=controller | grep -A 3  sas-viya-gelcontent-pvc-volume

    You should see in the output: log /mnt/gelcontent from sas-viya-gelcontent-pvc-volume (rw) /opt/sas/viya/config/etc/SASSecurityCertificateFramework/cacerts from security (rw,path="cacerts") /opt/sas/viya/config/etc/SASSecurityCertificateFramework/private from security (rw,path="private") /opt/sas/viya/home/commonfiles from commonfilesvols (ro) -- sas-viya-gelcontent-pvc-volume: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: gelcontent-data ReadOnly: false

  5. To see the change made to the Viya manifest file run the following command.

    icdiff  /tmp/${current_namespace}/manifest_02-021.yaml /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml

    NOTE: when using the orchestrate deploy command the manifest is built inside the docker container in a work directory. In order to access the manifest we have mounted that work directory to a path on the local file system.

Review

In the practice exercise you:

  • Created two new overlays(a PVC definition and a transformer that patches the CAS deployment to use the PVC)
  • Edited the kustomization.yaml file to reference two new overlays.
  • Used the docker run command to run the sas-orchestration deploy command
  • The deploy command built and applied the manifests to make the changes in the Viya environment.

Lesson 03

SAS Viya Administration Operations
Lesson 03, Section 0 Exercise: Configure Identities

In this handson you will manage the SAS Viya access to POSIX attributes.

Review Current Identities Configuration

  1. Setup Environment.

    gel_setCurrentNamespace gelcorp
    export SAS_CLI_PROFILE=${current_namespace}
    export SSL_CERT_FILE=~/.certs/${current_namespace}_trustedcerts.pem
    export REQUESTS_CA_BUNDLE=${SSL_CERT_FILE}
    /opt/pyviyatools/loginviauthinfo.py
  2. In the class environment the identity provider can return POSIX attributes. However the default settings for Viya do not return the userid (UID) POSIX attribute. In this step we view the current settings and see that identifier.generateUids is true, meaning that the identities service will generate UID’s using a hashing algorithim.

    sas-viya configuration configurations show --id $(sas-viya configuration configurations list --definition-name sas.identities | jq -r '.items[0]["id"]') | grep identifier

    Expected output: log identifier.disableGids : false identifier.disallowedUids : 1001 identifier.generateGids : false identifier.generateUids : true

  3. As a result of these settings UID and GID are generated and cannot be overriden. You can check the current POSIX attributes returned to Viya from identities using the show-user command of the identities plugin of the sas-viya cli. The –showadvanced option shows the advanced attributes including UID and GID. The UID and GID numbers have been generated by the identities service.

    sas-viya --output text identities show-user --id Ahmed --show-advanced

    Expected output: log Id Ahmed Name Ahmed Title Platform Administrator EmailAddresses [map[value:Ahmed@gelenable.sas.com]] PhoneNumbers Addresses State active ProviderId ldap CreationTimeStamp 2021-11-02T15:15:25.000Z ModifiedTimeStamp 2022-11-21T15:05:37.000Z Uid 7.3671692e+08 Gid 7.3671692e+08 SecondaryGids [2003 3000 3006 3007]

    NOTE: If the UID is generated for a user, their primary GID is always set to the same value. Because generateGids=False the secondary GID’s are being loaded from LDAP.

  4. You can also use the pyviyatools getposixidentity.py to view the UID and GID (also secondary GID’s).

    /opt/pyviyatools/getposixidentity.py -u Ahmed -o simplejson

    Expected output: log { "username": "Ahmed", "version": 1, "gid": 736716920, "secondaryGids": [ 2003, 3006, 3007 ], "id": 736716920, "uid": 736716920 }

  5. getposixidentity can return the UID, GID and secondary GID’s for all users. NOTE: this can take a minute or two to complete.

    /opt/pyviyatools/getposixidentity.py -o csv

    Expected output: log id ,uid ,gid ,secgid ,name "sasldap","520876611","520876611","[1003]","SAS LDAP Service Account" "sas","1999796463","1999796463","[2002, 2003, 1001, 3001, 1002, 3003, 3004]","SAS System Account" "cas","1847089512","1847089512","[1001, 3001, 1002, 3003, 3004]","CAS System Account" "sasadm","1681596511","1681596511","[2002, 2003, 3001, 3003, 3004, 3006, 3007]","SAS Administrator" "sastest1","1396998156","1396998156","[2003]","SAS Test User 1" "sastest2","1754724830","1754724830","[2003]","SAS Test User 2" "geladm","1794942382","1794942382","[2002, 2003, 3001, 3003, 3004, 3006, 3007]","geladm" "Douglas","1721890207","1721890207","[2003, 3003, 3007]","Douglas" "Delilah","408403145","408403145","[2003, 3001, 3007]","Delilah" "Alex","1571476065","1571476065","[2003, 3005]","Alex" "Amanda","1917500395","1917500395","[2003, 3006, 3007]","Amanda" "Ahmed","736716920","736716920","[2003, 3006, 3007]","Ahmed" "Fay","217599329","217599329","[2003, 3004]","Fay" "Fernanda","2063022101","2063022101","[2003, 3004]","Fernanda" "Fiona","820179103","820179103","[2003, 3004]","Fiona" "Frank","451444854","451444854","[2003, 3004]","Frank" "Fred","340329804","340329804","[2003, 3004]","Fred" "Hamish","127901971","127901971","[2003, 3001]","Hamish" "Hazel","1123162857","1123162857","[2003, 3001]","Hazel" "Heather","1263590372","1263590372","[2003, 3001]","Heather" "Helena","1414144599","1414144599","[2003, 3001, 3002]","Helena" "Henrik","274336581","274336581","[2003, 3001]","Henrik" "Hugh","176265822","176265822","[2003, 3001]","Hugh" "Santiago","792606553","792606553","[2003, 3003]","Santiago" "Sarah","1664936282","1664936282","[2003, 3003]","Sarah" "Sasha","243405028","243405028","[2003, 3003]","Sasha" "Sean","1900111114","1900111114","[2003, 3003]","Sean" "Sebastian","1640393434","1640393434","[2003, 3003]","Sebastian" "Shannon","1234018889","1234018889","[2003, 3003]","Shannon" "Sheldon","1621874476","1621874476","[2003, 3003]","Sheldon" "Sophia","1098749434","1098749434","[2003, 3002, 3003]","Sophia" "hrservice","1170696439","1170696439","[2003]","hrservice" "salesservice","1688566405","1688566405","[2003]","salesservice" "financeservice","1148835472","1148835472","[2003]","financeservice"

Update Identities Configuration to return UID from LDAP

  1. In our environment we want to return the POSIX attributes from our LDAP identity provider so that our SAS compute engines can access content secured to users and groups on shared storage.

  2. On sasnode01 where the LDAP Server is deployed we can see the POSIX attributes for Ahmed.

    id Ahmed

    Expected output: uid=4005(Ahmed) gid=2003(sasusers) groups=2003(sasusers),3006(GELCorpSystemAdmins),3007(powerusers)

  3. To return the POSIX attributes from LDAP we will set the identifier.generateUids property to false and then refresh the identities cache. We will use the configuration cli to achieve this, we could also perform these steps in Environment Manager.

    MEDIATYPE=$(sas-viya configuration configurations download -d sas.identities | jq -r '.items[]["metadata"]["mediaType"] ' )
    
        echo ${MEDIATYPE}
    
    tee /tmp/update_identities.json > /dev/null << EOF
        {
        "name": "identities configurations",
        "items": [
            {
                "metadata": {
                    "isDefault": false,
                    "mediaType": "${MEDIATYPE}"
                },
                "identifier.generateUids": false
    
            }
        ]
        }
    EOF
    
    sas-viya configuration configurations update --file /tmp/update_identities.json
    sas-viya --output text identities refresh-cache
    sleep 20

    Expected output: log The cache is refreshing. state refreshing

  4. Now that UID is not generated lets check what is returned for Ahmed. Notice that we are now getting the attributes from the LDAP identity provider.

    Tip: If Ahmed still has the same large values for Uid and Gid that he had before you updated the configuration, it is likely the cache has not refreshed yet. Wait a short time (no more than 30 seconds), and try again

    sas-viya --output text identities show-user --id Ahmed --show-advanced

    Expected output: log Id Ahmed Name Ahmed Title Platform Administrator EmailAddresses [map[value:Ahmed@gelcorp.com]] PhoneNumbers Addresses [map[country: locality:Cary postalCode: region:]] State active ProviderId ldap CreationTimeStamp 2023-05-04T08:29:03.000Z ModifiedTimeStamp 2023-05-04T08:29:03.000Z Uid 4005 Gid 2003

  5. In this step review the UID, GID and secondary GID’s for all users. NOTE: this can take a minute or two to complete.

    /opt/pyviyatools/getposixidentity.py -o csv

    Expected output: log id ,uid ,gid ,secgid ,name "sasldap","1003","1003","['']","SAS LDAP Service Account" "sas","1001","1001","[2002, 2003, 3001, 1002, 3003, 3004]","SAS System Account" "cas","1002","1001","[3001, 1002, 3003, 3004]","CAS System Account" "sasadm","2002","2002","[2003, 3001, 3003, 3004, 3006, 3007]","SAS Administrator" "sastest1","2003","2003","['']","SAS Test User 1" "sastest2","2004","2003","['']","SAS Test User 2" "geladm","4000","2002","[2003, 3001, 3003, 3004, 3006, 3007]","geladm" "Douglas","4001","2003","[3003, 3007]","Douglas" "Delilah","4002","2003","[3001, 3007]","Delilah" "Alex","4003","2003","[3005]","Alex" "Amanda","4004","2003","[3006, 3007]","Amanda" "Ahmed","4005","2003","[3006, 3007]","Ahmed" "Fay","4006","2003","[3004]","Fay" "Fernanda","4007","2003","[3004]","Fernanda" "Fiona","4008","2003","[3004]","Fiona" "Frank","4009","2003","[3004]","Frank" "Fred","4010","2003","[3004]","Fred" "Hamish","4011","2003","[3001]","Hamish" "Hazel","4012","2003","[3001]","Hazel" "Heather","4013","2003","[3001]","Heather" "Helena","4014","2003","[3001, 3002]","Helena" "Henrik","4015","2003","[3001]","Henrik" "Hugh","4016","2003","[3001]","Hugh" "Santiago","4017","2003","[3003]","Santiago" "Sarah","4018","2003","[3003]","Sarah" "Sasha","4019","2003","[3003]","Sasha" "Sean","4020","2003","[3003]","Sean" "Sebastian","4021","2003","[3003]","Sebastian" "Shannon","4022","2003","[3003]","Shannon" "Sheldon","4023","2003","[3003]","Sheldon" "Sophia","4024","2003","[3002, 3003]","Sophia" "hrservice","3001","2003","['']","hrservice" "salesservice","3002","2003","['']","salesservice" "financeservice","3003","2003","['']","financeservice"

Review

The POSIX attributes are now returned from the LDAP identity provider. This will facilitate securing and accessing files on the shared NFS server that uses the same LDAP identity provider.


SAS Viya Administration Operations
Lesson 03, Section 1 Exercise: Configure Persistent Storage

The Viya environment has an NFS server running on sasnode1. We can use this NFS server to mount directories and files from the host to pods in the Kubernetes cluster. This is useful for accessing data or code from this permanent location. The files and folders on the NFS server are secured so we will also make sure that the CAS and SAS Programming Run-Time servers work with the secured NFS mount.

In this hands-on you will mount a drive from an NFS file server into the CAS and Programming Run-time pods.

Set the namespace and authenticate

gel_setCurrentNamespace gelcorp
/opt/pyviyatools/loginviauthinfo.py

Use an NFS volume to make data available to the CAS deployment

Mount NFS Share to CAS Deployment

  1. Create an overlay for CAS to add the volume from the NFS server and the volume mount point inside the CAS container. The overlay targets the single CASDeployment that is available in the namespace.

    • volumeMounts : mountPath is the location inside the container​
    • volumes : path is the location outside the container ​
    cd ~/project/deploy/
    
    _deploymentNodeFQDN=$(hostname -f)
    
    tee ~/project/deploy/${current_namespace}/site-config/cas-add-nfs-mount.yaml > /dev/null << EOF
    # cas-add-nfs-mount.yaml
    # Add additional mount
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: cas-add-mount
    patch: |-
      - op: add
        path: /spec/controllerTemplate/spec/volumes/-
        value:
          name: sas-viya-gelcorp-volume
          nfs:
            path: /shared/gelcontent
            server: ${_deploymentNodeFQDN}
      - op: add
        path: /spec/controllerTemplate/spec/containers/0/volumeMounts/-
        value:
          name: sas-viya-gelcorp-volume
          mountPath: /gelcontent
    target:
      group: viya.sas.com
      kind: CASDeployment
      # The following name specification will target all CAS servers. To target specific
      # CAS servers, comment out the following line then uncomment and edit one of the lines
      # targeting specific CAS servers.
      name: .*
      # Uncomment this to apply to one particular named CAS server:
      #name: {{ NAME-OF-SERVER }}
      # Uncomment this to apply to the default CAS server:
      #labelSelector: "sas.com/cas-server-default"
      version: v1alpha1
    EOF
  2. Modify ~/project/deploy/${current_namespace}/kustomization.yaml to reference the cas server overlay.

    In the transformers section add the line - site-config/cas-add-nfs-mount.yaml. Run this command to update kustomization.yaml using the yq tool:

    [[ $(grep -c "site-config/cas-add-nfs-mount.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.transformers += ["site-config/cas-add-nfs-mount.yaml"]' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you can manually edit the transformers section to add the lines below

    [...]
    transformers:
        [... previous transformers items ...]
        - site-config/cas-add-nfs-mount.yaml
    [...]

Add path to CAS allowlist

  1. By default, the path which CAS can access is restricted to /cas/data/caslibs. In this step we will add the NFS mounted path so that users can create caslibs to that path. This could also be done as a CAS super-user in SAS Environment Manager.

    tee ~/project/deploy/${current_namespace}/site-config/cas-add-allowlist-paths.yaml > /dev/null << EOF
    ---
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: cas-add-allowlist-paths
    patch: |-
      - op: add
        path: /spec/appendCASAllowlistPaths
        value:
          - /cas/data/caslibs
          - /gelcontent
          - /mnt/gelcontent/
    target:
      group: viya.sas.com
      kind: CASDeployment
      # The following name specification will target all CAS servers. To target specific
      # CAS servers, comment out the following line then uncomment and edit one of the lines
      # targeting specific CAS servers.
      name: .*
      # Uncomment this to apply to one particular named CAS server:
      #name: {{ NAME-OF-SERVER }}
      # Uncomment this to apply to the default CAS server:
      #labelSelector: "sas.com/cas-server-default"
      version: v1alpha1
    EOF
  2. Modify ~/project/deploy/${current_namespace}/kustomization.yaml to reference the cas allowlist overlay. In the transformers section add the line - site-config/cas-add-allowlist-paths.yaml

    Run this command to update kustomization.yaml using the yq tool:

    [[ $(grep -c "site-config/cas-add-allowlist-paths.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.transformers += ["site-config/cas-add-allowlist-paths.yaml"]' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you can manually edit the transformers section to add the lines below

    [...]
    transformers:
        [... previous transformers items ...]
        - site-config/cas-add-allowlist-paths.yaml
    [...]

Use an NFS volume to make data available to the Programming Run-Time Servers (programming run-time)

Mount NFS share to all Programming Run-Time

In Viya, programming run-time sessions are started by the launcher. The Launcher Service looks for a Kubernetes PodTemplate that contains information that is used to construct a Kubernetes job request. The PodTemplate information is used to generate the container that is launched as pod. The container in the pod performs the SAS processing.

  1. Create a new .yaml file for the changes that need to be applied to the Kubernetes manifest to add the volume and volume mount. Save the file in the ${current_namespace} project.

    _deploymentNodeFQDN=$(hostname -f)
    
    tee ~/project/deploy/${current_namespace}/site-config/compute-server-add-nfs-mount.yaml > /dev/null << EOF
    
    ---
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: compute-server-add-nfs-mount
    patch: |-
      - op: add
        path: /template/spec/volumes/-
        value:
          name: sas-viya-gelcorp-volume
          nfs:
            path: /shared/gelcontent
            server: ${_deploymentNodeFQDN}
      - op: add
        path: /template/spec/containers/0/volumeMounts/-
        value:
          name: sas-viya-gelcorp-volume
          mountPath: /gelcontent
    target:
      kind: PodTemplate
      version: v1
      labelSelector: sas.com/template-intent=sas-launcher
    EOF
  2. Modify the ~/project/deploy/${current_namespace}/kustomization.yaml to reference the compute server overlay. Run this command to update kustomization.yaml using the yq tool:

    [[ $(grep -c "site-config/compute-server-add-nfs-mount.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.transformers += ["site-config/compute-server-add-nfs-mount.yaml"]' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you can manually edit the transformers section to add the lines below

    [...]
    transformers:
        [... previous transformers items ...]
        - site-config/compute-server-add-nfs-mount.yaml
    [...]

    Update the Allowlist for SAS Programming Run-Time

  3. Starting with Stable 2020.1.4 lockdown is set by default on Programming run-time servers. Update the allowlist for the Compute Server

    tee /tmp/compute-autoexec.json > /dev/null << EOF
    {
    "items": [
        {
            "version": 1,
            "metadata": {
                "isDefault": false,
                "services": [
                    "compute"
                ],
                "mediaType": "application/vnd.sas.configuration.config.sas.compute.server+json;version=1"
            },
            "name": "autoexec_code",
            "contents": "/*Allow List*/  \n lockdown path='/gelcontent'; \n lockdown path='/mnt/gelcontent'; \n "
        }
    ],
    "version": 2
    }
    EOF
    
    gel_sas_viya configuration configurations update --file /tmp/compute-autoexec.json

Build and Apply with sas-orchestration deploy

  1. Keep a copy of the current manifest file. We will use this copy to track the changes your kustomization processing makes to this file.

    cp -p /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml /tmp/${current_namespace}/manifest_03-021.yaml
  2. Run the sas-orchestration deploy command.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
                -v ${PWD}/license:/license \
                -v ${PWD}/${current_namespace}:/${current_namespace} \
                -v ${HOME}/.kube/config_portable:/kube/config \
                -v /tmp/${current_namespace}/deploy_work:/work \
                -e KUBECONFIG=/kube/config \
                --user $(id -u):$(id -g) \
            sas-orchestration \
                deploy \
                --namespace ${current_namespace} \
                --deployment-data /license/SASViyaV4_${_order}_certs.zip \
                --license /license/SASViyaV4_${_order}_license.jwt \
                --user-content /${current_namespace} \
                --cadence-name ${_cadenceName} \
                --cadence-version ${_cadenceVersion} \
                --image-registry ${_viyaMirrorReg}

    When the deploy commmand completes succesfully the final message should say The deploy command completed succesfully as shown in the log snippet below.

    The deploy command started
    Generating deployment artifacts
    Generating deployment artifacts complete
    Generating kustomizations
    Generating kustomizations complete
    Generating manifests
    Applying manifests
    > start_leading gelcorp
    
    [...more...]
    
    > kubectl delete --namespace gelcorp --wait --timeout 7200s --ignore-not-found configmap sas-deploy-lifecycle-operation-variables
    configmap "sas-deploy-lifecycle-operation-variables" deleted
    
    > stop_leading gelcorp
    
    Applying manifests complete
    The deploy command completed successfully
  3. If the sas-orchestration deploy command fails checkout the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

  4. Run the following command to view the changes in the manifest. The changes are in green in the right column.

    icdiff  /tmp/${current_namespace}/manifest_03-021.yaml /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml

Validation

In this section we will validate that the changes were sucessfully implemented in the Viya Deployment.

Validate that the NFS directories were mounted into the CAS pods

  1. To pick up the CAS Related changes we need to restart the CAS server. Delete the existing CAS pods and the CAS Operator will automatically start a new instance. In this case we use a selector to target all pods managed by the CAS operator.

    kubectl delete pod --selector='app.kubernetes.io/managed-by=sas-cas-operator'
    sleep 20
    kubectl wait pods -l casoperator.sas.com/node-type=controller --for condition=ready --timeout 360s
  2. Check that /shared/gelcontent is mounted into the pods at the location /gelcontent.

    kubectl describe pod -l casoperator.sas.com/node-type=controller | grep -A 3 sas-viya-gelcorp-volume

    You should see in the output:

        /gelcontent from sas-viya-gelcorp-volume (rw)
        /opt/sas/viya/home/share/refdata/qkb from sas-quality-knowledge-base-volume (rw)
        /rdutil from sas-rdutil-dir (rw)
        /sasviyabackup from backup (rw)
    --
    sas-viya-gelcorp-volume:
        Type:      NFS (an NFS mount that lasts the lifetime of a pod)
        Server:    pdcesx21071.race.sas.com
        Path:      /shared/gelcontent
  3. Exec into the CAS controller pod and check that files on the NFS share can be accessed.

    kubectl exec -it  $(kubectl get pod -l casoperator.sas.com/node-type=controller  --output=jsonpath={.items..metadata.name}) -c sas-cas-server -- ls -li /gelcontent/gelcorp

    You should see in the output:

    total 0
    461525615 drwxrws--- 8 sas 3004 86 Jan 10  2020 finance
    432176384 drwxrws--- 8 sas 3001 86 Mar 22  2020 hr
    490905538 drwxrwxr-x 2 sas 2003 58 Mar 22  2020 inventory
    381853994 drwxrws--- 8 sas 3003 86 Mar 22  2020 sales
    411319639 drwxrwsrwx 6 sas 2003 57 May 27  2021 shared
  4. Let’s see which user is running the CAS container and if we can read the data in the sales area.

    kubectl exec -it $(kubectl get pod -l casoperator.sas.com/node-type=controller --output=jsonpath={.items..metadata.name}) -c sas-cas-server -- bash -c "id && cat /gelcontent/gelcorp/sales/data/test.csv"

    It looks like we are the user sas id 1001 and it also appears that the user 1001 cannot read the data. We will address this problem in the next hands-on.

    uid=1001(sas) gid=1001(sas) groups=1001(sas)
    cat: /gelcontent/gelcorp/sales/data/test.csv: Permission denied
    command terminated with exit code 1

Validate that the NFS directories were mounted into the programming run-time Pods

  1. Run the command below to generate a link for SAS Studio. Click on the link in the terminal window.

    gellow_urls | grep "SAS Studio"
  2. Logon as Henrik:lnxsas and select New Program. Cut and paste the following code into the SAS Studio editor. The code will access the data from the nfs mount within the compute pod. Henrik is a member of the HR group so he should be able to access the data. Save and submit the code and check the result, the libname should be allocated and the data printed.

    NOTE: you may have to wait 10 to 20 seconds for the SAS Studio Compute context to initialize. If you get an error try reselecting the SAS Studio compute context

    /* HR data mounted from /shared/gelcontent/gelcorp/hr/data */
    libname hrdata "/gelcontent/gelcorp/hr/data";
    proc print data=hrdata.performance_lookup;
    run;

    Extra credit: try to read the data from the the Sales area at /gelcontent/gelcorp/sales/data

  3. Get the the pod launched to run Henrik’s SAS Studio SAS session

    kubectl get pods -l launcher.sas.com/requested-by-client=sas.studio,launcher.sas.com/username=Henrik --field-selector status.phase=Running --sort-by=.metadata.creationTimestamp
  4. View the log of the pod that was launched for user Henrik.

    kubectl logs $(kubectl get pods -l launcher.sas.com/requested-by-client=sas.studio,launcher.sas.com/username=Henrik --field-selector status.phase=Running --sort-by=.metadata.creationTimestamp --output=jsonpath={.items..metadata.name}) | grep HRDATA | gel_log

    You should see the libname being accessed in the log.

    INFO  2021-05-11 10:30:21.733 +0000 [compsrv] - NOTE: Libref HRDATA was successfully assigned as follows:
    INFO  2021-05-11 10:30:21.762 +0000 [compsrv] - NOTE: There were 4 observations read from the data set HRDATA.PERFORMANCE_LOOKUP.
    INFO  2021-05-11 10:30:23.991 +0000 [compsrv] - Request  [00000076] >> GET /compute/sessions/aa210bd4-874c-4839-9615-f50eb9dc1b71-ses0000/data/HRDATA

Review

In this practice exercise you:

  • made an an NFS server available to a mount point in the CAS and Programming runtime servers
  • update the allowlist for the servers so that Viya can access the file-system location
  • validated that the mounted directories are available and accessible.

SAS Viya Administration Operations
Lesson 03, Section 2 Exercise: Permissions and Home Directories

In this hands-on you will ensure that permissions are respected and configure SAS Studio to access user home-directories that are mounted from an NFS Server.

Set the namespace and authenticate

gel_setCurrentNamespace gelcorp
/opt/pyviyatools/loginviauthinfo.py

Update Identities Configuration

  1. View the users home directories that are located on the NFS server at /shared/gelcontent/home.

    ls -al /shared/gelcontent/home

    Partial output:

    drwxr-xr-x 36 root           root      4096 Sep 21 17:38 .
    drwxrwxrwx  6 sas            sasusers    61 Sep 21 17:43 ..
    drwx------  3 Ahmed          sasusers    78 Sep 21 17:38 Ahmed
    drwx------  3 Alex           sasusers    78 Sep 21 17:38 Alex
    drwx------  3 Amanda         sasusers    78 Sep 21 17:38 Amanda
    drwx------  3 cas            sas         78 Sep 21 17:38 cas
    drwx------  3 Delilah        sasusers    78 Sep 21 17:38 Delilah
    drwx------  3 Douglas        sasusers    78 Sep 21 17:38 Douglas
    drwx------  3 Fay            sasusers    78 Sep 21 17:38 Fay
    drwx------  3 Fernanda       sasusers    78 Sep 21 17:38 Fernanda
    drwx------  3 financeservice sasusers    78 Sep 21 17:38 financeservice
    drwx------  3 Fiona          sasusers    78 Sep 21 17:38 Fiona
    drwx------  3 Frank          sasusers    78 Sep 21 17:38 Frank
    ...
  2. The attribute identifier.homeDirectoryPrefix must be set on the identities service to the home-directory root location at /shared/gelcontent/home. Once it is set, the software will build the home directory by concatenating identifier.homeDirectoryPrefix with the username and accessing the NFS server specified in the launcher.sas.com/nfs-server annotation specified on the compute job context. The configuration updates in this hands-on are performed using the configurations plugin of the sas-viya CLI. These updates could also be completed in the Configuration are of SAS Environment Manager. View the current identities configuration.

    gel_sas_viya configuration configurations show --id $(gel_sas_viya configuration configurations list --definition-name sas.identities | jq -r '.items[0]["id"]')

    Expected output:

    id                   : e26c9e3d-9037-4860-bc56-2c01720d8e37
    metadata.isDefault   : false
    metadata.mediaType   : application/vnd.sas.configuration.config.sas.identities+json;version=5
    metadata.services    : [identities]
    cache.cacheRefreshInterval : 12h
    cache.enabled        : true
    cache.providerPageLimit : 1000
    defaultProvider      : local
    endpoints.secured.groups : false
    endpoints.secured.members : false
    endpoints.secured.memberships : false
    endpoints.secured.users : false
    identifier.disableGids : false
    identifier.disallowedUids : 1001
    identifier.generateGids : false
    identifier.generateUids : false
  3. Return the mediaType (mediaType can change across releases) and update configuration property config/identities/sas.identities/identifier.homeDirectoryPrefix and set value to /shared/gelcontent/home

    MEDIATYPE=$(/opt/sas/viya/home/bin/sas-viya configuration configurations download -d sas.identities | jq -r '.items[]["metadata"]["mediaType"] ' )
    echo ${MEDIATYPE}
    
    tee /tmp/update_identities.json > /dev/null << EOF
    {
    "items":
    [
        {
            "version": 1,
            "metadata": {
                "isDefault": false,
                "services": [
                    "identities"
                ],
                "mediaType": "${MEDIATYPE}"
            },
            "identifier.homeDirectoryPrefix": "/shared/gelcontent/home",
            "defaultProvider": "local"
        }
    ]
    }
    EOF
    
    gel_sas_viya configuration configurations update --file /tmp/update_identities.json
  4. This configuration change requires a restart of the identities service. Restart identities and wait for the pod to be ready before continuing (Typically takes around 2 minutes).

    kubectl delete pods -l app=sas-identities
    kubectl wait pods -l app=sas-identities --for condition=ready --timeout 180s
  5. View the updated identities configuration

    gel_sas_viya configuration configurations show --id $(gel_sas_viya configuration configurations list --definition-name sas.identities | jq -r '.items[0]["id"]')

    Expected output:

    id                   : e26c9e3d-9037-4860-bc56-2c01720d8e37
    metadata.isDefault   : false
    metadata.mediaType   : application/vnd.sas.configuration.config.sas.identities+json;version=5
    metadata.services    : [identities]
    cache.cacheRefreshInterval : 12h
    cache.enabled        : true
    cache.providerPageLimit : 1000
    defaultProvider      : local
    endpoints.secured.groups : false
    endpoints.secured.members : false
    endpoints.secured.memberships : false
    endpoints.secured.users : false
    identifier.disableGids : false
    identifier.disallowedUids : 1001
    identifier.generateGids : false
    identifier.generateUids : false
    identifier.homeDirectoryPrefix : /shared/gelcontent/home

Update the CAS Configuration

Change the CAS Account to make Secondary Groups available

Currently the CAS pod is running as the service account with UID 1001 and GID 1001. This does no provide the necessary permissions to access key content on the NFS share. In this step we will add the supplemental groups for the user who runs CAS so that the CAS server can read the data from the NFS share.

  1. Update the user that CAS runs as.

    cd ~/project/deploy/
    
    tee ~/project/deploy/${current_namespace}/site-config/cas-modify-user.yaml > /dev/null << EOF
    ---
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: cas-modify-user
    patch: |-
      - op: replace
        path: /spec/controllerTemplate/spec/securityContext/supplementalGroups
        value:
          [2003,3000,3001,3002,3003,3004,3005,3006,3007]
    target:
      group: viya.sas.com
      kind: CASDeployment
      # The following name specification will target all CAS servers. To target specific
      # CAS servers, comment out the following line then uncomment and edit one of the lines
      # targeting specific CAS servers.
      name: .*
      # Uncomment this to apply to one particular named CAS server:
      #name: {{ NAME-OF-SERVER }}
      # Uncomment this to apply to the default CAS server:
      #labelSelector: "sas.com/cas-server-default"
      version: v1alpha1
    EOF
  2. Modify ~/project/deploy/${current_namespace}/kustomization.yaml to reference the cas server overlay. In the transformers section add the line - site-config/cas-modify-user.yaml

    Run this command to update kustomization.yaml using the yq tool:

    [[ $(grep -c "site-config/cas-modify-user.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.transformers += ["site-config/cas-modify-user.yaml"]' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you can manually edit the transformers section to add the lines below

    [...]
    transformers:
        [... previous transformers items ...]
        - site-config/cas-modify-user.yaml
    [...]

Add NFS mount for home-directoroes

  1. Add a mount for the home-directories. The mount point inside the container must match the identifier.homeDirectoryPrefix which is /shared/gelcontent/home.

    • volumeMounts : mountPath is the location inside the container​
    • volumes : path is the location outside the container ​
    cd ~/project/deploy/
    
    _deploymentNodeFQDN=$(hostname -f)
    
    tee ~/project/deploy/${current_namespace}/site-config/cas-add-nfs-homedir-mount.yaml > /dev/null << EOF
    # cas-add-nfs-mount.yaml
    # Add additional mount
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: cas-add-mount-nfs-homedir
    patch: |-
      - op: add
        path: /spec/controllerTemplate/spec/volumes/-
        value:
          name: sas-viya-gelcorp-homedir
          nfs:
            path: /shared/gelcontent/home
            server: ${_deploymentNodeFQDN}
      - op: add
        path: /spec/controllerTemplate/spec/containers/0/volumeMounts/-
        value:
          name: sas-viya-gelcorp-homedir
          mountPath: /shared/gelcontent/home
    target:
      group: viya.sas.com
      kind: CASDeployment
      # The following name specification will target all CAS servers. To target specific
      # CAS servers, comment out the following line then uncomment and edit one of the lines
      # targeting specific CAS servers.
      name: .*
      # Uncomment this to apply to one particular named CAS server:
      #name: {{ NAME-OF-SERVER }}
      # Uncomment this to apply to the default CAS server:
      #labelSelector: "sas.com/cas-server-default"
      version: v1alpha1
    EOF
  2. Modify ~/project/deploy/${current_namespace}/kustomization.yaml to reference the cas server overlay.

    In the transformers section add the line - site-config/cas-add-nfs-mount.yaml

    Run this command to update kustomization.yaml using the yq tool:

    [[ $(grep -c "site-config/cas-add-nfs-homedir-mount.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.transformers += ["site-config/cas-add-nfs-homedir-mount.yaml"]' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you can manually edit the transformers section to add the lines below

    [...]
    transformers:
        [... previous transformers items ...]
        - site-config/cas-add-nfs-homedir-mount.yaml
    [...]

Enabling Host Launched CAS Sessions

As an alternative, or in addition to modifying the account that CAS runs under you can use the CASHostAccountRequired custom group. Members of this group will run the CAS process as their own account. There is also a CAS environment variable CASALLHOSTACCOUNTS which forces all CAS sessions to run as the host account (with the exception of session zero).

  1. As an additional security measure to enable host launched CAS sessions you must include the cas-enable-host.yaml in your kustomization.yaml. It must appear before the sas-bases/overlays/required/transformers.yaml

  2. Copy the example file.

    cp -p ~/project/deploy/${current_namespace}/sas-bases/examples/cas/configure/cas-enable-host.yaml ~/project/deploy/${current_namespace}/site-config
  3. Add the overlay for site-config/cas-server/cas-enable-host.yaml (need to be placed before sas-bases/overlays/required/transformers.yaml)

    Run this command to update your kustomization.yaml file using the sed tool:

    sed -i '/sas-bases\/overlays\/required\/transformers.yaml/i \ \ \- site-config\/cas-enable-host.yaml' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you could manually edit the transformers section to add the line - site-config/cas-enable-host.yaml.

    [...]
    transformers:
        [... previous resource items ...]
        - site-config/cas-enable-host.yaml
        - sas-bases/overlays/required/transformers
    [...]
  4. Create the CASHostAccountRequired group.

    gel_sas_viya --output text  identities create-group --id CASHostAccountRequired --name "CASHostAccountRequired" --description "Run CAS as users account"

    Expected output:

    Id            CASHostAccountRequired
    Name          CASHostAccountRequired
    Description   Run
    State         active
    The group was created successfully.
  5. Add users to the CASHostAccountRequired group. These users will launch there CAS session under the user identity.

    sas-viya --output text identities add-member --group-id CASHostAccountRequired --user-member-id Henrik
    sas-viya --output text identities add-member --group-id CASHostAccountRequired --user-member-id Douglas
    sas-viya --output text identities add-member --group-id CASHostAccountRequired --user-member-id Delilah

    Expected output:

    Henrik has been added to group CASHostAccountRequired
    Douglas has been added to group CASHostAccountRequired
    Delilah has been added to group CASHostAccountRequired
  6. This process will only work on a newly created permstore. Stop CAS so that we are able to delete the PVC. After deleting the PVC, a new persistent volume will be created when CAS is restarted. (Probably best to do this when you are initially deploying to avoid this step.)

    kubectl delete casdeployment default
    kubectl delete pvc -l 'sas.com/backup-role=provider'

    You should see:

    persistentvolumeclaim "cas-default-data" deleted
    persistentvolumeclaim "cas-default-permstore" deleted

Make User Home-Directories Available

In the first step we updated identities to set the attribute identifier.homeDirectoryPrefix. In this step we will make the home-directories available.

Add an annotation to the Launcher job to tell it user home directories will be on nfs

  1. In order for the home directories to be accessed inside the launched container, users must specify the NFS server via pod template annotation.

    Setting launcher.sas.com/nfs-server: NFS_SERVER_LOCATION in the pod template annotation uses the NFS mount when launching containers. If the annotation is not set, the Launcher uses hostPath by default and assumes that the user directories are available locally on the Kubernetes nodes.

    _deploymentNodeFQDN=$(hostname -f)
    echo ${_deploymentNodeFQDN}
    
    tee ~/project/deploy/${current_namespace}/site-config/compute-server-annotate-podtempate.yaml > /dev/null << EOF
    - op: add
      path: "/metadata/annotations/launcher.sas.com~1nfs-server"
      value: ${_deploymentNodeFQDN}
    EOF

    Modify the ~/project/deploy/${current_namespace}/kustomization.yaml to reference the compute server overlay.

    In the patches section add the lines below that patch the compute server deployment

    Run this command to update kustomization.yaml using the yq tool:

    [[ $(grep -c "site-config/compute-server-annotate-podtempate.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.patches += {
        "path": "site-config/compute-server-annotate-podtempate.yaml",
        "target":
            {"name": "sas-compute-job-config",
            "version": "v1",
            "kind": "PodTemplate"}
        }' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you can manually edit the patches section to add the lines below

    [...]
    patches:
        [... previous patches items ...]
        - path: site-config/compute-server-annotate-podtempate.yaml
        target:
                name: sas-compute-job-config
                version: v1
                kind: PodTemplate
    [...]

Configure SAS Studio

SAS Studio cannot, by default, access the file system. It is preferred that content such as SAS code are stored in folders. However, it is possible to configure SAS Studio to access the file system and to make users home directories available from an NFS mount. In these steps the configuration plugin of the sas-viya CLI is used. This could also have been completed interactively in SAS Environment Manager.

  1. Update the SAS Studio configuration to set config/SASStudio/sas.studio/:

    • showServerFiles”: true
    • serverDisplayName” : “NFS gelcontent”
    • fileNavigationRoot”: “USER”
  2. View the configuration before the change.

    gel_sas_viya configuration configurations show --id $(gel_sas_viya configuration configurations list --definition-name sas.studio | jq -r '.items[0]["id"]')

    Expected output:

    id                   : 98eeed20-a5f8-4b25-a771-c09ed7c781a9
    metadata.isDefault   : true
    metadata.mediaType   : application/vnd.sas.configuration.config.sas.studio+json;version=21
    metadata.services    : [SASStudio studio]
    abandonedSessionTimeout : 5
    allowCopyPasteData   : true
    allowDownload        : true
    allowExport          : true
    allowGit             : true
    allowGitKerberosAuthentication : false
    allowGitPassword     : ********
    allowGitSSHPassword  : ********
    allowGitSSLCertFilepath : false
    allowPrintData       : true
    allowUpload          : true
    defaultTextEncoding  : UTF-8
    enableAllFeatureFlags : false
    enableAutoCompleteLibraries : true
    enableAutoCompleteTables : true
    fileNavigationRoot   : USER
    flowColumnLimit      : 10000
    longPollingHoldTimeSeconds : 30
    maxGitFileSize       : 3e+07
    maxUploadSize        : 1.048576e+08
    showServerFiles      : false
    validMemName         : EXTEND
    validVarName         : ANY
    
  3. Return the mediaType (mediaType can change across releases) and update the configuration.

    MEDIATYPE=$(/opt/sas/viya/home/bin/sas-viya configuration configurations download -d sas.studio | jq -r '.items[]["metadata"]["mediaType"] ' )
    echo ${MEDIATYPE}
    
    tee /tmp/update_studio.json > /dev/null << EOF
    {
        "name": "configurations",
        "items": [
            {
            "metadata": {
                "isDefault": false,
                "mediaType": "${MEDIATYPE}"
            },
            "serverDisplayName": "NFS gelcontent",
            "showServerFiles": true,
            "fileNavigationRoot": "USER"
            }
        ]
    }
    EOF
    
    gel_sas_viya configuration configurations update --file /tmp/update_studio.json
  4. View the configuration after the change.

    gel_sas_viya configuration configurations show --id $(gel_sas_viya configuration configurations list --definition-name sas.studio | jq -r '.items[0]["id"]')

    Expected output:

    id                   : 98eeed20-a5f8-4b25-a771-c09ed7c781a9
    metadata.isDefault   : false
    metadata.mediaType   : application/vnd.sas.configuration.config.sas.studio+json;version=21
    metadata.services    : [SASStudio studio]
    abandonedSessionTimeout : 5
    allowCopyPasteData   : true
    allowDownload        : true
    allowExport          : true
    allowGit             : true
    allowGitKerberosAuthentication : false
    allowGitPassword     : ********
    allowGitSSHPassword  : ********
    allowGitSSLCertFilepath : false
    allowPrintData       : true
    allowUpload          : true
    defaultTextEncoding  : UTF-8
    enableAllFeatureFlags : false
    enableAutoCompleteLibraries : true
    enableAutoCompleteTables : true
    fileNavigationRoot   : USER
    flowColumnLimit      : 10000
    longPollingHoldTimeSeconds : 30
    maxGitFileSize       : 3e+07
    maxUploadSize        : 1.048576e+08
    serverDisplayName    : NFS gelcontent
    showServerFiles      : true
    validMemName         : EXTEND
    validVarName         : ANY

Build and Apply with sas-orchestration deploy

  1. Keep a copy of the current manifest file. We will use this copy to track the changes your kustomization processing makes to this file.

    cp -p /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml /tmp/${current_namespace}/manifest_03-031.yaml
  2. Run the sas-orchestration deploy command.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
                -v ${PWD}/license:/license \
                -v ${PWD}/${current_namespace}:/${current_namespace} \
                -v ${HOME}/.kube/config_portable:/kube/config \
                -v /tmp/${current_namespace}/deploy_work:/work \
                -e KUBECONFIG=/kube/config \
                --user $(id -u):$(id -g) \
            sas-orchestration \
                deploy \
                --namespace ${current_namespace} \
                --deployment-data /license/SASViyaV4_${_order}_certs.zip \
                --license /license/SASViyaV4_${_order}_license.jwt \
                --user-content /${current_namespace} \
                --cadence-name ${_cadenceName} \
                --cadence-version ${_cadenceVersion} \
                --image-registry ${_viyaMirrorReg}

    When the deploy commmand completes succesfully the final message should say The deploy command completed succesfully as shown in the log snippet below.

    The deploy command started
    Generating deployment artifacts
    Generating deployment artifacts complete
    Generating kustomizations
    Generating kustomizations complete
    Generating manifests
    Applying manifests
    > start_leading gelcorp
    
    [...more...]
    
    > kubectl delete --namespace gelcorp --wait --timeout 7200s --ignore-not-found configmap sas-deploy-lifecycle-operation-variables
    configmap "sas-deploy-lifecycle-operation-variables" deleted
    
    > stop_leading gelcorp
    
    Applying manifests complete
    The deploy command completed successfully
  3. If the sas-orchestration deploy command fails checkout the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

  4. Run the following command to view the changes in the manifest. The changes are in green in the right column.

    icdiff  /tmp/${current_namespace}/manifest_03-031.yaml /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml

Validate

Validate the Change of the CAS User

  1. Check which account the cas server is running under and see if the group and secondary group membership is established so we can read the data. (You may need to wait a few seconds for CAS to restart.)

    sleep 30
    kubectl wait pods -l "casoperator.sas.com/node-type==controller" --for condition=ready --timeout 620s
    kubectl exec -it \
             $(kubectl get pod -l casoperator.sas.com/node-type=controller --output=jsonpath={.items..metadata.name}) \
            -c sas-cas-server \
            -- bash -c "id && ls -al /gelcontent/gelcorp/sales/data/ && head -n 4 /gelcontent/gelcorp/sales/data/test.csv"

    It looks like we are the user cas (id 1002) and all of our group and secondary group memberships are established. As a result we can read the file because we are a member of the Sales group.

    uid=1001(sas) gid=1001(sas) groups=1001(sas),2003,3000,3001,3002,3003,3004,3005,3006,3007
    total 55468
    drwxrws--- 2 sas  3003       50 Sep  7  2017 .
    drwxrws--- 8 sas  3003       86 Mar 22  2020 ..
    -rwxrwx--- 1 2002 3003 54198272 Sep  7  2017 salesmaster.sas7bdat
    -rwxrwx--- 1 2002 3003  2598077 Feb 14  2018 test.csv
    Store,Dept,Date,IsHoliday
    1,1,2012-11-02,FALSE
    1,1,2012-11-09,FALSE
    1,1,2012-11-16,FALSE

Validate the Change to Host Launched CAS Sessions

  1. Logon to SAS Environment Manager as Henrik and select the Data tab.

    gellow_urls | grep "SAS Environment Manager"
  2. Check to see if the CAS session is running as Henrik (4015)

    kubectl exec -it $(kubectl get pod -l casoperator.sas.com/node-type=controller --output=jsonpath={.items..metadata.name}) -c sas-cas-server -- bash -c "ps -ef | grep 4015"

    You should see a CAS session running as user 4015

    4015     12081  2360  0 18:23 ?        00:00:00 /opt/sas/viya/home/SASFoundation/utilities/bin/cas session 141 -role controller -id 0 -keyfile - -controlpid 53873 -port 5570 -cfgpath /cas/config
    sas      12225     0  0 18:25 pts/0    00:00:00 bash -c ps -ef | grep 4015
    sas      12232 12225  0 18:25 pts/0    00:00:00 grep 4015

Validate the Home Directories are availabe in Compute and CAS

  1. Validate that the users home-directory is available. First, Create a SAS Program and put it on the nfs server.

    sudo tee /shared/gelcontent/home/Henrik/gel_launcher_details.sas > /dev/null << EOF
    data _null_;
    
    /* list the attributes of this launcher session */
    
    %put NOTE: I am &_CLIENTUSERNAME;
    %put NOTE: My home directory is &_USERHOME;
    %put NOTE: My Launcher POD IS &SYSHOSTNAME;
    run;
    
    /* is my CASUSER directory mounted from NFS */
    
    cas mysess;
    proc cas;
    builtins.userinfo;
    table.caslibinfo / caslib='CASUSER' verbose=true;
    run;
    quit;
    cas mysess terminate;
    
    EOF
    
    sudo chown Henrik:sasusers /shared/gelcontent/home/Henrik/gel_launcher_details.sas
    sudo chmod 700  /shared/gelcontent/home/Henrik/gel_launcher_details.sas
  2. Stay logged on as Henrik and select Develop Code and Flows to switch to SAS Studio.

  3. You will have to allow some time for the compute context to initialize. Select Explorer and note that:

    • the root of the file-system explorer is named “NFS gelcontent” (NOTE: in the latest release the node may still says SAS Server. )
    • the Users home-directory from the NFS server is available

  4. Open the Program at NFS gelcontent > Home > gel_launcher_details.sas and Run the code. The program will output the name of the pod for this SAS session, the username and the home-directory.

    In the log:

    
        79
        80   data _null_;
        81
        82   /* list the attributes of this launcher session */
        83
        84   %put NOTE: I am &_CLIENTUSERNAME;
        NOTE: I am Henrik
        85   %put NOTE: My home directory is &_USERHOME;
        NOTE: My home directory is /shared/gelcontent/home/Henrik
        86   %put NOTE: My Launcher POD IS &SYSHOSTNAME;
        NOTE: My Launcher POD IS sas-compute-server-bcbe621a-22dd-47c6-b434-70f3601d98c2-4kznd
        87   run;
        NOTE: DATA statement used (Total process time):
            real time           0.00 seconds
            cpu time            0.00 seconds
    
        88
        89   /* is my CASUSER directory mounted from NFS */
        90
        91   cas mysess;
        NOTE: The session MYSESS connected successfully to Cloud Analytic Services sas-cas-server-default-client using port 5570. The UUID
            is 9f201ebb-dd02-ef4d-a9c6-5016cfb2e4fb. The user is Henrik and the active caslib is CASUSER(Henrik).
        NOTE: The SAS option SESSREF was updated with the value MYSESS.
        NOTE: The SAS macro _SESSREF_ was updated with the value MYSESS.
        NOTE: The session is using 0 workers.
        92   proc cas;
        93   builtins.userinfo;
        94   table.caslibinfo / caslib='CASUSER' verbose=true;
        95   run;
        NOTE: Active Session now MYSESS.
        {userInfo={userId=Henrik,providedName=Henrik,uniqueId=Henrik,groups={sasusers,CASHostAccountRequired,HR,2003,3001},providerName=
        OAuth,anonymous=FALSE,hostAccount=TRUE,guest=FALSE}}
        96   quit;
        NOTE: The PROCEDURE CAS printed page 1.
        NOTE: PROCEDURE CAS used (Total process time):
            real time           0.04 seconds
            cpu time            0.05 seconds
    
        97   cas mysess terminate;
        NOTE: Deletion of the session MYSESS was successful.
        NOTE: The default CAS session MYSESS identified by SAS option SESSREF= was terminated. Use the OPTIONS statement to set the
            SESSREF= option to an active session.
        NOTE: Request to TERMINATE completed for session MYSESS.
        98
        99
        100
        101  /* region: Generated postamble */

    In the Result output we can see that Henrik’s CASUSER directory is also on the Shared File-system.

  5. Make a change to the program and Save As to NFS gelcontent > Files > /gelcorp/home/Henrik/gel_launcher_details_v2.sas

  6. Test that the program is persisted on the NFS server and available in Henrik’s home directory outside the pod. You should see the SAS program that you saved.

    sudo ls -ali /shared/gelcontent/home/Henrik
    total 20
    461775230 drwxr-xr-x 3 Henrik sasusers 145 Sep 24 19:04 .
    204782145 drwxrwxrwx 4 sas    sasusers  33 Sep 24 17:12 ..
    461775231 -rw-r--r-- 1 Henrik sasusers  18 Sep 24 16:27 .bash_logout
    461775244 -rw-r--r-- 1 Henrik sasusers 193 Sep 24 16:27 .bash_profile
    461775245 -rw-r--r-- 1 Henrik sasusers 231 Sep 24 16:27 .bashrc
    461775242 -rw-r--r-- 1 Henrik sasusers 227 Sep 24 18:56 gel_launcher_details.sas
    461775228 -rwxr-xr-x 1 Henrik sasusers 227 Sep 24 19:04 gel_launcher_details_v2.sas
    487108963 drwxr-xr-x 4 Henrik sasusers  39 Sep 24 16:27 .mozilla
  7. Exec into the running launcher pod and see the UID and GID of the user inside the pod and the files.

    id Henrik
    kubectl exec -it $(kubectl get pod -l launcher.sas.com/requested-by-client=sas.studio,launcher.sas.com/username=Henrik --output=jsonpath={.items..metadata.name}) -- bash -c "id && ls -al /gelcontent/home/Henrik"
    uid=4015 gid=2003 groups=2003,3001
    total 24
    drwx------  3 4015 2003  145 May  4 18:31 .
    drwxr-xr-x 35 root root 4096 May  4 08:52 ..
    -rw-------  1 4015 2003   18 May  4 08:52 .bash_logout
    -rw-------  1 4015 2003  193 May  4 08:52 .bash_profile
    -rw-------  1 4015 2003  231 May  4 08:52 .bashrc
    -rwx------  1 4015 2003  367 May  4 18:25 gel_launcher_details.sas
    -rwxr-xr-x  1 4015 2003  379 May  4 18:31 gel_launcher_details_v2.sas
    drwx------  4 4015 2003   39 May  4 08:52 .mozilla

Review

In this practice exercise you:

  • Updated the identites configuration to set identifier.homeDirectoryPrefix to the home-directory root location at /shared/gelcontent/home.
  • Update the CAS Configuration to change the CAS user, mount home-directories and enable host-launched CAS sessions for a subset of users.
  • Ensured that home-directores are available to SAS Programming Run-time
  • Configured SAS Studio to access the file-system and the home-directories.
  • Validated all the changes.

SAS Viya Administration Operations
Lesson 03, Section 2 Exercise: Preserve Data Permissions

In our introduction to kustomize we created a PVC that is mounted to CAS. In this section we will copy data to the PVC. We will use a kubernetes job to copy the data to the PVC. Using the job we can ensure that the permissions are preserved. The job is passed

Set namespace and authenticate

  1. In a MobaXterm session on sasnode01, set the current namespace to the gelcorp deployment.

    gel_setCurrentNamespace gelcorp
    /opt/pyviyatools/loginviauthinfo.py

Copy Data to the PVC

  1. Copy data to the PVC. Create ConfigMap with job parameters

    cd ~/project/deploy/
    
    # source directory all files will be copied
    _mysource=/shared/gelcontent/gelcorp
    
    # target persistent volume claim
    _targetclaim=gelcontent-data
    
    #target directory
    _targetdir=/gelcorp
    
    tee ~/project/deploy/${current_namespace}/site-config/gel-sas-copy-data-configmap.yaml > /dev/null << EOF
    ---
    apiVersion: v1
    data:
        _SOURCEDIR: ${_mysource}
        _TARGETCLAIM: ${_targetclaim}
        _TARGETDIR: ${_targetdir}
    kind: ConfigMap
    metadata:
        annotations: {}
        name: gel-copy-data-parameters
        namespace: ${current_namespace}
    EOF
  2. Create the Job to perform the copy

    tee ~/project/deploy/${current_namespace}/site-config/gel-sas-copy-data.yaml > /dev/null << EOF
    apiVersion: batch/v1
    kind: Job
    metadata:
      name:  gel-sas-copy-data
      labels:
        app.kubernetes.io/name: gel-sas-copy-data
    spec:
        template:
          spec:
            containers:
              - name: copydata
                image: registry.access.redhat.com/ubi7/ubi:latest
                envFrom:
                  - configMapRef:
                      name: gel-copy-data-parameters
                command: ["/bin/sh","-c"]
                args:
                  - echo Starting copy from \$(_SOURCEDIR) to PVC \$(_TARGETCLAIM) and directory \$(_TARGETDIR)  ;
                    mkdir -p /target_location\$(_TARGETDIR);
                    chmod 770 /target_location\$(_TARGETDIR);
                    cp -pr /source_location/* /target_location\$(_TARGETDIR);
                    ls -al /target_location\$(_TARGETDIR);
                    echo Completed;
                volumeMounts:
                  - name: viya-source-location
                    mountPath: /source_location
                  - name: viya-target-location
                    mountPath: /target_location
            securityContext:
              fsGroup: 1001
              runAsGroup: 1001
              runAsUser: 1001
              supplementalGroups:
                - 2003
                - 3000
                - 3001
                - 3002
                - 3003
                - 3004
                - 3005
                - 3006
                - 3007
            volumes:
              - name: viya-source-location
                nfs:
                  server: sasnode01
                  path: "${_mysource}"
              - name: viya-target-location
                persistentVolumeClaim:
                  claimName: ${_targetclaim}
            restartPolicy: Never
    EOF
  3. Run the Job to copy the data to the PVC

    cd ~/project/deploy/
    
    kubectl apply -f ~/project/deploy/${current_namespace}/site-config/gel-sas-copy-data-configmap.yaml
    kubectl apply -f ~/project/deploy/${current_namespace}/site-config/gel-sas-copy-data.yaml
  4. Check that the job completed

    kubectl get job gel-sas-copy-data

    Expected output: log NAME COMPLETIONS DURATION AGE gel-sas-copy-data 0/1 44s 44s

  5. Check if the data has been copied by viewing the log.

    kubectl logs -l job-name=gel-sas-copy-data

    Expected output: log Starting copy from /shared/gelcontent/gelcorp to PVC gelcontent-data and directory /gelcorp total 0 drwxrwx--- 7 1001 1001 75 May 4 18:47 . drwxrwxrwx 3 root root 21 May 4 18:47 .. drwxrws--- 8 1001 3004 86 Jan 10 2020 finance drwxrws--- 8 1001 3001 86 Mar 22 2020 hr drwxrwxr-x 2 1001 2003 58 Mar 22 2020 inventory drwxrws--- 8 1001 3003 86 Mar 22 2020 sales drwxrwsrwx 6 1001 2003 57 May 27 2021 shared Completed

Check that the Data is Available on the PVC

  1. Check that the data is available in the PVC and that we can read it.

    kubectl exec -it $(kubectl get pod -l casoperator.sas.com/node-type=controller --output=jsonpath={.items..metadata.name}) -c sas-cas-server -- sh -c "id && ls -al /mnt/gelcontent/gelcorp/sales/data/ && head -n 4 /mnt/gelcontent/gelcorp/sales/data/test.csv"

    Expected output: log uid=1001(sas) gid=1001(sas) groups=1001(sas),2003,3000,3001,3002,3003,3004,3005,3006,3007 total 55468 drwxrws--- 2 sas 3003 50 Sep 7 2017 . drwxrws--- 8 sas 3003 86 Mar 22 2020 .. -rwxrwx--- 1 sas 3003 54198272 Sep 7 2017 salesmaster.sas7bdat -rwxrwx--- 1 sas 3003 2598077 Feb 14 2018 test.csv Store,Dept,Date,IsHoliday 1,1,2012-11-02,FALSE 1,1,2012-11-09,FALSE 1,1,2012-11-16,FALSE

  2. Delete the job

    kubectl delete job gel-sas-copy-data

SAS Viya Administration Operations
Lesson 03, Section 3 Exercise: Load Content with Automation

In this exercise we will pull the SAS provided sas-viya-cli docker image and use it in an Apache Airflow flow to initialze a Viya environment.

Set the namespace and authenticate

In a MobaXterm session on sasnode01, set the current namespace to the gelcorp deployment.

gel_setCurrentNamespace gelcorp
source ~/project/deploy/.${current_namespace}_vars
export SAS_CLI_PROFILE=${current_namespace}
export SSL_CERT_FILE=~/.certs/${current_namespace}_trustedcerts.pem
export REQUESTS_CA_BUNDLE=${SSL_CERT_FILE}
/opt/pyviyatools/loginviauthinfo.py

Pull and use the SAS Provided sas-viya cli image

  1. The sas-viya CLI Container image is available with a SAS Viya license. To pull the image you need the certificates that are included with your order. In this step we will use mirrormanager to return the image name of the sas-viya cli image in our order.

    source /opt/gellow_work/vars/vars.txt
    climage=$(mirrormgr list remote docker tags --deployment-data /home/cloud-user/project/deploy/license/SASViyaV4_${GELLOW_ORDER}_certs.zip --cadence ${GELLOW_CADENCE_NAME}-${GELLOW_CADENCE_VERSION} | grep sas-viya-cli:latest)
    echo  Order Number is ${GELLOW_ORDER} and latest image is ${climage}

    Expected output: log Order Number is 9CYNLY and latest image is cr.sas.com/viya-4-x64_oci_linux_2-docker/sas-viya-cli:1.1.0-20240319.1710834807264

  2. Use mirror manager to retrieve the logon credentials and logon to the docker registry.

    logincmd=$(mirrormgr list remote docker login --deployment-data /home/cloud-user/project/deploy/license/SASViyaV4_${GELLOW_ORDER}_certs.zip)
    echo $logincmd
    eval $logincmd

    Expected output: ```log docker login -u 9CYNLY -p ‘!|gd^X3Vq0fVJbiL1h9N1JVJ#0mqf986’ cr.sas.com WARNING! Using –password via the CLI is insecure. Use –password-stdin. WARNING! Your password will be stored unencrypted in /home/cloud-user/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store

    Login Succeeded ```

  3. Get CLI image tags.

    climage=$(mirrormgr list remote docker tags --deployment-data /home/cloud-user/project/deploy/license/SASViyaV4_${GELLOW_ORDER}_certs.zip --cadence ${GELLOW_CADENCE_NAME}-${GELLOW_CADENCE_VERSION} | grep sas-viya-cli:latest)
    echo ${climage}
  4. Pull the image and tag it as sas-viya-cli:v1.

    docker pull ${climage}
    docker tag ${climage} sas-viya-cli:v1
  5. Use docker container run to test the image. Initially lets just view the CLI help.

    docker container run -it sas-viya-cli:v1 --help
  6. We will need to provide the SAS Viya certificates to the container. In this step download the certificate file.

    kubectl cp $(kubectl get pod | grep "sas-logon-app" | head -1 | awk -F" " '{print $1}'):security/trustedcerts.pem /tmp/trustedcerts.pem
  7. To authenticate with userid and password set VIYA_USER, VIYA_PASSWORD and SAS_SERVICES_EDNPOINT environment variables. Use the docker run command to authenticate as sasadm and run sas-viya identities whoami.

    NOTE: in this syntax you must use the export command to set the environment variables.

    export VIYA_USER=sasadm
    export VIYA_PASSWORD=lnxsas
    export SAS_SERVICES_ENDPOINT=https://${current_namespace}.$(hostname -f)
    docker run -it -e SAS_SERVICES_ENDPOINT -v /tmp:/security -e VIYA_USER -e VIYA_PASSWORD sas-viya-cli:v1 --output text identities whoami

    Expected output: log https://gelcorp.pdcesx02038.race.sas.com Login succeeded. Token saved. Id sasadm Name SAS Administrator Title EmailAddresses [map[value:sasadm@gelcorp.com]] PhoneNumbers Addresses [map[country: locality:Cary postalCode: region:]] State active ProviderId ldap CreationTimeStamp ModifiedTimeStamp

  8. You can also use the locally available profile and credentials files by mounting them into the container. This has the benefit of not authenticating on every call. You have to do a few things differently:

    • override the entrypoint so that the default authentication is not used
    • specify the user
    • mount in the cli configuration and credentials file
    • specifyy the CLI profile to use as an environment variable
    docker container run -it \
    --entrypoint "/bin/bash" \
    --user 1000:1000 \
    -v /tmp:/tmp \
    -v ${SSL_CERT_FILE}:/cli-home/.certs/`basename ${SSL_CERT_FILE}` \
    -v ~/.sas/config.json:/cli-home/.sas/config.json \
    -v ~/.sas/credentials.json:/cli-home/.sas/credentials.json \
    -e SSL_CERT_FILE=/cli-home/.certs/`basename ${SSL_CERT_FILE}` \
    -e REQUESTS_CA_BUNDLE=/cli-home/sas/.certs/`basename ${SSL_CERT_FILE}` \
    -e SAS_CLI_PROFILE=$SAS_CLI_PROFILE sas-viya-cli:v1  -c "./sas-viya --output text identities whoami"

    Expected output: log Id geladm Name geladm Title Platform Administrator EmailAddresses [map[value:geladm@gelcorp.com]] PhoneNumbers Addresses [map[country: locality:Cary postalCode: region:]] State active ProviderId ldap CreationTimeStamp ModifiedTimeStamp

Use pre-built administration CLI images

You can build your own docker image based on the SAS provided sas-viya cli image. This allows you to add additional tools to the cli image. For the rest of class we will use an image built from the SAS provided Viya CLI image. The image has been pre-built and stored in the gelharbor docker registry. The container images are automatically re-built weekly using a Jenkins process. The docker files and scripts are stored in gitlab here.

  1. Pull the latest build of the sas-viya cli container and test by displaying the version number.

    docker pull gelharbor.race.sas.com/admin-toolkit/sas-viya-cli:latest
    docker tag gelharbor.race.sas.com/admin-toolkit/sas-viya-cli:latest sas-viya-cli:latest
    docker container run -it sas-viya-cli:latest ./sas-viya --version

    Expected output: log latest: Pulling from admin-toolkit/sas-viya-cli Digest: sha256:02650157e29f0950b62b9508c6b9e4bc105a7213f46c53bd2427d0c348ddab9b Status: Image is up to date for gelharbor.race.sas.com/admin-toolkit/sas-viya-cli:latest gelharbor.race.sas.com/admin-toolkit/sas-viya-cli:latest sas-viya version 1.22.3

  2. Run ad-hoc CLI processing using the container. In this example we will use the container to run a CLI command. The CLI configuration and credentials file are mounted into the container.

    docker container run -it \
        -v ${SSL_CERT_FILE}:/cli-home/.certs/`basename ${SSL_CERT_FILE}` \
        -v ~/.sas/config.json:/cli-home/.sas/config.json \
        -v ~/.sas/credentials.json:/cli-home/.sas/credentials.json \
        -e SSL_CERT_FILE=/cli-home/.certs/`basename ${SSL_CERT_FILE}` \
        -e REQUESTS_CA_BUNDLE=/cli-home/.certs/`basename ${SSL_CERT_FILE}` \
        -e SAS_CLI_PROFILE=${current_namespace} \
        sas-viya-cli:latest sas-viya --output text identities whoami

    Expected output: log Id geladm Name geladm Title Platform Administrator EmailAddresses [map[value:geladm@gelcorp.com]] PhoneNumbers Addresses [map[country: locality:Cary postalCode: region:]] State active ProviderId ldap CreationTimeStamp ModifiedTimeStamp

  3. There is a lot more typing to use the containerized CLI. In the class environment we have defined two functions that will allow us to run ad hoc commands and scripts more easily. Review the functions.

    cat ~/geladmin_common_functions.shinc | grep gel_sas_viya -A 24
    gel_sas_viya () {
    # if env var not set set it to Default
    SAS_CLI_PROFILE=${SAS_CLI_PROFILE:=Default}
    
    # run the sas-admin cli in a container
    docker container run -it \
    -v /tmp:/tmp \
    -v ${SSL_CERT_FILE}:/cli-home/.certs/`basename ${SSL_CERT_FILE}` \
    -v ~/.sas/config.json:/cli-home/.sas/config.json \
    -v ~/.sas/credentials.json:/cli-home/.sas/credentials.json \
    -e SSL_CERT_FILE=/cli-home/.certs/`basename ${SSL_CERT_FILE}` \
    -e REQUESTS_CA_BUNDLE=/cli-home/.certs/`basename ${SSL_CERT_FILE}` \
    -e SAS_CLI_PROFILE=$SAS_CLI_PROFILE gelharbor.race.sas.com/admin-toolkit/sas-viya-cli sas-viya $@
    }
    
    gel_sas_viya_batch () {
    
    if [ $# -eq 0 ]; then
    echo "ERROR: pass the function the full path to a script"
    return
    fi
    
    # if env var not set set it to Default
    SAS_CLI_PROFILE=${SAS_CLI_PROFILE:=Default}
    
    
    # run the sas-admin cli in a container
    docker container run -it  \
    -v /tmp:/tmp \
    -v /shared/gelcontent:/gelcontent \
    -v ${SSL_CERT_FILE}:/cli-home/.certs/`basename ${SSL_CERT_FILE}` \
    -v ~/.sas/config.json:/cli-home/.sas/config.json \
    -v ~/.sas/credentials.json:/cli-home/.sas/credentials.json \
    -e SSL_CERT_FILE=/cli-home/.certs/`basename ${SSL_CERT_FILE}` \
    -e REQUESTS_CA_BUNDLE=/cli-home/.certs/`basename ${SSL_CERT_FILE}` \
    -e SAS_CLI_PROFILE=$SAS_CLI_PROFILE gelharbor.race.sas.com/admin-toolkit/sas-viya-cli sh $@
    }
  4. Here we can run the same cli command using the function gel_sas_viya. Now the command to use the containerized cli is basically the same as using the downloaded cli.

    gel_sas_viya --output text identities whoami

Use Apache Airflow to orchestrate processing using the sas-viya CLI container

The flow will execute scripts that run series of sas-viya commands to perform a specific administration tasks The scripts are stored on an NFS server and will be mounted into the PODS in the flow.

The following scripts will be executed

  • Setup identities : 01-setup-identities.sh
  • Create a preliminary folder structure : 02-create-folders.sh
  • Apply an authorization schema to the folder structure: 03-setup-authorization
  • Create some caslibs for data access: 04-setup-caslibs.sh
  • Load Data: 05-setup-loaddata.sh
  • Apply CAS authorization: 06-setup-casauth.sh
  • Load content from Viya Packiges: 07-load-content.sh
  • Validate the success of the process: 08-validate.sh
  1. Copy the scripts and files to the shared storage which will be mounted into the pods.

    cp -pr ~/PSGEL260-sas-viya-4.0.1-administration/files/gelcorp_initenv /shared/gelcontent/
    chmod -R 755 /shared/gelcontent/gelcorp_initenv

Copy the scripts and configuration files

  1. Copy the scripts and files to the shared storage which will be mounted into the pods.

    cp -pr ~/PSGEL260-sas-viya-4.0.1-administration/files/gelcorp_initenv /shared/gelcontent/
    chmod -R 755 /shared/gelcontent/gelcorp_initenv
  2. Copy the SAS Viya CLI configuration and credential files to the project directory. The sas-viya cli uses a profile to store the connection information for the Viya environment, a credentials file to store the access token used to access the environment, and needs to be able to references the certificates for the Viya environment. In this step these files will be copied to our project directory and then we will generate configMaps that include their content. Ultimately the configmaps will be mounted into the sas-viya CLI container so that it can access Viya.

    /opt/pyviyatools/loginviauthinfo.py
    mkdir -p ~/project/admincli/${current_namespace}
    cp -p ~/.sas/config.json -p ~/project/admincli/${current_namespace}/
    cp -p ~/.sas/credentials.json -p ~/project/admincli/${current_namespace}/
    cp -p ~/.certs/${current_namespace}_trustedcerts.pem -p ~/project/admincli/${current_namespace}/trustedcerts.pem
  3. Create configmaps for CLI config files in the airflow namespace.

    tee ~/project/admincli/${current_namespace}/kustomization.yaml > /dev/null << EOF
    ---
    generatorOptions:
      disableNameSuffixHash: true
    configMapGenerator:
      - name: cli-config
        files:
        - config.json
      - name: cli-token
        files:
        - credentials.json
      - name: cert-file
        files:
        - trustedcerts.pem
    EOF
    
    cd ~/project/admincli/${current_namespace}
    kustomize build -o ~/project/admincli/${current_namespace}/configmaps.yaml
    kubectl -n airflow apply --server-side=true -f ~/project/admincli/${current_namespace}/configmaps.yaml

Create the Python script that defines the flow

The workflow is created as a python script. In this step we will review the script and copy it to the airflow dags directory.

Notice the following in the flow definition:

  • the dag item defines each task,the order of execution and dependencies for the tasks
  • each task runs a script that is mounted into the POD from the NFS server
  • the container image used is gelharbor.race.sas.com/admin-toolkit/sas-viya-cli:latest
  • the credentials, certifcates and CLI profile are mounted into the POD from config maps.
  1. Copy the python file that defines the flow to the airflow DAG directory and review the content.

    cp /home/cloud-user/PSGEL260-sas-viya-4.0.1-administration/files/dags/001-load-content.py  /shared/gelcontent/airflow/dags/001-load-content.py
    cat /shared/gelcontent/airflow/dags/001-load-content.py

Run the flow and review the results

  1. In a MobaXterm session on sasnode01, generate the Airflow URL and logon using admin:admin.

    gellow_urls | grep Airflow
  2. In the DAG’s tab notice we have a flow 01-load-content-flow. The flow has been loaded to Airflow because the software is configured to register flows from any python scripts copied to those directory. Open 01-load-content-flow. Review the flow diagram.

    Alt text
  3. Select Graph

  4. Select the Run icon and select Trigger DAG. The flow should run and if it is succesful all the nodes should turn green.

  5. Click on task-04-setup-caslibs and then select Logs to view the log from the step.

  6. We can also view the log of each task using kubectl.

    kubectl -n airflow logs -l task_id=task-04-setup-caslibs --tail 50

    Expected output: ```log The requested caslib “hrdl” has been added successfully.

     Caslib Properties
     Name                hrdl
     Server              cas-shared-default
     Description         gelcontent hrdl
     Source Type         PATH
     Path                /gelcontent/gelcorp/hr/data/
     Scope               global
    
     Caslib Attributes
     active              true
     personal            false
     subDirs             false
     The requested caslib "Financial Data" has been added successfully.
    
     Caslib Properties
     Name                Financial Data
     Server              cas-shared-default
     Description         gelcontent finance
     Source Type         PATH
     Path                /gelcontent/gelcorp/finance/data/
     Scope               global
    
     Caslib Attributes
     active              true
     personal            false
     subDirs             false

    ```

  7. If we look at the PODS in the airflow namespace we will see there is a POD with the status Completed for each node in the flow.

    kubectl get pods -n airflow | grep task

    Expected output: log task-01-setup-identities-ed8va44f 0/1 Completed 0 22m task-02-setup-folders-2jmd0504 0/1 Completed 0 22m task-03-setup-authorization-7262tzvh 0/1 Completed 0 22m task-04-setup-caslibs-krhgav20 0/1 Completed 0 22m task-05-setup-loaddata-7wp8rnjw 0/1 Completed 0 22m task-06-setup-casauth-37cs2f7n 0/1 Completed 0 21m task-07-load-content-rl3i11i4 0/1 Completed 0 21m task-08-validate-dgaurcjg 0/1 Completed 0 21m

  8. Clean up PODS. Because we set is_delete_operator_pod=Falsethe PODS remain even when the task is complete. We did this so that we would have PODS to inspect and view the logs. Now we can cleanup the pods.

    kubectl -n airflow delete pods -l dag_id=01-load-content-flow

    Expected output: log pod "task-01-setup-identities-81poa3v2" deleted pod "task-02-setup-folders-kuc8pm4f" deleted pod "task-03-setup-authorization-eiboo19i" deleted pod "task-04-setup-caslibs-9plc51t4" deleted pod "task-05-setup-loaddata-c0ijir2u" deleted pod "task-06-setup-casauth-81l4fovm" deleted pod "task-07-load-content-hmjars3f" deleted pod "task-08-validate-c3lk6ayj" deleted

Validate

The validation is run in the last step of the flow.

  1. View the Validation Report. In MobaXterm sasnode1 sftp tab navigate to /shared/gelcontent/gelcorp_initenv/.

  2. Select the html file that starts with report-, right-click, select Open with and open the report with Google Chrome.

  3. Review the report to check what the folders for content were created, the caslib is running and new caslibs are available.

  4. We could also use the CLI to validate for example, list folders

    gel_sas_viya --output text folders list-members --path /gelcontent --recursive --tree
    |—— gelcontent
        |  |—— GELCorp
        |  |  |—— Finance
        |  |  |  |—— Reports
        |  |  |  |  |—— RevenueTrend (report)
        |  |  |  |  |—— FinanceOverTime (report)
        |  |  |  |  |—— Profit Pie Chart (jobDefinition)
        |  |  |  |  |—— Profit Bar Chart (jobDefinition)
        |  |  |  |  |—— Map of Profit by State (jobDefinition)
        |  |  |  |  |—— LossMakingProductRank (report)
        |  |  |  |—— Data
        |  |  |  |  |—— FinanceLASRAppendTables1 (dataPlan)
        |  |  |  |  |—— Source Data
        |  |  |—— Shared
        |  |  |  |—— Reports
        |  |  |  |  |—— GELCORP Shared HR Summary Report (report)
        |  |  |—— HR
        |  |  |  |—— Code
        |  |  |  |  |—— HRAnalysysProject
        |  |  |  |  |  |—— 4_LoadDataInCAS.sas (file)
        |  |  |  |  |  |—— 2_CreateDataInSAS.sas (file)
        |  |  |  |  |  |—— 1_CreateFormatsInSAS.sas (file)
        |  |  |  |  |  |—— 3_LoadFormatsInSAS.sas (file)
        |  |  |  |—— Work in Progress
        |  |  |  |—— Data Plans
        |  |  |  |—— WorkinProgress
        |  |  |  |—— Reports
        |  |  |  |  |—— Employee measure histograms (report)
        |  |  |  |  |—— Employee Attrition Overview (report)
        |  |  |  |  |—— Employee attrition factors heatmap (report)
        |  |  |  |  |—— Employee attrition factors correlation (report)
        |  |  |  |—— Analyses
        |  |  |  |  |—— Cluster Analysis for employees who left (report)
        |  |  |  |  |—— EmployeeSurveyDecisionTree (report)
        |  |  |  |  |—— Regression Analysis of Employee Attrition (report)
        |  |  |—— Sales
        |  |  |  |—— Data Plans
        |  |  |  |—— Work in Progress
        |  |  |  |—— WorkinProgress
        |  |  |  |—— Reports
        |  |  |  |  |—— Sales Forecast (report)
        |  |  |  |  |—— Sales Correlation (report)
        |  |  |  |  |—— Sales Overview (report)
        |  |  |  |—— Analyses
        |  |  |  |  |—— TemperaturevSales (report)
        |  |  |  |  |—— Sales Regression Analysis (report)
  5. Logon to SAS Drive as geladm : lnxsas and view Reports. Run the command below to generate a link for SAS Drive. Click on the link in the terminal window

    gellow_urls | grep "SAS Drive"
  • Navigate to SAS Content > gelcontent > GELCorp > HR >Reports

  • Open the Employee attrition factors correlation report

Review

In the practice exercise you

  • pulled the sas provided sas-viya-cli docker image
  • authenticated and use the cli in the container
  • created an Apache Aiflow process to initialize the environment for users. Each step of the flow is run in a the sas-viya cli container and runs a script to perform its task.
  • validated that the the flow executed succesfully.

Lesson 04

SAS Viya Administration Operations
Lesson 04, Section 1 Exercise: Configure Backup and Restore

Review Backup Settings and Change the Retain Policy

Set current namespace and authenticate.

gel_setCurrentNamespace gelcorp
/opt/pyviyatools/loginviauthinfo.py

Change the Retain Policy of the SAS Backup Persistent Volumes

The Viya Documentation recommends setting the ReclaimPolicy for the Backup PV’s to Retain. In this section we will make that change. With the “Retain” policy, if the PersistentVolumeClaim is deleted, the corresponding PersistentVolume will not be deleted, allowing data to be manually recovered.

  1. Viya requires at least one ReadWriteMany (RWX) StorageClass has been defined and set as the default. In the command below we view default storage classes in our cluster. Notice the RECLAIMPOLICY is delete. This means that any PV’s created from the storage class will inherit this policy.

    kubectl get storageclass

    Expected output: log NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE nfs-client (default) cluster.local/nfs-nfs-subdir-external-provisioner Delete Immediate true 9h

  2. View the Backup Persistent volume claims and their volumes.

    kubectl get    pvc -l 'sas.com/backup-role=storage'

    Expected output: log NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE sas-cas-backup-data Bound pvc-4ea3e4e3-4efc-4e62-b5c1-a959759e4a79 8Gi RWX nfs-client 5d18h sas-common-backup-data Bound pvc-5ba4898e-0309-46dc-977b-13ba5d3e5b07 25Gi RWX nfs-client 5d18h

  3. Store the two backup volume names in environment variables.

    casbackvolname=$(kubectl get pvc sas-cas-backup-data -o jsonpath='{.spec.volumeName}')
    commonbackvolname=$( kubectl get pvc sas-common-backup-data -o jsonpath='{.spec.volumeName}')
    echo CAS Backup PV: ${casbackvolname} AND Common Backup PV: ${commonbackvolname}

    Expected output: log CAS Backup PV: pvc-4f82795c-bb7d-4fd7-80f1-50785369259f AND Common Backup PV: pvc-5ff882a4-fb4d-4ae4-8b56-65b442966805

  4. Describe the sas-cas-backup-data volume, notice the Reclaim Policy, inherited from the storage class is Delete. For the backup data it would be more appropriate to have a Reclaim policy of Retain. With the Retain policy, if a user deletes a PersistentVolumeClaim, the corresponding PersistentVolume will not be deleted, allowing data to be manually recovered.

    kubectl describe pv ${casbackvolname}

    Expected output: log Name: pvc-4ea3e4e3-4efc-4e62-b5c1-a959759e4a79 Labels: <none> Annotations: pv.kubernetes.io/provisioned-by: cluster.local/nfs-nfs-subdir-external-provisioner Finalizers: [kubernetes.io/pv-protection] StorageClass: nfs-client Status: Bound Claim: from35/sas-cas-backup-data Reclaim Policy: Delete Access Modes: RWX VolumeMode: Filesystem Capacity: 8Gi Node Affinity: <none> Message: Source: Type: NFS (an NFS mount that lasts the lifetime of a pod) Server: intnode01 Path: /srv/nfs/kubedata/from35-sas-cas-backup-data-pvc-4ea3e4e3-4efc-4e62-b5c1-a959759e4a79 ReadOnly: false Events: <none>

  5. The two persistent volumes were dynamically provisioned. In order to update the ReclaimPolicy we must patch both Backup volumes, setting spec.persistentVolumeReclaimPolicy to Retain.

    kubectl patch pv ${casbackvolname} -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
    kubectl patch pv ${commonbackvolname} -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

    Expected output: log persistentvolume/pvc-30887465-dc86-4fd9-b6f2-eaaba3a7d91a patched persistentvolume/pvc-cc6ea395-991e-4bb7-9cb0-7cb623900643 patched

  6. Check that the reclaim poicy has been updated. Notice we now have two Persistent Volumes with a reclaimPolicy of Retain.

    kubectl get pv | grep Retain | grep backup

    Expected output: log pvc-0a852e85-48cc-437b-95f7-92543c9b0e21 25Gi RWX Retain Bound gelcorp/sas-common-backup-data nfs-client 20h pvc-a548e136-9209-430e-b627-6f129308b0d0 8Gi RWX Retain Bound gelcorp/sas-cas-backup-data nfs-client 20h

  7. Changing the reclaim policy to RETAIN is a best practice for the Viya Backup volumes. This would mean that in the event of a problem in the namespace the backup would be perserved and the data could be used in a restore.

    NOTE: because k8s will no longer automatically clean-up these volumes the Viya administrator should make sure any data no longer needed on the volume is deleted.

Review the Current Backup Settings

  1. List the backup and restore cronjobs. Notice that the two that are not suspended(False) are the scheduled backup and purge cronJobs.

    kubectl get cronjobs | grep -E "backup|restore"

    Expected output: log sas-backup-purge-job 15 0 1/1 * ? False 0 12h 15h sas-backup-pv-copy-cleanup-job * * 30 2 * True 0 <none> 15h sas-restore-job * * 30 2 * True 0 <none> 15h sas-scheduled-backup-all-sources 0 1 * * 6 True 0 <none> 15h sas-scheduled-backup-incr-job 0 6 * * 1-6 True 0 <none> 15h sas-scheduled-backup-job 0 1 * * 0 False 0 <none> 15h

  2. Review the settings in the backup configMap. The configMap holds parameters for the regularly scheduled backup job and any adhoc jobs created from it. First get the name of the sas-backup-job-parameters configMap.

    BACKUP_CM=$(kubectl describe cronjob sas-scheduled-backup-job | grep -i sas-backup-job-parameters | awk '{print $1}'|head -n 1)
    echo ${BACKUP_CM}

    Expected output like: log sas-backup-job-parameters-tbd6g9ttmh

  3. Describe the configMap.

    kubectl describe cm ${BACKUP_CM}

    Expected output: ```log Name: sas-backup-job-parameters-d24749d25f Namespace: gelcorp Labels: sas.com/admin=cluster-local sas.com/deployment=sas-viya Annotations:

    Data ==== SG_GO_MODULES_ENABLED: —- true SG_PROJECT: —- backup CNTR_REPO_PREFIX: —- convoy INCLUDE_POSTGRES: —- true JOB_TIME_OUT: —- 1200 SAS_BACKUP_JOB_DU_NAME: —- sas-backup-job SAS_LOG_LEVEL: —- DEBUG SAS_SERVICE_NAME: —- sas-backup-job SG_GO_MULTI_MODULES: —- true FILE_SYSTEM_BACKUP_FORMAT: —- tar RETENTION_PERIOD: —- 2 SAS_CONTEXT_PATH: —- backup SAS_DU_NAME: —- backup

    BinaryData ====

    Events: ```

  4. List the currently scheduled backups. As you can see from the schedule, by default, a backup is scheduled weekly on Sunday at 1:00 am UTC and the all-sources backup is suspended.

    kubectl get cronjobs -l "sas.com/backup-job-type=scheduled-backup"

    Expected output: log NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE sas-scheduled-backup-all-sources 0 1 * * 6 True 0 <none> 15h sas-scheduled-backup-job 0 1 * * 0 False 0 <none> 13h

  5. Purging is performed through a CronJob that executes daily at 12:15 a.m. Get the details of the current backup purge job.

    kubectl get cronjobs -l "sas.com/backup-job-type=purge-backup"

    Expected output log NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE sas-backup-purge-job 15 0 1/1 * ? False 0 13h 15h

OPTIONAL: Change the Backup Settings

In this OPTIONAL section we will update the backup configuation. We will change the default

  • backup retention period
  • backup schedule

We will make the changes in the manifests and then build and apply to make the changes in the cluster.

Backup Job Parameters

To change the backup sas-backup-job-parameters configMap to update the backup job parameters. In this step we change the backup retention period for the backup and the level for log messages.

  1. Run this command to insert three constants into the configMapGenerator section of kustomization.yaml.

    [[ $(grep -c "name: sas-backup-job-parameters" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.configMapGenerator += {
        "name": "sas-backup-job-parameters",
        "behavior": "merge",
        "literals": ["RETENTION_PERIOD=5","SAS_LOG_LEVEL=INFO"]
        }' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you could have manually edited kustomization.yaml to include the constants.

    [...]
    configMapGenerator:
        [... previous resource items ...]
        - name: sas-backup-job-parameters
        behavior: merge
        literals:
            - RETENTION_PERIOD=5
            - SAS_LOG_LEVEL=INFO
    [...]

Backup Schedule

  1. Create a patch transformer that will update the schedule of the default backup cronJob.

    tee ~/project/deploy/${current_namespace}/site-config/change-default-backup-schedule.yaml > /dev/null << EOF
    ---
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
        name: sas-scheduled-backup-job-change-default-backup-transformer
    patch: |-
        - op: replace
          path: /spec/schedule
          value: '0 3 * * 6'
    target:
        name: sas-scheduled-backup-job
        kind: CronJob
        version: v1
    EOF
  2. Modify ~/project/deploy/${current_namespace}/kustomization.yaml to reference the patch transformer overlay.

    In the transformers section add the line - site-config/change-default-backup-schedule.yaml

    Run this command to update kustomization.yaml using the yq tool:

    [[ $(grep -c "site-config/change-default-backup-schedule.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.transformers += ["site-config/change-default-backup-schedule.yaml"]' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you can manually edit the transformers section to add the lines below

    [...]
    transformers:
        [... previous transformers items ...]
        - site-config/change-default-backup-schedule.yaml
    [...]

Build and Apply with sas-orchestration deploy

  1. Keep a copy of the current manifest file. We will use this copy to track the changes your kustomization processing makes to this file.

    cp -p /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml /tmp/${current_namespace}/manifest_03-051.yaml
  2. Run the sas-orchestration deploy command.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
                -v ${PWD}/license:/license \
                -v ${PWD}/${current_namespace}:/${current_namespace} \
                -v ${HOME}/.kube/config_portable:/kube/config \
                -v /tmp/${current_namespace}/deploy_work:/work \
                -e KUBECONFIG=/kube/config \
                --user $(id -u):$(id -g) \
            sas-orchestration \
                deploy \
                    --namespace ${current_namespace} \
                    --deployment-data /license/SASViyaV4_${_order}_certs.zip \
                    --license /license/SASViyaV4_${_order}_license.jwt \
                    --user-content /${current_namespace} \
                    --cadence-name ${_cadenceName} \
                    --cadence-version ${_cadenceVersion} \
        --image-registry ${_viyaMirrorReg}

    When the deploy commmand completes succesfully the final message should say The deploy command completed succesfully as shown in the log snippet below.

    The deploy command started
    Generating deployment artifacts
    Generating deployment artifacts complete
    Generating kustomizations
    Generating kustomizations complete
    Generating manifests
    Applying manifests
    > start_leading gelcorp
    
    [...more...]
    
    > kubectl delete --namespace gelcorp --wait --timeout 7200s --ignore-not-found configmap sas-deploy-lifecycle-operation-variables
    configmap "sas-deploy-lifecycle-operation-variables" deleted
    
    > stop_leading gelcorp
    
    Applying manifests complete
    The deploy command completed successfully
  3. If the sas-orchestration deploy command fails checkout the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

  4. Run the following command to view the changes in the manifest. The changes are in green in the right column.

    icdiff  /tmp/${current_namespace}/manifest_03-051.yaml /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml
  5. Check the default backup schedule has been updated.

    kubectl get cronjobs -l "sas.com/backup-job-type=scheduled-backup"

    Expected output: log NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE sas-scheduled-backup-job 0 3 * * 6 False 0 <none> 13h

Review

  • Set the retain policy on the Viya backup PV’s to protect the backup package from accidental deletion
  • Backup and restore are implemented using Kubernetes cronJobs
  • You can change the default backup settings using the provided overlays

SAS Viya Administration Operations
Lesson 04, Section 2 Exercise: Perform a Backup

In this hands-on you will perform an ad hoc backup of the Viya deployment and copy the resulting backup package outside of the cluster.

Run the Ad-hoc Backup

To run an adhoc backup create a backup job from the default SAS Viya scheduled backup cronJob.

  1. Create the adhoc backup job from the scheduled backup.

    cd ~/project/deploy/${current_namespace}
    kubectl create job --from=cronjob/sas-scheduled-backup-job sas-scheduled-backup-job-adhoc-001
  2. It will take a moment for the job to be created and start running. Check if the job has started.

    kubectl get jobs sas-scheduled-backup-job-adhoc-001
  3. When the job starts, you can view the job progress in the log. This command will display the backup log as the job runs. You can press CTRL-C to stop viewing the log.

    kubectl logs -f job/sas-scheduled-backup-job-adhoc-001 -c sas-backup-job | gel_log

    Expected output: ```log INFO 2023-08-24 14:55:58.735 +0000 [sas-backupjob] - Received a response from the backup agent for the “cas” data source and the “backup” job. INFO 2023-08-24 14:55:58.736 +0000 [sas-backupjob] - The “backup” job for the “cas” data source finished with the status “Completed”. INFO 2023-08-24 14:55:58.736 +0000 [sas-backupjob] - Received a response from the backup agent for the “cas” data source and the “backup” job. INFO 2023-08-24 14:55:58.736 +0000 [sas-backupjob] - The “backup” job for the “cas” data source finished with the status “Completed”. INFO 2023-08-24 14:55:58.736 +0000 [sas-backupjob] - Received a response from the backup agent for the “configurations” data source and the “backup” job. INFO 2023-08-24 14:55:58.736 +0000 [sas-backupjob] - The “backup” job for the “configurations” data source finished with the status “Completed”. INFO 2023-08-24 14:56:58.736 +0000 [sas-backupjob] - Received a response from the backup agent for the “postgres” data source and the “backup” job. INFO 2023-08-24 14:56:58.736 +0000 [sas-backupjob] - The “backup” job for the “postgres” data source finished with the status “Completed”. INFO 2023-08-24 14:57:58.737 +0000 [sas-backupjob] - Received a response from the backup agent for the “fileSystem” data source and the “backup” job. INFO 2023-08-24 14:57:58.737 +0000 [sas-backupjob] - The “backup” job for the “fileSystem” data source finished with the status “Completed”. INFO 2023-08-24 14:57:58.737 +0000 [sas-backupjob] - Received a response from the backup agent for the “fileSystem” data source and the “backup” job. INFO 2023-08-24 14:57:58.737 +0000 [sas-backupjob] - The “backup” job for the “fileSystem” data source finished with the status “Completed”. INFO 2023-08-24 14:57:58.740 +0000 [sas-backupjob] - Created the backup status file: /sasviyabackup/2023-08-24T14_54_48_648_0700/status.json INFO 2023-08-24 14:57:58.742 +0000 [sas-backupjob] - Added a backup job entry to the status file: 2023-08-24T14_54_48_648_0700 INFO 2023-08-24 14:57:58.765 +0000 [sas-backupjob] - Updating the Kubernetes job sas-scheduled-backup-job-adhoc-001 with given label. INFO 2023-08-24 14:57:58.777 +0000 [sas-backupjob] - Updated the Kubernetes job the sas-scheduled-backup-job-adhoc-001. INFO 2023-08-24 14:57:58.777 +0000 [sas-backupjob] - backupjob-log-icu.backup.backup.status.info.log [jobStatus:Completed]

    ```

  4. You can check the status of the job as it runs. The job usually takes 5 minutes or so to complete. Please wait for the job to complete before moving on.

    kubectl get jobs -l "sas.com/backup-job-type=scheduled-backup" -L "sas.com/backup-job-type,sas.com/sas-backup-job-status"

    Reissue the command until you see a result that indicates the backup job has completed (you may have more than 1 job)

    NAME                                 COMPLETIONS   DURATION   AGE     BACKUP-JOB-TYPE    SAS-BACKUP-JOB-STATUS
    sas-scheduled-backup-job-adhoc-001   0/1           5m32s      5m32s   scheduled-backup   Completed

Review the Ad-hoc Backup Results

To view the results of your backup, you will need the storage location and unique ID for the backup package(BackupID).

NOTE: the BackupID uniquely identifies a backup package and is formed from the date and timestamp when the backup was created.

  1. The code below captures the backupid and the location of the persistent volumes. Then within the backup package we can review the content of the status.json file. The file contains the detailed status of each backup task.

    Check the fields:

    • Status
    • TaskStatus for each task
    • within each task the size of backup for each source.
    bckpvname=sas-common-backup-data
    bckvolname=$( kubectl get pvc $bckpvname -o jsonpath='{.spec.volumeName}')
    echo Backup Persistent Volume is: $bckvolname
    
    caspvname=sas-cas-backup-data
    casvolname=$( kubectl get pvc $caspvname -o jsonpath='{.spec.volumeName}')
    echo CAS Backup Persistent Volume is: $casvolname
    
    backupid=$(yq4 eval '(.metadata.labels."sas.com/sas-backup-id")' <(kubectl get job sas-scheduled-backup-job-adhoc-001 -o yaml))
    
    echo Backup Id is: $backupid
    
    cat /srv/nfs/kubedata/${current_namespace}-${bckpvname}-${bckvolname}/${backupid}/status.json

    Expected output: json { "BackupID": "2020-10-16T17_23_26_626_0700", "Status": "Completed", "StartTime": "2020-10-16T17:23:26.777020011Z", "EndTime": "2020-10-16T17:25:46.806417294Z", "Tasks": [ { "TaskID": "5218cc6d-3a9a-4a76-8d9d-9d6ad89df33d", "TaskStatus": "Completed", "SourceType": "configurations", "DataSource": "sas-adhoc-backup-9hjtd4sc", "ResponseItems": [ { "ServiceID": "configurations", "Status": "Completed", "StatusCode": 0, "Size": "425K", "StartTime": "2020-10-16T17:23:36.825629789Z", "EndTime": "2020-10-16T17:23:37.150478583Z" }, { "ServiceID": "definitions", "Status": "Completed", "StatusCode": 0, "Size": "425K", "StartTime": "2020-10-16T17:23:37.156930002Z", "EndTime": "2020-10-16T17:23:37.326226212Z" }, { "ServiceID": "consulProperties", "Status": "Completed", "StatusCode": 0, "Size": "687", "StartTime": "2020-10-16T17:23:37.332567231Z", "EndTime": "2020-10-16T17:23:37.361899002Z" } ] }, { "TaskID": "608cde22-0ad0-455e-be46-22eb71a38faf", "TaskStatus": "Completed", "SourceType": "postgres", "DataSource": "sas-adhoc-backup-9hjtd4sc", "ResponseItems": [ { "ServiceID": "sas-crunchy-data-postgres", "Status": "Completed", "StatusCode": 0, "Size": "263M", "StartTime": "2020-10-16T17:23:37.523873943Z", "EndTime": "2020-10-16T17:25:37.831081135Z" } ] }, { "TaskID": "a055a85a-048a-4546-a842-b358c9df1925", "TaskStatus": "Completed", "SourceType": "cas", "DataSource": "cas-shared-default", "ResponseItems": [ { "ServiceID": "cas-shared-default", "Status": "Completed", "StatusCode": 0, "Size": "89K", "StartTime": "2020-10-16T17:23:37.018146359Z", "EndTime": "2020-10-16T17:23:37.327337438Z" } ] }, { "TaskID": "ae9296d8-c89b-45b1-bc0b-16ede4fc0a4e", "TaskStatus": "Completed", "SourceType": "fileSystem", "DataSource": "cas-shared-default", "ResponseItems": [ { "ServiceID": "cas-shared-default", "Status": "Completed", "StatusCode": 0, "Size": "1.2G", "StartTime": "2020-10-16T17:23:36.840136553Z", "EndTime": "2020-10-16T17:23:56.451535167Z" } ] } ], "Version": "1.0" }

  2. Knowing the volume names from above, you can view the actual backup data and the cas backup data.

    ls -al /srv/nfs/kubedata/${current_namespace}-${bckpvname}-${bckvolname}/${backupid}/__default__
    ls -al /srv/nfs/kubedata/${current_namespace}-${caspvname}-${casvolname}/${backupid}/__default__

    Expected output: log [cloud-user@rext03-0057 from35]$ ls -al /srv/nfs/kubedata/${current_namespace}-${bckpvname}-${volname}/${backupid}/__default__ total 0 drwxr-xr-x 4 sas sas 36 Oct 19 13:13 . drwxr-xr-x 3 sas sas 44 Oct 19 13:15 .. drwxr-xr-x 2 sas sas 101 Oct 19 13:13 consul drwxr-xr-x 2 sas sas 93 Oct 19 13:15 postgres [cloud-user@rext03-0057 from35]$ ls -al /srv/nfs/kubedata/${current_namespace}-${caspvname}-${casvolname}/${backupid} /__default__ total 0 drwxr-xr-x 4 sas sas 35 Oct 19 13:13 . drwxr-xr-x 3 sas sas 25 Oct 19 13:13 .. drwxr-xr-x 3 sas sas 51 Oct 19 13:13 cas drwxr-xr-x 3 sas sas 51 Oct 19 13:14 fileSystem

Copy the Backup Package outside the cluster

Use the sas-backup-pv-copy-cleanup script

A script is provided to help in managing the backup package. The script starts a Kubernetes job, the job has a POD which mounts the two Backup PVCs (sas-common-backup-data and sas-cas-backup-data). The POD can be used to copy TO and FROM the backup PVCs. In this section we will copy from the backup PVC’s to the local file-system. Keep in mind the same technique can be used, if needed, to copy a package into a namespace prior to a restore.

  1. Run the script to start the Job. The required paramaters are namespace, operation and cas server. To use manage the backup package the operation is copy

    chmod 755 "/home/cloud-user/project/deploy/gelcorp/sas-bases/examples/restore/scripts/sas-backup-pv-copy-cleanup.sh"
    "/home/cloud-user/project/deploy/gelcorp/sas-bases/examples/restore/scripts/sas-backup-pv-copy-cleanup.sh" gelcorp copy default

    Expected output: log The sas-backup-pv-copy-default-default-aa4e2b8 keeps on running until user terminates it. The copy pods are created, and they are in a running state. To check the status of copy pods, run the following command. kubectl -n gelcorp get pods -l sas.com/backup-job-type=sas-backup-pv-copy-cleanup | grep aa4e2b8

  2. Use the command below to get the name of the POD started by the script. This command will work when only one Job is running.

    NOTE: You also can use the command in the previous output to get the POD in the job.

    BACKUPCOPYPOD=$(kubectl -n gelcorp get pods -l sas.com/backup-job-type=sas-backup-pv-copy-cleanup --field-selector status.phase=Running --output=jsonpath={.items..metadata.name})
    echo The backup copy pod name id $BACKUPCOPYPOD

    Expected output: log The backup copy pod name id sas-backup-pv-copy-default-default-020f428-57vlt

  3. View the CAS and Common backup mount locations inside the sas-backup-pv-copy POD. The directories are named for the BackupID of the backup package. One of the directories should match the BackupID of the ad-hoc backup.

    kubectl exec -it ${BACKUPCOPYPOD} -- ls -al /sasviyabackup
    kubectl exec -it ${BACKUPCOPYPOD} -- ls -al /cas

    Expected output: log [cloud-user@pdcesx02193 ~]$ kubectl exec -it ${BACKUPCOPYPOD} -- ls -al /sasviyabackup Defaulted container "sas-backup-pv-copy-cleanup-job" out of: sas-backup-pv-copy-cleanup-job, sas-certframe (init) total 0 drwxrwxrwx 3 root root 42 Nov 26 01:02 . drwxr-xr-x 1 root root 76 Nov 27 17:52 .. drwxr-xr-x 3 sas sas 44 Nov 26 01:07 2023-11-26T01_02_07_607_0700 [cloud-user@pdcesx02193 ~]$ kubectl exec -it ${BACKUPCOPYPOD} -- ls -al /cas Defaulted container "sas-backup-pv-copy-cleanup-job" out of: sas-backup-pv-copy-cleanup-job, sas-certframe (init) total 0 drwxrwxrwx 3 root root 42 Nov 26 01:02 . drwxr-xr-x 1 root root 76 Nov 27 17:52 .. drwxr-xr-x 3 sas sas 25 Nov 26 01:02 2023-11-26T01_02_07_607_0700

Copy Backup Package

  1. In this step we create a directory on sasnode01 where we will copy the backup package.

    mkdir -p /tmp/sas-common-backup-data
    mkdir -p /tmp/sas-cas-backup-data
  2. Get the BACKUP ID of the last completed full backup. The backupid is stored in a file in the sas-common-backup-data PV we can access the file throught the copy/cleanup POD.

    BACKUPCOPYPOD=$(kubectl -n gelcorp get pods -l sas.com/backup-job-type=sas-backup-pv-copy-cleanup --field-selector status.phase=Running --output=jsonpath={.items..metadata.name})
    LASTBACKUPFILE=$(kubectl exec  ${BACKUPCOPYPOD} -c sas-backup-pv-copy-cleanup-job  --  find /sasviyabackup/LastCompletedFullBackups -type f -name '*')
    backupid=$(kubectl exec ${BACKUPCOPYPOD} -c sas-backup-pv-copy-cleanup-job -- cat  ${LASTBACKUPFILE} | jq -r .backupID )
    echo Last Completed Backup File=${LASTBACKUPFILE} BACKUPID=${backupid}

    Expected output: log Last Completed Backup File=/sasviyabackup/LastCompletedFullBackups/3fa746b9f419285865e9cef3bc4843dc6a57abb2 BACKUPID=20240904-143543F

  3. Copy the SAS Viya Backup to a location outside the cluster

    kubectl cp ${BACKUPCOPYPOD}:/sasviyabackup/${backupid} /tmp/sas-common-backup-data/${backupid}
    kubectl cp ${BACKUPCOPYPOD}:/cas/${backupid} /tmp/sas-cas-backup-data/${backupid}

    Expected output:

    Defaulted container "sas-backup-pv-copy-cleanup-job" out of: sas-backup-pv-copy-cleanup-job, sas-certframe (init)
    tar: Removing leading `/' from member names
    [cloud-user@rext03-0006 gelcorp]$ kubectl cp ${BACKUPCOPYPOD}:/cas/${backupid} /tmp/sas-cas-backup-data/${backupid}
    Defaulted container "sas-backup-pv-copy-cleanup-job" out of: sas-backup-pv-copy-cleanup-job, sas-certframe (init)
    tar: Removing leading `/' from member names
  4. Check the local file-system to see if the package has been copied.

    tree -L 4 "/tmp/sas-common-backup-data/"
    tree -L 4 "/tmp/sas-cas-backup-data/"

    Expected output: ```log /tmp/sas-common-backup-data/ └── 20240731-143921F ├── default │   ├── consul │   │   ├── configuration.dmp │   │   ├── definition.dmp │   │   ├── genericProperties.dmp │   │   └── status.json │   └── postgres │   ├── SharedServices_pg_dump.dmp │   ├── SharedServices_pg_dump.log │   └── status.json └── status.json

    5 directories, 7 files [cloud-user@pdcesx03198 gelcorp]$ tree -L 4 “/tmp/sas-cas-backup-data/” /tmp/sas-cas-backup-data/ └── 20240731-143921F └── default ├── cas │   ├── cas-shared-default │   └── status.json └── fileSystem ├── cas-shared-default └── status.json

    6 directories, 2 files ```

  5. When you are done you can delete the Job, or if you leave the job running the POD will be available for future usage. In our environment we will leave the job running and we can use the pod in the future to access the backup packages.

Review

  1. The backup package can be copied using kubectl and the provided POD. If we wanted to restore this backup in a different Viya environment we could use the copy POD and the reverse kubectl cp commands to copy the package into the backup PVs in that cluster.

  2. You have completed a backup of your Viya environment and have copied the backup package outside of the cluster.


SAS Viya Administration Operations
Lesson 04, Section 3 Exercise: Restore a Backup

Restore a backup

In this hands-on you will restore a Viya Backup.

Select a Backup Package

  1. In a MobaXterm session on sasnode01, set the current namespace to the target deployment, and identify the sas-viya CLI profile to use.

    gel_setCurrentNamespace gelcorp
    /opt/pyviyatools/loginviauthinfo.py
  2. Determine the Backup Package to Restore. To get the BackupID of the ad-hoc backup run in the previous exercise we can retrieve it from a label set on the completed backup job.

    backupid=$(yq4 eval '(.metadata.labels."sas.com/sas-backup-id")' <(kubectl get job sas-scheduled-backup-job-adhoc-001 -o yaml))
    echo ${backupid}
  3. If you do not know the name of the job that ran, or it has been deleted you can get backup id of the last succesful backup from a file in the sas-common-backup-data PVC. The file can be accessed from the cleanup/copy POD.

    BACKUPCOPYPOD=$(kubectl get pods -l sas.com/backup-job-type=sas-backup-pv-copy-cleanup --field-selector status.phase=Running --output=jsonpath={.items..metadata.name})
    LASTBACKUPFILE=$(kubectl exec  ${BACKUPCOPYPOD} -c sas-backup-pv-copy-cleanup-job  --  find /sasviyabackup/LastCompletedFullBackups -type f -name '*')
    backupid=$(kubectl exec ${BACKUPCOPYPOD} -c sas-backup-pv-copy-cleanup-job -- cat  ${LASTBACKUPFILE} | jq -r .backupID )
    echo Last Completed Backup File=${LASTBACKUPFILE} BACKUPID=${backupid}

    Expected output:

    Last Completed Backup File=/sasviyabackup/LastCompletedFullBackups/3fa746b9f419285865e9cef3bc4843dc6a57abb2 BACKUPID=20240904-143543F

Remove Content

In this section, we will delete a folder and a Caslib to simulate a data loss. We will then create a basic inventory of the user content and the caslibs in the Viya environment. After we restore the backup we can check that the deleted folder and caslib were restored from the backup.

  1. Delete folder.

    gel_sas_viya -y folders delete --path /gelcontent/GELCorp/hr/reports --recursive

    Expected output

    The folder was deleted successfully.
  2. Delete CASlib

    gel_sas_viya -y cas caslibs delete --server cas-shared-default --caslib hrdl --su

    Expected output

    The caslib "hrdl" has been deleted from server "cas-shared-default".
  3. Create basic inventory of user folder content. The inventory is piped to a csv file for use in the comparison after the restore.

    /opt/pyviyatools/listcontent.py -f /gelcontent -o csv > /tmp/contentbefore-restore.csv
  4. Create a basic inventory of caslibs. The inventory is piped to a csv file for use in the comparison after the restore.

    /opt/pyviyatools/listcaslibs.py > /tmp/casbefore-restore.csv

Restore the Backup

In this section we will restore the backup. The restore process happens in three steps.

  • Step 1 Update the Restore configMap
  • Step 2 the Restore Job: restores the SAS Configuration Server and SAS Infrastructure Data Server and stops the CAS Server(s)
  • Step 3 clear the CAS PVCs and restart CAS in RESTORE mode

Step 1: Update the restore configMap

  1. Identify the sas-restore-job-parameters configMap that needs to be modified. This step returns the config map for the restore job.

    restore_config_map=$(kubectl describe cronjob sas-restore-job | grep -i sas-restore-job-parameters | awk '{print $1}'|head -n 1)
    echo The current restore Config Map is: $restore_config_map

    Expected output:

    The current restore Config Map is: sas-restore-job-parameters-hgd4ftbmmm
  2. Edit the configmap to set the restore parameters. Set the SAS_BACKUP_ID to the backup id of the package to restore, the SAS_DEPLOYMENT_START_MODE to RESTORE and the SAS_LOG_LEVEL to DEBUG.

    kubectl patch cm $restore_config_map --type json -p '[ {"op": "replace", "path": "/data/SAS_BACKUP_ID", "value":"'${backupid}'"}, {"op": "replace", "path": "/data/SAS_DEPLOYMENT_START_MODE", "value":"RESTORE" }, {"op": "replace", "path": "/data/SAS_LOG_LEVEL", "value":"DEBUG" }]'

    Expected output:

    configmap/sas-restore-job-parameters-hgd4ftbmmm patched
  3. View the updated config map. Make sure that the SAS_BACKUP_ID and SAS_DEPLOYMENT_START_MODE parameters are correctly set.

    kubectl describe cm $restore_config_map

    Expected output

    Name:         sas-restore-job-parameters-hgd4ftbmmm
    Namespace:    gelcorp
    Labels:       app.kubernetes.io/name=sas-restore-job
                sas.com/admin=cluster-local
                sas.com/deployment=sas-viya
    Annotations:  <none>
    
    Data
    ====
    SAS_SERVICE_NAME:
    ----
    sas-restore-job
    SG_PROJECT:
    ----
    backup
    OAUTH2_CLIENT_ACCESSTOKENVALIDITY:
    ----
    72000
    SAS_BACKUP_ID:
    ----
    2023-10-24T08_11_39_639_0700
    SAS_CONTEXT_PATH:
    ----
    restore
    SAS_DEPLOYMENT_START_MODE:
    ----
    RESTORE
    SAS_RESTORE_JOB_DU_NAME:
    ----
    sas-restore-job
    
    BinaryData
    ====
    
    Events:  <none>

Step 2: Run the restore job to restore the SAS Infrastructure Data Serve and Configuration Server

  1. Start the Restore Job from the Restore cronJob. This process will restore the SAS Infrastructure Data Server and the SAS Configuration Server. In addition it will stop the CAS server to prepare for restore of the CAS server.

    kubectl create job --from=cronjob/sas-restore-job sas-restore-job

    Expected output:

    job.batch/sas-restore-job created
  2. Check that the restore job is running.

    kubectl get jobs -l sas.com/backup-job-type=restore -L sas.com/sas-backup-id,sas.com/backup-job-type,sas.com/sas-restore-status

    Expected output:

    NAME              COMPLETIONS   DURATION   AGE     SAS-BACKUP-ID                  BACKUP-JOB-TYPE   SAS-RESTORE-STATUS
    sas-restore-job   0/1           2m55s      2m55s   2021-01-11T15_43_38_638_0700   restore           Running
  3. View the log of the restore job as it runs. You will get the command prompt back when the restore job completes.

    kubectl logs -l job-name=sas-restore-job -f -c sas-restore-job | gel_log
  4. Check for specific messages in the log of the restore job to check the status.

    kubectl logs -l "job-name=sas-restore-job" -c sas-restore-job --tail 1000 | gel_log | grep "restore job completed successfully" -B 3 -A 1

    Expected output:

    INFO  2023-10-24 08:48:27.196 +0000 [sas-restorejob] - Successfully completed post-restore operations to enable CAS restore.
    INFO  2023-10-24 08:48:27.208 +0000 [sas-restorejob] - Updating the Kubernetes job sas-restore-job with given label.
    INFO  2023-10-24 08:48:27.220 +0000 [sas-restorejob] - Updated the Kubernetes job the sas-restore-job.
    INFO  2023-10-24 08:48:27.220 +0000 [sas-restorejob] - The restore job completed successfully.
    INFO  2023-10-24 08:48:27.234 +0000 [sas-restorejob] - Updating the Kubernetes job sas-restore-job with given label.

Step 3: Restore the CAS Server

The process will start the CAS Server where data and configuration will be migrated during startup.

  1. The restore job should have stopped the CAS server. The CAS Server is required to be stopped in order to perform the CAS restore, lets check that CAS is not running.

    kubectl get pods --selector="app.kubernetes.io/managed-by==sas-cas-operator"

    Expected output:

    No resources found in target namespace.
  2. The process uses two provided CAS scripts.

    • sas-backup-pv-copy-cleanup.sh deletes the existing data from the CAS PV’s.
    • scale-up-cas.sh starts the CAS server(s) in RESTORE mode.

    Make the provided CAS scripts executable.

    chmod +x ~/project/deploy/${current_namespace}/sas-bases/examples/restore/scripts/*.sh
  3. The CAS file system restore requires a clean volume. Run the sas-backup-pv-copy-cleanup script to clean up the CAS PVs. This step deletes the existing data in the CAS permstore and CAS data PVCs. The paramters to pass in order are namespace, operation, and a command delimited list of CAS servers.

    ~/project/deploy/${current_namespace}/sas-bases/examples/restore/scripts/sas-backup-pv-copy-cleanup.sh gelcorp remove "default"

    Expected output:

    The cleanup pods are created, and they are in a running state.
    Ensure that all pods are completed. To check the status of the cleanup pods, run the following command.
    kubectl -n gelcorp get pods -l sas.com/backup-job-type=sas-backup-pv-copy-cleanup | grep d2c055a
  4. In the output from the previous step the last line is a kubectl command that displays the status of the cleanup. Copy and run the the kubectl from the output of the previous step to check if the cleanup POD is in a completed state. Expected output:

    sas-backup-pv-cleanup-default-default-d2c055a-n98wk   0/1     Completed   0          3m45s
  5. Use the provided script to start up the CAS server to start the CAS restore.

    ~/project/deploy/${current_namespace}/sas-bases/examples/restore/scripts/scale-up-cas.sh gelcorp "default"

    Expected output:

    casdeployment.viya.sas.com/default patched
  6. Check the results. First, make sure the CAS server is up.

    kubectl wait --for=condition=ready --timeout=600s pod -l "app.kubernetes.io/instance=default"

    If you see messages like this, reissue the same command until the prompt does not immediately return control to you. This simply means that the CAS pods are not yet running. It can take 2-3 minutes before the CAS pods are able to respond.

    error: no matching resources found

    Eventually, you should see confirmation that the CAS pods are up.

    pod/sas-cas-server-default-controller condition met
  7. Check the log to see if the CAS server performed the restore. The logs should show the start of the restore process that restores the backup content to the target CAS persistent volumes.

    kubectl logs sas-cas-server-default-controller -c sas-cas-server  | gel_log | grep -A 10 "RESTORE"

    Expected output:

    Mon Jan 18 03:07:34 UTC 2021 - INFO: SAS_DEPLOYMENT_START_MODE is set to RESTORE, Initiating restore process
    Mon Jan 18 03:07:34 UTC 2021 - -------------------------------------------------
    Mon Jan 18 03:07:34 UTC 2021 - INFO: Evaluating backup content for restore
    Mon Jan 18 03:07:34 UTC 2021 - INFO: Listing cas data volume contents at: /cas/data
    Mon Jan 18 03:07:34 UTC 2021 - INFO: Initiated restoring files
    Mon Jan 18 03:07:34 UTC 2021 - INFO: copying volume data
    '/sasviyabackup/2021-01-18T01_48_36_636_0700/__default__/fileSystem/cas-shared-default/cas-default-data-volume/apps' -> '/cas/data/apps'
    '/sasviyabackup/2021-01-18T01_48_36_636_0700/__default__/fileSystem/cas-shared-default/cas-default-data-volume/apps/projects' -> '/cas/data/apps/projects'
    '/sasviyabackup/2021-01-18T01_48_36_636_0700/__default__/fileSystem/cas-shared-default/cas-default-data-volume/apps/sashealth' -> '/cas/data/apps/sashealth'
    '/sasviyabackup/2021-01-18T01_48_36_636_0700/__default__/fileSystem/cas-shared-default/cas-default-data-volume/caslibs' -> '/cas/data/caslibs'
    '/sasviyabackup/2021-01-18T01_48_36_636_0700/__default__/fileSystem/cas-shared-default/cas-default-data-volume/caslibs/modelMonitorLibrary' -> '/cas/data/caslibs/modelMonitorLibrary'
  8. Check to see that the permstore was restored.

    kubectl logs sas-cas-server-default-controller -c sas-cas-server -n ${current_namespace} | grep -B 1 -A 5 "Restoring CAS permstore"

    Expected output:

    [cloud-user@pdcesx02092 gelcorp]$ kubectl logs sas-cas-server-default-controller -c sas-cas-server -n ${current_namespace} | grep -B 1 -A 5 "Restoring CAS permstore"
    {"version": 1, "timeStamp": "2022-11-18T21:40:35.588564+00:00", "level": "info", "source": "cas-shared-default", "message": "SAS_DEPLOYMENT_START_MODE is set to RESTORE, Initiating restore process.", "properties": {"pod": "sas-cas-server-default-controller", "caller": "restore_cas.sh:221"}}
    {"version": 1, "timeStamp": "2022-11-18T21:40:35.669603+00:00", "level": "info", "source": "cas-shared-default", "message": "Restoring CAS permstore volume contents", "properties": {"pod": "sas-cas-server-default-controller", "caller": "restore_cas.sh:137"}}
    {"version": 1, "timeStamp": "2022-11-18T21:40:35.746658+00:00", "level": "info", "source": "cas-shared-default", "message": "Target CAS permstore volume contents at /cas/permstore (Should be empty): \n", "properties": {"pod": "sas-cas-server-default-controller", "caller": "restore_cas.sh:152"}}
    {"version": 1, "timeStamp": "2022-11-18T21:40:35.856896+00:00", "level": "info", "source": "cas-shared-default", "message": "changed ownership of '/cas/permstore/primaryctrl' from root:root to 1001:1001", "properties": {"pod": "sas-cas-server-default-controller", "caller": "restore_cas.sh:174"}}
    {"version": 1, "timeStamp": "2022-11-18T21:40:35.933131+00:00", "level": "info", "source": "cas-shared-default", "message": "Copying backup permstore volume contents from source /sasviyabackup/2022-11-18T20_05_28_628_0700/__default__/cas/cas-shared-default to target /cas/permstore/primaryctrl", "properties": {"pod": "sas-cas-server-default-controller", "caller": "restore_cas.sh:177"}}
    {"version": 1, "timeStamp": "2022-11-18T21:40:36.444409+00:00", "level": "info", "source": "cas-shared-default", "message": "sending incremental file list", "properties": {"pod": "sas-cas-server-default-controller", "caller": "restore_cas.sh:196"}}
    {"version": 1, "timeStamp": "2022-11-18T21:40:36.528909+00:00", "level": "info", "source": "cas-shared-default", "message": "06499622-f1d7-7646-a32c-68c301a729a4.admitm", "properties": {"pod": "sas-cas-server-default-controller", "caller": "restore_cas.sh:196"}}
  9. Reset the SAS restore job configMap parameters and check that the command worked.

    kubectl patch cm $restore_config_map --type json -p '[{ "op": "remove", "path": "/data/SAS_BACKUP_ID" },{"op": "remove", "path": "/data/SAS_DEPLOYMENT_START_MODE"}]'
    kubectl describe cm $restore_config_map

    Expected output:

    configmap/sas-restore-job-parameters-hgd4ftbmmm patched
    
    Name:         sas-restore-job-parameters-hgd4ftbmmm
    Namespace:    gelcorp
    Labels:       app.kubernetes.io/name=sas-restore-job
                sas.com/admin=cluster-local
                sas.com/deployment=sas-viya
    Annotations:  <none>
    
    Data
    ====
    OAUTH2_CLIENT_ACCESSTOKENVALIDITY:
    ----
    72000
    SAS_CONTEXT_PATH:
    ----
    restore
    SAS_RESTORE_JOB_DU_NAME:
    ----
    sas-restore-job
    SAS_SERVICE_NAME:
    ----
    sas-restore-job
    SG_PROJECT:
    ----
    backup
    
    BinaryData
    ====
    
    Events:  <none>

Validate

In this section we will check that our content has been restored from the backup.

  1. After the restore create basic inventory of user folder content.

    /opt/pyviyatools/listcontent.py -f /gelcontent -o csv > /tmp/contentafter-restore.csv
  2. After the restore create a basic inventory of caslibs.

    /opt/pyviyatools/listcaslibs.py > /tmp/casafter-restore.csv
  3. Compare the two inventory files for CAS. File 1 was created before the restore and file 2 after the restore. Notice the CAS library hrdl has been restored from the backup.

    /opt/pyviyatools/comparecontent.py --file1 /tmp/casbefore-restore.csv --file2 /tmp/casafter-restore.csv

    Expected output:

    NOTE: Compare the content of file1=/tmp/casbefore-restore.csv and file2=/tmp/casafter-restore.csv
    NOTE: SUMMARY
    NOTE: there is nothing in file2 that is not in file1.
    NOTE: DETAILS
    NOTE: The content listed below is in file2 but not in file1:
    server,caslib
    
    cas-shared-default,hrdl
  4. Compare the two inventory files of the folder content. File 1 was created before the restore and file 2 after the restore. Notice that the folder/gelcontent/GELCorp/HR/Reports and its content have been restored from the backup.

    /opt/pyviyatools/comparecontent.py --file1 /tmp/contentbefore-restore.csv --file2 /tmp/contentafter-restore.csv

    Expected output:

    NOTE: Compare the content of file1=/tmp/contentbefore-restore.csv and file2=/tmp/contentafter-restore.csv
    NOTE: SUMMARY
    NOTE: there is nothing in file2 that is not in file1.
    NOTE: DETAILS
    NOTE: The content listed below is in file2 but not in file1:
    id ,pathtoitem ,name ,contentType ,createdBy ,creationTimeStamp ,modifiedBy ,modifiedTimeStamp ,uri
    
    "2d22881b-7529-4e13-b349-48f223e32861","/gelcontent/GELCorp/HR/Reports/","Employee attrition factors heatmap","report","sasadm","2017-10-25T07:34:25.488Z","sasadm","2024-09-05T02:12:20.089Z","/reports/reports/3a95ba0d-d2bd-4897-b379-3f7e97a55e83"
    
    "e43c09a3-9793-4df0-b6e5-e9adcc4abc6d","/gelcontent/GELCorp/HR/Reports/","Employee Attrition Overview","report","sasadm","2017-10-25T08:13:57.1Z","sasadm","2024-09-05T02:12:20.087Z","/reports/reports/61e4e9e1-0b8a-4d28-89f5-3d98cf90cdbd"
    
    "b1f234fa-a77b-4876-b249-337c6f944c6e","/gelcontent/GELCorp/HR/Reports/","Employee attrition factors correlation","report","sasadm","2017-10-25T07:29:24.103Z","sasadm","2024-09-05T02:12:20.087Z","/reports/reports/c809ca34-ab79-47f8-ba25-b24cf3ae0740"
    
    "5b905498-8a27-4ecd-8065-2aaf8c001138","/gelcontent/GELCorp/HR/Reports/","Employee measure histograms","report","sasadm","2017-10-25T07:38:27.981Z","sasadm","2024-09-05T02:12:20.024Z","/reports/reports/c8869178-3495-44f8-bc48-c68c8129835c"
    
    "03bbb4c9-dc48-40dc-bc16-6f9bfb530098","/gelcontent/GELCorp/HR/","Reports","folder","geladm","2024-09-05T02:11:11.334785Z","geladm","2024-09-05T02:11:11.334786Z","/folders/folders/b7b4b719-7a76-4e91-bb41-0252b55abe48"

Review

In this hands-on you select a completed SAS Viya backup and restored it to Viya.


Lesson 05

SAS Viya Administration Operations
Lesson 05, Section 0 Exercise: Configure a Reusable Compute Context

Create a reusable compute context with pool of compute servers

In this exercise, we create a reusable compute context for HR members to use, which runs as user hrservice, another member of HR.

In this hands-on exercise

Optional: Try existing SAS Studio compute context as Henrik

This step is optional. Click here to see it.

Note: You may have already done something like this task in an earlier hands-on exercise. Feel free to skip this task and proceed to the next one if you like.

  1. Run the following command in MobaXterm:

    id Henrik

    Expected results:

    uid=4015(Henrik) gid=2003(sasusers) groups=2003(sasusers),3001(HR)

    From this you can see that Henrik’s uid number is 4015.

  2. Open SAS Studio and log in as Henrik:lnxsas.

    Tip: To generate the URL if you need it:

    gellow_urls | grep "SAS Studio"
  3. Make sure your compute session is running under the ‘SAS Studio compute context’: change the compute context if necessary. Wait for the session to start, if it has not already started.

  4. From the menu choose New > SAS Program, or click the ‘Program in SAS’ button in the Start Page to open a new SAS program pane.

    SAS Studio
  5. Check which user your compute session is running as. In SAS Studio’s SAS Program tab, paste and run the following code:

    %put NOTE: I am &_CLIENTUSERNAME;
    %put NOTE: My UID is &SYSUSERID;
    %put NOTE: My home directory is &_USERHOME;
    %put NOTE: My Compute POD IS &SYSHOSTNAME;

    Expected results - both the automatic macro variables _CLIENTUSERNAME and SYSUSERID return a value of Henrik:

    1    /* region: Generated preamble */
    79
    80   %put NOTE: I am &_CLIENTUSERNAME;
    NOTE: I am Henrik
    81   %put NOTE: My UID is &SYSUSERID;
    NOTE: My UID is Henrik
    82   %put NOTE: My home directory is &_USERHOME;
    NOTE: My home directory is /shared/gelcontent/home/Henrik
    83   %put NOTE: My Compute POD IS &SYSHOSTNAME;
    NOTE: My Compute POD IS sas-compute-server-4009df27-36f8-4656-9d19-6b2be004c8c2-34
    84
    85   /* region: Generated postamble */
    96
  6. Back in MobaXterm, connected to sasnode01 as cloud-user, exec into the running launcher pod and see the UID and GID of the user inside the pod and the files.

    kubectl exec -it $(kubectl get pod -l launcher.sas.com/requested-by-client=sas.studio,launcher.sas.com/username=Henrik --output=jsonpath={.items..metadata.name}) -- bash -c "id && ls -al /gelcontent/home/Henrik"

    Expected output - notice that the uid from the id command is also 4015, i.e. Henrik, that all the files are owned by user 4015, and that the home directory contains a file you saved there as Henrik in an earlier exercise:

    uid=4015 gid=2003 groups=2003,3000,3001
    total 20
    drwx------  4 4015 2003  125 Sep 23 16:11 .
    drwxr-xr-x 35 root root 4096 Sep 17 10:32 ..
    -rw-------  1 4015 2003   18 Sep 17 10:32 .bash_logout
    -rw-------  1 4015 2003  193 Sep 17 10:32 .bash_profile
    -rw-------  1 4015 2003  231 Sep 17 10:32 .bashrc
    drwx------  2 4015 2003    6 Sep 23 16:11 casuser
    -rwx------  1 4015 2003  367 Sep 17 11:22 gel_launcher_details.sas
    drwx------  4 4015 2003   39 Sep 17 10:32 .mozilla

    This gives us a baseline to compare with later on: we are definitely running this SAS Programming Run-Time session as Henrik.


Store credentials for hrservice in Compute Service

This will be the first step in the process of letting Henrik run a SAS Compute context with shared credentials for another account, hrservice.

  1. Use the gel_sas_viya script to run the sas-viya CLI’s compute plugin to list existing shared credentials - we don’t expect there to be any yet:

    gel_sas_viya compute credentials list

    Expected output:

    There are no shared service account credentials.
  2. Use gel_sas_viya to create (i.e. store) a shared credential for user hrservice:lnxsas.

    Note: Here we are passing the credentials directly in the script. To be more secure, you could store them in a protected file readable only to the user who runs the script (e.g. with permissions of 0600). Or, if you are creating the credentials as a one-off task, you could omit the -u (--user) and/or -p (--password) parameters from the command to be prompted to for them interactively.

    gel_sas_viya compute credentials create -u hrservice -p lnxsas -d "Shared service account called hrservice"

    Expected output:

    2024/09/24 15:49:34 The shared service account credential for hrservice was created successfully.
  3. Then check that the credentials have been created and stored:

    gel_sas_viya compute credentials list

    Expected output:

    Shared Service Account Credentials:
    1. hrservice - compute-password - Shared

Create compute context to run as hrservice

To configure SAS Programming Run-Time servers as reusable, they must first be configured to run under a shared account, like the one for which we just saved credentials.

Tip: If you used Chrome as your main browser for SAS Studio so far, we suggest you use Firefox for this task, or the other way around. This allows you to be logged in as a user (Henrik, Ahmed etc.) in one browser, and as geladm in the other, without having to log out and log in again so often.

  1. In a different browser to the one you used to open SAS Studio as Henrik earlier, open SAS Environment Manager and log in as geladm:lnxsas.

    Tip: To generate the URL if you need it:

    gellow_urls | grep "SAS Environment Manager"

    As always, opt in to the SASAdministrators assumable group, and if prompted click ‘Skip setup’ and ‘Let’s go’.

  2. In SAS Environment Manager, as geladm, open the Contexts page.

    Context page icon
  3. Select the Compute contexts view.

  4. Right-click the SAS Studio Compute context and choose ‘Copy’ from the popup menu:

    Environment Manager Copy Contexts button
  5. In the New Compute Context dialog, set the properties of the new context to the following values. Add an attribute for runServerAs=hrservice:

    Property

    Value

    Name:

    SAS Studio compute context as hrservice

    Description:

    A compute context for SAS Studio which allows members of HR to run code as hrservice.

    Launcher context:

    SAS Studio launcher context

    Identity type:

    Identities

    Groups:

    HR

    Attributes:

    runServerAs

    hrservice

    Resources:

    shrfmt
    Shared formats - Base SAS I/O Engine

    (Present only if you created it in an earlier exercise)

    Advanced:

    SAS options:

    (none)

    Autoexec content:

    (none)

    This is what the Basic tab of the dialog should look like:

    New compute context dialog
  6. Click Save, and after a moment, the new compute context should be created:

    List of compute contexts, including the new one

Test new compute context runs as hrservice

  1. Back in your main browser (e.g. Chrome), in SAS Studio, still logged in as Henrik, click the server context button in the top right-hand corner of SAS Studio to view the list of available contexts.

  2. Scroll down if necessary to see your new context “SAS Studio compute context as hrservice”.

  3. Choose SAS Studio compute context as hrservice. If prompted, click ‘Change’.

    After a moment a compute session under the new compute context will start.

  4. In a SAS Program tab, paste and run the same code you ran earlier:

    %put NOTE: I am &_CLIENTUSERNAME;
    %put NOTE: My UID is &SYSUSERID;
    %put NOTE: My home directory is &_USERHOME;
    %put NOTE: My Compute POD IS &SYSHOSTNAME;

    Expected results:

    1    /* region: Generated preamble */
    79
    80   %put NOTE: I am &_CLIENTUSERNAME;
    NOTE: I am Henrik
    81   %put NOTE: My UID is &SYSUSERID;
    NOTE: My UID is hrservice
    82   %put NOTE: My home directory is &_USERHOME;
    NOTE: My home directory is /shared/gelcontent/home/hrservice
    83   %put NOTE: My Compute POD IS &SYSHOSTNAME;
    NOTE: My Compute POD IS sas-compute-server-3f376ec7-504f-451e-bc6d-1b8e70d4afd3-37
    84
    85   /* region: Generated postamble */
    96

    Your _CLIENTUSERNAME is Henrik, but now your SYSUSERID is hrservice, and ‘your’ home directory is /shared/gelcontent/home/hrservice instead of /shared/gelcontent/home/Henrik.

  5. For added confirmation, exec into the running launcher pod and see the UID and GID of the user inside the pod and the files.

    Note: If you are paying very close attention, you may see that the inner kubectl command here is looking for slightly different labels on the sas-launcher pod than we looked for earlier. That’s because the labels it has are different, now that it is running as hrservice instead of the user who logged in. Earlier, we looked for Henrik’s launcher pod with:

    • -l launcher.sas.com/requested-by-client=sas.studio,launcher.sas.com/username=Henrik

    Now we are looking for a pod with these labels:

    • -l launcher.sas.com/requested-by-client=sas.compute,launcher.sas.com/username=hrservice

    Also, notice that we are listing the contents of hrservice’s home directory, instead of Henrik’s. You can see that the files in that home directory are different to those we saw earlier.

    kubectl exec -it $(kubectl get pod -l launcher.sas.com/requested-by-client=sas.compute,launcher.sas.com/username=hrservice --output=jsonpath={.items..metadata.name}) -c "sas-programming-environment" -- bash -c "id && ls -al /shared/gelcontent/home/hrservice"

    Expected output:

    uid=3001 gid=2003 groups=2003
    total 12
    drwx------ 3 3001 2003  78 Sep 17 10:32 .
    drwxr-xr-x 3 root root  23 Sep 25 12:43 ..
    -rw------- 1 3001 2003  18 Sep 17 10:32 .bash_logout
    -rw------- 1 3001 2003 193 Sep 17 10:32 .bash_profile
    -rw------- 1 3001 2003 231 Sep 17 10:32 .bashrc
    drwx------ 4 3001 2003  39 Sep 17 10:32 .mozilla

    You can see that this time, our uid is hrservice’s uid, 3001, instead of Henrik’s uid, 4015.

  6. Run an id command to verify that UID 3001 is the hrservice.

    id 3001

    You should see that UID 3001 is the hrservice.

    uid=3001(hrservice) gid=2003(sasusers) groups=2003(sasusers)

    This demonstrates two things.

    First, that Henrik has started a compute session in SAS Studio which runs as hrservice.

    Second, you can tell who Henrik is running his session as by inspecting the value of both &SYSUSERID and &_USERHOME, which indicates that the home directory is set to /shared/gelcontent/home/hrservice.

    And one more time to make the dual identity absolutely clear (Henrik running as hrservice), click on the user menu in the very top right of the application window. We are still logged in to SAS Studio as Henrik:

    SAS Studio user is still Henrik

Make servers that run with the new compute context reusable

  1. In your alternate browser (e.g. Firefox if you were mainly using Chrome), open SAS Environment Manager and sign in as geladm:lnxsas, if you aren’t already signed in.

  2. Return to the Contexts page, and the Compute contexts view.

  3. Edit your “SAS Studio compute context as hrservice” compute context. Add a new attribute, as follows:

    reuseServerProcesses=true

    Here are all the properties, with the new attribute in bold:

    Property

    Value

    Name:

    SAS Studio compute context as hrservice

    Description:

    A compute context for SAS Studio which allows members of HR to run code as hrservice.

    Launcher context:

    SAS Studio launcher context

    Identity type:

    Identities

    Groups:

    HR

    Attributes:

    runServerAs

    hrservice

    reuseServerProcesses

    true

    Advanced:

    SAS options:

    (none)

    Autoexec content:

    (none)

    The modified compute context should look like this:

    Note: The attributes may be listed in the reverse order in your environment when you first add a new attribute. The order of the attributes is not important.

    SAS Studio new compute context is also reusable
  4. Save your change.

    The documentation describes some other properties which you can also set, if you don’t like the default values:

    Attribute: Default Notes
    serverInactiveTimeout 600 Determines the time the server can remain idle before it is terminated. A server is considered to be idle if there is no active session in the server. The default value is 600 seconds (10 minutes).
    serverReuseLimit Determines the number of times a server can be reused before it is terminated. If this attribute is not set, there is no limit on how many times the server can be reused.

Show that servers run with new compute context are reusable

  1. In your main browser (e.g. Chrome), sign out of SAS Studio. Click ‘Discard and Exit’ if prompted.

  2. Sign in again as Henrik:lnxsas.

    Tip: Really do sign out, and sign back in again. Do not just click the browser’s refresh button, and do not just choose Options > Reset SAS session.

    It appears that just clicking the browser refresh button, or resetting the session in SAS Studio, normally results in you getting a compute session in a new pod, instead of in the same pod as before.

    I think this is because the old compute session does not have enough time to end when you refresh the browser page or reset the session. When your refreshed SAS Studio requests a compute session, the old compute session is either still running or still terminating. So SAS Launcher has to start your new session in a new pod. If you sign out of SAS Studio, I think enough time elapses for your old SAS session to end, leaving the existing pod available to be re-used, and you normally get a session running in the same pod again.

  3. Ensure that the current compute context is still “SAS Studio compute context as hrservice”, and open a new program window again. Run the usual code:

    %put NOTE: I am &_CLIENTUSERNAME;
    %put NOTE: My UID is &SYSUSERID;
    %put NOTE: My home directory is &_USERHOME;
    %put NOTE: My Compute POD IS &SYSHOSTNAME;

    Expected results:

    1    /* region: Generated preamble */
    79
    80   %put NOTE: I am &_CLIENTUSERNAME;
    NOTE: I am Henrik
    81   %put NOTE: My UID is &SYSUSERID;
    NOTE: My UID is hrservice
    82   %put NOTE: My home directory is &_USERHOME;
    NOTE: My home directory is /shared/gelcontent/home/hrservice
    83   %put NOTE: My Compute POD IS &SYSHOSTNAME;
    NOTE: My Compute POD IS sas-compute-server-470fe399-46f2-463d-ae50-90686543615d-41
    84
    85   /* region: Generated postamble */
    96

    Make a note of the Compute pod name as you did before - for example, copy it from SAS Studio and paste it into a text editor like Notepad++.

  4. Sign out of SAS Studio (‘Discard and Exit’ if prompted) and sign in yet again as Henrik:lnxsas.

  5. Once again, check that the current compute context is still “SAS Studio compute context as hrservice”, and open a new program window again.

    Note: This time, your compute session may start a little more quickly than it did before!

  6. Run the usual code:

    %put NOTE: I am &_CLIENTUSERNAME;
    %put NOTE: My UID is &SYSUSERID;
    %put NOTE: My home directory is &_USERHOME;
    %put NOTE: My Compute POD IS &SYSHOSTNAME;

    Expected results:

    1    /* region: Generated preamble */
    79
    80   %put NOTE: I am &_CLIENTUSERNAME;
    NOTE: I am Henrik
    81   %put NOTE: My UID is &SYSUSERID;
    NOTE: My UID is hrservice
    82   %put NOTE: My home directory is &_USERHOME;
    NOTE: My home directory is /shared/gelcontent/home/hrservice
    83   %put NOTE: My Compute POD IS &SYSHOSTNAME;
    NOTE: My Compute POD IS sas-compute-server-470fe399-46f2-463d-ae50-90686543615d-41
    84
    85   /* region: Generated postamble */
    96

    Make a note of the Compute pod name as you did before - for example, copy it from SAS Studio and paste it into a text editor like Notepad++.

    Note: Notice that the name of the pod running this SAS Studio session is the same as the pod in the previous SAS Studio session.

    This shows that the compute server was reused.

Configure a pool of available servers

  1. In your main browser, sign out of SAS Studio. ‘Discard and Exit’ if prompted.

    IMPORTANT: To create a pool of available compute servers, the servers must be reusable, and must run under a service account. See the preceding tasks in this exercise, above.

  2. In your alternate browser (e.g. Firefox if you were mainly using Chrome, or the other way around), open SAS Environment Manager and sign in as geladm:lnxsas, if you aren’t already signed in.

  3. Return to the Contexts page, and the Compute contexts view.

  4. Edit your “SAS Studio compute context as hrservice” compute context. Add a new attribute, as follows:

    serverMinAvailable=1

    Here are all the properties, with the new attribute in bold:

    Property

    Value

    Name:

    SAS Studio compute context as hrservice

    Description:

    A compute context for SAS Studio which allows members of HR to run code as hrservice.

    Launcher context:

    SAS Studio launcher context

    Identity type:

    Identities

    Groups:

    HR

    Attributes:

    runServerAs

    hrservice

    reuseServerProcesses

    true

    serverMinAvailable

    1

    Advanced:

    SAS options:

    (none)

    Autoexec content:

    (none)

    The modified compute context should look like this:

    SAS Studio new compute context pre-start available servers

    Save your change.

  5. Run this command in MobaXterm, to find compute pods which were launched by the SAS Compute service:

    kubectl get pod -l launcher.sas.com/requested-by-client=sas.compute,launcher.sas.com/username=hrservice

    Expected output - the number of compute pods you see may vary depending on when you run the command in relation to setting serverMinAvailable to 1:

    NAME                                                         READY   STATUS    RESTARTS   AGE
    sas-compute-server-ef843aee-51e0-4965-86cd-fba68f36dfa6-42   2/2     Running   0          50s
  6. In your main browser, sign in to SAS Studio again as Henrik:lnxsas.

  7. Once again, check that the current compute context is still “SAS Studio compute context as hrservice”.

    Q: What do you notice about how long it took for your compute server in the “SAS Studio compute context as hrservice” context to be available?

    A: It should have been quicker than before - perhaps around 4 or 5 seconds. You connected to a pre-started compute server from the ‘pool’ (of 1, in this case!) of available compute servers that you requested be created for this compute context.

  8. Open a new program window again. Run the usual code:

    %put NOTE: I am &_CLIENTUSERNAME;
    %put NOTE: My UID is &SYSUSERID;
    %put NOTE: My home directory is &_USERHOME;
    %put NOTE: My Compute POD IS &SYSHOSTNAME;

    Expected results - we are running in a pod that was already ‘pre-started’ and available:

    1    /* region: Generated preamble */
    79
    80   %put NOTE: I am &_CLIENTUSERNAME;
    NOTE: I am Henrik
    81   %put NOTE: My UID is &SYSUSERID;
    NOTE: My UID is hrservice
    82   %put NOTE: My home directory is &_USERHOME;
    NOTE: My home directory is /shared/gelcontent/home/hrservice
    83   %put NOTE: My Compute POD IS &SYSHOSTNAME;
    NOTE: My Compute POD IS sas-compute-server-ef843aee-51e0-4965-86cd-fba68f36dfa6-42
    84
    85   /* region: Generated postamble */
    96

    Note: Notice that the compute pod in this SAS Studio session is the same one that was pre-started.

  9. Run this command again in MobaXterm, to find compute contexts which were launched by the SAS Compute service:

    kubectl get pod -l launcher.sas.com/requested-by-client=sas.compute,launcher.sas.com/username=hrservice

    Expected output - the pod that was running before, plus one new pod which started when we signed in to SAS Studio as Henrik and took the existing pre-started compute pod:

    NAME                                                         READY   STATUS    RESTARTS   AGE
    sas-compute-server-988182b9-e8aa-48e3-9ea3-ab18726f104a-43   2/2     Running   0          49s
    sas-compute-server-ef843aee-51e0-4965-86cd-fba68f36dfa6-42   2/2     Running   0          4m50s

    Q: What does this show?

    A: Notice that one of the compute sessions started a few minutes ago, and the other when you signed in to SAS Studio more recently. This shows that when you modified the ‘SAS Studio compute context as hrservice’ compute context to set serverMinAvailable = 1, the SAS Launcher service started a new compute server under that context - a ‘pool’ of 1 compute server, under the username hrservice. Then, when you signed in to SAS Studio as Henrik, you connected to an (or rather, the only) available compute server, which means you got a compute session more quickly. As soon as you were connected to it, it was no longer ‘available’, it was in use. So the SAS Launcher service started another SAS Compute server under the same context, to be ‘available’ ready and waiting. When a user takes an available server from the pool, another server is started in its place, so that there is always the requested number of unused, ready and waiting servers available.

  10. OPTIONAL: In your main browser, use a stopwatch or timer to see how long it takes to switch:

    1. From “SAS Studio compute context as hrservice” to “SAS Studio compute context” (which does NOT have a pool of available servers)
    2. From “SAS Studio compute context” to “SAS Studio compute context as hrservice” (which you just configured to maintain a pool of 1 available server(s))

    Here is a sample of times we measured, all in seconds, over three context switches in each of the directions above, all in the same SAS Studio session as Henrik:

    Attempt “compute context as hrservice” to “compute context” “compute context” to “compute context as hrservice”
    1 28.7 5.9
    2 29.4 5.9
    3 29.2 5.1
    Average 29.1 5.7

    As you can see from these results, where you can configure a SAS compute context to 1. run as a shared account, 2. be reusable, and 3. maintain a pool of available compute servers, the fact a pool of available compute servers are maintained significantly reduces the time to get a compute session when one is needed.

Create a reusable compute context with a pool of compute servers with a script

  1. Run this all at once, in MobaXterm connected to sasnode01 as cloud-user, to create another reusable compute context with one pre-started compute server called “SAS Studio compute context as hrservice too”, in a single easy-to-script step.

    tee /home/cloud-user/prestarted_reusable_hrservice_cc.json > /dev/null << EOF
    {
        "name": "SAS Studio compute context as hrservice too",
        "description": "Another compute context for SAS Studio which allows members of HR to run code as hrservice.",
        "attributes": {
            "runServerAs": "hrservice",
            "reuseServerProcesses": "true",
            "serverMinAvailable": "1"
        },
        "launchContext": {
            "contextName": "SAS Studio launcher context"
        },
        "launchType": "service",
        "authorizedGroups": [
            "HR"
        ]
    }
    EOF
    
    sas-viya compute contexts create -r -d @/home/cloud-user/prestarted_reusable_hrservice_cc.json
    sleep 30
  2. Run this command again in MobaXterm, to find compute contexts which were launched by the SAS Compute service:

    kubectl get pod -l launcher.sas.com/requested-by-client=sas.compute,launcher.sas.com/username=hrservice

    Expected output - there is one new pod which started when we created the new compute context from the command line, with one pre-started compute server:

    NAME                                                         READY   STATUS    RESTARTS   AGE
    sas-compute-server-0fa95c99-57d9-441e-a4b2-6edd5e011aa8-44   2/2     Running   0          8m13s
    sas-compute-server-c71f3a6e-10b9-40ab-ab22-2f5b8c604253-45   2/2     Running   0          36s
    sas-compute-server-ef843aee-51e0-4965-86cd-fba68f36dfa6-42   2/2     Running   0          20m

Obviously you can modify the attributes and other properties in the JSON file to suit your requirements, subject to the resources available in your SAS Viya environment.


SAS Viya Administration Operations
Lesson 05, Section 1 Exercise: Configure Python Integration

In this hands-on you will complete the configuration necessary to integrate Python with SAS Viya.

Setup

  • In a MobaXterm session on sasnode01, set the current namespace to the gelcorp deployment.

    gel_setCurrentNamespace gelcorp
  • Keep a copy of the current manifest and kustomization.yaml files. We will use these copies to track the changes your kustomization processing makes to these two files.

    cp -p /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml /tmp/${current_namespace}/manifest_03-036.yaml
    cp -p ~/project/deploy/${current_namespace}/kustomization.yaml /tmp/${current_namespace}/kustomization_03-036.yaml

Steps completed so far

As part of the workshop deployment, two of the steps normally included in this process have been completed for you. In the interest of time, we have already

  • Configured the ASTORES PVC that is required by MAS
  • Installed Python and R using the SAS Configurator for Open Source.

Mount the sas-pyconfig PVC

Because Python was installed using the SAS Configurator for Open Source, Python is located on the sas-pyconfig PVC. You now need to mount the sas-pyconfig PVC to the MAS, CAS, and launcher-based pods so they can access Python.

  1. Create a new directory in $deploy/site-config for the customizations

    mkdir -p ~/project/deploy/${current_namespace}/site-config/sas-open-source-config/python
  2. Copy $deploy/sas-bases/examples/sas-open-source-config/python/python-transformer.yaml to the site-config directory.

    export deploy=~/project/deploy/${current_namespace}
    cd ${deploy}/site-config/sas-open-source-config/python
    cp ${deploy}/sas-bases/examples/sas-open-source-config/python/python-transformer.yaml .
    chmod ug+w ./python-transformer.yaml
  3. Use sed to customize the python-transformer.yaml template.

    • Replace {{ VOLUME-ATTRIBUTES }} with persistentVolumeClaim: {claimName: sas-pyconfig} for all python-volume definitions.
    • Replace the default /python mount paths with /opt/sas/viya/home/sas-pyconfig, which is required when using the SAS Configurator for Open Source.
    • Replace the {{ PYTHON-EXE-DIR }} and {{ PYTHON-EXECUTABLE }} for the Java policy allow list.
    cd ${deploy}/site-config/sas-open-source-config/python
    sed -i "s/{{ VOLUME-ATTRIBUTES }}/persistentVolumeClaim: {claimName: sas-pyconfig}/g"  python-transformer.yaml
    sed -i "s/\/python/\/opt\/sas\/viya\/home\/sas-pyconfig/g" python-transformer.yaml
    sed -i "s/{{ PYTHON-EXE-DIR }}/default_py\/bin/g" python-transformer.yaml
    sed -i "s/{{ PYTHON-EXECUTABLE }}/python3/g" python-transformer.yaml
  4. Examine the differences to verify your changes on the right with the original template values on the left.

    icdiff -W ${deploy}/sas-bases/examples/sas-open-source-config/python/python-transformer.yaml ./python-transformer.yaml
  5. Modify ~/project/deploy/gelcorp/kustomization.yaml to reference site-config/sas-open-source-config/python/python-transformer.yaml. The python-transformer.yaml needs to be referenced before sas-bases/overlays/required/transformers.yaml.

    [[ $(grep -c "site-config/sas-open-source-config/python/python-transformer.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    sed -i '/sas-bases\/overlays\/required\/transformers.yaml/i \ \ \- site-config\/sas-open-source-config\/python\/python-transformer.yaml' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you could have manually edited the transformers section to add the reference as shown below.

    transformers:
      ...
      - site-config/sas-open-source-config/python/python-transformer.yaml
      - sas-bases/overlays/required/transformers.yaml
      ...
  6. Verify that python-transformer.yaml was added before the required transformers.yaml. You should see it listed in green in the right column.

    icdiff -W /tmp/gelcorp/kustomization_03-036.yaml ${deploy}/kustomization.yaml

Connect Python command

With Python mounted to our SAS Viya pods, the next step is to provide MAS and compute pods with the fully qualified commands to Python and to additional configuration elements.

  1. Create ~/project/deploy/gelcorp/site-config/sas-open-source-config/python/kustomization.yaml to define environment variables pointing to the Python interpreter.

    • MAS_PYPATH which is used by SAS Micro Analytic Service
    • PROC_PYPATH which is used by PROC PYTHON in compute servers
    • DM_PYPATH which is used by the Open Source Code node in SAS Visual Data Mining and Machine Learning
    • SAS_EXTLANG_SETTINGS which controls access to Python from CAS (more on this later)
    • SAS_EXT_LLP_PYTHON which is used when the base distribution or packages for open-source software require additional run-time libraries that are not part of the shipped container image, similar to the LD_LIBRARY_PATH concept.
    tee ${deploy}/site-config/sas-open-source-config/python/kustomization.yaml > /dev/null << EOF
    configMapGenerator:
    - name: sas-open-source-config-python
      literals:
      - MAS_PYPATH=/opt/sas/viya/home/sas-pyconfig/default_py/bin/python3
      - MAS_M2PATH=/opt/sas/viya/home/SASFoundation/misc/embscoreeng/mas2py.py
      - PROC_PYPATH=/opt/sas/viya/home/sas-pyconfig/default_py/bin/python3
      - PROC_M2PATH=/opt/sas/viya/home/SASFoundation/misc/tk
      - DM_PYPATH=/opt/sas/viya/home/sas-pyconfig/default_py/bin/python3
      - SAS_EXTLANG_SETTINGS=/opt/sas/viya/home/sas-pyconfig/extlang.xml
      - SAS_EXT_LLP_PYTHON=/opt/sas/viya/home/sas-pyconfig/lib/python3.9/lib-dynload
    - name: sas-open-source-config-python-mas
      literals:
      - MAS_PYPORT= 31100
    EOF
  2. Modify ~/project/deploy/gelcorp/kustomization.yaml to add a reference to site-config/sas-open-source-config/python in the resources field.

    [[ $(grep -xc "site-config/sas-open-source-config/python" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.resources += ["site-config/sas-open-source-config/python"]' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you could have manually edited the resources section to add the reference as shown below.

    resources:
      ...
      - site-config/sas-open-source-config/python
  3. Verify that site-config/sas-open-source-config/python was added to the resources field. You should see it listed in green in the right column.

    icdiff -W /tmp/gelcorp/kustomization_03-036.yaml ${deploy}/kustomization.yaml

Adjust LOCKDOWN to allow Python

For security reasons, SAS Viya compute servers are configured in LOCKDOWN mode which prohibits users from invoking external processes. The next step enables communication between Python and SAS Viya compute servers in LOCKDOWN.

  1. Copy sas-bases/examples/sas-programming-environment/lockdown/enable-lockdown-access-methods.yaml to site-config/sas-programming-environment/lockdown/enable-lockdown-access-methods.yaml.

    mkdir -p ${deploy}/site-config/sas-programming-environment/lockdown
    cp ${deploy}/sas-bases/examples/sas-programming-environment/lockdown/enable* "$_"
    chmod 644 ${deploy}/site-config/sas-programming-environment/lockdown/*.yaml
  2. The following code edits site-config/sas-programming-environment/lockdown/enable-lockdown-access-methods.yaml to enable python,python_embed, and socket access methods. The socket method is required for the Python Code Editor.

    sed -i "s/{{ ACCESS-METHOD-LIST }}/python python_embed socket/g" $deploy/site-config/sas-programming-environment/lockdown/enable-lockdown-access-methods.yaml
  3. Modify ~/project/deploy/gelcorp/kustomization.yaml to add a reference to site-config/sas-programming-environment/lockdown/enable-lockdown-access-methods.yaml in the transformers field.

    [[ $(grep -c "site-config/sas-programming-environment/lockdown/enable-lockdown-access-methods.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.transformers += ["site-config/sas-programming-environment/lockdown/enable-lockdown-access-methods.yaml"]' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you could have manually edited the transformers section to add the reference as shown below.

    transformers:
      ...
      - site-config/sas-programming-environment/lockdown/enable-lockdown-access-methods.yaml
  4. Verify that enable-lockdown-access-methods.yaml was added to the transformers field. You should see it listed in green in the right column.

    icdiff -W /tmp/gelcorp/kustomization_03-036.yaml ${deploy}/kustomization.yaml

Configure watchdog

While compute server sessions are locked down by default, Python processes are not. Fortunately, the SAS Compute Server provides the ability to execute SAS Watchdog, which monitors the spawned Python processes to ensure that they comply with the terms of LOCKDOWN system options.

SAS Watchdog emulates the restrictions imposed by LOCKDOWN by restricting access only to files that exist in folders that are allowed by LOCKDOWN.

  1. To enable watchdog, simply add a reference to sas-bases/overlays/sas-programming-environment/watchdog to the transformers field of your base kustomization.yaml before the required transformers.yaml.

    [[ $(grep -c "sas-bases/overlays/sas-programming-environment/watchdog" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    sed -i '/sas-bases\/overlays\/required\/transformers.yaml/i \ \ \- sas-bases\/overlays\/sas-programming-environment\/watchdog' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you could have manually edited the transformers section to add the reference as shown below.

    transformers:
      ...
      - sas-bases/overlays/sas-programming-environment/watchdog
      - sas-bases/overlays/required/transformers.yaml
      ...
  2. Verify that watchdog was added before the required transformers.yaml. You should see it listed in green in the right column.

    icdiff -W /tmp/gelcorp/kustomization_03-036.yaml ${deploy}/kustomization.yaml

Configure CAS for external languages

There are three additional steps to configure CAS for external language integration. Two of the three steps were done in an earlier exercise but we will include them here in case you did not complete that work.

  1. The first step is to configure CAS for host access which enables CAS to do host identity session launching. You did this step in an earlier exercise but you can perform the following steps, understanding that any errors you see are likely due to the transformer already having been included.

    • Copy $deploy/sas-bases/examples/cas/configure/cas-enable-host.yaml to $deploy/site-config. If you see a Permission denied error that can be ignored. It means that the file already exists from an earlier exercise.
    cp -p ${deploy}/sas-bases/examples/cas/configure/cas-enable-host.yaml ~/project/deploy/${current_namespace}/site-config
    • Add a reference to it in your base kustomization.yaml file’s transformers field before the required transformers.yaml.
    [[ $(grep -c "site-config/cas-enable-host.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    sed -i '/sas-bases\/overlays\/required\/transformers.yaml/i \ \ \- site-config\/cas-enable-host.yaml' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you could have manually edited the transformers section to add the reference as shown below.

    transformers:
      ...
      - site-config/cas-enable-host.yaml
      - sas-bases/overlays/required/transformers.yaml
      ...
    • Verify that cas-enable-host.yaml was added before the required transformers.yaml. You should see it listed in the right column. If you added it just now it will be displayed in green. If you added it in an earlier exercise it will appear in white.
    icdiff -W /tmp/gelcorp/kustomization_03-036.yaml ~/project/deploy/gelcorp/kustomization.yaml
  2. The second step is to configure users who need host identity sessions. This was done earlier in exercise 03_031_Respecting_Permissions_and_Home_Directories so if you completed that work you can skip ahead to step #3.

    • Otherwise, create the CASHostAccountRequired group.
    gel_sas_viya --output text  identities create-group --id CASHostAccountRequired --name "CASHostAccountRequired" --description "Run CAS as users account"
    Id            CASHostAccountRequired
    Name          CASHostAccountRequired
    Description   Run
    State         active
    The group was created successfully.

    If you see the following instead, you have already created the CASHostAccountRequired group and can ignore the error.

    The following errors have occurred:
    The identity "CASHostAccountRequired" already exists.
    • Add some users to the CASHostAccountRequired group. These users will launch there CAS session under the user identity.
    gel_sas_viya --output text identities add-member --group-id CASHostAccountRequired --user-member-id Henrik
    gel_sas_viya --output text identities add-member --group-id CASHostAccountRequired --user-member-id geladm
    gel_sas_viya --output text identities add-member --group-id CASHostAccountRequired --user-member-id Delilah
    Henrik has been added to group CASHostAccountRequired
    geladm has been added to group CASHostAccountRequired
    Delilah has been added to group CASHostAccountRequired
  3. The third step is to create an XML file that allows specified users to access external languages from CAS. Any referenced users must be in the CASHostAccountRequired group. Earlier, you initialized the SAS_EXTLANG_SETTINGS environment variable with /opt/sas/viya/home/sas-pyconfig/extlang.xml so that is the file we need to create. We are using the sas-pyconfig PVC for this file since it is a location that is accessible to CAS.

    • Get the path for the sas-pyconfig PVC.
    volume=$(kubectl describe pvc sas-pyconfig | grep Volume: | awk '{print $NF}')
    pvPath=$(kubectl describe pv ${volume} | grep Path: | awk '{print $NF}')
    echo pvPath is ${pvPath}
    • Create the extlang.xml file on the sas-pyconfig PVC. The permissions in this file allow only geladm, Henrik, and Delilah to access Python and R from CAS.
    sudo -u sas tee ${pvPath}/extlang.xml > /dev/null << EOF
    <EXTLANG version="1.0" mode="ALLOW" allowAllUsers="BLOCK">
        <DEFAULT  scratchDisk="/tmp"
                diskAllowlist="/opt/sas/viya/home/sas-pyconfig"
                userSetScratchDisk="BLOCK">
            <LANGUAGE name="PYTHON3"
                    interpreter="/opt/sas/viya/home/sas-pyconfig/default_py/bin/python3"
                    userSetEnv="BLOCK"
                    userSetInterpreter="BLOCK">
            </LANGUAGE>
            <LANGUAGE name="R"
                    interpreter="/opt/sas/viya/home/sas-pyconfig/default_r/bin/Rscript"
                    userSetEnv="BLOCK"
                    userSetInterpreter="BLOCK">
            </LANGUAGE>
        </DEFAULT>
        <GROUP name="geladm">
            <LANGUAGE name="PYTHON3"
                    userInlineCode="ALLOW"
                    userSetEnv="ALLOW"
                    userSetInterpreter="ALLOW" />
            <LANGUAGE name="R"
                    userInlineCode="ALLOW"
                    userSetEnv="ALLOW"
                    userSetInterpreter="ALLOW" />
        </GROUP>
        <GROUP name="analysts"
            users="Henrik,Delilah">
            <LANGUAGE name="PYTHON3"
                    userInlineCode="ALLOW"/>
            <LANGUAGE name="R"
                    userInlineCode="ALLOW"/>
        </GROUP>
    </EXTLANG>
    EOF
    • You should see extlang.xml in the listing of the sas-pyconfig volume.
    ls -alF ${pvPath}
    lrwxrwxrwx 1 sas sas   56 Apr 25 18:37 default_py -> /opt/sas/viya/home/sas-pyconfig/Python-3.9.16.1714081401
    lrwxrwxrwx 1 sas sas   50 Apr 25 18:16 default_r -> /opt/sas/viya/home/sas-pyconfig/R-4.2.3.1714081401
    -rw-r--r-- 1 sas sas 1198 Apr 26 12:28 extlang.xml
    -rw-r--r-- 1 sas sas 1154 Apr 25 18:37 md5sum
    drwxr-xr-x 8 sas sas   83 Apr 25 18:26 Python-3.9.16.1714081401
    drwxr-xr-x 5 sas sas   43 Apr 25 17:49 R-4.2.3.1714081401

Review kustomization changes

  • Run the following command to view the cumulative changes you have made to kustomization.yaml. Your changes are in green in the right column.

    icdiff -W /tmp/gelcorp/kustomization_03-036.yaml ${deploy}/kustomization.yaml

Apply changes

With the configuration complete, rebuild the SAS deployment to apply your changes to the cluster.

  • Apply your changes to the deployment using the sas-orchestration deploy command..

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
               -v ${PWD}/license:/license \
               -v ${PWD}/${current_namespace}:/${current_namespace} \
               -v ${HOME}/.kube/config_portable:/kube/config \
               -v /tmp/${current_namespace}/deploy_work:/work \
               -e KUBECONFIG=/kube/config \
               --user $(id -u):$(id -g) \
           sas-orchestration \
              deploy \
                 --namespace ${current_namespace} \
                 --deployment-data /license/SASViyaV4_${_order}_certs.zip \
                 --license /license/SASViyaV4_${_order}_license.jwt \
                 --user-content /${current_namespace} \
                 --cadence-name ${_cadenceName} \
                 --cadence-version ${_cadenceVersion} \
                 --image-registry ${_viyaMirrorReg}

    When the deploy command completes successfully, the final message should say The deploy command completed successfully as shown in the log snippet below.

    The deploy command started
    Generating deployment artifacts
    Generating deployment artifacts complete
    Generating kustomizations
    Generating kustomizations complete
    Generating manifests
    Applying manifests
    > start_leading gelcorp
    
    [...more...]
    
    > kubectl delete --namespace gelcorp --wait --timeout 7200s --ignore-not-found configmap sas-deploy-lifecycle-operation-variables
    configmap "sas-deploy-lifecycle-operation-variables" deleted
    
    > stop_leading gelcorp
    
    Applying manifests complete
    The deploy command completed successfully
  • If the sas-orchestration deploy command fails, review the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

Validate Python integration

Let’s use a simple program in SAS Studio to verify that you can run Python code.

  • Get the SAS Studio URL.

    gellow_urls | grep "SAS Studio"
  • Open SAS Studio and log in as Henrik:lnxsas.

  • If the SAS Studio compute context does not initialize successfully, wait 2 minutes and then re-select the SAS Studio compute context which will try to launch another compute server for you. You may need to repeat this a few more times if the servers are under load.

  • Paste this code into SAS Studio Code pane.

    proc python;
    submit;
    import sys
    print(sys.version)
    print("hello world")
    endsubmit;
    run;
  • Verify in the log that Python initialized and notice the log message that cites the Python release.

    80   proc python;
    81   submit
    NOTE: Python initialized.
    Python 3.9.16 (main, Apr 25 2024, 22:21:24)
    [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>>
    >>>
    81 !       ;
    82   import sys
    83   print(sys.version)
    84   print("hello world")
    85   endsubmit;
    86   run;
    >>>
    3.9.16 (main, Apr 25 2024, 22:21:24)
    [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)]
    hello world
    >>>
    NOTE: PROCEDURE PYTHON used (Total process time):
          real time           1.15 seconds
          cpu time            0.05 seconds
  • Sign out of SAS Studio as you have completed the exercise.


Lesson 06

SAS Viya Administration Operations
Lesson 06, Section 1 Exercise: Configure a Queue

Queue Management

In this section, you will inspect the defined queues, configure a new queue and a new context, and then submit workloads to utilize them.

Table of contents

Explore and define queues

View the defined queues in SAS Environment Manager.

  1. Authenticate to the CLI as a SAS Administrator and view queues with the workload-orchestrator plugin.

    /opt/pyviyatools/loginviauthinfo.py
    sas-viya workload-orchestrator queues list

    Expected output:

    {
        "items": [
            {
                "configInfo": {
                    "activeOverride": "",
                    "isDefaultQueue": true,
                    "maxJobs": -1,
                    "maxJobsPerHost": -1,
                    "maxJobsPerUser": -1,
                    "priority": 10,
                    "scalingMinJobs": -1,
                    "scalingMinSecs": -1,
                    "willRestartJobs": false
                },
                "name": "default",
                "processingInfo": {
                    "jobsPending": 0,
                    "jobsRunning": 0,
                    "jobsSuspended": 0,
                    "state": "OPEN-ACTIVE"
                },
                "tenant": "uaa",
                "version": 1
            }
        ]
    }
  2. Log on to Environment Manager as geladm:lnxsas and opt in to Assumable Groups.

  3. Open the Workload Orchestrator page, switch to the Configuration tab, and open the Queues panel.

  4. Click the New queue button to define a new queue with the following settings:

    • Name: adhoc
    • Priority: 5
    • Maximum jobs per user: 2
    • Users: Finance (Hint: Click the ‘identities’ icon next to the Users field and in the dialog, add the Finance group to the Selected Identities panel.)
    • Administrators: Delilah
    • Limits:
      • maxMemory: 0.001
      • maxClockTime: 70
  5. Click the Save icon.

Submit and interact with jobs

  1. Run the following to authenticate to the CLI as user Delilah:

    # Create authinfo file for Delilah
    tee ~/.authinfo_Delilah > /dev/null << EOF
    default user Delilah password lnxsas
    EOF
    chmod 600 ~/.authinfo_Delilah
    
    # log in to the CLI as Delilah
    /opt/pyviyatools/loginviauthinfo.py -f ~/.authinfo_Delilah
  2. Submit a job as Delilah:

    sas-viya batch jobs submit-pgm --pgm /mnt/workshop_files/workshop_content/Utils/swo_work/doWork1mins.sas -c default --queue adhoc

    What does the message tell you about the result of issuing the above command?

    View the answer

    While Delilah is a queue administrator, she does not have permission to submit jobs to the adhoc queue.

  3. Inactive the adhoc queue:

    sas-viya workload-orchestrator queues open-inactivate --queue adhoc

    Expected output:

    The queue "adhoc" is set successfully to "OPEN-INACTIVE".
  4. Now try try submitting a job as geladm, a SAS administrator, to the inactivated adhoc queue.

    /opt/pyviyatools/loginviauthinfo.py
    sas-viya batch jobs submit-pgm --pgm /mnt/workshop_files/workshop_content/Utils/swo_work/doWork1mins.sas -c default --queue adhoc
  5. Return to SAS Environment Manager and view the Workload Orchestrator > Jobs page.

    What is the status of the job you submitted?

    View the answer

    It is in a PENDING state, because inactivated queues can accept jobs, but will not process them until the queue is reactivated.

  6. Go the Queues tab and Activate the adhoc queue. Note that geladm has the privilege to do so as a SAS Administrator.

  7. Go to the Jobs tab and check to see that the job starts and runs.

  8. Return to MobaXterm and try running another job with the CLI, this time one that takes longer to run.

    sas-viya batch jobs submit-pgm --pgm /mnt/workshop_files/workshop_content/Utils/swo_work/doWork2mins.sas -c default --queue adhoc
  9. Run the following to view the status of the job as it executes.

    watch sas-viya --output text workload-orchestrator jobs list --queue adhoc --state ALL

    Wait for the job to finish execution. What happens to the doWork2mins job? Why?

    View the answer

    The job gets terminated after approximately 70 seconds due to the ‘maxClockTime’ limit you specified for the adhoc queue.

  10. Press Ctrl + C to return to the terminal prompt.

Associate contexts with queues

  1. Go to SAS Environment Manager’s Contexts area from the navigation menu. From the drop-down, select Batch contexts.

  2. Select the default context and then click the pencil icon to edit the context.

  3. On the Advanced tab, specify default for the SAS Workload Orchestrator queue field.

  4. Click Save.

  5. Once again try submitting another batch job to the adhoc queue with the default context.

    sas-viya batch jobs submit-pgm --pgm /mnt/workshop_files/workshop_content/Utils/swo_work/doWork10mins.sas -c default --queue adhoc
  6. Use SAS Environment Manager to see which queue the job is submitted to. When you find the answer, click the Cancel icon to terminate the job.


Lesson 07

SAS Viya Administration Operations
Lesson 07, Section 0 Exercise: Default CAS Server Review

Review the Default CAS Server

In this exercise you will examine the default CAS Server and its Kubernetes components.

Table of content

Set the namespace

gel_setCurrentNamespace gelcorp

List all CAS relative pods

This list all the pods that contain “sas-cas” in their name. The way to list all initial cas pods and CAS Server pods in a single command.

kubectl get pods \
            -o wide \
   | { head -1; grep "sas-cas"; }

Note: “{}” (brackets) are used here to pass multiple command in the “|” (pipe), and the “head -1” command provides us with the header of the kubectl command output.

List all CASDeployment

Each CASDeployment represents a single Viya deployment CAS Server.

This command lists the CASDeployments (CAS Server instances) that exist in you Viya deployment.

kubectl get casdeployments
NAME           AGE
default        56m

Look at the CAS pods and containers

Pods are the smallest deployable units of computing that you can create and manage in Kubernetes. Pod can contains a unique or a group of containers.

  1. List the CAS Server pods.

    The command below will list all CAS Server pods. The CAS operator pod manages all CASDeployments.

    kubectl get pods \
                --selector="app.kubernetes.io/managed-by==sas-cas-operator" \
                -o wide
    NAME                               READY   STATUS      RESTARTS   AGE   IP            NODE        NOMINATED NODE   READINESS GATES
    sas-cas-server-default-controller  3/3     Running     0          60m   10.42.2.221   intnode01   <none>           <none>

    Currently only a single default CAS Server exists. The default CAS Server is an SMP server so you do not see workers or backup controller pods in the listing. You will see those later in the workshop.

    In other CAS configurations you may see other pods listed such as

    • sas-cas-server-default-controller: a CAS Server controller (SMP & MPP).
    • sas-cas-server-default-backup: a CAS Server backup controller (MPP only).
    • sas-cas-server-default-worker-[0..N]: a CAS Server worker (MPP only).
  2. Look at the details of the CAS Server pod.

    This command lists details about the CAS controller which provides information about the type of CAS Server you have.

    kubectl describe pods \
                     sas-cas-server-default-controller \
       | grep " casoperator." \
       | awk -F"/" '{print $2}'

    Click here to see the output

    cas-cfg-mode=smp
    cas-env-consul-name=cas-shared-default
    controller-active=1
    controller-index=0
    instance-index=0
    node-type=controller
    server=default
    service-name=primary

    Possible values are:

    • cas-cfg-mode: smp or mpp
    • cas-env-consul-name and server are metadata information about the CAS Server
    • node-type: controller or worker
      • if controller:
        • controller-active: 1 or 0 (0=inactive; 1=active)
        • controller-index: 0 or 1 (0=primaryController; 1=secondaryController)
      • if worker:
        • worker-index: 0..N (the worker number)
    • instance-index: 0..N (exist only when state transfer is enable)
  3. List the containers in your CAS Server pod

    The kubectl top pods command is normally used to get information about the pod resources consumption. But using with the --containers parameter, it is also a very easy way to list all of a pod’s containers.

    kubectl top pods \
                sas-cas-server-default-controller \
                --containers
    POD                                 NAME               CPU(cores)   MEMORY(bytes)
    sas-cas-server-default-controller   sas-cas-server     118m         70Mi
    sas-cas-server-default-controller   sas-backup-agent   1m           18Mi
    sas-cas-server-default-controller   sas-consul-agent   26m          21Mi

    The NAME field contains names of all CAS Server pod containers.

    Note that the sas-cas-server container was named cas before 2022.09

List the volumes available to CAS Server pods

A Kubernetes volume is essentially a storage area accessible to all containers running in a pod. In contrast to the container-local filesystem, the data in volumes is preserved across container restarts. Kubernetes supports many types of volumes. A pod can use any number of volume types simultaneously.

Note that the state transfer is enabled by default in this SAS Viya deployment for the cas-shared-default server: a GEL team deployment choice. This has an impact on the number of volumes that are mounted to the CAS server (more details below).

  1. List all current default CAS Server volumes.

    This command lists all Kubernetes volumes that are created for a CAS Server. You can see that a CAS Server uses volumes of many different types.

    kubectl get pods \
                sas-cas-server-default-controller \
                -o=json \
       | jq '[.spec.volumes[] |
             if   has("configMap") then "Name: "+.name, "Type: configMap", ""
             elif has("emptyDir") then "Name: "+.name, "Type: emptyDir", ""
             elif has("hostPath") then "Name: "+.name, "Type: hostPath", "Path: "+.hostPath.path, ""
             elif has("nfs") then "Name: "+.name, "Type: nfs", "Path: "+.nfs.path, "Server: "+.nfs.server, ""
             elif has("persistentVolumeClaim") then "Name: "+.name, "Type: persistentVolumeClaim", "Claim Name: "+.persistentVolumeClaim.claimName, ""
             elif has("secret") then "Name: "+.name, "Type: secret", ""
             else empty end]' \
       | tr -d '",[]'

    Click here to see the output

    Name: cas-default-permstore-volume
    Type: persistentVolumeClaim
    Claim Name: cas-default-permstore
    
    Name: cas-default-data-volume
    Type: persistentVolumeClaim
    Claim Name: cas-default-data
    
    Name: cas-default-cache-volume
    Type: emptyDir
    
    Name: cas-default-config-volume
    Type: emptyDir
    
    Name: cas-tmp-volume
    Type: emptyDir
    
    Name: cas-license-volume
    Type: secret
    
    Name: commonfilesvols
    Type: persistentVolumeClaim
    Claim Name: sas-commonfiles
    
    Name: backup
    Type: persistentVolumeClaim
    Claim Name: sas-cas-backup-data
    
    Name: tmp
    Type: emptyDir
    
    Name: consul-tmp-volume
    Type: emptyDir
    
    Name: certframe-token
    Type: secret
    
    Name: security
    Type: emptyDir
    
    Name: customer-provided-ca-certificates
    Type: configMap
    
    Name: sas-viya-gelcontent-pvc-volume
    Type: persistentVolumeClaim
    Claim Name: gelcontent-data
    
    Name: sudo-ts-tmp
    Type: emptyDir
    
    Name: sas-quality-knowledge-base-volume
    Type: persistentVolumeClaim
    Claim Name: sas-quality-knowledge-base
    
    Name: sas-rdutil-dir
    Type: configMap
    
    Name: cas-default-transfer-volume
    Type: persistentVolumeClaim
    Claim Name: sas-cas-transfer-data
    
    Name: astores-volume
    Type: persistentVolumeClaim
    Claim Name: sas-microanalytic-score-astores
    
    Name: sas-viya-gelcorp-volume
    Type: nfs
    Path: /shared/gelcontent
    Server: pdcesx03145.race.sas.com
    
    Name: cas-workers
    Type: secret

    The different types of volumes you see are:

    • Persistent volumes: used to store data that need to be persisted when the pods restart.
      • persistentVolumeClaim: is used to mount Persistent Volumes into CAS Server pods.
      • nfs: an NFS volume mounted directly to CAS Server pod (Server = NFS Server - Path = NFS path). Automatically remounted each time the pod restart.
    • Ephemeral volumes: are recreated each time the pod restarts.
      • configMap: each data item in the ConfigMap is represented by an individual file in the volume.
      • emptyDir: created when a pod is first assigned to a Kubernetes node and exists as long as that pod is running on that node.
      • secret: used to pass sensitive information, such as passwords, to pods.

      Note that emptyDir, configMap, and secret are local ephemeral storage managed by Kubernetes on each cluster node.

    The commonfilesvols persistentVolumeClaim exists to store all CAS Server binaries and files that are required for CAS servers to run. This volume is shared by all CAS server pods in a Viya deployment to help reduce the size of the cas container in each CAS Server pod.

    The sas-quality-knowledge-base-volume persistentVolumeClaim exists because SAS Data Quality product is licensed.

    The sas-viya-gelcontent-pvc-volume persistentVolumeClaim exists because of 03_021_Mount_NFS_to_Viya hands-on.

    The sas-viya-gelcorp-volume nfs exist because of 02_021_Kustomize hands-on.

    These persistent volumes are the key volumes for the CAS server (their names contain cas-):

    • cas-default-permstore: persists the metadata for CAS including caslib definitions, permissions, etc.
    • cas-default-data-volume: stores data that is saved and possibly reloaded into the CAS Server.
    • sas-cas-backup-data: stores the CAS Server backups.
    • cas-default-transfer-volume use for the state transfer when enabled (exists only when state transfer is enabled for a CAS server).
  2. List the CAS Server cas container mounted volumes.

    This command lists all volumes mounted to the cas container of the default CAS server.

    kubectl get pods \
                sas-cas-server-default-controller \
                -o=json \
       | jq '[.spec.containers[0].volumeMounts[] | "Name: "+.name, "Mount path:"+.mountPath, ""]' \
       | tr -d '",[]'

    Note that the containers[0] is the cas container. 0 is always the index of the cas container into the containers[] array.

    Click here to see the output

    Name: cas-default-permstore-volume
    Mount path:/cas/permstore
    
    Name: cas-default-data-volume
    Mount path:/cas/data
    
    Name: cas-default-cache-volume
    Mount path:/cas/cache
    
    Name: cas-default-config-volume
    Mount path:/cas/config
    
    Name: cas-tmp-volume
    Mount path:/tmp
    
    Name: cas-license-volume
    Mount path:/cas/license
    
    Name: commonfilesvols
    Mount path:/opt/sas/viya/home/commonfiles
    
    Name: podinfo
    Mount path:/etc/podinfo
    
    Name: backup
    Mount path:/sasviyabackup
    
    Name: security
    Mount path:/security
    
    Name: security
    Mount path:/opt/sas/viya/config/etc/SASSecurityCertificateFramework/cacerts
    
    Name: security
    Mount path:/opt/sas/viya/config/etc/SASSecurityCertificateFramework/private
    
    Name: sas-viya-gelcontent-pvc-volume
    Mount path:/mnt/gelcontent
    
    Name: sudo-ts-tmp
    Mount path:/run/sudo
    
    Name: sas-rdutil-dir
    Mount path:/rdutil
    
    Name: sas-quality-knowledge-base-volume
    Mount path:/opt/sas/viya/home/share/refdata/qkb
    
    Name: cas-default-transfer-volume
    Mount path:/cas/transferdir
    
    Name: astores-volume
    Mount path:/models/resources/viya
    
    Name: sas-viya-gelcorp-volume
    Mount path:/gelcontent
    
    Name: cas-workers
    Mount path:/var/casdata
    
    Name: kube-api-access-l5r68
    Mount path:/var/run/secrets/kubernetes.io/serviceaccount

    The Mount path is the cas container local path where the volume is attach.

    A single volume can be attached to multiple mount path (e.g., security)

  3. List the CAS Server specific defined Persistent Volumes (pv).

    The Persistent Volumes are created and managed at the Kubernetes cluster level. They are a Kubernetes cluster resource, not a namespace resource. Because of that, when the persistentVolumes resources is queried by using the kubectl CLI, the --namespace argument is ignored.

    kubectl get persistentVolumes \
               -o wide \
       | { head -1; grep "cas-"; }

    Click here to see the output

    NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                           STORAGECLASS   REASON   AGE     VOLUMEMODE
    pvc-15d7fbbb-c15a-4685-a21e-e61adadc7056   8Gi        RWX            Delete           Bound    gelcorp/cas-default-data        nfs-client              4d5h    Filesystem
    pvc-1fd1be30-bf19-43c7-87ae-cd4f3af30028   100Mi      RWX            Delete           Bound    gelcorp/cas-default-permstore   nfs-client              4d5h    Filesystem
    pvc-ddf16678-4759-46df-8e85-2c03b00b5f52   8Gi        RWX            Delete           Bound    gelcorp/sas-cas-backup-data     nfs-client              4d5h    Filesystem
    pvc-fa215d9b-14e6-4701-a510-2577736cb566   8Gi        RWX            Delete           Bound    gelcorp/sas-cas-transfer-data   nfs-client              4d5h    Filesystem

    You can note from this output that three volumes are created for the cas-shared-default Server. In this Viya deployment they are defined as a NFS storage class.

    All of these Viya persistentVolumes names are prefixed by pvc-.

    The CLAIM field contains interesting information: <NAMESPACE>/<persistentVolumeClaim NAME>.

  4. List the CAS server specific Persistent Volumes Claims (pvc).

    The Persistent Volume Claims are created and managed at the namespace level. They are a namespace resource. Because of that, when the persistentVolumeClaims resources is queried by using the kubectl CLI, the --namespace argument is important.

    kubectl get persistentVolumeClaims \
                -o wide \
       | { head -1; grep "cas-"; }

    Click here to see the output

    NAME                    STATUS   VOLUME                                     CAPACITY   ACCESS MODES    STORAGECLASS   AGE    VOLUMEMODE
    cas-default-data        Bound    pvc-15d7fbbb-c15a-4685-a21e-e61adadc7056   8Gi        RWX             nfs-client     4d5h   Filesystem
    cas-default-permstore   Bound    pvc-1fd1be30-bf19-43c7-87ae-cd4f3af30028   100Mi      RWX             nfs-client     4d5h   Filesystem
    sas-cas-backup-data     Bound    pvc-ddf16678-4759-46df-8e85-2c03b00b5f52   8Gi        RWX             nfs-client     4d5h   Filesystem
    sas-cas-transfer-data   Bound    pvc-fa215d9b-14e6-4701-a510-2577736cb566   8Gi        RWX             nfs-client     4d5h   Filesystem

    The persistentVolumeClaims create a link between a persistentVolume (Kubernetes cluster resources) and a volume defined for a pod in a specific namespace.

    Regarding the persistentVolumeClaims, if you compare this output with the previous command output you can note that

    • The NAME is part of a persistentVolumes CLAIM field
    • The VOLUME corresponds to a persistentVolumes NAME field
    • CAPACITY, ACCESS MODES, STORAGECLASS, AGE, and VOLUMEMODE are exactly the same.

Lessons learned

  • The default CAS Server is an SMP server.

  • The CAS server pod has three containers

    • sas-cas-server
    • sas-backup-agent
    • sas-consul-agent
  • Three plus one specific volumes linked via a specific persistentVolumeClaim volumes to a container-local filesystem mount path.

    “Three plus one” because the cas-default-transfer-volume volume exists only if the state transfer is enabled for the CAS server.

    cas container

    Kubernetes

    Mount path

    Volume

    Claim

    Pesistent volume

    /cas/permstore

    cas-default-permstore-volume

    cas-default-permstore

    pvc-15d7fbbb-c15a-4685-a21e-e61adadc7056

    /cas/data

    cas-default-data-volume

    cas-default-data

    pvc-1fd1be30-bf19-43c7-87ae-cd4f3af30028

    /sasviyabackup

    backup

    sas-cas-backup-data

    pvc-ddf16678-4759-46df-8e85-2c03b00b5f52

    /cas/transferdir

    cas-default-transfer-volume

    sas-cas-transfer-data

    pvc-fa215d9b-14e6-4701-a510-2577736cb566


SAS Viya Administration Operations
Lesson 07, Section 1 Exercise: Add a New CAS Server

Add a new CAS server

In this exercise you will add a new CAS server to your Viya deployment.

Table of content

Set the namespace

gel_setCurrentNamespace gelcorp

Create the new CAS server

  1. The create-cas-server.sh script generates all of the manifests you need to create, deploy, and configure a new CAS server. Look at the options of the script create-cas-server.sh to get an idea of what you can do with it.

    bash ~/project/deploy/${current_namespace}/sas-bases/examples/cas/create/create-cas-server.sh \
         --help
    Flags:
      -h  --help     help
      -i, --instance CAS server instance name
      -o, --output   Output location. If undefined, default to working directory.
      -v, --version  CAS server creation utility version
      -w, --workers  Specify the number of CAS worker nodes. Default is 0 (SMP).
      -b, --backup   Set this to include a CAS backup controller. Disabled by default.
      -t, --tenant   Set the tenant name. default is shared.
      -r, --transfer Set this to enable support for state transfer between restarts. Disabled by default.
      -a, --affinity Specify the node affinity and toleration to use for this deployment.  Default is 'cas'.
      -q, --required-affinity Set this flag to have the node affinity be a required node affinity.  Default is preferred node affinity.

    Important notes:

    • “-a, –affinity”, and “-q, –required-affinity” are options that provide the SAS Viya administrator to be able to decide on which Kubernetes nodePool the CAS server pods have to be started, and if it is mandatory or not.
    • -r, –transfer option is used to enable/disable the state transfer between CAS server restarts. This will keep the loaded data and CAS sessions persistent to the CAS server restarts. In this workshop we decided to activate this option by default, and you will see later its impact on CAS servers.

  2. Use create-cas-server.sh to create a new distributed gelcorp CAS server with a backup controller and two workers. We want the manifests for the new CAS server to be placed in the ~/project/deploy/gelcorp/site-config directory.

    bash ~/project/deploy/${current_namespace}/sas-bases/examples/cas/create/create-cas-server.sh \
        --instance gelcorp \
        --output ~/project/deploy/${current_namespace}/site-config \
        --workers 2 \
        --backup 1 \
        --transfer 1

    Note that we created the gelcorp CAS server using the --transfer option to enable the CAS server state transfer.

    The name of the new CAS server will be cas-shared-gelcorp.

    Fri May 13 12:08:48 EDT 2022 - instance = gelcorp
    Fri May 13 12:08:48 EDT 2022 - tenant =
    Fri May 13 12:08:48 EDT 2022 - output = /home/cloud-user/project/deploy/gelcorp/site-config
    
    make: *** No rule to make target `install'.  Stop.
    output directory does not exist: /home/cloud-user/project/deploy/gelcorp/site-config/
    creating directory: /home/cloud-user/project/deploy/gelcorp/site-config/
    Generating artifacts...
    100.0% [=======================================================================]
    |-cas-shared-gelcorp (root directory)
      |-cas-shared-gelcorp-cr.yaml
      |-kustomization.yaml
      |-shared-gelcorp-pvc.yaml
      |-annotations.yaml
      |-backup-agent-patch.yaml
      |-cas-consul-sidecar.yaml
      |-cas-fsgroup-security-context.yaml
      |-cas-sssd-sidecar.yaml
      |-kustomizeconfig.yaml
      |-provider-pvc.yaml
      |-transfer-pvc.yaml
      |-enable-binary-port.yaml
      |-enable-http-port.yaml
      |-configmaps.yaml
      |-state-transfer.yaml
      |-node-affinity.yaml
      |-require-affinity.yaml
    
    create-cas-server.sh complete!

    As shown in the command output, all of the gelcorp CAS server manifests are written to the /home/cloud-user/project/deploy/gelcorp/site-config/cas-shared-gelcorp directory.

    Click here if you want to list the gelcorp CAS server manifests

    ls -al ~/project/deploy/${current_namespace}/site-config/cas-shared-gelcorp

    You should see…

    total 80
    drwxrwxr-x 2 cloud-user cloud-user 4096 May 12 08:20 .
    drwxr-xr-x 7 cloud-user cloud-user 4096 May 12 08:20 ..
    -rw-rw-r-- 1 cloud-user cloud-user  203 May 12 08:20 annotations.yaml
    -rw-rw-r-- 1 cloud-user cloud-user 3761 May 12 08:20 backup-agent-patch.yaml
    -rw-rw-r-- 1 cloud-user cloud-user 2856 May 12 08:20 cas-consul-sidecar.yaml
    -rw-rw-r-- 1 cloud-user cloud-user  359 May 12 08:20 cas-fsgroup-security-context.yaml
    -rw-rw-r-- 1 cloud-user cloud-user 5814 May 12 08:20 cas-shared-gelcorp-cr.yaml
    -rw-rw-r-- 1 cloud-user cloud-user 2282 May 12 08:20 cas-sssd-sidecar.yaml
    -rw-rw-r-- 1 cloud-user cloud-user  259 May 12 08:20 configmaps.yaml
    -rw-rw-r-- 1 cloud-user cloud-user  304 May 12 08:20 enable-binary-port.yaml
    -rw-rw-r-- 1 cloud-user cloud-user  298 May 12 08:20 enable-http-port.yaml
    -rw-rw-r-- 1 cloud-user cloud-user  340 May 12 08:20 kustomization.yaml
    -rw-rw-r-- 1 cloud-user cloud-user 1267 May 12 08:20 kustomizeconfig.yaml
    -rw-rw-r-- 1 cloud-user cloud-user 1353 May 12 08:20 node-affinity.yaml
    -rw-rw-r-- 1 cloud-user cloud-user  291 May 12 08:20 provider-pvc.yaml
    -rw-rw-r-- 1 cloud-user cloud-user  486 May 12 08:20 require-affinity.yaml
    -rw-rw-r-- 1 cloud-user cloud-user  652 May 12 08:20 shared-gelcorp-pvc.yaml
    -rw-rw-r-- 1 cloud-user cloud-user  433 May 12 08:20 state-transfer.yaml
    -rw-rw-r-- 1 cloud-user cloud-user  396 May 12 08:20 transfer-pvc.yaml
  3. The next step is to modify ~/project/deploy/gelcorp/kustomization.yaml to include a reference the cas-shared-gelcorp manifests.

    • Backup the current kustomization.yaml file.

      cp -p ~/project/deploy/${current_namespace}/kustomization.yaml /tmp/${current_namespace}/kustomization_07-021-01.yaml
    • Use this yq command to add a reference to the site-config/cas-shared-gelcorp manifests in the resources field of the Viya deployment kustomization.yaml file.

      [[ $(grep -c "site-config/cas-shared-gelcorp" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
      yq4 eval -i '.resources += ["site-config/cas-shared-gelcorp"]' ~/project/deploy/${current_namespace}/kustomization.yaml
    • Alternatively, you can update the ~/project/deploy/gelcorp/kustomization.yaml file using your favorite text editor:

      [...]
      resources:
        [... previous transformers items ...]
        - site-config/cas-shared-gelcorp
      [...]
  4. Verify that the update is in place.

    cat ~/project/deploy/${current_namespace}/kustomization.yaml

    Search and ensure that site-config/cas-shared-gelcorp exists in the resources field of the Viya deployment kustomization.yaml file.

    Click here to see the output

    ---
    namespace: gelcorp
    resources:
      - sas-bases/base
      # GEL Specifics to create CA secret for OpenSSL Issuer
      - site-config/security/gel-openssl-ca
      - sas-bases/overlays/network/networking.k8s.io # Using networking.k8s.io API since 2021.1.6
      - site-config/security/openssl-generated-ingress-certificate.yaml # Default to OpenSSL Issuer in 2021.2.6
      - sas-bases/overlays/cas-server
      - sas-bases/overlays/crunchydata/postgres-operator # New Stable 2022.10
      - sas-bases/overlays/postgres/platform-postgres # New Stable 2022.10
      - sas-bases/overlays/internal-elasticsearch # New Stable 2020.1.3
      - sas-bases/overlays/update-checker # added update checker
      ## disable CAS autoresources to keep things simpler
      #- sas-bases/overlays/cas-server/auto-resources                                        # CAS-related
      #- sas-bases/overlays/crunchydata_pgadmin                                               # Deploy the sas-crunchy-data-pgadmin container - remove 2022.10
      - site-config/sas-prepull/add-prepull-cr-crb.yaml
      - sas-bases/overlays/cas-server/state-transfer # Enable state transfer for the cas-shared-default CAS server - new PVC sas-cas-transfer-data
      - site-config/sas-microanalytic-score/astores/resources.yaml
      - site-config/gelcontent_pvc.yaml
      - site-config/cas-shared-gelcorp
    configurations:
      - sas-bases/overlays/required/kustomizeconfig.yaml
    transformers:
      - sas-bases/overlays/internal-elasticsearch/sysctl-transformer.yaml # New Stable 2020.1.3
      - sas-bases/overlays/startup/ordered-startup-transformer.yaml
      - site-config/cas-enable-host.yaml
      - sas-bases/overlays/required/transformers.yaml
      - site-config/mirror.yaml
      #- site-config/daily_update_check.yaml      # change the frequency of the update-check
      #- sas-bases/overlays/cas-server/auto-resources/remove-resources.yaml    # CAS-related
      ## temporarily removed to alleviate RACE issues
      - sas-bases/overlays/internal-elasticsearch/internal-elasticsearch-transformer.yaml # New Stable 2020.1.3
      - sas-bases/overlays/sas-programming-environment/enable-admin-script-access.yaml # To enable admin scripts
      #- sas-bases/overlays/scaling/zero-scale/phase-0-transformer.yaml
      #- sas-bases/overlays/scaling/zero-scale/phase-1-transformer.yaml
      - sas-bases/overlays/cas-server/state-transfer/support-state-transfer.yaml # Enable state transfer for the cas-shared-default CAS server - enable and mount new PVC
      - site-config/change-check-interval.yaml
      - sas-bases/overlays/sas-microanalytic-score/astores/astores-transformer.yaml
      - site-config/sas-pyconfig/change-configuration.yaml
      - site-config/sas-pyconfig/change-limits.yaml
      - site-config/cas-add-nfs-mount.yaml
      - site-config/cas-add-allowlist-paths.yaml
      - site-config/cas-modify-user.yaml
    components:
      - sas-bases/components/crunchydata/internal-platform-postgres # New Stable 2022.10
      - sas-bases/components/security/core/base/full-stack-tls
      - sas-bases/components/security/network/networking.k8s.io/ingress/nginx.ingress.kubernetes.io/full-stack-tls
    patches:
      - path: site-config/storageclass.yaml
        target:
          kind: PersistentVolumeClaim
          annotationSelector: sas.com/component-name in (sas-backup-job,sas-data-quality-services,sas-commonfiles,sas-cas-operator,sas-pyconfig)
      - path: site-config/cas-gelcontent-mount-pvc.yaml
        target:
          group: viya.sas.com
          kind: CASDeployment
          name: .*
          version: v1alpha1
      - path: site-config/compute-server-add-nfs-mount.yaml
        target:
          labelSelector: sas.com/template-intent=sas-launcher
          version: v1
          kind: PodTemplate
      - path: site-config/compute-server-annotate-podtempate.yaml
        target:
          name: sas-compute-job-config
          version: v1
          kind: PodTemplate
    secretGenerator:
      - name: sas-consul-config
        behavior: merge
        files:
          - SITEDEFAULT_CONF=site-config/sitedefault.yaml
      - name: sas-image-pull-secrets
        behavior: replace
        type: kubernetes.io/dockerconfigjson
        files:
          - .dockerconfigjson=site-config/crcache-image-pull-secrets.json
    configMapGenerator:
      - name: ingress-input
        behavior: merge
        literals:
          - INGRESS_HOST=gelcorp.pdcesx03145.race.sas.com
      - name: sas-shared-config
        behavior: merge
        literals:
          - SAS_SERVICES_URL=https://gelcorp.pdcesx03145.race.sas.com
      # # This is to fix an issue that only appears in very slow environments.
      # # Do not do this at a customer site
      - name: sas-go-config
        behavior: merge
        literals:
          - SAS_BOOTSTRAP_HTTP_CLIENT_TIMEOUT_REQUEST='15m'
      - name: input
        behavior: merge
        literals:
          - IMAGE_REGISTRY=crcache-race-sas-cary.unx.sas.com
  5. Keep a copy of the current manifest.yaml file.

    cp -p /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml /tmp/${current_namespace}/manifest_07-021-01.yaml
  6. Run the sas-orchestration deploy command.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
               -v ${PWD}/license:/license \
               -v ${PWD}/${current_namespace}:/${current_namespace} \
               -v ${HOME}/.kube/config_portable:/kube/config \
               -v /tmp/${current_namespace}/deploy_work:/work \
               -e KUBECONFIG=/kube/config \
               --user $(id -u):$(id -g) \
           sas-orchestration \
              deploy \
                 --namespace ${current_namespace} \
                 --deployment-data /license/SASViyaV4_${_order}_certs.zip \
                 --license /license/SASViyaV4_${_order}_license.jwt \
                 --user-content /${current_namespace} \
                 --cadence-name ${_cadenceName} \
                 --cadence-version ${_cadenceVersion} \
                 --image-registry ${_viyaMirrorReg}

    When the deploy command completes successfully the final message should say The deploy command completed successfully as shown in the log snippet below.

    The deploy command started
    
    [...]
    
    The deploy command completed successfully

    If the sas-orchestration deploy command fails checkout the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

  7. Look at the existing CASDeployment custom resources

    kubectl get casdeployment

    You should now see the new CAS server you created.

    NAME             AGE
    default          3h8m
    shared-gelcorp   48s

    It may take several more minutes for the gelcorp CAS server to fully initialize. The following command will notify you when the CAS server is ready.

    kubectl wait pods \
                 --selector="casoperator.sas.com/server==shared-gelcorp" \
                 --for condition=ready \
                 --timeout 15m

    You should see these messages in the output.

    pod/sas-cas-server-shared-gelcorp-backup condition met
    pod/sas-cas-server-shared-gelcorp-controller condition met
    pod/sas-cas-server-shared-gelcorp-worker-0 condition met
    pod/sas-cas-server-shared-gelcorp-worker-1 condition met

    While you are waiting for the CAS server to be ready, you can use OpenLens to monitor the CAS pods.

    • Open OpenLens and connect to your GEL Kubernetes cluster.
    • Navigate to Workloads –> Pods and then filter on
      • namespace: gelcorp
      • sas-cas-server

    You can sort by Age ascending to place the newest pods at the top of the list.

    07_021_Lens_Monitor_Gelcorp_CASServer_0000
    07_021_Lens_Monitor_Gelcorp_CASServer_0001
    07_021_Lens_Monitor_Gelcorp_CASServer_0002

    When your CAS pods show a status of running you can display the status of all the gelcorp CAS server pods by running this command.

    kubectl get pods \
                --selector="casoperator.sas.com/server==shared-gelcorp" \
                -o wide

    You should see something like…

    NAME                                       READY   STATUS    RESTARTS   AGE      IP            NODE        NOMINATED NODE   READINESS GATES
    sas-cas-server-shared-gelcorp-backup       3/3     Running   0          10m26s   10.42.0.83    intnode02   <none>           <none>
    sas-cas-server-shared-gelcorp-controller   3/3     Running   0          10m26s   10.42.4.168   intnode04   <none>           <none>
    sas-cas-server-shared-gelcorp-worker-0     3/3     Running   0          10m21s   10.42.2.63    intnode03   <none>           <none>
    sas-cas-server-shared-gelcorp-worker-1     3/3     Running   0          10m21s   10.42.3.115   intnode05   <none>           <none>

    The gelcorp CAS server has now started and is ready to be used.

Examine the cas-shared-gelcorp server

  1. List the pod containers for the CAS controller.

    kubectl top pods \
                sas-cas-server-shared-gelcorp-controller \
                --containers
    POD                                        NAME               CPU(cores)   MEMORY(bytes)
    sas-cas-server-shared-gelcorp-controller   sas-cas-server     19m          63Mi
    sas-cas-server-shared-gelcorp-controller   sas-backup-agent   1m           29Mi
    sas-cas-server-shared-gelcorp-controller   sas-consul-agent   18m          24Mi

    Does the list of containers differ from what you saw for the default CAS server?

  2. List the CAS server Persistent Volumes (pv).

    kubectl get persistentVolumes \
                -o wide \
       | { head -1; \
           grep -E "${current_namespace}\/(sas-)?cas-" \
           | grep "shared-gelcorp"; }

    Click here to see the output

    NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                          STORAGECLASS   REASON    AGE     VOLUMEMODE
    pvc-62c75280-6e36-44aa-a60c-53de62f0271a   8Gi        RWX            Delete           Bound    gelcorp/cas-shared-gelcorp-data                nfs-client               9m19s   Filesystem
    pvc-a018e7ff-a008-494b-92ac-43d1d28d919e   8Gi        RWX            Delete           Bound    gelcorp/sas-cas-transfer-data-shared-gelcorp   nfs-client               9m18s   Filesystem
    pvc-b0a73e8d-10b3-43d8-a982-c1724e62a19c   100Mi      RWX            Delete           Bound    gelcorp/cas-shared-gelcorp-permstore           nfs-client               9m19s   Filesystem
    pvc-b757acf9-3504-4a1e-9641-03a0dd793359   4Gi        RWX            Delete           Bound    gelcorp/sas-cas-backup-data-shared-gelcorp     nfs-client               9m19s   Filesystem
    

    Note that an additional persistent volume was created because we enable the state transfer for the gelcorp CAS server.

    Does the list of persistent volumes look different?

  3. List the CAS server Persistent Volumes Claims.

    kubectl get persistentvolumeclaims \
                -o wide \
       | { head -1; \
           grep "shared\-${current_namespace}"; }

    Click here to see the output

    NAME                                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE   VOLUMEMODE
    cas-shared-gelcorp-data                Bound    pvc-62c75280-6e36-44aa-a60c-53de62f0271a   8Gi        RWX            nfs-client     10m   Filesystem
    cas-shared-gelcorp-permstore           Bound    pvc-b0a73e8d-10b3-43d8-a982-c1724e62a19c   100Mi      RWX            nfs-client     10m   Filesystem
    sas-cas-backup-data-shared-gelcorp     Bound    pvc-b757acf9-3504-4a1e-9641-03a0dd793359   4Gi        RWX            nfs-client     10m   Filesystem
    sas-cas-transfer-data-shared-gelcorp   Bound    pvc-a018e7ff-a008-494b-92ac-43d1d28d919e   8Gi        RWX            nfs-client     10m   Filesystem

    Note that an additional persistent volume claim was created because we enable the state transfer for the gelcorp CAS server.

    Do you see any differences in the PVCs compared to the default CAS server?

Lessons learned

  • It is easy to add a new CAS server to your deployment using the create-cas-server.sh script.
  • You can add either an SMP or MPP CAS server depending the parameters you passed to the create-cas-server.sh script.
  • The create-cas-server.sh script creates all of the manifests needed to create, deploy, and configure a new CAS server.
  • You must add a reference in kustomization.yaml to the location of your new CAS server manifests to add the CAS server to your deployment.

SAS Viya Administration Operations
Lesson 07, Section 2 Exercise: Stop and Restart CAS

Start/Stop/Restart a CAS server

In this exercise you will learn how to stop, start, and restart a CAS server.

Table of content

Set the namespace

gel_setCurrentNamespace gelcorp

Check the status of the cas-shared-gelcorp server pods

Verify that the cas-shared-gelcorp server is running.

kubectl get pods \
            --selector="casoperator.sas.com/server==shared-gelcorp" \
            -o wide
NAME                                       READY   STATUS    RESTARTS   AGE   IP            NODE        NOMINATED NODE   READINESS GATES
sas-cas-server-shared-gelcorp-backup       3/3     Running   0          17h   10.42.0.83    intnode02   <none>           <none>
sas-cas-server-shared-gelcorp-controller   3/3     Running   0          17h   10.42.4.168   intnode04   <none>           <none>
sas-cas-server-shared-gelcorp-worker-0     3/3     Running   0          17h   10.42.2.63    intnode03   <none>           <none>
sas-cas-server-shared-gelcorp-worker-1     3/3     Running   0          17h   10.42.3.115   intnode05   <none>           <none>
  • Open OpenLens and connect to your GEL Kubernetes cluster.

  • Navigate to Workloads –> Pods and then filter on

    • namespace: gelcorp
    • sas-cas-server-shared-gelcorp
  • Sort by Age ascending.

    07_031_Lens_Monitor_Gelcorp_CASServer_0000

IMPORTANT: Leave OpenLens open and do not change the filtering as you will come back to this display during later steps.

Stop the cas-shared-gelcorp server

When you stop a CAS server, all CAS server pod instances are stopped and deleted and no new pod instances are automatically restarted by the operator, regardless of the replicas setting. The administrator will need to execute a start command to create new CAS server pod instances.

  1. Stop the cas-shared-gelcorp server by setting value of /spec/shutdown to true in the CASDeployment.

    kubectl patch casdeployment \
                  shared-gelcorp \
                  --type=json -p='[{"op": "add", "path": "/spec/shutdown", "value":true}]'
    casdeployment.viya.sas.com/shared-gelcorp patched

    Quickly return to OpenLens and watch the impact of your command on the CAS pods.

    1. All cas-shared-gelcorp pods should show a status of Terminating.

      07_031_Lens_Monitor_Gelcorp_CASServer_0001
    2. Once the pods terminate, no new cas-shared-gelcorp pods should appear because the CAS server has stopped.

      07_031_Lens_Monitor_Gelcorp_CASServer_0002
  2. After few seconds, run this command to list the cas-shared-gelcorp pods.

    kubectl get pods \
                --selector="casoperator.sas.com/server==shared-gelcorp" \
                -o wide

    You should see…

    No resources found in gelcorp namespace.

The cas-shared-gelcorp server has completely stopped.

Start the cas-shared-gelcorp server

When the CAS server start command is executed, new instances of the CAS server pods are started and the CAS server is available for use. The CAS server is configured as defined but no previously loaded tables will be available except those tables pre-loaded at startup during session zero processing.

  1. Start the cas-shared-gelcorp server by setting the value of /spec/shutdown to false in the CASDeployment

    kubectl patch casdeployment \
                  shared-gelcorp \
                  --type=json -p='[{"op": "add", "path": "/spec/shutdown", "value":false}]'
    casdeployment.viya.sas.com/shared-gelcorp patched

    It may take a few minutes for the gelcorp CAS server to fully restart. The following command will notify you when the gelcorp CAS server controller is ready.

    kubectl wait pods \
                 --selector="casoperator.sas.com/server==shared-gelcorp" \
                 --for condition=ready \
                 --timeout 15m

    It will take CAS time to start so quickly switch back to OpenLens and monitor the impact of your command on the CAS pods.

    You should see that cas-shared-gelcorp pods go from Pending

    07_031_Lens_Monitor_Gelcorp_CASServer_0003

    …to Running with some containers still yellow…

    07_031_Lens_Monitor_Gelcorp_CASServer_0004

    … to Running with all containers green.

    07_031_Lens_Monitor_Gelcorp_CASServer_0005

    Return to your MobaXterm session and you should see these messages in the output informing you that the CAS server is ready.

    pod/sas-cas-server-shared-gelcorp-backup condition met
    pod/sas-cas-server-shared-gelcorp-controller condition met
    pod/sas-cas-server-shared-gelcorp-worker-0 condition met
    pod/sas-cas-server-shared-gelcorp-worker-1 condition met
  2. Now look once more at the status of the cas-shared-gelcorp pods.

    kubectl get pods \
                --selector="casoperator.sas.com/server==shared-gelcorp" \
                -o wide
    NAME                                       READY   STATUS    RESTARTS   AGE     IP            NODE        NOMINATED NODE   READINESS GATES
    sas-cas-server-shared-gelcorp-backup       3/3     Running   0          4m2s    10.42.0.85    intnode02   <none>           <none>
    sas-cas-server-shared-gelcorp-controller   3/3     Running   0          4m2s    10.42.4.200   intnode04   <none>           <none>
    sas-cas-server-shared-gelcorp-worker-0     3/3     Running   0          3m58s   10.42.2.65    intnode03   <none>           <none>
    sas-cas-server-shared-gelcorp-worker-1     3/3     Running   0          3m58s   10.42.3.117   intnode05   <none>           <none>

The cas-shared-gelcorp server is started and all required pods are running.

Restart the cas-shared-gelcorp server

When the administrator restarts a CAS server, all instances of the CAS pods are stopped and then a new instance of each CAS server pod is immediately restarted. The CAS server is restarted as configured but no previously loaded tables are available except those tables pre-loaded at startup during session zero processing.

  1. To restart the cas-shared-gelcorp server simply delete the CAS server pods. Kubernetes will notice that your deployment no longer has the pods you requested and will automatically restart new instances of the deleted pods.

    kubectl delete pod \
                   --selector="casoperator.sas.com/server==shared-gelcorp"

    You should see these messages in the output.

    pod "sas-cas-server-shared-gelcorp-backup" deleted
    pod "sas-cas-server-shared-gelcorp-controller" deleted
    pod "sas-cas-server-shared-gelcorp-worker-0" deleted
    pod "sas-cas-server-shared-gelcorp-worker-1" deleted

    Quickly switch back to OpenLens and watch what happens.

    1. You should see all cas-shared-gelcorp pods Terminating.

      07_031_Lens_Monitor_Gelcorp_CASServer_0001
    2. Then you should see new pods automatically start to replace the deleted ones.

      07_031_Lens_Monitor_Gelcorp_CASServer_0003
      07_031_Lens_Monitor_Gelcorp_CASServer_0004
      07_031_Lens_Monitor_Gelcorp_CASServer_0005

    It may take a few minutes for the CAS server to fully restart.

    The following command will notify you when the gelcorp CAS server controller is ready.

    kubectl wait pods \
                 --selector="casoperator.sas.com/server==shared-gelcorp" \
                 --for condition=ready \
                 --timeout 15m

    You should see these messages in the output.

    pod/sas-cas-server-shared-gelcorp-backup condition met
    pod/sas-cas-server-shared-gelcorp-controller condition met
    pod/sas-cas-server-shared-gelcorp-worker-0 condition met
    pod/sas-cas-server-shared-gelcorp-worker-1 condition met
  2. Check the status of the cas-shared-gelcorp pods.

    kubectl get pods \
                --selector="casoperator.sas.com/server==shared-gelcorp" \
                -o wide
    NAME                                       READY   STATUS    RESTARTS   AGE     IP            NODE        NOMINATED NODE   READINESS GATES
    sas-cas-server-shared-gelcorp-backup       3/3     Running   0          2m28s   10.42.3.119   intnode05   <none>           <none>
    sas-cas-server-shared-gelcorp-controller   3/3     Running   0          2m28s   10.42.2.67    intnode03   <none>           <none>
    sas-cas-server-shared-gelcorp-worker-0     3/3     Running   0          2m40s   10.42.4.202   intnode04   <none>           <none>
    sas-cas-server-shared-gelcorp-worker-1     3/3     Running   0          2m28s   10.42.0.87    intnode02   <none>           <none>

The cas-shared-gelcorp server has been successfully restarted.

Lessons learned

  • There is a difference between stopping a CAS server and restarting a CAS server.
  • Stopping a CAS server deletes all CAS server pod instances. New CAS server pod instances will be created only when the CAS Server is started.
  • Restarting a CAS server deletes all CAS server pod instances and new pods instances are automatically, and immediately, started in their place.
  • Stopping or restarting a CAS server unloads all in-memory tables.
  • Stopping a CAS server can be useful in a multi-CAS server configuration to free up resources or to quickly remove access to the data loaded in a particular CAS server.

SAS Viya Administration Operations
Lesson 07, Section 2 Exercise: Using State Transfer

Restart a CAS server using state transfer

In this exercise you will use the CAS server state transfer capability to restart the CAS server. This will preserve the sessions, tables, and state of a running CAS server.

Table of content

Set the namespace

gel_setCurrentNamespace gelcorp

Current cas-shared-gelcorp CAS server pods

Using OpenLens, you can see the current status of the cas-shared-gelcorp server. The cas-shared-gelcorp server is MPP with a backup controller and two workers.

  • Navigate to Workloads –> Pods and then filter on
    • namespace: gelcorp
    • sas-cas-server-shared-gelcorp
07_032_Lens_Monitor_Gelcorp_CASServer_0000

The name of each cas-shared-gelcorp server pods is like: sas-cas-server-<CASServerInstanceName>-<CASServerNodeType>

If you double-click on a cas-shared-gelcorp server pod you could see detailed information about the pod.

07_032_Lens_Monitor_Gelcorp_CASServer_0001

Because the cas-shared-gelcorp server was created with the state transfer option, a new label is defined for each CAS server pod: casoperator.sas.com/instance-index.

By default the instance-index is set to 0. This value is not used with the first instance of the CAS server pods. You will see later in this hands-on how it will be used.

Current CAS servers loaded tables

  1. Open SAS Environment Manager, log in as geladm, and assume the SASAdministrators membership.

    gellow_urls | grep "SAS Environment Manager"
  2. Look at the loaded tables properties

    07_032_SASEnvironmentManager_Monitor_CAS_Tables_0000

    All these tables are all loaded on the cas-shared-default server. No table is currently loaded into the cas-shared-gelcorp server.

  3. DO NOT LOG OFF SAS ENVIRONMENT MANAGER, you will have to go back to SAS Environment Manager later in this hands-on to monitor Available (loaded) tables again.

Load tables in the cas-shared-gelcorp CAS server

  1. Open SAS Studio, log in as geladm, and assume the SASAdministrators membership.

    gellow_urls | grep "SAS Studio"
  2. In the Explorer panel, navigate to Folder Shortcuts \ Shortcut to My Folder \ My SAS Code, and then open the TestCASServerStateTransfer.sas program.

    07_032_SASStudio_CAS_Session_0000

    You can fold all region code section if you want, like in the screen below.

    07_032_SASStudio_CAS_Session_0001
  3. Execute the Step1 code to start a CAS session and load tables in the cas-shared-gelcorp server.

    07_032_SASStudio_CAS_Session_0002

    Select the first region code and submit it by clicking on the run button.

  4. In the Log panel, search for the CAS session UUID and note it. You will have to compare this value with the CAS session UUID from the new instance of the cas-shared-gelcorp server later in this hands-on.

    07_032_SASStudio_CAS_Session_0003
  5. Go back to your SAS environment Manager session. Then, in the Data panel, refresh the Available tab content.

    Look at the loaded tables properties.

    07_032_SASEnvironmentManager_Monitor_CAS_Tables_0001

    Two tables where loaded into the cas-shared-gelcorp server as expected by the SAS program above.

Execute the cas-shared-gelcorp CAS server state transfer and monitor it

  1. If you look again the SAS code in SAS Studio, the Step2 consist to wait until the cas-shared-gelcorp server state transfer command is executed in MobaXterm.

    07_032_SASStudio_CAS_Session_0004
  2. Go back to your MobaXterm session tu run the run the kubectl command below.

    This command will patch the cas-shared-gelcorp server CASDeployment custom resource to initiate the CAS server state transfer process.

    kubectl patch casdeployment \
                  shared-gelcorp \
                  --type='json' \
                  -p='[{"op": "replace", "path": "/spec/startStateTransfer", "value":true}]'
    casdeployment.viya.sas.com/shared-gelcorp patched
  3. Switch back to OpenLens to monitor the impact of the state transfer process against the cas-shared-gelcorp server pods.

    • Navigate to Workloads –> Pods and then filter on

      • namespace: gelcorp
      • sas-cas-server-shared-gelcorp
    1. State transfer step1: a new instance of the cas-shared-gelcorp server pods are started in the SAS Viya deployment.

      07_032_Lens_Monitor_Gelcorp_CASServer_0002

      These pods instance are started for the same CAS server, based on the same CASDeployment custom resource. The cas-shared-gelcorp server instance name does not change (same as the CASDeployment custom resource), but, if you look at the cas-shared-gelcorp server pods name, the name of the pods changed a little bit.

      The name of each new instance of the cas-shared-gelcorp pods is now like: sas-cas-server-<CASServerInstanceName>-<CASServerInstanceIndex>-<CASServerNodeType>.

      07_032_Lens_Monitor_Gelcorp_CASServer_0003

      This step could take a few minutes depending the Kubernetes cluster resource availability. Kubernetes has to find where the new instance of the CAS server pods can start based on resources availability, and other rules like workload node placement.

    2. State transfer step2: when the 2 instances of the cas-shared-gelcorp server pods are running, all loaded data and CAS sessions are transferred to the new instance of the CAS server.

      07_032_Lens_Monitor_Gelcorp_CASServer_0004

      This step could take a while since all loaded table, and existing CAS session information, from the previous instance of the CAS server have to be saved in JSON format files into the cas-default-transfer-volume volume. All these JSON files are used to reload the data into the new instance of the CAS Server and all CAS sessions have to be recreated as they were.

      The data that is transferred is all data that was loaded in the CAS server:

      • The global data
      • The users’ sessions data

      Be careful…

      • Two instances of the CAS server pods have to run simultaneously for a few minutes.
      • The data have to be loaded in both instances of the CAS server for a few minutes.

      This requires a lot of extra resources in the Kubernetes cluster to support this CAS server capability.

      SAS Viya administrator have to really think about the cost impacts before enabeling the CAS server state transfer.

      The CAS server state transfer capability has to be discuss with the Architect, the customer, and the deployment team before SAS Viya is deployed.

    3. State transfer step3: the transfer of the data and sessions is finished, the cas-shared-gelcorp server is now fully restarted (a new instance of the CAS server pods started and the previous instance is terminated).

      07_032_Lens_Monitor_Gelcorp_CASServer_0005

      The new instance of the cas-shared-gelcorp server is ready to be used by the users.

      You will see in next steps of this hands-on the impact for the users regarding the loaded tables and existing CAS sessions.

  4. Look at the new instance of the cas-shared-gelcorp server pods in OpenLens.

    07_032_Lens_Monitor_Gelcorp_CASServer_0006

    You can see that the new instance of the cas-shared-gelcorp server is running. Because it is a new instance, a label was updated on each pods of the CAS server: casoperator.sas.com/instance-index.

    The casoperator.sas.com/instance-index label value is incremented by one each time the state transfer is initiated for the CAS server.

Look at the impact of the CAS server state transfer on data and CAS session

Go back to your SAS Studio session.

  1. Execute the Step3 of the SAS program

    07_032_SASStudio_CAS_Session_0005

    Select the Step3 region code and submit it by clicking on the run button.

  2. In the Log panel, search for the CAS session UUID and and compare it with the one that you noted earlier in this hands-on.

    07_032_SASStudio_CAS_Session_0006

    You can see that the CAS session UUID did not change. The CAS session was transferred to the new instance of the cas-shared-gelcorp server.

  3. Go back to your SAS environment Manager session. Then, in the Data panel, refresh the Available tab content.

    Look at the loaded tables properties.

    07_032_SASEnvironmentManager_Monitor_CAS_Tables_0002

Lessons learned

By enabling the CAS server state transfer capability, it is possible to preserve the sessions, tables, and state of a running CAS server for a new CAS server instance that is being started as part of a CAS server update (apply new configuration, change the topology, or update the CAS server pods).​

The CAS server state transfer capability requires to be able to use the double of the resources that the CAS server used for a few minutes.

Using this restarting process, the users are less impacted, there is now downtime.


SAS Viya Administration Operations
Lesson 07, Section 3 Exercise: Configure CAS Startup

Modify the CAS Server Configuration Files

In this exercise we will review CAS server configuration and configure CAS startup for session zero processing.

Modifying the CAS Server configuration files requires a restart of your CAS servers which results in the termination of all active connections and sessions and the loss of any in-memory data. But all CAS server configurations, and the permstore are persisted.

Table of contents

Set the namespace and authenticate

gel_setCurrentNamespace gelcorp
/opt/pyviyatools/loginviauthinfo.py

Make the HR data and custom formats available to the cas-shared-gelcorp server

The goal of this hands-on is to configure what happens during cas-shared-gelcorp CAS Server session zero processing.

When the cas-shared-gelcorp server starts, we would like for certain HR tables to be loaded and for the custom formats the tables reference to be available to CAS.

Workaround required for the current version of Viya

Without this workaround, the user that owns the cas-shared-gelcorp CAS serer process, regardless of its group membership settings, will be not able to access the required data because secondary groups memberships are not defined before the end of the CAS server startup which is too late for session zero processing.

# Change the permissions on required HR directories and files
sudo chmod o+rx /shared/gelcontent/gelcorp/hr
sudo chmod o+rx /shared/gelcontent/gelcorp/hr/data
sudo chmod o+r /shared/gelcontent/gelcorp/hr/data/*.*
sudo chmod o+rx /shared/gelcontent/gelcorp/hr/formats
sudo chmod o+r /shared/gelcontent/gelcorp/hr/formats/*.*

This adds rx (read/execute) permissions against some directories to allow the cas user to be able to access them. And add r (read) permission to some files to allow the cas user to be able to read these files.

These file accesses are required for you to be able to test the CAS session-zero script on the cas-shared-gelcorp server.

HOPE THIS BUG WILL BE FIXED SOON.

Make the HR user defined formats available to the cas-shared-gelcorp server

  1. List the current user defined formats available into the cas-shared-gelcorp server.

    gel_sas_viya --output text \
                 cas format-libraries \
                     list --server cas-shared-gelcorp

    You should see…

    There are no SAS format libraries in the server "cas-shared-gelcorp".
  2. Make the HR user defined formats available to the cas-shared-gelcorp server.

    gel_sas_viya cas format-libraries \
                     create --server cas-shared-gelcorp \
                            --format-library HRFORMATS \
                            --search-order append \
                            --source-path /gelcontent/gelcorp/hr/formats/formats.sas7bcat \
                            --caslib "formats" \
                            --su \
                            --force

    You should see…

    The SAS format library "HRFORMATS" was successfully created and was appended to the end of the SAS format search order.

    This command is used to access the HR user defined formats SAS catalog (formats.sas7bcat) and load it as CAS server user formats (fmtLibName)named HRFORMATS. The new CAS user format library is then loaded inside the Formats global CAS library.

  3. Validate using the sas-viya CLI

    gel_sas_viya --output text \
                 cas format-libraries \
                     list --server cas-shared-gelcorp

    You should see…

    Format Library   Present in Format Search Path   Scope     Persisted   Caslib    Table
    HRFORMATS        false                           global    true        FORMATS   HRFORMATS
    HRFORMATS        true                            session   true        FORMATS   HRFORMATS
  4. Lets look in the cas-shared-gelcorp Server permstore to see what happened.

    Since we enabled the state transfer for the cas-shared-gelcorp server, the name of the CAS server pods will change each time the statetransfer was initiated. The name of the pod contains the instance-index value until the CA server pods are restarted (deleted then recreated).

    Because he name of the CAS server pods is not stable, we have to extract the required pod name using label filtering to be able to access aspecific CAS server pod.

    The command below returns the current name of the cas-shared-gelcorp server controller pod.

    _CASControllerPodName=$(kubectl get pod \
          --selector "casoperator.sas.com/server==shared-gelcorp,casoperator.sas.com/node-type==controller,casoperator.sas.com/controller-index==0" \
          --no-headers \
       | awk '{printf $1}')
    
    echo ${_CASControllerPodName}
    
    kubectl exec -it ${_CASControllerPodName} \
                 -c sas-cas-server \
                 -- bash -c "cat /cas/permstore/primaryctrl/addfmtlibs_startup.lua"
    --------
    -- Format Library persistence file #1, Version 1.0
    --------
    log.info('----------------------------------------')
    log.info('Lua: Running add_fmt_libs.lua')
    log.info('----------------------------------------')
    s:sessionProp_addFmtLib{caslib="Formats",fmtLibName="HRFORMATS",name="hrformats.sashdat",replace=true,promote=true}

    A lua file was created with the cas format library definition. This means that the HR UDF will be reloaded into the cas-shared-gelcorp servereach time it is restarted.

  5. Validate using the SAS Environment Manager.

    1. Open SAS Environment Manager, log in as geladm, and assume the SASAdministrators membership.

      gellow_urls | grep "SAS Environment Manager"
    2. Navigate to the User-Defined Formats page and then select cas-shared-gelcorp from the Server: drop down list.

    3. Verify that the HRFORMATS is listed in the Format Library list.

    4. You are now able to see the HRFORMATS formats.

      07_041_SASEnvironmentManager_UDF_0000
    5. If you look at the edu format properties

      07_041_SASEnvironmentManager_UDF_0001

Make HR data available to the cas-shared-gelcorp server

  1. List the current global CAS libraries available to the cas-shared-gelcorp server.

    gel_sas_viya --output text \
                 cas caslibs \
                     list --server cas-shared-gelcorp

    You should see…

    Name                   Source Type   Description                                                                           Scope    Path
    CASUSER(geladm)        PATH          Personal File System Caslib                                                           global   /cas/data/caslibs/casuserlibraries/geladm/
    Formats                PATH          Stores user defined formats.                                                          global   /cas/data/caslibs/formats/
    ModelPerformanceData   PATH          Stores performance data output for the Model Management service.                      global   /cas/data/caslibs/modelMonitorLibrary/
    Models                 PATH          Stores models created by Visual Analytics for use in other analytics or SAS Studio.   global   /cas/data/caslibs/models/
    Public                 PATH          Shared and writeable caslib, accessible to all users.                                 global   /cas/data/caslibs/public/
    Samples                PATH          Stores sample data, supplied by SAS.                                                  global   /cas/data/caslibs/samples/
    SystemData             PATH          Stores application generated data, used for general reporting.                        global   /cas/data/caslibs/sysData/

    There should be no CAS library defined to access the HR department data (/gelcontent/gelcorp/hr/data).

  2. Create a HR CAS library for the cas-shared-gelcorp server.

    gel_sas_viya cas caslibs \
                     create path \
                            --caslib hrdl \
                            --path /gelcontent/gelcorp/hr/data \
                            --server cas-shared-gelcorp \
                            --description "gelcontent_for_HR_department" \
                            --superuser

    You should see…

    The requested caslib "hrdl" has been added successfully.
    
    Caslib Properties
    Name                hrdl
    Server              cas-shared-gelcorp
    Description         gelcontent_for_HR_department
    Source Type         PATH
    Path                /gelcontent/gelcorp/hr/data/
    Scope               global
    
    Caslib Attributes
    active              true
    personal            false
    subDirs             false
  3. Validate the new caslib using the sas-viya command.

    gel_sas_viya --output text \
                 cas caslibs \
                     list --server cas-shared-gelcorp

    You should see…

    Name                   Source Type   Description                                                                           Scope    Path
    CASUSER(geladm)        PATH          Personal File System Caslib                                                           global   /cas/data/caslibs/casuserlibraries/geladm/
    Formats                PATH          Stores user defined formats.                                                          global   /cas/data/caslibs/formats/
    hrdl                   PATH          gelcontent_for_the_HR_department                                                      global   /gelcontent/gelcorp/hr/data/
    ModelPerformanceData   PATH          Stores performance data output for the Model Management service.                      global   /cas/data/caslibs/modelMonitorLibrary/
    Models                 PATH          Stores models created by Visual Analytics for use in other analytics or SAS Studio.   global   /cas/data/caslibs/models/
    Public                 PATH          Shared and writeable caslib, accessible to all users.                                 global   /cas/data/caslibs/public/
    Samples                PATH          Stores sample data, supplied by SAS.                                                  global   /cas/data/caslibs/samples/
    SystemData             PATH          Stores application generated data, used for general reporting.                        global   /cas/data/caslibs/sysData/
  4. Validate the CAS library using SAS Environment Manager.

    1. Open SAS Environment Manager, log in as geladm, and assume the SASAdministrators membership.

      gellow_urls | grep "SAS Environment Manager"
    2. Navigate to the Data page and Select the Data Sources tab.

    3. Select cas-shared-gelcorp as the CAS server.

    4. You should now be able to see the hdrl CAS library.

      07_041_SASEnvironmentManager_CASLib_0000

Inspect CAS startup parameters

First, review the CAS configuration to see the parameters that control the behavior of the CAS server when it starts (session zero). In this task we will inspect the configuration using the sas-viya command-line interface.

  1. Use the configuration plugin to list all cas-shared-gelcorp server configuration instances.

    gel_sas_viya --output text \
                 configuration configurations \
                               list --service cas-shared-gelcorp \
                                    --definition-name sas.cas.instance.config

    Click here to see the output

    Id                                     DefinitionName            Name               Services             IsDefault
    213a06db-e5ac-42cb-a541-8120562b01c3   sas.cas.instance.config   config             cas-shared-gelcorp   true
    8ed88586-5b40-4da4-b86d-c7cd1d6973c4   sas.cas.instance.config   delete             cas-shared-gelcorp   true
    63347f67-7a35-44cc-b78a-94c3c98cd2fa   sas.cas.instance.config   logconfig          cas-shared-gelcorp   true
    cd90210f-a039-44c4-99aa-ff4eb3e25e1f   sas.cas.instance.config   sessionlogconfig   cas-shared-gelcorp   true
    eae686a8-2503-4710-8bfb-07b5b7c4691a   sas.cas.instance.config   settings           cas-shared-gelcorp   true
    1a7bc8e9-d706-451f-ad79-c4a907f15e51   sas.cas.instance.config   startup            cas-shared-gelcorp   true

    You will be able to see these configuration instances later using SAS Environment Manager.

  2. Let’s start by looking at the current startup settings by listing details of the sas.cas.instance.config:startup configuration.T To do this, we need to get the ID of that particular configuration instance so we can pass that into the show command.

    • Get the instance ID for the cas-shared-gelcorp startup configuration. This is basically the same command you just ran with a bit of extra code to strip out just the ID value.

      _CAS_Startup_ConfigInstance_Id=$(gel_sas_viya --output text \
                                                    configuration configurations \
                                                                  list --service cas-shared-gelcorp \
                                                                       --definition-name sas.cas.instance.config \
         | grep "startup" \
         | awk '{printf $1}')
      
      echo ID=${_CAS_Startup_ConfigInstance_Id}
    • Now that we have the instance ID, show the details of the cas-shared-gelcorp startup configuration instance.

      • First, use the sas-viya command:

        gel_sas_viya --output text \
                     configuration configurations \
                                   show --id=${_CAS_Startup_ConfigInstance_Id}
        id                   : 1a7bc8e9-d706-451f-ad79-c4a907f15e51
        metadata.isDefault   : true
        metadata.mediaType   : application/vnd.sas.configuration.config.sas.cas.instance.config+json;version=1
        metadata.services    : [cas-shared-gelcorp]
        name                 : startup
        contents             : -- CAS session-zero startup script extensions.
        --
        -- Lua-formatted SWAT client code
        -- that executes specified actions during session-zero prior to
        -- clients connecting to CAS.
        
        -- s:table_addCaslib{ name="sales", description="Sales data", dataSource={srcType="path"}, path="/data/sales" }

        The contents value contains Lua code that is to be executed during CAS Server startup (session zero). The configuration details are stored in the SAS Infrastructure Data Server (PostgreSQL).

        During CAS server initialization, the Lua code from the contents value is extracted from the configuration and written to the /cas/config/casstartup_usermods.lua file inside the CAS pod. When session zero processing takes place, CAS then reads the /cas/config/casstartup_usermods.lua file and carries out the instructions.

      • To prove that, take a look inside the CAS server controller pod and you should see that the contents of casstartup_usermods.lua match the value from the configuration instance.

        _CASControllerPodName=$(kubectl get pod \
              --selector "casoperator.sas.com/server==shared-gelcorp,casoperator.sas.com/node-type==controller,casoperator.sas.com/controller-index==0" \
              --no-headers \
           | awk '{printf $1}')
        
        echo ${_CASControllerPodName}
        
        kubectl exec -it ${_CASControllerPodName} \
                     -c sas-cas-server \
                     -- bash -c "cat /cas/config/casstartup_usermods.lua"
        -- CAS session-zero startup script extensions.
        --
        -- Lua-formatted SWAT client code
        -- that executes specified actions during session-zero prior to
        -- clients connecting to CAS.
        
        -- s:table_addCaslib{ name="sales", description="Sales data", dataSource={srcType="path"}, path="/data/sales" }
      • You can also use SAS Environment Manager to examine the configuration instance.

        1. Open SAS Environment Manager, log in as geladm, and assume the SASAdministrators membership.

          gellow_urls | grep "SAS Environment Manager"
        2. Navigate to the Configuration page and then select Definitions as the view.

        3. Filter on sas-cas, select sas.cas.instance.config, and click on the Collapse all icon (double arrows to top).

        4. You should now be able to see all of sas.cas.instance.config definitions for all CAS Servers (shared-default and shared-gelcorp).

          07_041_SASEnvironmentManager_Configuration_0000
        5. Expand the cas-shared-gelcorp: startup definition

          07_041_SASEnvironmentManager_Configuration_0001

          You can now see the Lua code that will populate the casstatup_usermods.lua file in the CAS Server pod.

  3. Quiz time! Now that you have seen the Lua code many times, what processing will actually take place when the CAS server starts?

    Click here to see the output

    Nothing!

    All lines are commented (“–”).

Modify the CAS server session zero processing to load HR tables

In the previous steps you defined an HR caslib to make HR data accessible to CAS but that does not force the loading of any tables into memory.

In this step, you will modify the cas-shared-gelcorp: startup definition to pre-load two HR tables each time the cas-shared-gelcorp server starts.

  1. Open SAS Environment Manager, log in as geladm, and assume the SASAdministrators membership.

    gellow_urls | grep "SAS Environment Manager"
  2. Navigate to the Configuration page and select Definitions as the view.

  3. Filter on sas-cas, select sas.cas.instance.config, and click on the Collapse all icon (double arrows to top).

  4. You should now be able to see the all sas.cas.instance.config definitions for all existing CAS Servers (shared-default and shared-gelcorp).

    07_041_SASEnvironmentManager_Configuration_0000
  5. Edit the cas-shared-gelcorp: startup definition instance by clicking the pencil icon.

    07_041_SASEnvironmentManager_Configuration_0002
  6. Add the following lines to the contents property’s text field below the existing text and then save your change. These Lua commands instruct CAS to load the HR_SUMMARY and HRDATA tables into memory from the hrdl caslib.

    
    -- Add User Defined Formats permanently and re-loadable
    ------------------------------------------------------
    -- Not required since defined CAS formats libraries are automatically loaded since Stable 2021.1.4
    
    -- Add HR tables to be reloaded at CAS Server start
    ---------------------------------------------------
    ---- Load HR summary table
    s:table_loadTable{caslib="hrdl",
                      casOut={caslib="hrdl",replication=0.0},
                      path="hr_summary.csv",
                      promote=true
                     }
    
    ---- Load HR data table
    s:table_loadTable{caslib="hrdl",
                      casOut={caslib="hrdl",replication=0.0},
                      path="hrdata.sas7bdat",
                      promote=true
                     }
    07_041_SASEnvironmentManager_Configuration_0003
  7. The cas-shared-gelcorp server needs to be restarted for the change to take effect.

    Since we enabled the state transfer the cas-shared-gelcorp server you have now choices to restart the CAS server.

    • Choice 1: initiate the state transfer

      All loaded tables and active CAS session will be kept.

      The casoperator.sas.com/instance-index label for all pods of the CAS server will be incremented by 1.

      kubectl patch casdeployment \
              shared-gelcorp \
              --type='json' -p='[{"op": "replace", "path": "/spec/startStateTransfer", "value":true}]'
      sleep 60s
      kubectl wait pods \
              --selector="casoperator.sas.com/server==shared-gelcorp" \
              --for condition=ready \
              --timeout 15m
    • Choice 2: delete the CAS server pods

      All loaded tables and active CAS sessions will be lost.

      The casoperator.sas.com/instance-index label for all pods of the CAS server will be reset to 0.

      kubectl delete pod \
              --selector="casoperator.sas.com/server==shared-gelcorp"
      sleep 60s
      kubectl wait pods \
              --selector="casoperator.sas.com/server==shared-gelcorp" \
              --for condition=ready \
              --timeout 15m

    While you are waiting, switch to OpenLens and monitor the CAS pod activity.

    1. Open OpenLens, connect to the GEL Kubernetes cluster, navigate to Workloads/Pods, and then filter on:

      • namespace: gelcorp
      • sas-cas-server-shared-gelcorp
    2. As you saw in the last exercise, all cas-shared-gelcorp pods should terminate.

      07_041_Lens_Monitor_Gelcorp_CASServer_0000
    3. Then you should see the cas-shared-gelcorp pods restart.

      07_041_Lens_Monitor_Gelcorp_CASServer_0001
      07_041_Lens_Monitor_Gelcorp_CASServer_0002
    4. When all containers are green the cas-shared-gelcorp is started and is ready for you to use.

      07_041_Lens_Monitor_Gelcorp_CASServer_0003
  8. Once the server is ready, verify that the HR tables have been loaded into memory.

    1. Using SAS Environment Manager

      Open SAS Environment Manager, log in as geladm, and assume the SASAdministrators membership.

      gellow_urls | grep "SAS Environment Manager"

      Navigate to the Data page.

      • On the Available tab, which shows tables loaded in memory, you should see the two HR tables, HR_SUMMARY and HRDATA are loaded.

        07_041_SASEnvironmentManager_Data_0000
      • Click on the HRDATA table name to display its details, noticing that the custom format EDU. has been applied to the Education variable which is a double.

        07_041_SASEnvironmentManager_Data_0001
      • Switch to the Sample Data tab and verify that the EDU. custom format has been applied to the Education values.

        07_041_SASEnvironmentManager_Data_0002
    2. Using sas-viya.

      1. Look at the cas-shared-gelcorp: startup configuration instance.

        • As you did before, get the cas-shared-gelcorp server startup configuration instance Id.

          _CAS_Startup_ConfigInstance_Id=$(gel_sas_viya --output text \
                                                        configuration configurations \
                                                                      list --service cas-shared-gelcorp \
                                                                           --definition-name sas.cas.instance.config \
             | grep "startup" \
             | awk '{printf $1}')
          
          echo ID=${_CAS_Startup_ConfigInstance_Id}
        • Show details of the cas-shared-gelcorp server startup configuration instance.

          gel_sas_viya --output text \
                       configuration configurations \
                                     show --id=${_CAS_Startup_ConfigInstance_Id}
        • You should see…

          id                   : 3b957407-eaf8-4a91-9ca1-99c4fb95790e
          metadata.isDefault   : false
          metadata.mediaType   : application/vnd.sas.configuration.config.sas.cas.instance.config+json;version=1
          metadata.services    : [cas-shared-gelcorp]
          name                 : startup
          contents             : -- CAS session-zero startup script extensions.
          --
          -- Lua-formatted SWAT client code
          -- that executes specified actions during session-zero prior to
          -- clients connecting to CAS.
          
          -- s:table_addCaslib{ name="sales", description="Sales data", dataSource={srcType="path"}, path="/data/sales" }
          
          -- Add User Defined Formats permanently and re-loadable
          ------------------------------------------------------
          -- Not required since defined CAS formats libraries are automatically loaded since Stable 2021.1.4
          
          -- Add HR tables to be reloaded at CAS Server start
          ---------------------------------------------------
          ---- Load HR summary table
          s:table_loadTable{caslib="hrdl",
                            casOut={caslib="hrdl",replication=0.0},
                            path="hr_summary.csv",
                            promote=true
                           }
          
          ---- Load HR data table
          s:table_loadTable{caslib="hrdl",
                            casOut={caslib="hrdl",replication=0.0},
                            path="hrdata.sas7bdat",
                            promote=true
                           }
      2. Verify that the HRFORMATS format library is in the list of available format libraries.

        gel_sas_viya --output text \
                     cas format-libraries \
                         list --server cas-shared-gelcorp

        You should see:

        Format Library   Present in Format Search Path   Scope     Persisted   Caslib    Table
        HRFORMATS        false                           global    true        FORMATS   HRFORMATS
        HRFORMATS        true                            session   true        FORMATS   HRFORMATS
      3. List the formats from the HRFORMATS CAS format library.

        gel_sas_viya --output text \
                     cas format-libraries \
                         show-formats --format-library HRFORMATS \
                                      --server cas-shared-gelcorp
        Format Name   Version
        edu           1
        perf          1
        rate          1
        work          1
      4. List the files from the hrdl global CAS library and see which ones were automatically loaded when the CAS server restarted.

        gel_sas_viya --output text \
                     cas tables \
                         list --server cas-shared-gelcorp \
                              --caslib hrdl
        Name                 Source Table Name             Scope    State
        EMPLOYEE_NEW         employee_new.sas7bdat         None     unloaded
        HR_SUMMARY           hr_summary.csv                global   loaded
        HRDATA               hrdata.sas7bdat               global   loaded
        PERFORMANCE_LOOKUP   performance_lookup.sas7bdat   None     unloaded
    3. Using kubectl, look at the cas-shared-gelcorp server casstartup_usermods.lua file content.

      _CASControllerPodName=$(kubectl get pod \
            --selector "casoperator.sas.com/server==shared-gelcorp,casoperator.sas.com/node-type==controller,casoperator.sas.com/controller-index==0" \
            --no-headers \
         | awk '{printf $1}')
      
      echo ${_CASControllerPodName}
      
      kubectl exec -it ${_CASControllerPodName} \
                   -c sas-cas-server \
                   -- bash -c "cat /cas/config/casstartup_usermods.lua"

      Click here to see the output

      -- CAS session-zero startup script extensions.
      --
      -- Lua-formatted SWAT client code
      -- that executes specified actions during session-zero prior to
      -- clients connecting to CAS.
      
      -- s:table_addCaslib{ name="sales", description="Sales data", dataSource={srcType="path"}, path="/data/sales" }
      
      -- Add User Defined Formats permanently and reloadable
      ------------------------------------------------------
      -- Not required since defined CAS formats libraries are automatically loaded since Stable 2021.1.4
      
      -- Add HR tables to be reloaded at CAS Server start
      ---------------------------------------------------
      ---- Load HR summary table
      s:table_loadTable{caslib="hrdl",
                        casOut={caslib="hrdl",replication=0.0},
                        path="hr_summary.csv",
                        promote=true
                       }
      
      ---- Load HR data table
      s:table_loadTable{caslib="hrdl",
                        casOut={caslib="hrdl",replication=0.0},
                        path="hrdata.sas7bdat",
                        promote=true
                       }

Lesson learned

  • CAS server usermods files must be managed using either SAS Environment Manager or the sas-viya CLI.
  • The CAS server usermods files should not be modified directly inside the CAS server pods cas container.
  • When pre-loading data in session zero processing, you must
    • Make sure you have a caslib defined to the data
    • Make sure user-defined formats used by the tables are available to CAS
    • Add Lua code to the sas.cas.instance.config:startup configuration to load the tables.
  • Modifying sas.cas.instance.config definitions requires restarting the CAS server to pick up the changes. You do not have to update the entire Viya deployment though.

SAS Viya Administration Operations
Lesson 07, Section 4 Exercise: Change Topology of an Additional CAS Server

Managing CAS Server Topology for Servers You Added

The steps to modify the topology of CAS servers you add to the deployment differ from the steps to modify the default CAS server.

In this hands-on you will learn how to modify the topology of CAS servers you have added to the deployment. You will re-run the create-cas-server.sh script with different parameters to re-generate a new set of manifests for the cas-shared-gelcorp server to modify its current topology. You will also be able to confirm that the configuration changes you made earlier are preserved when you modify the topology as long as the CAS instance name is kept the same.

The topology change technique you will follow here is different than the one recommended in the SAS Viya documentation but does not work for the cas-shared-default server. The technique for modifying the default CAS server’s topology is covered in the 07_052_CAS_Manage_Topology_Default_Optional.md hands-on.

After making a topology change you then look at the impact of the change on the cas-shared-gelcorp content.

Table of content

Set the namespace

gel_setCurrentNamespace gelcorp
/opt/pyviyatools/loginviauthinfo.py

The current topology of cas-shared-gelcorp server

Recall that in previous exercises you:

  • Created cas-shared-gelcorp as an MPP CAS Server with
    • a CAS controller
    • a CAS backup controller
    • two CAS workers
  • Configured cas-shared-gelcorp session zero to load specific HR tables
  • Relocated CAS_DISK_CACHE for cas-shared-gelcorp server from an emptyDir volume to a hostPath volume.

Modify the cas-shared-gelcorp server topology

Now let’s modify the topology of cas-shared-gelcorp server so that it has one CAS controller and four CAS workers and no longer has a backup controller.

Because you added cas-shared-gelcorp server using the create-cas-server.sh script, you can simply re-run the script with the appropriate parameter changes needed to define the new topology you want the server to have. And because we are updating an existing CAS server, we will be careful to keep the --instance gelcorp parameter the same as we did when we created the server.

  1. Re-generate the cas-shared-gelcorp server manifests by re-running create-cas-server.sh with parameters to increase the number of workers to four and to remove the backup controller.

    echo "y" | bash ~/project/deploy/${current_namespace}/sas-bases/examples/cas/create/create-cas-server.sh \
       --instance gelcorp \
       --output ~/project/deploy/${current_namespace}/site-config \
       --workers 4 \
       --backup 0 \
       --transfer 1
    Fri May 13 12:10:25 EDT 2022 - instance = gelcorp
    Fri May 13 12:10:25 EDT 2022 - tenant =
    Fri May 13 12:10:25 EDT 2022 - output = /home/cloud-user/project/deploy/gelcorp/site-config
    
    make: *** No rule to make target `install'.  Stop.
    output directory does not exist: /home/cloud-user/project/deploy/gelcorp/site-config/
    creating directory: /home/cloud-user/project/deploy/gelcorp/site-config/
    Generating artifacts...
    100.0% [=======================================================================]
    |-cas-shared-gelcorp (root directory)
      |-cas-shared-gelcorp-cr.yaml
      |-kustomization.yaml
      |-shared-gelcorp-pvc.yaml
      |-annotations.yaml
      |-backup-agent-patch.yaml
      |-cas-consul-sidecar.yaml
      |-cas-fsgroup-security-context.yaml
      |-cas-sssd-sidecar.yaml
      |-kustomizeconfig.yaml
      |-provider-pvc.yaml
      |-transfer-pvc.yaml
      |-enable-binary-port.yaml
      |-enable-http-port.yaml
      |-configmaps.yaml
      |-state-transfer.yaml
      |-node-affinity.yaml
    
    create-cas-server.sh complete!
  2. Keep a copy of the current manifest.yaml file.

    cp -p /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml /tmp/${current_namespace}/manifest_07-051-01.yaml
  3. Run the sas-orchestration deploy command.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
               -v ${PWD}/license:/license \
               -v ${PWD}/${current_namespace}:/${current_namespace} \
               -v ${HOME}/.kube/config_portable:/kube/config \
               -v /tmp/${current_namespace}/deploy_work:/work \
               -e KUBECONFIG=/kube/config \
               --user $(id -u):$(id -g) \
           sas-orchestration \
              deploy \
                 --namespace ${current_namespace} \
                 --deployment-data /license/SASViyaV4_${_order}_certs.zip \
                 --license /license/SASViyaV4_${_order}_license.jwt \
                 --user-content /${current_namespace} \
                 --cadence-name ${_cadenceName} \
                 --cadence-version ${_cadenceVersion} \
                 --image-registry ${_viyaMirrorReg}

    When the deploy command completes successfully the final message should say The deploy command completed successfully as shown in the log snippet below.

    The deploy command started
    
    [...]
    
    The deploy command completed successfully

    If the sas-orchestration deploy command fails checkout the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

  4. It may take several more minutes for the cas-shared-gelcorp server to fully initialize. The following command will notify you when the CAS Server is ready.

    kubectl wait pods \
                 --selector="casoperator.sas.com/server==shared-gelcorp" \
                 --for condition=ready \
                 --timeout 15m

    You should see these messages in the output.

    pod/sas-cas-server-shared-gelcorp-backup condition met
    pod/sas-cas-server-shared-gelcorp-controller condition met
    pod/sas-cas-server-shared-gelcorp-worker-0 condition met
    pod/sas-cas-server-shared-gelcorp-worker-1 condition met
    pod/sas-cas-server-shared-gelcorp-worker-2 condition met
    pod/sas-cas-server-shared-gelcorp-worker-3 condition met

    Now take a look at the cas-shared-gelcorp pods. Does anything look strange to you?

    kubectl get pods \
                --selector="casoperator.sas.com/server==shared-gelcorp" \
                -o wide

    You should see something like…

    NAME                                         READY   STATUS    RESTARTS   AGE    IP            NODE        NOMINATED NODE   READINESS GATES
    sas-cas-server-shared-gelcorp-3-backup       3/3     Running   0          27m    10.42.0.125   intnode03   <none>           <none>
    sas-cas-server-shared-gelcorp-3-controller   3/3     Running   0          27m    10.42.4.72    intnode05   <none>           <none>
    sas-cas-server-shared-gelcorp-3-worker-0     3/3     Running   0          27m    10.42.2.97    intnode02   <none>           <none>
    sas-cas-server-shared-gelcorp-3-worker-1     3/3     Running   0          27m    10.42.3.79    intnode04   <none>           <none>
    sas-cas-server-shared-gelcorp-3-worker-2     3/3     Running   0          6m7s   10.42.1.62    intnode01   <none>           <none>
    sas-cas-server-shared-gelcorp-3-worker-3     3/3     Running   0          6m7s   10.42.0.129   intnode03   <none>           <none>

    The cas-shared-gelcorp server has started but it does not have the topology you may have expected. The additional CAS workers have been added (+2) but the backup controller still exists even though you modified the topology to remove it.

    To fully implement the CAS server topology changes you must now restart the CAS server.

    Since we enabled the state transfer the cas-shared-gelcorp server you have now choices to restart the CAS server.

    • Choice 1: initiate the state transfer

      All loaded tables and active CAS session will be kept.

      The casoperator.sas.com/instance-index label for all pods of the CAS server will be incremented by 1.

      kubectl patch casdeployment \
                    shared-gelcorp \
                    --type='json' -p='[{"op": "replace", "path": "/spec/startStateTransfer", "value":true}]'
      sleep 60s
      kubectl wait pods \
                   --selector="casoperator.sas.com/server==shared-gelcorp" \
                   --for condition=ready \
                   --timeout 15m
    • Choice 2: delete the CAS server pods

      All loaded tables and active CAS sessions will be lost.

      The casoperator.sas.com/instance-index label for all pods of the CAS server will be reset to 0.

      kubectl delete pod \
                     --selector="casoperator.sas.com/server==shared-gelcorp"
      sleep 60s
      kubectl wait pods \
                   --selector="casoperator.sas.com/server==shared-gelcorp" \
                   --for condition=ready \
                   --timeout 15m

    You should see something like this.

    pod/sas-cas-server-shared-gelcorp-controller condition met
    pod/sas-cas-server-shared-gelcorp-worker-0 condition met
    pod/sas-cas-server-shared-gelcorp-worker-1 condition met
    pod/sas-cas-server-shared-gelcorp-worker-2 condition met
    pod/sas-cas-server-shared-gelcorp-worker-3 condition met

    Now cas-shared-gelcorp server has the topology you configured with a controller and four workers.

Look at the impact of the topology change on cas-shared-gelcorp server configuration

Using the kubectl CLI

  1. List the configuration files on the cas-shared-gelcorp server controller and note when they were created.

    _CASControllerPodName=$(kubectl get pod \
          --selector "casoperator.sas.com/server==shared-gelcorp,casoperator.sas.com/node-type==controller,casoperator.sas.com/controller-index==0" \
          --no-headers \
       | awk '{printf $1}')
    
    echo ${_CASControllerPodName}
    
    kubectl exec -it ${_CASControllerPodName} \
                 -c sas-cas-server \
                 -- bash -c "ls -al /cas/config/"

    Click here to see the output

    total 184
    drwxrwsrwx 6 root sas   4096 Jun  1 16:40 .
    drwxr-xr-x 1 root root    21 Jun  1 16:39 ..
    -rw-r--r-- 1 1002 sas   1549 Jun  1 16:40 casconfig_container.lua
    -rw-r--r-- 1 1002 sas    545 Jun  1 16:39 casconfig_deployment.lua
    -rw-r--r-- 1 1002 sas  11443 Jun  1 16:39 casconfig.lua
    -rw-r--r-- 1 1002 sas    287 Jun  1 16:34 casconfig_usermods.lua
    -rw-r--r-- 1 1002 sas    711 Jun  1 16:40 cas_container.settings
    -rwx------ 1 1002 sas     65 Jun  1 16:40 cas_key
    -rw-r--r-- 1 1002 sas      5 Jun  1 16:40 cas.pid
    -rw-r--r-- 1 1002 sas   1282 Jun  1 16:39 cas.settings
    -rw-r--r-- 1 1002 sas    855 Jun  1 16:39 casstartup.lua
    -rw-r--r-- 1 1002 sas   1124 Jun  1 16:34 casstartup_usermods.lua
    -rw-r--r-- 1 1002 sas    163 Jun  1 16:40 cas_usermods_bootstrap.log
    -rw-r--r-- 1 1002 sas    217 Jun  1 16:34 cas_usermods.settings
    -rw-r--r-- 1 1002 sas   1744 Jun  1 16:39 cas.yml
    drwxr-sr-x 2 1002 sas      6 Jun  1 16:39 conf.d
    -rw-r--r-- 1 1002 sas      8 Jun  1 16:39 .configrc
    -rw-r--r-- 1 1002 sas  41602 Jun  1 16:39 confLog.json
    -rwxr-x--- 1 1002 sas   5531 Jun  1 16:40 crsplanning-qp-logback.xml
    -rw-r--r-- 1 1002 sas     98 Jun  1 16:34 kv.log
    -rwxr-xr-x 1 1002 sas   3024 Jun  1 16:39 launchconfig
    -rw-r--r-- 1 1002 sas   1296 Jun  1 16:34 logconfig.session.xml
    -rw-r--r-- 1 1002 sas   3814 Jun  1 16:39 logconfig.trace.xml
    -rw-r--r-- 1 1002 sas   3210 Jun  1 16:34 logconfig.xml
    -rw-r--r-- 1 1002 sas   1354 Jun  1 16:39 node.lua
    -rw-r--r-- 1 1002 sas   2875 Jun  1 16:40 node_usermods.lua
    -rwxr-xr-x 1 1002 sas  12825 Jun  1 16:39 perms.xml
    -rw-r--r-- 1 1002 sas    456 Jun  1 16:40 sas-cas-container
    -rw-r--r-- 1 1002 sas   8089 Jun  1 16:40 sas-configuration-configurations-v1.json
    -rw-r--r-- 1 1002 sas   1216 Jun  1 16:40 sas-configuration-definitions-v1.json
    drwxr-sr-x 3 1002 sas     17 Jun  1 16:40 share
    drwxr-sr-x 2 1002 sas    171 Jun  1 16:40 start.d
    drwxr-sr-x 2 1002 sas     47 Jun  1 16:40 tokens
    -rw-r--r-- 1 1002 sas     91 Jun  1 16:40 usermodsdelete.sh

    You can see that all of the CAS server configuration files have been regenerated.

  2. List out the casconfig_container.lua file which contains CAS environment variables. Was the previous configuration of CAS_DISK_CACHE retained even though you changed the topology?

    _CASControllerPodName=$(kubectl get pod \
          --selector "casoperator.sas.com/server==shared-gelcorp,casoperator.sas.com/node-type==controller,casoperator.sas.com/controller-index==0" \
          --no-headers \
       | awk '{printf $1}')
    
    echo ${_CASControllerPodName}
    
    kubectl exec -it ${_CASControllerPodName} \
                 -c sas-cas-server \
                 -- bash -c "cat /cas/config/casconfig_container.lua"
    -- Inserting section capturing variables set on container creation.
    cas.dqlocale = 'ENUSA'
    cas.hostknownby = 'controller.sas-cas-server-shared-gelcorp.gelcorp'
    cas.initialworkers = 4
    cas.dqsetuploc = 'QKB CI 33'
    cas.elastic = 'true'
    cas.gcport = 5571
    cas.servicesbaseurl = 'https://gelcorp.***********.race.sas.com'
    cas.machinelist = '/dev/null'
    cas.userloc = '/cas/data/caslibs/casuserlibraries/%USER'
    cas.permstore = '/cas/permstore'
    cas.mode = 'mpp'
    cas.initialbackups = 0
    cas.keyfile = '/cas/config/cas_key'
    cas.colocation = 'none'
    env.CONSUL_HTTP_ADDR = 'https://localhost:8500'
    env.CAS_VIRTUAL_HOST = 'controller.sas-cas-server-shared-gelcorp.gelcorp'
    env.CAS_DEPLOYED_LOGCFGLOC = '/opt/sas/viya/config/etc/cas/default/logconfig.xml'
    env.CASDATADIR_CASLIBS = '/cas/data/caslibs'
    env.CASDATADIR = '/cas/data'
    env.CAS_VIRTUAL_PORT = 8777
    env.CONSUL_CACERT = '/security/trustedcerts.pem'
    env.CAS_VIRTUAL_PATH = '/cas-shared-gelcorp-http'
    env.CAS_K8S_SERVICE_NAME = 'sas-cas-server-shared-gelcorp-client'
    env.CAS_USE_CONSUL = 'true'
    env.CONSUL_NAME = 'cas-shared-gelcorp'
    env.CASDEPLOYMENT_SPEC_ALLOWLIST_APPEND = '/cas/data/caslibs:/gelcontent:/mnt/gelcontent/'
    env.CASPERMSTORE = '/cas/permstore'
    env.CAS_VIRTUAL_PROTO = 'http'
    env.CASDATADIR_APPS = '/cas/data/apps'
    env.CAS_DISK_CACHE = '/casdiskcache/cdc01:/casdiskcache/cdc02:/casdiskcache/cdc03:/casdiskcache/cdc04'
    env.CAS_INSTANCE_MODE = 'shared'
    env.CAS_LICENSE = '/cas/license/license.sas'
    env.CLIENT_ID = 'cas-shared-gelcorp'
    env.CLIENT_SECRET_LOC = '/cas/config/tokens/client.secret'
  3. Now let’s make sure the CAS startup configuration for session zero processing is still in place.

    • Using the same code you ran in an earlier exercise, get the instance ID for the cas-shared-gelcorp server startup configuration.

      _CAS_Startup_ConfigInstance_Id=$(gel_sas_viya --output text \
                                                    configuration configurations \
                                                                  list --service cas-shared-gelcorp \
                                                                       --definition-name sas.cas.instance.config \
         | grep "startup" \
         | awk '{printf $1}')
      
      echo ID=${_CAS_Startup_ConfigInstance_Id}
    • Now that we have the instance ID, show the details of the cas-shared-gelcorp server startup configuration instance.

      gel_sas_viya --output text \
                   configuration configurations \
                                 show --id=${_CAS_Startup_ConfigInstance_Id}

      You should see that your previous configuration change to load the HR tables is still in place.

      id                   : 068e19bc-4819-44a1-aef7-493f26fceaae
      metadata.isDefault   : false
      metadata.mediaType   : application/vnd.sas.configuration.config.sas.cas.instance.config+json;version=1
      metadata.services    : [cas-shared-gelcorp]
      name                 : startup
      contents             : -- CAS session-zero startup script extensions.
      --
      -- Lua-formatted SWAT client code
      -- that executes specified actions during session-zero prior to
      -- clients connecting to CAS.
      
      -- s:table_addCaslib{ name="sales", description="Sales data", dataSource={srcType="path"}, path="/data/sales" }
      -- Add User Defined Formats permanently and re-loadable
      ------------------------------------------------------
      -- Not required since defined CAS formats libraries are automatically loaded since Stable 2021.1.4
      
      -- Add HR tables to be reloaded at CAS Server start
      ---------------------------------------------------
      ---- Load HR summary table
      s:table_loadTable{caslib="hrdl",
                        casOut={caslib="hrdl",replication=0.0},
                        path="hr_summary.csv",
                        promote=true
                     }
      
      ---- Load HR data table
      s:table_loadTable{caslib="hrdl",
                        casOut={caslib="hrdl",replication=0.0},
                        path="hrdata.sas7bdat",
                        promote=true
                     }

Using SAS Environment Manager

  1. Open SAS Environment Manager, log in as geladm, and assume the SASAdministrators membership.

    gellow_urls | grep "SAS Environment Manager"
  2. Verify the CAS server topology.

    • Open the Server page

    • Right-click on the cas-shared-gelcorp server and select the Configuration option.

    • Switch to the Nodes tab. You can see that cas-shared-gelcorp server has a primary controller and four workers.

      07_051_SASEnvironmentManager_Monitor_Gelcorp_CASServer_0000
  3. Verify the CAS libraries and tables.

    All global CAS libraries remain even after you restart the cas-shared-gelcorp server.

    • Open the Data page.

    • On the Data Sources tab expand the cas-shared-gelcorp connection to display its CAS libraries.

    • Verify that you still see the hrdl caslib created in a previous exercise.

      07_051_SASEnvironmentManager_Monitor_Gelcorp_CASServer_0001
    • Expand the the hrdl CAS library to display its tables.

      After you restarted the cas-shared-gelcorp server, the data files remained available (data sources), and some tables are loaded into the memory because of the session zero defined in a previous hands-on.

      07_051_SASEnvironmentManager_Monitor_Gelcorp_CASServer_0003

Lessons learned

  • It is easy to change the topology of a non default CAS server.

    1. Re-run the create-cas-server.sh with different --workers and --backup values, but keep the same --instance and --output values. This will owerwrite the CAS server existing manifests.

    2. Regenerate and apply the SASDeployment custom resource.

    3. Restart the CAS server. In memory data are lost except if the CAS Server session zero is set.

  • Topology does not result in losing any preexisting configurations such as:

    • User Defined Formats,
    • Global CAS Libraries,
    • Session zero processing,
    • CAS_DISK_CACHE relocation

SAS Viya Administration Operations
Lesson 07, Section 4 Exercise: Change Topology of Default CAS Server

Managing cas-shared-default server topology - OPTIONAL

In this exercise you will:

  • Review the cas-shared-default server topology, configuration, and content
  • Convert cas-shared-default from an SMP server to an MPP server by adding CAS workers nodes.
  • Add a backup controller to the cas-shared-default MPP server.
  • Look at the impact of the topology changes on the cas-shared-default server content.

Table of content

Set the namespace

gel_setCurrentNamespace gelcorp

The current cas-shared-default server

From the intitial Viya deployment, the cas-shared-default server is SMP.

  1. Review your current CAS server configuration

    1. Access some of your CAS server metadata using the kubectl CLI

      kubectl describe pods \
                       sas-cas-server-default-controller \
         | grep " casoperator." \
         | awk -F"/" '{print $2}'

      Note that currently the cas-shared-default server is SMP.

      Click here to see the output

      cas-cfg-mode=smp
      cas-env-consul-name=cas-shared-default
      controller-active=1
      controller-index=0
      node-type=controller
      server=default
      service-name=primary
    2. List your CAS server pods.

      kubectl get pods \
                  --selector="casoperator.sas.com/server==default"

      Click here to see the output

       NAME                                READY   STATUS    RESTARTS   AGE
       sas-cas-server-default-controller   3/3     Running   0          2m19s
  2. Using SAS Environment Manager

    1. Open SAS Environment Manager, log in as geladm, and assume the SASAdministrators membership.

      gellow_urls | grep "SAS Environment Manager"
    2. Navigate to the Servers page and right-click on the cas-shared-default server. And then click on Configuration.

      07_052_SASEnvironmentManager_Monitor_Default_CASServer_0000
    3. Navigate to the Nodes tab. You can now see the current cas-shared-default server configuration: SMP (a single CAS controller, no workers).

      07_052_SASEnvironmentManager_Monitor_Default_CASServer_0001
  3. Using OpenLens

    1. Open OpenLens and connect to your GEL Kubernetes cluster.

    2. Navigate to Workloads –> Pods and then filter on

      • namespace: gelcorp
      • sas-cas-server-default
      07_052_Lens_Monitor_Default_CASServer_0000

Convert cas-shared-default server from SMP to MPP

To convert an SMP CAS server to MPP, you need to modify the CAS server deployment using a patchTransformer to modify the number of CAS workers from 0 to the desired number of workers.

The number of workers is specified using the sas-bases/examples/cas/configure/cas-manage-workers.yaml manifest.

  1. View the cas-manage-workers.yaml file

    cat ~/project/deploy/${current_namespace}/sas-bases/examples/cas/configure/cas-manage-workers.yaml

    Click here to see the cas-manage-workers.yaml content

    # This block of code is for specifying the number of workers in an MPP
    # deployment. Do not use this block for SMP deployments. The default value is 2
    ---
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: cas-manage-workers
    patch: |-
       - op: replace
         path: /spec/workers
         value:
           {{ NUMBER-OF-WORKERS }}
    target:
      group: viya.sas.com
      kind: CASDeployment
      # Uncomment this to apply to all CAS servers:
      name: .*
      # Uncomment this to apply to one particular named CAS server:
      #name: {{ NAME-OF-SERVER }}
      # Uncomment this to apply to the default CAS server:
      #labelSelector: "sas.com/cas-server-default"
      version: v1alpha1
  2. Create a cas-manage-workers-cas-shared-default.yaml file with two workers in the site-config directory.

    1. Copy the cas-manage-workers.yaml manifest in the project site-config directory

      cp -p ~/project/deploy/${current_namespace}/sas-bases/examples/cas/configure/cas-manage-workers.yaml ~/project/deploy/${current_namespace}/site-config/cas-manage-workers-cas-shared-default.yaml
      chmod 664  ~/project/deploy/${current_namespace}/site-config/cas-manage-workers-cas-shared-default.yaml
    2. Change the Target filtering in the cas-manage-workers-cas-shared-default.yaml file

      Only the cas-shared-default server has to be modified. By default, the provided manifest targets all CAS Server. Because of that, it is required to modify the manifest to apply the topology changes only against the cas-shared-default server.

      sed -i 's/name: \.\*/\#name: \.\*/' ~/project/deploy/${current_namespace}/site-config/cas-manage-workers-cas-shared-default.yaml
      sed -i 's/\#labelSelector: /labelSelector: /g' ~/project/deploy/${current_namespace}/site-config/cas-manage-workers-cas-shared-default.yaml
    3. Change the number of workers.

      _numberOfWorkers=2
      sed -i "/value:/{n;s/.*/       ${_numberOfWorkers}/}" ~/project/deploy/${current_namespace}/site-config/cas-manage-workers-cas-shared-default.yaml

    Click here to see the cas-manage-workers-cas-shared-default.yaml content

    cat ~/project/deploy/${current_namespace}/site-config/cas-manage-workers-cas-shared-default.yaml
    # This block of code is for specifying the number of workers in an MPP
    # deployment. Do not use this block for SMP deployments. The default value is 2
    ---
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: cas-manage-workers
    patch: |-
       - op: replace
         path: /spec/workers
         value:
           2
    target:
      group: viya.sas.com
      kind: CASDeployment
      # Uncomment this to apply to all CAS servers:
      #name: .*
      # Uncomment this to apply to one particular named CAS server:
      #name: {{ NAME-OF-SERVER }}
      # Uncomment this to apply to the default CAS server:
      labelSelector: "sas.com/cas-server-default"
      version: v1alpha1
  3. Modify ~/project/deploy/gelcorp/kustomization.yaml to reference the cas server manifest.

    • Backup the current kustomization.yaml file.

      cp -p ~/project/deploy/${current_namespace}/kustomization.yaml /tmp/${current_namespace}/kustomization_07-052-01.yaml
    • In the transformers field add the line “- site-config/cas-manage-workers-cas-shared-default.yaml” using the yq tool:

      [[ $(grep -c "site-config/cas-manage-workers-cas-shared-default.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
      yq4 eval -i '.transformers += ["site-config/cas-manage-workers-cas-shared-default.yaml"]' ~/project/deploy/${current_namespace}/kustomization.yaml
    • Alternatively, you can update the ~/project/deploy/gelcorp/kustomization.yaml file using your favorite text editor:

      [...]
      transformers:
        [... previous transformers items ...]
        - site-config/cas-manage-workers-cas-shared-default.yaml
      [...]
  4. Verify that the update is included in kustomization.yaml.

    cat ~/project/deploy/${current_namespace}/kustomization.yaml

    Search for - site-config/cas-manage-workers-cas-shared-default.yaml into the transformers field of the ~/project/deploy/gelcorp/kustomization.yaml.

    Click here to see the output

    ---
    namespace: gelcorp
    resources:
      - sas-bases/base
      # GEL Specifics to create CA secret for OpenSSL Issuer
      - site-config/security/gel-openssl-ca
      - sas-bases/overlays/network/networking.k8s.io # Using networking.k8s.io API since 2021.1.6
      - site-config/security/openssl-generated-ingress-certificate.yaml # Default to OpenSSL Issuer in 2021.2.6
      - sas-bases/overlays/cas-server
      - sas-bases/overlays/crunchydata/postgres-operator # New Stable 2022.10
      - sas-bases/overlays/postgres/platform-postgres # New Stable 2022.10
      - sas-bases/overlays/internal-elasticsearch # New Stable 2020.1.3
      - sas-bases/overlays/update-checker # added update checker
      ## disable CAS autoresources to keep things simpler
      #- sas-bases/overlays/cas-server/auto-resources                                        # CAS-related
      #- sas-bases/overlays/crunchydata_pgadmin                                              # Deploy the sas-crunchy-data-pgadmin container - remove 2022.10
      - site-config/sas-prepull/add-prepull-cr-crb.yaml
      - sas-bases/overlays/cas-server/state-transfer # Enable state transfer for the cas-shared-default CAS server - new PVC sas-cas-transfer-data
      - site-config/sas-microanalytic-score/astores/resources.yaml
      - site-config/gelcontent_pvc.yaml
      - site-config/cas-shared-gelcorp
    configurations:
      - sas-bases/overlays/required/kustomizeconfig.yaml
    transformers:
      - sas-bases/overlays/internal-elasticsearch/sysctl-transformer.yaml # New Stable 2020.1.3
      - sas-bases/overlays/startup/ordered-startup-transformer.yaml
      - site-config/cas-enable-host.yaml
      - sas-bases/overlays/required/transformers.yaml
      - site-config/mirror.yaml
      #- site-config/daily_update_check.yaml      # change the frequency of the update-check
      #- sas-bases/overlays/cas-server/auto-resources/remove-resources.yaml    # CAS-related
      ## temporarily removed to alleviate RACE issues
      - sas-bases/overlays/internal-elasticsearch/internal-elasticsearch-transformer.yaml # New Stable 2020.1.3
      - sas-bases/overlays/sas-programming-environment/enable-admin-script-access.yaml # To enable admin scripts
      #- sas-bases/overlays/scaling/zero-scale/phase-0-transformer.yaml
      #- sas-bases/overlays/scaling/zero-scale/phase-1-transformer.yaml
      - sas-bases/overlays/cas-server/state-transfer/support-state-transfer.yaml # Enable state transfer for the cas-shared-default CAS server - enable and mount new PVC
      - site-config/change-check-interval.yaml
      - sas-bases/overlays/sas-microanalytic-score/astores/astores-transformer.yaml
      - site-config/sas-pyconfig/change-configuration.yaml
      - site-config/sas-pyconfig/change-limits.yaml
      - site-config/cas-add-nfs-mount.yaml
      - site-config/cas-add-allowlist-paths.yaml
      - site-config/cas-modify-user.yaml
      - site-config/cas-manage-casdiskcache-shared-gelcorp.yaml
      - site-config/cas-manage-workers-cas-shared-default.yaml
    components:
      - sas-bases/components/crunchydata/internal-platform-postgres # New Stable 2022.10
      - sas-bases/components/security/core/base/full-stack-tls
      - sas-bases/components/security/network/networking.k8s.io/ingress/nginx.ingress.kubernetes.io/full-stack-tls
    patches:
      - path: site-config/storageclass.yaml
        target:
          kind: PersistentVolumeClaim
          annotationSelector: sas.com/component-name in (sas-backup-job,sas-data-quality-services,sas-commonfiles,sas-cas-operator,sas-pyconfig)
      - path: site-config/cas-gelcontent-mount-pvc.yaml
        target:
          group: viya.sas.com
          kind: CASDeployment
          name: .*
          version: v1alpha1
      - path: site-config/compute-server-add-nfs-mount.yaml
        target:
          labelSelector: sas.com/template-intent=sas-launcher
          version: v1
          kind: PodTemplate
      - path: site-config/compute-server-annotate-podtempate.yaml
        target:
          name: sas-compute-job-config
          version: v1
          kind: PodTemplate
    secretGenerator:
      - name: sas-consul-config
        behavior: merge
        files:
          - SITEDEFAULT_CONF=site-config/sitedefault.yaml
      - name: sas-image-pull-secrets
        behavior: replace
        type: kubernetes.io/dockerconfigjson
        files:
          - .dockerconfigjson=site-config/crcache-image-pull-secrets.json
    configMapGenerator:
      - name: ingress-input
        behavior: merge
        literals:
          - INGRESS_HOST=gelcorp.pdcesx03145.race.sas.com
      - name: sas-shared-config
        behavior: merge
        literals:
          - SAS_SERVICES_URL=https://gelcorp.pdcesx03145.race.sas.com
      # # This is to fix an issue that only appears in very slow environments.
      # # Do not do this at a customer site
      - name: sas-go-config
        behavior: merge
        literals:
          - SAS_BOOTSTRAP_HTTP_CLIENT_TIMEOUT_REQUEST='15m'
      - name: input
        behavior: merge
        literals:
          - IMAGE_REGISTRY=crcache-race-sas-cary.unx.sas.com
  5. Normally at this step, you will have to do a backup of the current manifest.yaml file and then run the sas-orchestration deploy command. But because you will have to add also a backup controller to the cas-shared-default server, these two step will be run later to apply all topology modifications in one step.

Note: to add or remove CAS workers to the cas-shared-default server, you just have to modify the ~/project/deploy/gelcorp/site-config/cas-manage-workers-cas-shared-default.yaml.

Add a backup controller to the MPP server

  1. View the cas-manage-backup.yaml file

    cat ~/project/deploy/${current_namespace}/sas-bases/examples/cas/configure/cas-manage-backup.yaml

    Click here to see the cas-manage-backup.yaml content

    ---
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: cas-manage-backup
    patch: |-
       - op: replace
         path: /spec/backupControllers
         value:
           1
    target:
      group: viya.sas.com
      kind: CASDeployment
      # Uncomment this to apply to all CAS servers:
      name: .*
      # Uncomment this to apply to one particular named CAS server:
      #name: {{ NAME-OF-SERVER }}
      # Uncomment this to apply to the default CAS server:
      #labelSelector: "sas.com/cas-server-default"
      version: v1alpha1
  2. Create the cas-manage-backup-cas-shared-default.yaml file in the site-config directory to add a backup controller to cas-shared-default server.

    1. Copy the cas-manage-backup.yaml manifest in the project site-config directory

      cp -p ~/project/deploy/${current_namespace}/sas-bases/examples/cas/configure/cas-manage-backup.yaml ~/project/deploy/${current_namespace}/site-config/cas-manage-backup-cas-shared-default.yaml
      chmod 664 ~/project/deploy/${current_namespace}/site-config/cas-manage-backup-cas-shared-default.yaml
    2. Change the Target filtering in the cas-manage-backup-cas-shared-default.yaml file

      Only the cas-shared-default server has to be modified. By default, the provided manifest targets all CAS Server. Because of that, it is required to modify the manifest to apply the topology changes only against the cas-shared-default server.

      sed -i 's/name: \.\*/\#name: \.\*/' ~/project/deploy/${current_namespace}/site-config/cas-manage-backup-cas-shared-default.yaml
      sed -i 's/\#labelSelector: /labelSelector: /g' ~/project/deploy/${current_namespace}/site-config/cas-manage-backup-cas-shared-default.yaml

    Click here to see the cas-manage-backup-cas-shared-default.yaml content

    cat ~/project/deploy/${current_namespace}/site-config/cas-manage-backup-cas-shared-default.yaml
    # This block of code is for specifying adding a backup controller in an MPP
    # deployment. Do not use this block for SMP deployments.
    ---
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: cas-manage-backup
    patch: |-
       - op: replace
         path: /spec/backupControllers
         value:
           1
    target:
      group: viya.sas.com
      kind: CASDeployment
      # Uncomment this to apply to all CAS servers:
      #name: .*
      # Uncomment this to apply to one particular named CAS server:
      #name: {{ NAME-OF-SERVER }}
      # Uncomment this to apply to the default CAS server:
      labelSelector: "sas.com/cas-server-default"
      version: v1alpha1
  3. Modify ~/project/deploy/gelcorp/kustomization.yaml to reference the cas server manifest.

    • Backup the current kustomization.yaml file.

      cp -p ~/project/deploy/${current_namespace}/kustomization.yaml /tmp/${current_namespace}/kustomization_07-052-02.yaml
    • In the transformers field add the line “- site-config/cas-manage-backup-cas-shared-default.yaml” using the yq tool:

      [[ $(grep -c "site-config/cas-manage-backup-cas-shared-default.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
      yq4 eval -i '.transformers += ["site-config/cas-manage-backup-cas-shared-default.yaml"]' ~/project/deploy/${current_namespace}/kustomization.yaml
    • Alternatively, you can update the ~/project/deploy/gelcorp/kustomization.yaml file using your favorite text editor:

      [...]
      transformers:
        [... previous transformers items ...]
        - site-config/cas-manage-backup-cas-shared-default.yaml
      [...]
  4. Verify that the modification is in place.

    cat ~/project/deploy/${current_namespace}/kustomization.yaml

    Search for - site-config/cas-manage-backup-cas-shared-default.yaml into the transformers field of the ~/project/deploy/gelcorp/kustomization.yaml.

    Click here to see the output

    ---
    namespace: gelcorp
    resources:
      - sas-bases/base
      # GEL Specifics to create CA secret for OpenSSL Issuer
      - site-config/security/gel-openssl-ca
      - sas-bases/overlays/network/networking.k8s.io # Using networking.k8s.io API since 2021.1.6
      - site-config/security/openssl-generated-ingress-certificate.yaml # Default to OpenSSL Issuer in 2021.2.6
      - sas-bases/overlays/cas-server
      - sas-bases/overlays/crunchydata/postgres-operator # New Stable 2022.10
      - sas-bases/overlays/postgres/platform-postgres # New Stable 2022.10
      - sas-bases/overlays/internal-elasticsearch # New Stable 2020.1.3
      - sas-bases/overlays/update-checker # added update checker
      ## disable CAS autoresources to keep things simpler
      #- sas-bases/overlays/cas-server/auto-resources                                        # CAS-related
      #- sas-bases/overlays/crunchydata_pgadmin                                              # Deploy the sas-crunchy-data-pgadmin container - remove 2022.10
      - site-config/sas-prepull/add-prepull-cr-crb.yaml
      - sas-bases/overlays/cas-server/state-transfer # Enable state transfer for the cas-shared-default CAS server - new PVC sas-cas-transfer-data
      - site-config/sas-microanalytic-score/astores/resources.yaml
      - site-config/gelcontent_pvc.yaml
      - site-config/cas-shared-gelcorp
    configurations:
      - sas-bases/overlays/required/kustomizeconfig.yaml
    transformers:
      - sas-bases/overlays/internal-elasticsearch/sysctl-transformer.yaml # New Stable 2020.1.3
      - sas-bases/overlays/startup/ordered-startup-transformer.yaml
      - site-config/cas-enable-host.yaml
      - sas-bases/overlays/required/transformers.yaml
      - site-config/mirror.yaml
      #- site-config/daily_update_check.yaml      # change the frequency of the update-check
      #- sas-bases/overlays/cas-server/auto-resources/remove-resources.yaml    # CAS-related
      ## temporarily removed to alleviate RACE issues
      - sas-bases/overlays/internal-elasticsearch/internal-elasticsearch-transformer.yaml # New Stable 2020.1.3
      - sas-bases/overlays/sas-programming-environment/enable-admin-script-access.yaml # To enable admin scripts
      #- sas-bases/overlays/scaling/zero-scale/phase-0-transformer.yaml
      #- sas-bases/overlays/scaling/zero-scale/phase-1-transformer.yaml
      - sas-bases/overlays/cas-server/state-transfer/support-state-transfer.yaml # Enable state transfer for the cas-shared-default CAS server - enable and mount new PVC
      - site-config/change-check-interval.yaml
      - sas-bases/overlays/sas-microanalytic-score/astores/astores-transformer.yaml
      - site-config/sas-pyconfig/change-configuration.yaml
      - site-config/sas-pyconfig/change-limits.yaml
      - site-config/cas-add-nfs-mount.yaml
      - site-config/cas-add-allowlist-paths.yaml
      - site-config/cas-modify-user.yaml
      - site-config/cas-manage-casdiskcache-shared-gelcorp.yaml
      - site-config/cas-manage-workers-cas-shared-default.yaml
      - site-config/cas-manage-backup-cas-shared-default.yaml
    components:
      - sas-bases/components/crunchydata/internal-platform-postgres # New Stable 2022.10
      - sas-bases/components/security/core/base/full-stack-tls
      - sas-bases/components/security/network/networking.k8s.io/ingress/nginx.ingress.kubernetes.io/full-stack-tls
    patches:
      - path: site-config/storageclass.yaml
        target:
          kind: PersistentVolumeClaim
          annotationSelector: sas.com/component-name in (sas-backup-job,sas-data-quality-services,sas-commonfiles,sas-cas-operator,sas-pyconfig)
      - path: site-config/cas-gelcontent-mount-pvc.yaml
        target:
          group: viya.sas.com
          kind: CASDeployment
          name: .*
          version: v1alpha1
      - path: site-config/compute-server-add-nfs-mount.yaml
        target:
          labelSelector: sas.com/template-intent=sas-launcher
          version: v1
          kind: PodTemplate
      - path: site-config/compute-server-annotate-podtempate.yaml
        target:
          name: sas-compute-job-config
          version: v1
          kind: PodTemplate
    secretGenerator:
      - name: sas-consul-config
        behavior: merge
        files:
          - SITEDEFAULT_CONF=site-config/sitedefault.yaml
      - name: sas-image-pull-secrets
        behavior: replace
        type: kubernetes.io/dockerconfigjson
        files:
          - .dockerconfigjson=site-config/crcache-image-pull-secrets.json
    configMapGenerator:
      - name: ingress-input
        behavior: merge
        literals:
          - INGRESS_HOST=gelcorp.pdcesx03145.race.sas.com
      - name: sas-shared-config
        behavior: merge
        literals:
          - SAS_SERVICES_URL=https://gelcorp.pdcesx03145.race.sas.com
      # # This is to fix an issue that only appears in very slow environments.
      # # Do not do this at a customer site
      - name: sas-go-config
        behavior: merge
        literals:
          - SAS_BOOTSTRAP_HTTP_CLIENT_TIMEOUT_REQUEST='15m'
      - name: input
        behavior: merge
        literals:
          - IMAGE_REGISTRY=crcache-race-sas-cary.unx.sas.com

Note: to add or remove CAS backup controller to the cas-shared-default server, you just have to modify the ~/project/deploy/gelcorp/site-config/cas-manage-backup-cas-shared-default.yaml.

Apply the topology modifications to the cas-shared-default server

  1. Keep a copy of the current manifest.yaml file.

    cp -p /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml /tmp/${current_namespace}/manifest_07-052-02.yaml
  2. Run the sas-orchestration deploy command.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
               -v ${PWD}/license:/license \
               -v ${PWD}/${current_namespace}:/${current_namespace} \
               -v ${HOME}/.kube/config_portable:/kube/config \
               -v /tmp/${current_namespace}/deploy_work:/work \
               -e KUBECONFIG=/kube/config \
               --user $(id -u):$(id -g) \
           sas-orchestration \
              deploy \
                 --namespace ${current_namespace} \
                 --deployment-data /license/SASViyaV4_${_order}_certs.zip \
                 --license /license/SASViyaV4_${_order}_license.jwt \
                 --user-content /${current_namespace} \
                 --cadence-name ${_cadenceName} \
                 --cadence-version ${_cadenceVersion} \
                 --image-registry ${_viyaMirrorReg}

    When the deploy command completes successfully the final message should say The deploy command completed successfully as shown in the log snippet below.

    The deploy command started
    
    [...]
    
    The deploy command completed successfully

    If the sas-orchestration deploy command fails checkout the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

  3. It may take several more minutes for the cas-shared-default server to fully initialize. The following command will notify you when the CAS Server is ready.

    kubectl wait pods \
                 --selector="casoperator.sas.com/server==default" \
                 --for condition=ready \
                 --timeout 15m

    You should see these messages in the output.

    pod/sas-cas-server-default-backup condition met
    pod/sas-cas-server-default-controller condition met
    pod/sas-cas-server-default-worker-0 condition met
    pod/sas-cas-server-default-worker-1 condition met

    The cas-shared-default server expected pods are running, but, because you changed its topology from SMP to MPP, to fully implement the CAS server new topology changes you must now restart the CAS server.

  4. Restart the cas-shared-default server so that it is aware of the new CAS backup controller.

    Since we enabled the state transfer the cas-shared-default server you have now choices to restart the CAS server.

    • Choice 1: initiate the state transfer

      All loaded tables and active CAS session will be kept.

      The casoperator.sas.com/instance-index label for all pods of the CAS server will be incremented by 1.

      kubectl patch casdeployment \
                    default \
                    --type='json' -p='[{"op": "replace", "path": "/spec/startStateTransfer", "value":true}]'
      sleep 60s
      kubectl wait pods \
                   --selector="casoperator.sas.com/server==default" \
                   --for condition=ready \
                   --timeout 15m
    • Choice 2: delete the CAS server pods

      All loaded tables and active CAS sessions will be lost.

      The casoperator.sas.com/instance-index label for all pods of the CAS server will be reset to 0.

      kubectl delete pod \
                     --selector="casoperator.sas.com/server==default"
      sleep 60s
      kubectl wait pods \
                   --selector="casoperator.sas.com/server==default" \
                   --for condition=ready \
                   --timeout 15m
  5. Access some of your CAS server metadata using the kubectl CLI, and confirm that the cas-shared-default server is now MPP with a backup controller and 2 workers .

    kubectl describe pods \
                     --selector="casoperator.sas.com/server==default" \
       | grep " casoperator." \
       | awk -F"/" '{print $2}' \
       | sed '/cas-cfg-mode=/i\ '

    Note that the cas-shared-default server is now MPP.

    Click here to see the output

    cas-cfg-mode=mpp
    cas-env-consul-name=cas-shared-default
    controller-active=0
    controller-index=1
    node-type=controller
    server=default
    service-name=backup
    
    cas-cfg-mode=mpp
    cas-env-consul-name=cas-shared-default
    controller-active=1
    controller-index=0
    node-type=controller
    server=default
    service-name=primary
    
    cas-cfg-mode=mpp
    cas-env-consul-name=cas-shared-default
    node-type=worker
    server=default
    service-name=worker
    worker-index=0
    
    cas-cfg-mode=mpp
    cas-env-consul-name=cas-shared-default
    node-type=worker
    server=default
    service-name=worker
    worker-index=1

Validate the cas-shared-default server topology changes

  1. Using SAS Environment Manager

    1. Open SAS Environment Manager, log in as geladm, and assume the SASAdministrators membership.

      gellow_urls | grep "SAS Environment Manager"
    2. Navigate to the Server page, and right-click on the cas-shared-default server, and click on Configuration. Then navigate to the Nodes tab.

      You can now see the current cas-shared-default server configuration: MPP with two controllers (primary, and secondary/backup), and two workers.

      07_052_SASEnvironmentManager_Monitor_Default_CASServer_0003
  2. Using OpenLens

    1. Open OpenLens and connect to your GEL Kubernetes cluster.

    2. Navigate to Workloads –> Pods and then filter on

      • namespace: gelcorp
      • sas-cas-server-default
      07_052_Lens_Monitor_Default_CASServer_0001

      You can see all the cas-shared-default server pods. Five pods: one controller, one backup, and two workers.

Lessons learned

  • By default the provided cas-shared-default server manifests (sas-bases/overlays/cas-server/) are configured for an SMP CAS server.
  • It is easy to convert the cas-shared-default server from SMP to MPP (ony a single controller by default).
  • When the cas-shared-default server is MPP, it is easy to modify the number of CAS workers.
  • It is easy to add or remove a backup controller in an existing MPP server.
  • In memory data are lost each time the CAS server is restarted if the CAS State Transfer is not enable and used for the CAS erver.

(For your info only) Rollback the cas-shared-default server from MPP CAS server to SMP

FOR YOUR INFORMATION ONLY - DO NOT PROCESS THE INSTRUCTION BELOW DURING THIS WORKSHOP

Click here to see the required steps
  1. Remove CAS server workers and secondary/backup controller: 2 methods

    1. Keep a copy of the current kuztomization.yaml file, then remove the site-config/cas-manage-workers-cas-shared-default.yaml and site-config/cas-manage-backup-cas-shared-default.yaml files references in the transformers field of the kuztomization.yaml file.

    or

    1. Keep a copy of the current site-config/cas-manage-workers-cas-shared-default.yaml and site-config/cas-manage-backup-cas-shared-default.yaml files (if exists), and then modify them:

      • Set workers value to 0 in *site-config/cas-manage-workers-cas-shared-default.yaml (remove all workers)
      • Set backupControllers value to 0 in site-config/cas-manage-backup-cas-shared-default.yaml (remove secondary/backup controller)

      Note: this strategy does not required to modify the kuztomization.yaml file.

  2. Keep a copy of the current manifest.yaml manifest.

  3. Run the sas-orchestration deploy command.

  4. Restart the cas-shared-default server so that it is aware of the new topology (mainly because the backup controller was removed).

  5. The cas-shared-default server will be back to SMP.


SAS Viya Administration Operations
Lesson 07, Section 5 Exercise: Configure CAS for External Access

Access CAS server from outside the Viya deployment namespace

In this exercise you will enable both binary and HTTP services for the cas-shared-gelcorp server to be able to access this CAS server from outside the Viya deployment.

You can look at this GEL blog for more information about accessing CAS from outside its SAS Viya deployment namespace

Table of content

Set the namespace

gel_setCurrentNamespace gelcorp

Access the cas-shared-gelcorp server using the default HTTP ingress

In this step of the hands-on, you will have to test the default HTTP ingress connection to the cas-shared-gelcorp server by listing the CAS server nodes.

You will do it from:

  • The Windows machine (sas-client) using the Postman application.
  • The Linux machine through a MobaXterm session using curl.
  1. On Windows, use Postman query to access the cas-shared-gelcorp server

    Using the Postman application, test the cas-shared-gelcorp server HTTP ingress access.

    1. Open Postman on your sas-client (Windows) machine

      07_061_Postman_Open_0000
    2. Update the RACE environment {{racemachine}} current variable value

      A Postman RACE environment was created for you with some variables that will be used by the Postman queries.

      The {{racemachine}} variable values (initial and current) were set to machineName.

      07_061_Postman_Query_Gelcorp_CASServer_HTTPIngress_0000

      You have to replace the {{racemachine}} variable current value with the short name of your sasnode01 machine. You can find it on the prompt of a MobaXterm terminal, or by running this command in MobaXterm.

      echo "The required RACE machine name: $(hostname)"

      Copy this returned value to the CURRENT VALUE of the {{racemachine}} variable.

      07_061_Postman_Query_Gelcorp_CASServer_HTTPIngress_0001

      Then save the modified RACE Postman environment.

    3. Run the Get NodeNames query from the HTTP Ingress Postman collection

      A Postman collections were created for you to query the the cas-shared-gelcorp server.

      At this step, open the HTTP ingress collection and ensure that the RACE Postman environment is selected.

      07_061_Postman_Query_Gelcorp_CASServer_HTTPIngress_0002

      Then just send the query to get the result.

      07_061_Postman_Query_Gelcorp_CASServer_HTTPIngress_0003
  2. On Linux, use a curl query to access the cas-shared-gelcorp server

    From a MobaXterm session, run the command bellow to test the cas-shared-gelcorp server HTTP ingress using curl.

    curl --user geladm:lnxsas https://gelcorp.$(hostname -f)/cas-shared-gelcorp-http/cas/nodeNames
    [
     "controller.sas-cas-server-shared-gelcorp.gelcorp.svc.cluster.local",
     "worker-1.sas-cas-server-shared-gelcorp.gelcorp.svc.cluster.local",
     "worker-0.sas-cas-server-shared-gelcorp.gelcorp.svc.cluster.local",
     "worker-3.sas-cas-server-shared-gelcorp.gelcorp.svc.cluster.local",
     "worker-2.sas-cas-server-shared-gelcorp.gelcorp.svc.cluster.local"
    ]

Enable both binary and HTTP services for the cas-shared-gelcorp server

To enable the binary and HTTP services, SAS provide a patchTransformer manifest through the SAS Viya deployment assets. You can find this manifest in `$deploy/sas-bases/examples/cas/configure/cas-enable-external-services.yaml. A single patchTransformer manifest is used to manage both binary and HTTP services, for all CAS server in a SAS Viya deployment by default (target options: “name: .*”).

In this hands-on, because we want to enable both binary and HTTP services for the cas-shared-gelcorp server only, you will have to copy then modify the provided cas-enable-external-services.yaml manifest.

  1. Copy the provided cas-enable-external-services.yaml patchTransformer manifest in the site-config directory

    cp -p ~/project/deploy/${current_namespace}/sas-bases/examples/cas/configure/cas-enable-external-services.yaml ~/project/deploy/${current_namespace}/site-config/cas-enable-external-services_shared-gelcorp.yaml

    Click here to see the cas-enable-external-services_shared-gelcorp.yaml content

    cat ~/project/deploy/${current_namespace}/site-config/cas-enable-external-services_shared-gelcorp.yaml
    ---
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: cas-enable-external-services
    patch: |-
      # After you set publishBinaryService to true, apply
      # the manifest, you can view the Service with
      # `kubectl get svc sas-cas-server-default-bin`
      - op: add
        path: /spec/publishBinaryService
        value: true
      # After you set publishHTTPService to true, apply
      # the manifest, you can view the Service with
      # `kubectl get svc sas-cas-server-default-http`
      #- op: add
      #  path: /spec/publishHTTPService
      #  value: true
      # By default, the services are added as NodePorts.
      # To configure them as LoadBalancers, uncomment the following
      # service template and optionally, set source ranges.
      #
      # Note: Setting the service template to LoadBalancer
      #       affects all CAS services, including the publishDCServices
      #       and publishEPCSService if those are set for SAS/ACCESS and
      #       Data Connectors.
      # - op: add
      #   path: /spec/serviceTemplate
      #   value:
      #     spec:
      #       type: LoadBalancer
      #       loadBalancerSourceRanges:
      #       - 192.168.0.0/16
      #       - 10.0.0.0/8
      #
      # Note: Some cloud providers may require additional settings
      # in the service template. For example, adding the following
      # annotation lets you set the load balancer timeout on AWS:
      #
      # - op: add
      #   path: /spec/serviceTemplate
      #   value:
      #     spec:
      #       type: LoadBalancer
      #       loadBalancerSourceRanges:
      #       - 192.168.0.0/16
      #       - 10.0.0.0/8
      #     metadata:
      #       annotations:
      #         service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "300"
      #
      # Consult your cloud provider's documentation for more information.
    target:
      group: viya.sas.com
      kind: CASDeployment
      # Uncomment this to apply to all CAS servers:
      name: .*
      # Uncomment this to apply to one particular named CAS server:
      #name: {{ NAME-OF-SERVER }}
      # Uncomment this to apply to the default CAS server:
      #labelSelector: "sas.com/cas-server-default"
      version: v1alpha1
  2. Modify the site-config/cas-enable-external-services_shared-gelcorp.yaml patchTransformer manifest to apply it only on the cas-shared-gelcorp server

    1. Enable the HTTP service

      You have to uncomment the lines that set the HTTP service (publishHTTPService).

      Note that the lines that set the binary service (publishBinaryService) are not commented by default.

      • Use these sed commands to modify the ~/project/deploy/gelcorp/site-config/cas-enable-external-services_shared-gelcorp.yaml manifest:

        sed -i 's/\#- op: add/- op: add/' ~/project/deploy/${current_namespace}/site-config/cas-enable-external-services_shared-gelcorp.yaml
        sed -i "s/\#  path: \/spec\/publishHTTPService/  path: \/spec\/publishHTTPService/g" ~/project/deploy/${current_namespace}/site-config/cas-enable-external-services_shared-gelcorp.yaml
        sed -i "s/\#  value: true/  value: true/g" ~/project/deploy/${current_namespace}/site-config/cas-enable-external-services_shared-gelcorp.yaml
      • Alternatively, you can update the ~/project/deploy/gelcorp/site-config/cas-enable-external-services_shared-gelcorp.yaml manifest using your favorite text editor:

        ---
        apiVersion: builtin
        kind: PatchTransformer
        metadata:
          name: cas-enable-external-services
        patch: |-
          # After you set publishBinaryService to true, apply
          # the manifest, you can view the Service with
          # `kubectl get svc sas-cas-server-default-bin`
          - op: add
            path: /spec/publishBinaryService
            value: true
          # After you set publishHTTPService to true, apply
          # the manifest, you can view the Service with
          # `kubectl get svc sas-cas-server-default-http`
          - op: add
            path: /spec/publishHTTPService
            value: true
          # By default, the services are added as NodePorts.
        [...]
    2. Change the Target filtering in the cas-enable-external-services_shared-gelcorp.yaml patchTransformer manifest

      Only the cas-shared-gelcorp server has to be modified. By default, the provided manifest targets all CAS Server. Because of that, it is required to modify the manifest to enable the binary and HTTP services only for the cas-shared-gelcorp server.

      • Use these sed commands to modify the ~/project/deploy/gelcorp/site-config/cas-enable-external-services_shared-gelcorp.yaml manifest:

        sed -i 's/name: \.\*/\#name: \.\*/' ~/project/deploy/${current_namespace}/site-config/cas-enable-external-services_shared-gelcorp.yaml
        sed -i "s/\#name: {{ NAME-OF-SERVER }}/name: shared-${current_namespace}/g" ~/project/deploy/${current_namespace}/site-config/cas-enable-external-services_shared-gelcorp.yaml
    3. Look at the modifications you made in the cas-enable-external-services_shared-gelcorp.yaml manifest

      cat ~/project/deploy/${current_namespace}/site-config/cas-enable-external-services_shared-gelcorp.yaml
      • Alternatively, you can update the ~/project/deploy/gelcorp/site-config/cas-enable-external-services_shared-gelcorp.yaml manifest using your favorite text editor:

        [...]
          # Consult your cloud provider's documentation for more information.
        target:
          group: viya.sas.com
          kind: CASDeployment
          # Uncomment this to apply to all CAS servers:
          #name: .*
          # Uncomment this to apply to one particular named CAS server:
          name: shared-gelcorp
          # Uncomment this to apply to the default CAS server:
          #labelSelector: "sas.com/cas-server-default"
          version: v1alpha1

      Click here to see the cas-manage-backup-cas-shared-default.yaml content

      ---
      apiVersion: builtin
      kind: PatchTransformer
      metadata:
        name: cas-enable-external-services
      patch: |-
        # After you set publishBinaryService to true, apply
        # the manifest, you can view the Service with
        # `kubectl get svc sas-cas-server-default-bin`
        - op: add
          path: /spec/publishBinaryService
          value: true
        # After you set publishHTTPService to true, apply
        # the manifest, you can view the Service with
        # `kubectl get svc sas-cas-server-default-http`
        - op: add
          path: /spec/publishHTTPService
          value: true
        # By default, the services are added as NodePorts.
        # To configure them as LoadBalancers, uncomment the following
        # service template and optionally, set source ranges.
        #
        # Note: Setting the service template to LoadBalancer
        #       affects all CAS services, including the publishDCServices
        #       and publishEPCSService if those are set for SAS/ACCESS and
        #       Data Connectors.
        # - op: add
        #   path: /spec/serviceTemplate
        #   value:
        #     spec:
        #       type: LoadBalancer
        #       loadBalancerSourceRanges:
        #       - 192.168.0.0/16
        #       - 10.0.0.0/8
        #
        # Note: Some cloud providers may require additional settings
        # in the service template. For example, adding the following
        # annotation lets you set the load balancer timeout on AWS:
        #
        # - op: add
        #   path: /spec/serviceTemplate
        #   value:
        #     spec:
        #       type: LoadBalancer
        #       loadBalancerSourceRanges:
        #       - 192.168.0.0/16
        #       - 10.0.0.0/8
        #     metadata:
        #       annotations:
        #         service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "300"
        #
        # Consult your cloud provider's documentation for more information.
      target:
        group: viya.sas.com
        kind: CASDeployment
        # Uncomment this to apply to all CAS servers:
        #name: .*
        # Uncomment this to apply to one particular named CAS server:
        name: shared-gelcorp
        # Uncomment this to apply to the default CAS server:
        #labelSelector: "sas.com/cas-server-default"
        version: v1alpha1
  3. Modify ~/project/deploy/gelcorp/kustomization.yaml to reference the patchTransformer manifest.

    • Backup the current kustomization.yaml file.

      cp -p ~/project/deploy/${current_namespace}/kustomization.yaml /tmp/${current_namespace}/kustomization_07-081-01.yaml
    • In the transformers field add the line “- site-config/cas-manage-workers-cas-shared-default.yaml” using the yq tool:

      [[ $(grep -c "site-config/cas-enable-external-services_shared-gelcorp.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
      yq4 eval -i ".transformers += [\"site-config/cas-enable-external-services_shared-gelcorp.yaml\"]" ~/project/deploy/${current_namespace}/kustomization.yaml
    • Alternatively, you can update the ~/project/deploy/gelcorp/kustomization.yaml file using your favorite text editor:

      [...]
      transformers:
      [... previous transformers items ...]
      - site-config/cas-enable-external-services_shared-gelcorp.yaml
      [...]

    Verify that the modification is in place.

    cat ~/project/deploy/${current_namespace}/kustomization.yaml

    Search for - site-config/cas-enable-external-services_shared-gelcorp.yaml into the transformers field of the kustomization.yaml file.

    Click here to see the output

    ---
    namespace: gelcorp
    resources:
      - sas-bases/base
      # GEL Specifics to create CA secret for OpenSSL Issuer
      - site-config/security/gel-openssl-ca
      - sas-bases/overlays/network/networking.k8s.io # Using networking.k8s.io API since 2021.1.6
      - site-config/security/openssl-generated-ingress-certificate.yaml # Default to OpenSSL Issuer in 2021.2.6
      - sas-bases/overlays/cas-server
      - sas-bases/overlays/crunchydata/postgres-operator # New Stable 2022.10
      - sas-bases/overlays/postgres/platform-postgres # New Stable 2022.10
      - sas-bases/overlays/internal-elasticsearch # New Stable 2020.1.3
      - sas-bases/overlays/update-checker # added update checker
      ## disable CAS autoresources to keep things simpler
      #- sas-bases/overlays/cas-server/auto-resources                                        # CAS-related
      #- sas-bases/overlays/crunchydata_pgadmin                                              # Deploy the sas-crunchy-data-pgadmin container - remove 2022.10
      - site-config/sas-prepull/add-prepull-cr-crb.yaml
      - sas-bases/overlays/cas-server/state-transfer # Enable state transfer for the cas-shared-default CAS server - new PVC sas-cas-transfer-data
      - site-config/sas-microanalytic-score/astores/resources.yaml
      - site-config/gelcontent_pvc.yaml
      - site-config/cas-shared-gelcorp
    configurations:
      - sas-bases/overlays/required/kustomizeconfig.yaml
    transformers:
      - sas-bases/overlays/internal-elasticsearch/sysctl-transformer.yaml # New Stable 2020.1.3
      - sas-bases/overlays/startup/ordered-startup-transformer.yaml
      - site-config/cas-enable-host.yaml
      - sas-bases/overlays/required/transformers.yaml
      - site-config/mirror.yaml
      #- site-config/daily_update_check.yaml      # change the frequency of the update-check
      #- sas-bases/overlays/cas-server/auto-resources/remove-resources.yaml    # CAS-related
      ## temporarily removed to alleviate RACE issues
      - sas-bases/overlays/internal-elasticsearch/internal-elasticsearch-transformer.yaml # New Stable 2020.1.3
      - sas-bases/overlays/sas-programming-environment/enable-admin-script-access.yaml # To enable admin scripts
      #- sas-bases/overlays/scaling/zero-scale/phase-0-transformer.yaml
      #- sas-bases/overlays/scaling/zero-scale/phase-1-transformer.yaml
      - sas-bases/overlays/cas-server/state-transfer/support-state-transfer.yaml # Enable state transfer for the cas-shared-default CAS server - enable and mount new PVC
      - site-config/change-check-interval.yaml
      - sas-bases/overlays/sas-microanalytic-score/astores/astores-transformer.yaml
      - site-config/sas-pyconfig/change-configuration.yaml
      - site-config/sas-pyconfig/change-limits.yaml
      - site-config/cas-add-nfs-mount.yaml
      - site-config/cas-add-allowlist-paths.yaml
      - site-config/cas-modify-user.yaml
      - site-config/cas-manage-casdiskcache-shared-gelcorp.yaml
      - site-config/cas-manage-workers-cas-shared-default.yaml
      - site-config/cas-manage-backup-cas-shared-default.yaml
      - site-config/cas-enable-external-services_shared-gelcorp.yaml
    components:
      - sas-bases/components/crunchydata/internal-platform-postgres # New Stable 2022.10
      - sas-bases/components/security/core/base/full-stack-tls
      - sas-bases/components/security/network/networking.k8s.io/ingress/nginx.ingress.kubernetes.io/full-stack-tls
    patches:
      - path: site-config/storageclass.yaml
        target:
          kind: PersistentVolumeClaim
          annotationSelector: sas.com/component-name in (sas-backup-job,sas-data-quality-services,sas-commonfiles,sas-cas-operator,sas-pyconfig)
      - path: site-config/cas-gelcontent-mount-pvc.yaml
        target:
          group: viya.sas.com
          kind: CASDeployment
          name: .*
          version: v1alpha1
      - path: site-config/compute-server-add-nfs-mount.yaml
        target:
          labelSelector: sas.com/template-intent=sas-launcher
          version: v1
          kind: PodTemplate
      - path: site-config/compute-server-annotate-podtempate.yaml
        target:
          name: sas-compute-job-config
          version: v1
          kind: PodTemplate
    secretGenerator:
      - name: sas-consul-config
        behavior: merge
        files:
          - SITEDEFAULT_CONF=site-config/sitedefault.yaml
      - name: sas-image-pull-secrets
        behavior: replace
        type: kubernetes.io/dockerconfigjson
        files:
          - .dockerconfigjson=site-config/crcache-image-pull-secrets.json
    configMapGenerator:
      - name: ingress-input
        behavior: merge
        literals:
          - INGRESS_HOST=gelcorp.pdcesx03145.race.sas.com
      - name: sas-shared-config
        behavior: merge
        literals:
          - SAS_SERVICES_URL=https://gelcorp.pdcesx03145.race.sas.com
      # # This is to fix an issue that only appears in very slow environments.
      # # Do not do this at a customer site
      - name: sas-go-config
        behavior: merge
        literals:
          - SAS_BOOTSTRAP_HTTP_CLIENT_TIMEOUT_REQUEST='15m'
      - name: input
        behavior: merge
        literals:
          - IMAGE_REGISTRY=crcache-race-sas-cary.unx.sas.com
  4. Now let’s rebuild and apply the Viya deployment manifest.

    1. Keep a copy of the current manifest.yaml file.

      cp -p /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml /tmp/${current_namespace}/manifest_07-081-01.yaml
    2. Generate the SAS Deployment Custom Resource

      cd ~/project/deploy
      rm -rf /tmp/${current_namespace}/deploy_work/*
      source ~/project/deploy/.${current_namespace}_vars
      
      docker run --rm \
                 -v ${PWD}/license:/license \
                 -v ${PWD}/${current_namespace}:/${current_namespace} \
                 -v ${HOME}/.kube/config_portable:/kube/config \
                 -v /tmp/${current_namespace}/deploy_work:/work \
                 -e KUBECONFIG=/kube/config \
                 --user $(id -u):$(id -g) \
             sas-orchestration \
                deploy \
                   --namespace ${current_namespace} \
                   --deployment-data /license/SASViyaV4_${_order}_certs.zip \
                   --license /license/SASViyaV4_${_order}_license.jwt \
                   --user-content /${current_namespace} \
                   --cadence-name ${_cadenceName} \
                   --cadence-version ${_cadenceVersion} \
                   --image-registry ${_viyaMirrorReg}

      When the deploy command completes successfully the final message should say The deploy command completed successfully as shown in the log snippet below.

      The deploy command started
      
      [...]
      
      The deploy command completed successfully

      If the sas-orchestration deploy command fails checkout the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

  5. Restart the cas-shared-gelcorp server using the state transfer capability

    Since we enabled the state transfer the cas-shared-gelcorp server you have now choices to restart the CAS server.

    • Choice 1: initiate the state transfer

      All loaded tables and active CAS session will be kept.

      The casoperator.sas.com/instance-index label for all pods of the CAS server will be incremented by 1.

      kubectl patch casdeployment \
                    shared-gelcorp \
                    --type='json' \
                    -p='[{"op": "replace", "path": "/spec/startStateTransfer", "value":true}]'
      sleep 60s
      kubectl wait pods \
                   --selector="casoperator.sas.com/server==shared-gelcorp" \
                   --for condition=ready \
                   --timeout 15m
    • Choice 2: delete the CAS server pods

      All loaded tables and active CAS sessions will be lost.

      The casoperator.sas.com/instance-index label for all pods of the CAS server will be reset to 0.

      kubectl delete pod \
                     --selector="casoperator.sas.com/server==shared-gelcorp"
      sleep 60s
      kubectl wait pods \
                   --selector="casoperator.sas.com/server==shared-gelcorp" \
                   --for condition=ready \
                   --timeout 15m

    You should see these messages in the output.

    pod/sas-cas-server-shared-gelcorp-controller condition met
    pod/sas-cas-server-shared-gelcorp-worker-0 condition met
    pod/sas-cas-server-shared-gelcorp-worker-1 condition met
    pod/sas-cas-server-shared-gelcorp-worker-2 condition met
    pod/sas-cas-server-shared-gelcorp-worker-4 condition met
  6. Find the information regarding the new enabled binary and HTTP services for the cas-shared-gelcorp server.

    kubectl get services \
                --selector "casoperator.sas.com/server=shared-gelcorp" \
       | grep -E "NAME | NodePort | LoadBalancer "
    NAME                                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                     AGE
    sas-cas-server-shared-gelcorp-bin      NodePort    10.43.154.32    <none>        5570:21761/TCP                              7m59s
    sas-cas-server-shared-gelcorp-http     NodePort    10.43.141.106   <none>        8777:11115/TCP,80:27041/TCP,443:24107/TCP   7m59s

Access the cas-shared-gelcorp server using the binary service

Use the SAS 9.4 display manager to access the cas-shared-gelcorp server

  1. Find the required connection information:

    1. the host

      echo "The required RACE machine name: $(hostname)"
    2. the port

      The information you got above regarding the binary service port is like

      NAME                                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                     AGE
      sas-cas-server-shared-gelcorp-bin      NodePort    10.43.154.32    <none>        5570:21761/TCP                              7m59s

      The required port is the 5570 port forwarding value like the bold value in this example: “5570:21761/TCP”.

      The commands bellow will provide you with the required value.

      _binaryServicePort=$(kubectl get services \
            --selector "casoperator.sas.com/server=shared-gelcorp" \
         | grep "shared-gelcorp-bin" \
         | awk '{printf $5}' \
         | awk -F ":" '{printf $2}' \
         | awk -F "/" '{printf $1}')
      
      echo "The required binary service port number: ${_binaryServicePort}"
  2. Open the SAS 9.4 display manager

    07_061_SASDisplayManager_Query_Gelcorp_CASServer_BinaryService_0000
  3. Open the CASServer_TestExternalAccess.sas program and modify it using the connection information (host and port) you get above.

    The provided CASServer_TestExternalAccess.sas program required to be modify to set the required host and port parameters.

    07_061_SASDisplayManager_Query_Gelcorp_CASServer_BinaryService_0001

    You have to replace the template values by the value you found in a step above.

    • Replace <machinename> by the host short name
    • Replace <port> by the binary service port number
    07_061_SASDisplayManager_Query_Gelcorp_CASServer_BinaryService_0002
  4. Execute it to validate the connection to the cas-shared-gelcorp server using its binary service.

    Now you just have to submit the SAS code by pressing the F3 key

    07_061_SASDisplayManager_Query_Gelcorp_CASServer_BinaryService_0003

    You can see in the Log window that it was possible to connect to the cas-shared-gelcorp server through its binary service from a SAS 9.4 client.

    07_061_SASDisplayManager_Query_Gelcorp_CASServer_BinaryService_0004

    If you are more curious, you can look at the CAS session details and the mounted CAS libraries content.

Access the cas-shared-gelcorp server using the HTTP service

In this step of the hands-on, you will have to test the HTTP service connection to the cas-shared-gelcorp server by listing the CAS server nodes.

You will do it from:

  • The Windows machine (sas-client) using the Postman application.
  • The Linux machine through a MobaXterm session using curl.
  1. On Windows, use Postman query to access the cas-shared-gelcorp server through its HTTP service

    Using the Postman application, test the cas-shared-gelcorp server HTTP service access.

    1. Open Postman on your sas-client (Windows) machine

    2. Update the RACE environment {{HTTPServicePort}} current variable value

      A Postman RACE environment was created for you with some variables that will be used by the Postman queries.

      The {{HTTPServicePort}} variable values (initial and current) were set to servicePort.

      07_061_Postman_Query_Gelcorp_CASServer_HTTPService_0000

      You have to replace the {{HTTPServicePort}} variable current value with the value returned by running this command in MobaXterm.

      The information you got above regarding the HTTP service port is like

      NAME                                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                     AGE
      sas-cas-server-shared-gelcorp-http     NodePort    10.43.141.106   <none>        8777:11115/TCP,80:27041/TCP,443:24107/TCP   7m59s

      The required port is the 443 port forwarding value like the bold value in this example: “8777:11115/TCP,80:27041/TCP,443:24107/TCP”.

      The commands bellow will provide you with the required value.

      _HTTPServicePort=$(kubectl get services \
            --selector "casoperator.sas.com/server=shared-gelcorp" \
         | grep "shared-gelcorp-http" \
         | awk '{printf $5}' \
         | awk -F "," '{printf $3}' \
         | awk -F ":" '{printf $2}' \
         | awk -F "/" '{printf $1}')
      
      echo "The required HTTP service port number: ${_HTTPServicePort}"

      Copy this returned value to the CURRENT VALUE of the {{HTTPServicePort}} variable.

      07_061_Postman_Query_Gelcorp_CASServer_HTTPService_0001

      Then save the modified RACE Postman environment.

    3. Run the Get NodeNames query from the HTTP Service Postman collection

      A Postman collections were created for you to query the the cas-shared-gelcorp server.

      At this step, open the HTTP Service collection and ensure that the RACE Postman environment is selected.

      07_061_Postman_Query_Gelcorp_CASServer_HTTPService_0002

      Then just send the query to get the result.

      07_061_Postman_Query_Gelcorp_CASServer_HTTPService_0003
  2. On Linux, use a curl query to access the cas-shared-gelcorp server

    From a MobaXterm session, run the command bellow to test the cas-shared-gelcorp server HTTP service using curl.

    _HTTPServicePort=$(kubectl get services \
          --selector "casoperator.sas.com/server=shared-gelcorp" \
       | grep "shared-gelcorp-http" \
       | awk '{printf $5}' \
       | awk -F "," '{printf $3}' \
       | awk -F ":" '{printf $2}' \
       | awk -F "/" '{printf $1}')
    
    echo ${_HTTPServicePort}
    
    curl --user geladm:lnxsas https://gelcorp.$(hostname -f):${_HTTPServicePort}/cas/nodeNames
    [
     "controller.sas-cas-server-shared-gelcorp.gelcorp.svc.cluster.local",
     "worker-1.sas-cas-server-shared-gelcorp.gelcorp.svc.cluster.local",
     "worker-0.sas-cas-server-shared-gelcorp.gelcorp.svc.cluster.local",
     "worker-3.sas-cas-server-shared-gelcorp.gelcorp.svc.cluster.local",
     "worker-2.sas-cas-server-shared-gelcorp.gelcorp.svc.cluster.local"
    ]

Lessons learned

  • By default in each SAS Viya deployment, the HTTP ingress is enabled for all CAS servers. This HTTP ingress allows access to a CAS server from outside the SAS Viya deployment using the CAS REST-API only. It is impossible to access directly the CAS server using a binary protocol from any SAS clients.

  • The SAS VIya administrator can decide to enable two services to allow access to a CAS server from outside the SAS Viya deployment: the binary and HTTP services. These services are Kubernetes nodePort services by default, but can be configure to use a loadBalancer (please refer to the documentation for this kind of settings).

    The binary and HTTP services can be enabled individually or simultaneously.

  • Enabling the binary and HTTP services requires to regenerate tke SASDeploynment custom resource, or the site.yaml files, and apply it then to restart the CAS servers.


SAS Viya Administration Operations
Lesson 07, Section 6 Exercise: Remove a CAS Server

Remove a CAS Server (non cas-shared-default) - OPTIONAL

In this exercise you will remove the cas-shared-gelcorp server from your Viya deployment.

Table of content

Set the namespace

gel_setCurrentNamespace gelcorp

Remove the cas-shared-gelcorp server

  1. Delete the cas-shared-gelcorp server CASDeployment

    1. List current CASDeployments

      kubectl get casdeployments

      You should see…

      NAME             AGE
      default          10d
      shared-gelcorp   3h25m
    2. Delete the cas-shared-gelcorp server CASDeployment

      kubectl delete casdeployments \
                     shared-gelcorp

      You should see…

      casdeployment.viya.sas.com "shared-gelcorp" deleted
    3. Validate that the cas-shared-gelcorp server CASDeployment was deleted

      kubectl get casdeployments

      You should see…

      NAME      AGE
      default   10d
  2. If you do not plan to reuse the shared-gelcorp server, you can delete the cas-shared-gelcorp server manifests

    rm -rf ~/project/deploy/${current_namespace}/site-config/cas-shared-gelcorp
    ls -al ~/project/deploy/${current_namespace}/site-config/*shared-gelcorp*
  3. Using your favorite text editor, remove (or comment) the cas-shared-gelcorp server manifests references from the kustomize.yaml file inside the current project directory.

    • Backup the current kustomization.yaml file.

      cp -p ~/project/deploy/${current_namespace}/kustomization.yaml /tmp/${current_namespace}/kustomization_07-061-01.yaml
    • Remove (or comment) this references from the kustomization.yaml file:

      • In resources section: - site-config/cas-shared-gelcorp
      • In transformers section: - site-config/cas-manage-topology-shared-gelcorp.yaml
      [[ $(grep -c "site-config/cas-shared-gelcorp" ~/project/deploy/${current_namespace}/kustomization.yaml) == 1 ]] && \
      _itemIndex=$(yq4 eval '.resources.[] | select(. == "site-config/cas-shared-gelcorp") | path | .[1]' ~/project/deploy/${current_namespace}/kustomization.yaml) && \
      yq4 eval -i 'del.resources.['${_itemIndex}']' ~/project/deploy/${current_namespace}/kustomization.yaml
  4. Check that the update was applied

    yq4 eval '.resources.[] | select(. == "*shared-gelcorp*")' ~/project/deploy/${current_namespace}/kustomization.yaml

    You should see no references to these CAS Server manifests in the kustomization.yaml on the current project directory.

  5. Keep a copy of the current manifest.yaml file.

    cp -p /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml /tmp/${current_namespace}/manifest_07-061-01.yaml
  6. Run the sas-orchestration deploy command.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
               -v ${PWD}/license:/license \
               -v ${PWD}/${current_namespace}:/${current_namespace} \
               -v ${HOME}/.kube/config_portable:/kube/config \
               -v /tmp/${current_namespace}/deploy_work:/work \
               -e KUBECONFIG=/kube/config \
               --user $(id -u):$(id -g) \
           sas-orchestration \
              deploy \
                 --namespace ${current_namespace} \
                 --deployment-data /license/SASViyaV4_${_order}_certs.zip \
                 --license /license/SASViyaV4_${_order}_license.jwt \
                 --user-content /${current_namespace} \
                 --cadence-name ${_cadenceName} \
                 --cadence-version ${_cadenceVersion} \
                 --image-registry ${_viyaMirrorReg}

    When the deploy command completes successfully the final message should say The deploy command completed successfully as shown in the log snippet below.

    The deploy command started
    
    [...]
    
    The deploy command completed successfully

    If the sas-orchestration deploy command fails checkout the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

  7. Look at the existing CASDeployment custom resources

    kubectl get casdeployment

    You should see…

    NAME             AGE
    default          3h8m

    You can now look at the status of all CAS server pods by running this command.

    kubectl get pods \
                --selector="app.kubernetes.io/managed-by==sas-cas-operator" \
                -o wide

    You should see something like…

    NAME                                       READY   STATUS    RESTARTS   AGE     IP            NODE        NOMINATED NODE   READINESS GATES
    sas-cas-server-default-controller          3/3     Running   0          3d13h   10.42.2.74    intnode03   <none>           <none>

    Only the cas-shared-default server is now started and ready to be used. No more cas-shared-gelcorp server.

Lessons learned

  • It is easy to remove a CAS Server from your Viya deployment.

    1. Deleting the CAS Server CAS Deployment is a proven practice that guarantees that the CAS Server will become inactive immediately (if not the CAS Server will stay active until the new SASDeployment custom resource will be fully applied).

    2. Removing the CAS Server manifests from the kustomization.yaml file is mandatory.

    3. Deleting the CAS Server manifests from the site-config directory is not mandatory. These manifests can be preserved specifically if you plan to re-onboard th2 CAS Server in a near future.

    4. Run the sas-orchestration deploy command is mandatory.

  • Never remove the cas-shared-default server since it is required for the Viya deployment to be fully functional.


Lesson 08

SAS Viya Administration Operations
Lesson 08, Section 0 Exercise: Start and Stop SAS Viya

Starting and Stopping SAS Viya

In this exercise we will walk through the process for stopping and automatically restarting individual Viya pods, as well as completely stopping and then starting the entire Viya deployment. We will also look the commands that can be run to monitor the progress of the shutdown and startup sequences and validate readiness of the deployment.

Table of contents

Set the namespace, the sas-viya CLI profile, and authenticate

  1. Set the current namespace and log on.

    gel_setCurrentNamespace gelcorp
    /opt/pyviyatools/loginviauthinfo.py

Restart a pod

In this step, we will try restarting one individual pod, but the same process can be used to restart a set of pods or all pods in the namespace.

  1. View the list of pods in your namespace.

    kubectl get pods
  2. Delete a single pod. For example, kill the sas-transfer pod.

    kubectl delete pod -l app=sas-transfer
  3. Try viewing the list of pods again, this time filtering using the label for the sas-transfer pod only.

    kubectl get pods -l app=sas-transfer
    NAME                            READY   STATUS    RESTARTS   AGE
    sas-transfer-59bc4c9966-mwnvm   1/1     Running   0          77s

    Note that the transfer pod appears to be running, but the AGE column has been updated. The pod is newer than the other pods because it was automatically restarted when we deleted it.

Remember that Kubernetes uses ReplicaSets to maintain a stable set of running pods. If a pod dies (or is killed), a new, identical replica is automatically started to achieve the desired declared state.

NOTE: If intending to restart all pods in the namespace, remember to delete the pods and not the namespace. Deleting the namespace will delete all resources, including pods, secrets, services, deployments, replicasets and nothing will be automatically restarted. The namespace will need to be redeployed by building and applying the manifest file.

To restart all pods (do not run in the workshop environment), run: kubectl delete pods --all

Stop the entire environment

It may sometimes be necessary to completely stop the SAS Viya deployment (all pods, jobs, etc.) without automatically restarting. For example, it may be necessary to stop in order to perform maintenance on the cluster nodes.

A Kubernetes cronjob is provided to perform start and stop operations whilst observing dependencies.

  1. Create a job to run an ad-hoc stop operation using the included sas-stop-all cronjob.

    kubectl create job sas-stop-all-`date +%s` --from cronjobs/sas-stop-all -n gelcorp

    Expected output:

    job.batch/sas-stop-all-1637717983 created
  2. Follow the pod log to check the status of the stop operation

    kubectl logs --follow  \
       $(kubectl get po --no-headers  -l "job-name=$(kubectl get job |grep sas-stop-all |awk '{print $1}')" | awk '{print $1'}) | gel_log

    Note that the log output indicates that the stop operation performs tasks such as stopping the operators, suspending jobs, and scaling deployments to zero replicas.

    The stop operation is finished when the log displays the message ‘The lifecycle run command completed successfully’.

  3. View the pods that are still running after the stop operation has completed.

    kubectl get pods

    Which components are still running?

    The remaining pods are pods from previously executed jobs (note that ‘0/1’ containers are ready of these pod, and their status is ‘Completed’). The exception is the Prometheus Pushgateway, which is a monitoring component; it continues to run in the namespace and is not included in the stop or start lifecycle operations. If any Compute sessions were running (e.g. if an users were logged in to SAS Studio), the pods for those sessions will also be running. Log out of those sessions (or delete the pods) to terminate them.

Start the environment

  1. Create a job to immediately run an ad-hoc start operation using the included sas-start-all cronjob.

    kubectl create job sas-start-all-`date +%s` --from cronjobs/sas-start-all -n gelcorp

    Expected output:

    job.batch/sas-start-all-1637719104
  2. Follow the pod log to monitor the status of the start operation.

    kubectl logs --follow \
       $(kubectl get po --no-headers  -l "job-name=$(kubectl get job |grep sas-start-all |awk '{print $1}')" | awk '{print $1'}) | gel_log

    Note that the log output indicates that the start operation performs tasks such as starting the operators, resuming jobs, and scaling deployments back up.

    Note: Messages like the following may appear in the log:

    JSON path '{.spec.replicas}' in resource 'apps/v1' 'Deployment' 'sas-data-profiles': replicas is not found

    This indicates the start operation is attempting (but failing) to find the previous replica count from the deployment spec. In this case, the pod is started with a default of 1 replica. These messages can be safely ignored.

    The start operation is finished when the log displays the message: The lifecycle run command completed successfully.

  3. Although the start operation has finished executing, pods are still starting in the background. View the status of the pods in the Viya namespace.

    kubectl get pods

    Note that many pods are still starting (with 0/1 containers ready). Continue to the next step to monitor the startup process.

Monitoring startup progress

There are several ways to monitor the startup process. This is one example, but you can choose from many applications or operating system tools to view the progress of startup and shut down procedures.

  1. Try monitoring the starting pods by using sas-readiness with tmux. First, open a new MobaXterm tab and run the following to define a tmux session to watch the pods.

    SessName=gelcorp_watch
    NS=gelcorp
    
    tmux new -s $SessName -d
    tmux send-keys \
      -t $SessName "watch 'kubectl get pods -o wide -n ${NS} | grep 0/ | grep -v Completed ' "  C-m
    tmux split-window -v -t $SessName
    
    tmux send-keys \
      -t $SessName "watch -n 5 \"kubectl -n ${NS} logs -l app=sas-readiness |gel_log | tail -n 1  \""  C-m
    tmux split-window -v -t $SessName
    
    tmux send-keys \
      -t $SessName "kubectl wait -n ${NS} --for=condition=ready pod -l app=sas-readiness  --timeout=2700s"  C-m
  2. Attach to the tmux session.

     tmux a -t ${SessName}

    Watch the pods, ensuring they disappear from the top pane as they return to Running state. The centre and bottom panes show the output of the readiness checks. Ensure they are completed to indicate the deployment is ready for use again.

  3. When all pods are started (after approximately 15 minutes), detach from the tmux session by pressing Ctrl + b, then d.

  4. Execute the gel_ReadyViya4 function as a final validation test. This function has been created to provide an easy way to query readiness and stability of the Viya deployment.

    gel_ReadyViya4 -n gelcorp -r 60 -rs 10

    In the command above, the -r flag specifies the number of minutes to wait for the first ready message, and the -rs flag defines the sensitivity (deployment is considered ‘ready’ even if there are this number of non-responsive endpoints).

    In output, look for the following messages:

    NOTE: All checks passed. Marking as ready. The first recorded failure was 16m3s ago.
    NOTE: Readiness detected based on parameters used.

Note: In a customer environment, you can use either of the above approaches to monitor/check readinesss of your deployment.


SAS Viya Administration Operations
Lesson 08, Section 1 Exercise: Apply a Patch Update

Applying Patch Updates

The Deployment Operator can be used to update your Viya deployment automatically. In addition to updating the Viya software version, it can also be used to update licenses, add or remove products, and switch cadences. In this exercise, you will check for and apply any patches that may be available for the deployed version of SAS Viya using the Deployment Operator.

Table of contents

Prerequisite steps

The output of the command that queries the sas-deployment ConfigMap also displays information about the cadence name, version and release.

  1. Run the command to check the release number of the SAS software running in your environment:

    kubectl -n gelcorp  get cm -o yaml | grep 'CADENCE' | head -8

    Expected output should be similar to this except for the SAS_CADENCE_RELEASE:

    SAS_BASE_CADENCE_NAME: stable
    SAS_BASE_CADENCE_VERSION: "2024.03"
    SAS_CADENCE_DISPLAY_NAME: Long-Term Support 2024.03
    SAS_CADENCE_DISPLAY_SHORT_NAME: Long-Term Support
    SAS_CADENCE_DISPLAY_VERSION: "2024.03"
    SAS_CADENCE_NAME: lts
    SAS_CADENCE_RELEASE: "20240930.1727729121985"
    SAS_CADENCE_VERSION: "2024.03"

    As shown in the output, this version of LTS 2024.03 is based on the Stable 2024.03 version, indicated by SAS_BASE_CADENCE_NAME and SAS_BASE_CADENCE_VERSION.

    Also note the value of SAS_CADENCE_RELEASE, which indicates the specific release number (the most granular level). This is the ‘patch level’ that has been applied.

  2. Next, check if there are any patches available for the deployed version using the Update Checker.

    The Update Checker is a job that runs on a schedule (as a Kubernetes cronjob), but it can be run on-demand as an ad-hoc job.

    Create the ad-hoc job from the Kubernetes cronjob.

    kubectl create job --from=cronjob/sas-update-checker update-checker-manual

    You should see:

     job.batch/update-checker-manual created
  3. View the job’s pod log to see the Update Checker report output:

    kubectl logs -f $(kubectl get pods | grep update-checker-manual | awk '{print $1}' ) | gel_log

    Note: The output of the report provides an alternative way to check the cadence, version and release of SAS software you have deployed.

    The output may indicate the availability of available patches, but there may not be any available depending on the specific release of SAS Viya you are running and whether or not any newer releases are available for your deployed version the at the time of running the Update Checker.

    If patches are available, they are indicated by a line in the report output indicating availability of a new version (a similar line will indicate whether an update is available for your deployed cadence):

    New release available for deployed version Long-Term Support 2024.03: Long-Term Support 2024.03 20240930.1727729121985.

    Additional detail is also displayed for releases available for individual products.

    If there are not any new patches available, the following will be displayed in the output:

    No new release available for deployed version Long-Term Support 2024.03.

    If there are no patches available for your deployment, you may skip this exercise and only return to complete the subsequent tasks after the update checker job shows that patches are available. You will need to delete the update-checker-manual job and recreate/re-run it. Depending on when patches ship, there may be be a delay of several days before patches are available for your environment.

  4. Delete the update-checker job and its associated pod.

    kubectl delete job update-checker-manual
  5. The recommended practice is to download the latest Deployment Assets for the target release and review enclosed README.md files for any product-specific pre-update tasks that may apply. While there there are unlikely to be many (if any) manual steps when applying patches, updated software introduced in a patch may require manual steps to be performed (e.g. updates to kustomization.yaml).

    In this exercise, in order to demonstrate the patch update process in a simple way, you will use the existing Deployment Assets that were used for the initial deployment of the environment.

    Click here to view the typical process for downloading new assets (not required for this exercise)

    • Backup the existing $deploy directory.
    • Download and copy new deployment assets to the $deploy directory
    • Delete the sas-bases directory with rm -rf sas-bases
    • Extract the new assets
    • Review README files in sas-bases for applicable product-specific configuration changes that need to be performed

Applying the latest patch release

Updating your software (to a new release/patch or version) with the Deployment Operator requires that you specify in the SASDeployment CR file the release that you would like to apply. You can also apply the latest available patch release (instead of a specific release) by simply inserting a blank release number for the cadenceRelease parameter in the CR file as per the below instructions.

Deploy

  1. Perform the update by running the sas-orchestration deploy task. Note the value of the --cadence-release parameter; in a customer environment, the value for this property should be set to the release number matching your deployment assets (which can be obtained from $deploy/sas-bases/.orchestration/cadence.yaml). For this exercise, we have set it to a value of "", which will result in the latest available release being applied.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
            -v ${PWD}/license:/license \
            -v ${PWD}/${current_namespace}:/${current_namespace} \
            -v ${HOME}/.kube/config_portable:/kube/config \
            -v /tmp/${current_namespace}/deploy_work:/work \
            -e KUBECONFIG=/kube/config \
            --user $(id -u):$(id -g) \
         sas-orchestration \
            deploy \
               --namespace ${current_namespace} \
               --deployment-data /license/SASViyaV4_${_order}_certs.zip \
               --license /license/SASViyaV4_${_order}_license.jwt \
               --user-content /${current_namespace} \
               --cadence-name ${_cadenceName} \
               --cadence-version ${_cadenceVersion} \
               --cadence-release "" \
               --image-registry ${_viyaMirrorReg}

    The deployment may take around 10 minutes to complete. When it is done, you will see the following in the terminal:

    Applying manifests complete
    The deploy command completed successfully

Validation

  1. Verify the release number of SAS Viya software you are now running by submitting the following command:

    kubectl -n gelcorp  get cm -o yaml | grep 'CADENCE' | head -8

    The output should indicate that your are now running a later SAS_CADENCE_RELEASE of Viya software than you were running when you executed this command at the beginning of the exercise.

    SAS_BASE_CADENCE_NAME: stable
    SAS_BASE_CADENCE_VERSION: "2024.03"
    SAS_CADENCE_DISPLAY_NAME: Long-Term Support 2024.03
    SAS_CADENCE_DISPLAY_SHORT_NAME: Long-Term Support
    SAS_CADENCE_DISPLAY_VERSION: "2024.03"
    SAS_CADENCE_NAME: lts
    SAS_CADENCE_RELEASE: "20240930.1727729121985"
    SAS_CADENCE_VERSION: "2024.03"

    Note: Another way to verify the release of software you are now running is to re-run an ad-hoc Update Checker job.

  2. Execute the gel_ReadyViya4 function as an additional check of readiness and stability after the update.

    gel_ReadyViya4 -n gelcorp -r 60 -rs 10

    In the output, look for the following messages: log NOTE: All checks passed. Marking as ready. The first recorded failure was 3m8s ago. NOTE: Readiness detected based on parameters used.

Cleanup

IMPORTANT: This step must be run to avoid issues with later exercises.

  1. Execute the following command as cloud-user in your MobaXterm terminal session:

    ~/PSGEL260-sas-viya-4.0.1-administration/scripts/gel_tools/gel_getSaveSASViyaDeploymentCadence.sh

SAS Viya Administration Operations
Lesson 08, Section 2 Exercise: Update to a New Version

Updating SAS Software

In this task, you will change the cadence of your Viya deployment from LTS to Stable using the sas-orchestration deploy task.

Table of contents

Switch Cadence

  1. You can view the deployed version information by running:

    kubectl get cm -o yaml | grep CADENCE | head -8

    You should see output similar to this:

    SAS_BASE_CADENCE_NAME: stable
    SAS_BASE_CADENCE_VERSION: "2024.03"
    SAS_CADENCE_DISPLAY_NAME: Long-Term Support 2024.03
    SAS_CADENCE_DISPLAY_SHORT_NAME: Long-Term Support
    SAS_CADENCE_DISPLAY_VERSION: "2024.03"
    SAS_CADENCE_NAME: lts
    SAS_CADENCE_RELEASE: "20240930.1727729121985"
    SAS_CADENCE_VERSION: "2024.03"

    As shown in the output, you are running LTS version 2024.03 as shown in the SAS_CADENCE_DISPLAY_NAME field. Also note that this version is based on the Stable 2024.03 version, indicated by SAS_BASE_CADENCE_NAME and SAS_BASE_CADENCE_VERSION.

  2. There are some restrictions to consider when switching cadence. When moving from LTS to Stable, it is a requirement that the target Stable version is at the same as or newer version than the source LTS version.

    From the source Long-Term Support yyyy.10 version deployed in the workshop environment, you can perform a single update to one of the following Stable versions: yyyy.11, yyyy.12, yyyy.01, yyyy.02, and yyyy.03. In the following task, the cadence will be switched to Stable 2023.11.

  3. Retrieve new deployment assets. Prior to an update, deployment assets for the target version must be downloaded and the enclosed README.md files should be reviewed for any product-specific pre-upgrade tasks (for example, tasks that require updates to the kustomization.yaml file).

Typically, new assets for the target version must be downloaded from my.sas.com or using the viya4-orders-cli. In the workshop environment, the new assets have already been downloaded.

Copy the new assets into your $deploy directory.

  • Backup the existing $deploy directory.

    cp -pr ~/project/deploy/gelcorp ~/project/deploy/gelcorp_07-031
  • Copy the new deployment assets to the $deploy directory.

    cp -p /mnt/workshop_files/workshop_content/updating-data/SASViyaV4_9CV11D_stable_2024.03_*_deploymentAssets_*.tgz ~/project/deploy/gelcorp
  • Delete sas-bases, and extract the new assets.

    cd ~/project/deploy/gelcorp
    rm -rf sas-bases
    tar xfv ~/project/deploy/gelcorp/SASViyaV4_9CV11D_stable_2023.05*
  1. At this point in the process, any relevant pre-upgrade tasks would typically be performed. These may include tasks such as:

    • Downloading the correct version of the sas-orchestration image or updating the SAS Deployment Operator
    • Product-specific configuration tasks as outlined in README files in the $deploy directory
    • Changes to configure your environment to use a mirror registry
    • Regenerating CAS Servers (for multi-tenant environments and deployments with multiple CAS servers)
    • Pausing SingleStore (if deployed)
    • Before Deployment steps documented in the Deployment Notes for your target version, as well as all applicable interim versions, excluding those that are marked as not applicable to deployments that were deployed with sas-orchestration (these will be automatically performed by the sas-orchestration deploy task).

    IMPORTANT: Note that the version of deployment assets used does not impact the version of the software that is downloaded as part of the update. Software updates are downloaded directly from the image registry for the cadence version specified in the deploy command, regardless of the version of deployment assets being used. However, product-specific configuration changes can be specific to a particular version of Viya (e.g. changes to TLS that were delivered in 2020.1.3 required additions/deletions from kustomization.yaml). As such, it is recommended to download the latest deployment assets and to carefully review the README files for configuration tasks and manual steps that may be applicable.

    For this particular upgrade of the workshop environment from LTS 2024.03 to Stable 2024.06, no pre-upgrade task are necessary (no revelant tasks in the Deployment Notes for Stable 2024.06).

Deploy

  1. Before you begin the deployment, you must copy the value that is shown for release in the $deploy/sas-bases/.orchestration/cadence.yaml file from the deployment assets. The following command will store the value in a variable you can refer to in the next step.

    cadenceReleaseNum=$(yq r sas-bases/.orchestration/cadence.yaml  spec.release)
  2. Run the sas-orchestration deploy command to start the upgrade, ensuring you insert the appropriate values for the --cadence-name and --cadence-version parameters. Remember to also insert the target release number (matching your deployment assets) as the value for the --cadence-release flag.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
            -v ${PWD}/license:/license \
            -v ${PWD}/${current_namespace}:/${current_namespace} \
            -v ${HOME}/.kube/config_portable:/kube/config \
            -v /tmp/${current_namespace}/deploy_work:/work \
            -e KUBECONFIG=/kube/config \
            --user $(id -u):$(id -g) \
         sas-orchestration \
            deploy \
               --namespace ${current_namespace} \
               --deployment-data /license/SASViyaV4_${_order}_certs.zip \
               --license /license/SASViyaV4_${_order}_license.jwt \
               --user-content /${current_namespace} \
               --cadence-name "stable" \
               --cadence-version "2024.06" \
               --cadence-release ${cadenceReleaseNum} \
               --image-registry ${_viyaMirrorReg}

    When the deployment completes, you will see the following in the terminal:

    Applying manifests complete
    The deploy command completed successfully

    WARNING: If you got an error regarding the sas-orchestration tool like:

    Error: Orchestration version is '1.93.2'; expected orchestration version is '1.97.>4-20230503.1683146435603

    you must update the sas-orchestration tool to be able execute successfully the sas-orchestration deploy command above.

    To update the sas-orchestration tool, you must execute these tasks:

    • In your environment, go the SAS Viya Platform deployment asset directory (in this workshop: /home/cloud-user/project/deploy/gelcorp/sas-bases)

    • Navigate to this directory: examples/kubernetes-tools/

    • Look at the Prerequisites section of the README.md file

    • Then run the two docker CLI commands below after extracting the required sas-orchestration tool image version:

      1. Extract the required image version of the sas-orchestration tool:

        _requiredVersion=$(grep "docker pull" ~/project/deploy/gelcorp/sas-bases/examples/kubernetes-tools/README.md | awk -F ":" '{print $2}')
      2. Load/Pull the required image.

        Note: In the docker commands below, we change the sas-orchestration tool docker image repository, since we use a mirror repository in this workshop.

        docker pull crcache-race-sas-cary.unx.sas.com/viya-4-x64_oci_linux_2-docker/sas-orchestration:${_requiredVersion}
      3. Tag the new image. This is required to use the sas-orchestration tool pod without having to pass the version of the image (like an alias).

        docker tag crcache-race-sas-cary.unx.sas.com/viya-4-x64_oci_linux_2-docker/sas-orchestration:${_requiredVersion} sas-orchestration

        The new version of the sas-orchestration tool is now updated and available in your environment. You can re-execute the sas-orchestration deploy command above.

Post-update Tasks

When the update has completed successfully, once again refer to the Deployment Notes and perform any documented After Deployment Commands for interim and target version. For this workshop, there are no manual post-update tasks to perform.

Validation

  1. Verify that the cadence has been switched in the output of the previous command. You can also verify the cadence version of SAS Viya you are now running by submitting the following command:

    kubectl -n gelcorp  get cm -o yaml | grep 'CADENCE' | head -6

    The output should indicate that your are now running a later SAS_CADENCE_VERSION of Viya software than you were running when you executed this command at the beginning of the exercise.

    SAS_CADENCE_DISPLAY_NAME: Stable 2024.06
    SAS_CADENCE_DISPLAY_SHORT_NAME: Stable
    SAS_CADENCE_DISPLAY_VERSION: "2024.06"
    SAS_CADENCE_NAME: stable
    SAS_CADENCE_RELEASE: "20240612.1702438953990"
    SAS_CADENCE_VERSION: "2024.06"

    Note: Now you have switched to stable cadence, there is no SAS_BASE_CADENCE_NAME field. Only LTS versions are based on earlier Stable cadence versions.

  2. Execute the gel_ReadyViya4 function as an additional check of readiness and stability after the update.

    gel_ReadyViya4 -n gelcorp -r 60 -rs 10

    In the output, look for the following messages: log NOTE: All checks passed. Marking as ready. The first recorded failure was 14m29s ago. NOTE: Readiness detected based on parameters used.

    Another way to verify the release of software you are now running is to re-run an ad-hoc Update Checker job (delete the previously run job if necessary).

Cleanup

IMPORTANT: This step must be run to avoid issues with later exercises.

  1. Execute the following command as cloud-user in your MobaXterm terminal session:

    ~/PSGEL260-sas-viya-4.0.1-administration/scripts/gel_tools/gel_getSaveSASViyaDeploymentCadence.sh

SAS Viya Administration Operations
Lesson 08, Section 3 Exercise: Update a License

Updating Licenses

In this exercise, we will renew the SAS license. Considering SAS Viya’s cadence lifecycle and the fact that deployment assets include a license, the license will always be current on a Stable cadence deployment as long as the deployment complies with the support policy (must be no more than four months/versions old). Licenses only need to be renewed on deployments running LTS cadence that are not updated at least once per year.

Table of contents

Set the namespace, the sas-viya CLI profile, and authenticate

  1. Set current namespace and log on.

    gel_setCurrentNamespace gelcorp
    /opt/pyviyatools/loginviauthinfo.py

Review existing license

First, review the existing license file and product expiry dates.

  1. Get the URL for SAS Environment Manager and then click the link in the terminal window.

    gellow_urls | cat "SAS Environment Manager""
  2. Log on as geladm:lnxsas and navigate to the Licenses area and review the list of products and expiration dates.

Apply new license

Renewal licenses are typically available from the customer’s My SAS portal, on the Orders page. For this hands-on activity, the renewal license has already been obtained from the portal and saved as renewal-license.jwt on the sasnode1 server in the cluster.

  1. Copy the renewal license file to the existing license * directory(inthe*deploy directory).

    cp /mnt/workshop_files/workshop_content/updating-data/renewal-license.jwt \ 
       /home/cloud-user/project/deploy/license/

    Deploy

  2. Run the sas-orchestration deploy command to start the license update, ensuring you pass in the path to the new license for the --license parameter.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
            -v ${PWD}/license:/license \
            -v ${PWD}/${current_namespace}:/${current_namespace} \
            -v ${HOME}/.kube/config_portable:/kube/config \
            -v /tmp/${current_namespace}/deploy_work:/work \
            -e KUBECONFIG=/kube/config \
            --user $(id -u):$(id -g) \
         sas-orchestration \
            deploy \
               --namespace ${current_namespace} \
               --deployment-data /license/SASViyaV4_${_order}_certs.zip \
               --license ./license/renewal-license.jwt \
               --user-content /${current_namespace} \
               --cadence-name ${_cadenceName} \
               --cadence-version ${_cadenceVersion} \
               --cadence-release "" \
               --image-registry ${_viyaMirrorReg}

Validation

  1. Use the sas-viya CLI’s licenses plugin to verify the license has been updated.

    sas_viya --output text licenses products list

    Verify that the expiry dates have been extended. log Product Name Product ID Status Max CPU Count Expiration Date (UTC) Grace Period End (UTC) Warning Period End (UTC) Base SAS 0 current No CPU limit 2026-11-11 2026-12-26 2027-02-09 SAS/STAT 1 current No CPU limit 2026-11-11 2026-12-26 2027-02-09 SAS/GRAPH 2 current No CPU limit 2026-11-11 2026-12-26 2027-02-09 Enterprise Miner Server 50 current No CPU limit 2026-11-11 2026-12-26 2027-02-09 SAS/Secure 94 current No CPU limit 2026-11-11 2026-12-26 2027-02-09 Cloud Analytic Services SAS Client 1000 current No CPU limit 2026-11-11 2026-12-26 2027-02-09 ...


SAS Viya Administration Operations
Lesson 08, Section 5 Exercise: Relocate SASWORK

Relocate SAS Programming Run-Time Temporary Files

In this exercise, we will relocate temporary files created for launched compute, connect and batch sessions, from an emptyDir in the pod to a hostpath on the node at /viyavolume. We will demonstrate the change with a SAS Batch program, but the change also effects SAS Compute and launched SAS Connect sessions. This does not effect spawned SAS Connect sessions.

In this hands-on exercise

Open a second terminal session

  1. In MobaXterm, you might have one SSH connection to sasnode01 as cloud-user open already. If not, open one now.

  2. Then, open a second SSH connection to sasnode01 as cloud-user so that you have two open at the same time. We will use both SSH connections in this exercise.

    Two tabs to sasnode01 open in MobaXterm

Submit a batch job which creates a SASWORK dataset

  1. In MobaXterm, in your first SSH connection to sasnode01 as cloud-user, run the bash commands below, all at once:

    tee /shared/gelcontent/gelcorp/shared/code/create_data_and_sleep.sas > /dev/null << EOF
    data dummy_data;
      do i=1 to 11;
        j=ranuni(1234);
        output;
      end;
    run;
    
    /* Keep the session open for five minutes (5 x 60 seconds) */
    data _null_;
        call sleep(5,60);
    run;
    EOF
    
    ls -al /shared/gelcontent/gelcorp/shared/code/create_data_and_sleep.sas
    cat /shared/gelcontent/gelcorp/shared/code/create_data_and_sleep.sas
  2. Then paste and run these commands all at once, to submit create_data_and_sleep.sas to run as a batch job:

    _sasprogram=/gelcontent/gelcorp/shared/code/create_data_and_sleep.sas
    
    # Run the SAS program as a batch job
    gel_sas_viya batch jobs submit-pgm --rem-pgm-path ${_sasprogram} --context default --watchoutput --waitnoresults --results-dir /tmp

    Note: This program creates a temporary dataset, then sleeps for five minutes.

    Watch the output from the command above until you see SAS log output begin to appear, but do not wait for the program to finish running!

    Leave this terminal tab open with the program still running.

Find that dataset in SASWORK on a node’s host filesystem

  1. In your second connection to sasnode01, while the SAS program create_data_and_sleep.sas is still running in your first connection, run this ansible command:

    ansible 'sasnode*' -b -m shell -a 'find /var/lib /viyavolume -type f -name "dummy_data.sas7bdat"  -print | xargs --no-run-if-empty sudo ls -al'

    Example results when create_data_and_sleep.sas is running

    Note: In the example results below, dummy_data.sas7bdat was on sasnode05. Our workshop cluster is configured so that batch jobs can run on any node.

    sasnode05 | CHANGED | rc=0 >>
    -rw-r--r-- 1 geladm sasadmins 131072 Sep 19 13:18 /var/lib/kubelet/pods/63a7c2c3-4c3e-4eff-8d8b-1ace89ac3498/volumes/kubernetes.io~empty-dir/viya/tmp/batch/default/SAS_workC299000001D9_sas-batch-server-8aa7518e-450c-4fce-bb2f-69da17eeba16-23/dummy_data.sas7bdatfind: ‘/viyavolume’: No such file or directory
    
    sasnode03 | CHANGED | rc=0 >>
    find: ‘/viyavolume’: No such file or directory
    
    sasnode02 | CHANGED | rc=0 >>
    find: ‘/viyavolume’: No such file or directory
    
    sasnode04 | CHANGED | rc=0 >>
    find: ‘/viyavolume’: No such file or directory
    
    sasnode01 | CHANGED | rc=0 >>
    find: ‘/viyavolume’: No such file or directory


    The dummy_data.sas7bdat file is at a path something like:

    /var/lib/kubelet/pods/63a7c2c3-4c3e-4eff-8d8b-1ace89ac3498/volumes/kubernetes.io~empty-dir/viya/tmp/batch/default/SAS_workC299000001D9_sas-batch-server-8aa7518e-450c-4fce-bb2f-69da17eeba16-23/dummy_data.sas7bdat

    This path begins with /var/lib/kubelet/pods/<guid>/volumes/kubernetes.io~empty-dir/viya, which is where our RKE Kubernetes deployment has created the host path for the pod’s emptyDir named viya. It is under Kubernetes’ control, and we are not really supposed to be doing anything with this directory.

    Below that path, the SAS_work directory is at …/tmp/batch/default/SAS_work<session_id>_<sas-launcher-pod-name>.

Create a /viyavolume directory on each node

Next we will create a directory on each node in our cluster, which we can mount into launched programming run-time pods as the ‘viya’ volume. This volume is where temporary files are created by processes running inside the SAS programming run-time container within those pods. If we mount our own hostpath volume into the pods, it will be used instead of the default emptyDir.

Note: In our RACE environment we do not have a more performant or larger volume available. So we will just create a directory on the boot disk, at the root of each node’s filesystem, as /viyavolume. In a production cluster, you would mount larger or more performant storage to the relevant nodes in your cluster, at whatever path you choose, and that path would be what you specify in the .path property of the ‘viya’ volume definition in the PodTemplate overlay created in the next step, where you see /viyavolume below.

  1. Run this in MobaXterm, at the sasnode01 shell prompt, to create a directory on each node in our cluster to be the host path for viya volumes in any pods that have one and run on that node:

    # Create volumes on each node for programming run-time temporary files
    ansible 'sasnode*' -b  -m 'shell' -a 'mkdir -p /viyavolume/; chmod 1777 /viyavolume/'
    ansible 'sasnode*' -m 'shell' -a 'echo "List /viyavolume/ directory content"; ls -al /viyavolume/'

Relocate the viya volume in programming run-time pods to use /viyavolume

  1. Run this to create a patchTransformer manifest to change the storage class for SAS Programming Run-time pods from emptyDir to nfs, and to add a viya volume pointing to the new directory path.

    # Relocate viya volume - sas-launcher-jobs
    tee ~/project/deploy/${current_namespace}/site-config/change-viya-volume-storage-class.yaml > /dev/null <<EOF
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: delete-viya-volume
    patch: |-
      apiVersion: v1
      kind: PodTemplate
      metadata:
        name: change-viya-volume-storage-class
      template:
        spec:
          volumes:
            - \$patch: delete
              name: viya
    target:
      kind: PodTemplate
      labelSelector: "sas.com/template-intent=sas-launcher"
    ---
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: add-viya-volume
    patch: |-
      - op: add
        path: /template/spec/volumes/-
        value:
          name: viya
          hostPath:
            path: /viyavolume
    target:
      kind: PodTemplate
      labelSelector: "sas.com/template-intent=sas-launcher"
    EOF
  2. Make a copy of your kustomization.yaml file, so we can see the effect of adding a line to it in the next step.

    cp -p ~/project/deploy/gelcorp/kustomization.yaml ~/project/deploy/gelcorp/kustomization-08-071-01.yaml
  3. Update your kustomization.yaml to reference this PatchTransformer:

    Modify ~/project/deploy/gelcorp/kustomization.yaml to reference site-config/change-viya-volume-storage-class.yaml. The change-viya-volume-storage-class.yaml needs to be referenced before sas-bases/overlays/required/transformers.yaml:

    [[ $(grep -c "site-config/change-viya-volume-storage-class.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    sed -i '/sas-bases\/overlays\/required\/transformers.yaml/i \ \ \- site-config\/change-viya-volume-storage-class.yaml' ~/project/deploy/${current_namespace}/kustomization.yaml

    Alternatively, you could have manually edited the transformers section to add the reference as shown below.

    transformers:
      ...
      - site-config/change-viya-volume-storage-class.yaml
      - sas-bases/overlays/required/transformers.yaml
      ...
  4. Run the following command to view the change this yq4 command made to your kustomization.yaml. The changes are in green in the right column.

    icdiff -W ~/project/deploy/gelcorp/kustomization-08-071-01.yaml ~/project/deploy/gelcorp/kustomization.yaml

    You should see change-viya-volume-storage-class.yaml in green on the right, before transformers.yaml:

    icdiff output showing new line before transformers.yaml
  5. Delete the kustomization-08-071-01.yaml file, so that it is not inadvertently included in gelcorp-sasdeployment.yaml in the next step.

    rm  ~/project/deploy/gelcorp/kustomization-08-071-01.yaml

Build and Apply using SAS-Orchestration Deploy

  1. Keep a copy of the current manifest.yaml file.

    cp -p /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml /tmp/${current_namespace}/manifest_08-071-01.yaml
  2. Run the sas-orchestration deploy command.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
              -v ${PWD}/license:/license \
              -v ${PWD}/${current_namespace}:/${current_namespace} \
              -v ${HOME}/.kube/config_portable:/kube/config \
              -v /tmp/${current_namespace}/deploy_work:/work \
              -e KUBECONFIG=/kube/config \
              --user $(id -u):$(id -g) \
           sas-orchestration \
             deploy \
                --namespace ${current_namespace} \
                --deployment-data /license/SASViyaV4_${_order}_certs.zip \
                --license /license/SASViyaV4_${_order}_license.jwt \
                --user-content /${current_namespace} \
                --cadence-name ${_cadenceName} \
                --cadence-version ${_cadenceVersion} \
                --image-registry ${_viyaMirrorReg}

    When the deploy command completes successfully, the final message should say ‘The deploy command completed successfully’ as shown in the log snippet below.

    The deploy command started
    
    [...]
    
    The deploy command completed successfully

    If the sas-orchestration deploy command fails checkout the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

Submit the SAS batch job which creates a dataset in SASWORK again

  1. In your first SSH connection to sasnode01 in MobaXterm, submit create_data_and_sleep.sas as a batch job:

    _sasprogram=/gelcontent/gelcorp/shared/code/create_data_and_sleep.sas
    
    # Run the SAS program as a batch job
    gel_sas_viya batch jobs submit-pgm --rem-pgm-path ${_sasprogram} --context default --watchoutput --waitnoresults --results-dir /tmp

    Watch the output from the command above until you see SAS log output begin to appear, but do not wait for the program to finish running!

    Leave this terminal tab open with the program still running.

Find dataset in SASWORK again

  1. In your second connection to sasnode01, while the SAS program create_data_and_sleep.sas is still running in your first connection, run this ansible command:

    ansible 'sasnode*' -b -m shell -a 'find /var/lib /viyavolume -type f -name "dummy_data.sas7bdat"  -print | xargs --no-run-if-empty sudo ls -al'

    Example results when create_data_and_sleep.sas is running

    Note: In the example results below, dummy_data.sas7bdat was on sasnode05 again. Our workshop cluster is configured so that batch jobs can run on any node.

    sasnode05 | CHANGED | rc=0 >>
    -rw-r--r-- 1 geladm sasadmins 131072 Sep 23 07:39 /viyavolume/tmp/batch/default/SAS_workD3C2000001DB_sas-batch-server-6a512e8a-be11-4739-9cb7-a5bb821d97b6-27/dummy_data.sas7bdat
    
    sasnode03 | CHANGED | rc=0 >>
    
    
    sasnode04 | CHANGED | rc=0 >>
    
    
    sasnode02 | CHANGED | rc=0 >>
    
    
    sasnode01 | CHANGED | rc=0 >>


    The dummy_data.sas7bdat file is at a path something like:

    /viyavolume/tmp/batch/default/SAS_workD3C2000001DB_sas-batch-server-6a512e8a-be11-4739-9cb7-a5bb821d97b6-27/dummy_data.sas7bdat

    This path begins with /viyavolume, which the host path we asked the SAS programming run-time pods to use for the temporary volume named viya. It is under our control.

    Below that path, the SAS_work directory is at …/tmp/batch/default/SAS_work<session_id>_<sas-launcher-pod-name>, as it was before.

OPTIONAL: See left-over temporary files and directories

When the SAS Programming Run-time finishes and exits normally (i.e. not a crash), it does a reasonable job of deleting any temporary files left behind from the job. But it does leave some of the directory structure

  1. Wait for the batch job to finish running.

See /viyavolume directory structure left by the SAS batch jobs we ran

  1. Run this command in either of your SSH connections to sasnode01. It shows directory structure and files (if there are any) left over after recent batch jobs and other launched programming run-time sessions have run and ended:

    ansible 'sasnode*' -b -m shell -a 'tree /viyavolume/'

    Expected results:

    sasnode05 | CHANGED | rc=0 >>
    /viyavolume/
    ├── log
    │   ├── batch
    │   │   └── default
    │   ├── compsrv
    │   │   └── default
    │   └── connectserver
    │       └── default
    ├── run
    │   ├── batch
    │   │   └── default
    │   │       └── uid4000
    │   ├── compsrv
    │   │   └── default
    │   └── connectserver
    │       └── default
    ├── spool
    │   ├── batch
    │   │   └── default
    │   ├── compsrv
    │   │   └── default
    │   └── connectserver
    │       └── default
    └── tmp
        ├── batch
        │   └── default
        ├── compsrv
        │   └── default
        └── connectserver
            └── default
    
    29 directories, 0 files
    
    sasnode03 | CHANGED | rc=0 >>
    /viyavolume/
    
    0 directories, 0 files
    
    sasnode02 | CHANGED | rc=0 >>
    /viyavolume/
    
    0 directories, 0 files
    
    sasnode04 | CHANGED | rc=0 >>
    /viyavolume/
    
    0 directories, 0 files
    
    sasnode01 | CHANGED | rc=0 >>
    /viyavolume/
    
    0 directories, 0 files
    

    As you can see in the results (even if they do not exactly match the example above), a launched SAS programming run-time pod creates a whole directory structure for its temporary files, whether it will use all of them nor not:

    • At the top level of the viya volume in the pods, it creates subdirectories called log run spool and tmp.

    • Below each of these, is a batch, compsrv and connectserver subdirectory

    • Below each of those is a default directory.

      Note: I believe the default directories are named for the compute/connect/batch context; we used the default batch context to run our program.

    Most of those directories are empty, except /viyavolume/run/batch/default which contains an extra subdirectory, named uid4000. We ran our batch job as user geladm, whose POSIX uid in this environment is 4000.

Exercise for the reader: See /viyavolume directory structure while a SAS Studio session is running, and after it finishes

  1. Start a SAS Studio session, and then re-run the ansible command above to see what files are present in the /viyavolume directory on the pod’s host node while a compute session is running.

    Question: Are there files present in the temporary directory that you were not expecting to see?

  2. Sign out of SAS Studio. Then re-run the same ansible command again, to see what is left behind after a compute session.

Exercise for the reader: See /viyavolume directory structure left behind by a crashed SAS programming run-time session

  1. By now you know everything you need to try deliberately crashing a compute or batch session in a SAS Compute or SAS Batch pod, and then to find out what files it leaves in the /viyavolume directory on the pod’s host node.

    Hint:

    %macro oops;
      %abort abend;
    %mend;
    %oops;

SAS Viya Administration Operations
Lesson 08, Section 5 Exercise: Relocate CAS_DISK_CACHE

Configure a new location for CAS_DISK_CACHE

In this exercise you will reconfigure cas-shared-gelcorp server to relocate its CAS_DISK_CACHE from the default location of emptyDir to a hostPath volume location that you will create on each Kubernetes node that you expect to host CAS server pods.

Table of contents

Set the namespace

gel_setCurrentNamespace gelcorp

Identify the current location of CAS_DISK_CACHE

Before you make any changes let’s examine the CAS server configuration to verify where CAS_DISK_CACHE is currently located.

  1. Open SAS Environment Manager, log in as geladm, and assume the SASAdministrators membership.

    gellow_urls | grep "SAS Environment Manager"
    1. Navigate to the Servers page.

    2. Right-click on cas-shared-gelcorp and select the Configuration option.

    3. Navigate to the Nodes tab

    4. Choose the Controller node, right-click on it, and select the Runtime Environment option.

    5. Scroll through the Environment Variable table until you find the CAS_DISK_CACHE variable. You should see this:

      08_072_CASServer_CAS_DISK_CACHE_0000
  2. Let’s try using kubectl to see if we can obtain the same information. Normally we would expect to find this information in a CASENV_CAS_DISK_CACHE environment variable on the CAS controller so let’s use kubectl to display and filter candidate variable values. Notice that we have specified in the command that we want to execute the command in the cas container of the sas-cas-server-gelcorp-controller pod.

    _CASControllerPodName=$(kubectl get pod \
          --selector "casoperator.sas.com/server==shared-gelcorp,casoperator.sas.com/node-type==controller,casoperator.sas.com/controller-index==0" \
          --no-headers \
       | awk '{printf $1}')
    
    echo ${_CASControllerPodName}
    
    kubectl exec -it ${_CASControllerPodName} \
                 -c sas-cas-server \
                 -- env \
       | grep "CAS" \
       | grep -v "SAS_"

    You should see something like…

    CASCFG_DQSETUPLOC=QKB CI 33
    CASCFG_HOSTKNOWNBY=controller.sas-cas-server-shared-gelcorp.gelcorp
    CASENV_CAS_VIRTUAL_HOST=controller.sas-cas-server-shared-gelcorp.gelcorp
    CASKEY=ce087d3dbd1a5a39abe248a8e9b6a36d488e101550e016531b05b49b46c7687c
    CASCFG_DQLOCALE=ENUSA
    CASCFG_INITIALBACKUPS=1
    CAS_POD_NAME=sas-cas-server-shared-gelcorp-controller
    CASCFG_MODE=mpp
    CASCONTROLLERHOST=controller.sas-cas-server-shared-gelcorp.gelcorp
    CASCFG_INITIALWORKERS=2
    CASBACKUPHOST=backup.sas-cas-server-shared-gelcorp.gelcorp
    CASCLOUDNATIVE=1
    CASENV_CAS_VIRTUAL_PATH=/cas-shared-gelcorp-http
    CAS_CLIENT_SSL_CA_LIST=/security/trustedcerts.pem
    CASENV_CONSUL_NAME=cas-shared-gelcorp
    CASENV_CAS_K8S_SERVICE_NAME=sas-cas-server-shared-gelcorp-client
    CASENV_CASDEPLOYMENT_SPEC_ALLOWLIST_APPEND=/cas/data/caslibs:/gelcontent:/mnt/gelcontent/
    CASENV_CASDATADIR=/cas/data
    CASENV_CASPERMSTORE=/cas/permstore
    CASCFG_GCPORT=5571
    CASENV_CAS_VIRTUAL_PROTO=http
    CASENV_CAS_VIRTUAL_PORT=8777
    CASENV_CAS_LICENSE=/cas/license/license.sas

    Remember that we are looking for the CASENV_CAS_DISK_CACHE variable value but it appears that CASENV_CAS_DISK_CACHE is not defined. This is because CAS_DISK_CACHE is using the default emptyDir volume. In this case the CAS server uses the default /cas/cache path to locate CAS_DISK_CACHE in an emptyDir volume.

Reconfigure CAS to relocate CAS_DISK_CACHE

To relocate CAS_DISK_CACHE you will need to

  • Create the directories for the CAS_DISK_CACHE on each Kubernetes node used for CAS pods
  • Create a patchTransformer manifest to
    • mount the directories to the CAS pods
    • configure CAS to use the new CAS_DISK_CACHE location
  • Add the patchTransformer manifest to kustomization.yaml
  • Rebuild and apply your new SASDeployment custom resource to implement the changes.
  1. Create the new directories for CAS_DISK_CACHE on each Kubernetes cluster node. In this case, we are going to use Ansible to create the directories on all of the nodes.

    _casInstance=shared-gelcorp
    ansible 'sasnode*' \
            -b \
            -m 'shell' \
            -a "mkdir -p /casdiskcache/${_casInstance};
                chmod 777 /casdiskcache;
                chmod 777 /casdiskcache/${_casInstance};"
    ansible 'sasnode*' \
            -b \
            -m 'shell' \
            -a "mkdir -p /casdiskcache/${_casInstance}/cdc01;
                mkdir -p /casdiskcache/${_casInstance}/cdc02;
                mkdir -p /casdiskcache/${_casInstance}/cdc03;
                mkdir -p /casdiskcache/${_casInstance}/cdc04;
                chmod 1777 /casdiskcache/${_casInstance}/*;"
    ansible 'sasnode*' \
            -b \
            -m 'shell' \
            -a "ls -al /casdiskcache/${_casInstance};"
  2. Now that the directories exist, create a patchTransformer manifest to mount the directories to CAS pods and to configure CAS to set the value for the CASENV_CAS_DISK_CACHE environment variable so CAS will use the new directories.

    tee ~/project/deploy/${current_namespace}/site-config/cas-manage-casdiskcache-${_casInstance}.yaml  > /dev/null << EOF
    # This patchTranformer file is created for the ${_casInstance} CAS server only
    ---
    # This block of code is for creating the CAS server mount point - node physical /casdiskcache/default mounted to container /casdiskcache
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: cas-add-host-mount-casdiskcache-${_casInstance}
    patch: |-
        - op: add
          path: /spec/controllerTemplate/spec/volumes/-
          value:
            name: casdiskcache
            hostPath:
              path: /casdiskcache/${_casInstance}
        - op: add
          path: /spec/controllerTemplate/spec/containers/0/volumeMounts/-
          value:
            name: casdiskcache
            mountPath: /casdiskcache
    target:
      group: viya.sas.com
      kind: CASDeployment
    # Target filtering, chose/uncomment one of these option:
    #    To filter the default CAS server (cas-shared-default) only:
      #labelSelector: "sas.com/cas-server-default"
    #    To filter another CAS server (casdeployments):
      #name: <CASInstanceName>
      name: ${_casInstance}
    #    To filter all CAS servers:
      #name: .*
      version: v1alpha1
    
    ---
    # This block of code is for adding environment variables for the CAS server.
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: cas-add-environment-variables-casdiskcache-${_casInstance}
    patch: |-
        - op: add
          path: /spec/controllerTemplate/spec/containers/0/env/-
          value:
            name: CASENV_CAS_DISK_CACHE
            value: "/casdiskcache/cdc01:/casdiskcache/cdc02:/casdiskcache/cdc03:/casdiskcache/cdc04"
    target:
      group: viya.sas.com
      kind: CASDeployment
    # Target filtering, chose/uncomment one of these option:
    #    To filter the default CAS server (cas-shared-default) only:
      #labelSelector: "sas.com/cas-server-default"
    #    To filter another CAS server (casdeployments):
      #name: <CASInstanceName>
      name: ${_casInstance}
    #    To filter all CAS servers:
      #name: .*
      version: v1alpha1
    EOF
  3. Now add a reference to cas-manage-casdiskcache-shared-gelcorp.yaml in the kustomization.yaml file.

    • Backup the current kustomization.yaml file.

      cp -p ~/project/deploy/${current_namespace}/kustomization.yaml /tmp/${current_namespace}/kustomization_05-042-01.yaml
    • Use this yq command to add a reference to the cas-manage-casdiskcache-shared-gelcorp.yaml manifest in the transformers field of the Viya deployment kustomization.yaml file. While the command may look complicated, it is simply adding the reference after making sure that the reference does not already exist.

      [[ $(grep -c "site-config/cas-manage-casdiskcache-${_casInstance}.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
      yq4 eval -i '.transformers += ["site-config/cas-manage-casdiskcache-'${_casInstance}'.yaml"]' ~/project/deploy/${current_namespace}/kustomization.yaml
    • Alternatively, you can update the Viya deployment kustomization.yaml file using your favorite text editor:

      [...]
      transformers:
        [... previous transformers items ...]
        - site-config/cas-manage-casdiskcache-shared-gelcorp.yaml
      [...]
  4. Check that the update is in place.

    cat ~/project/deploy/${current_namespace}/kustomization.yaml

    Make sure that site-config/cas-shared-site-config/cas-manage-casdiskcache-shared-gelcorp.yaml exists in the transformers field of the Viya deployment kustomization.yaml file.

    Click here to see the output

    ---
    namespace: gelcorp
    resources:
      - sas-bases/base
      # GEL Specifics to create CA secret for OpenSSL Issuer
      - site-config/security/gel-openssl-ca
      - sas-bases/overlays/network/networking.k8s.io # Using networking.k8s.io API since 2021.1.6
      - site-config/security/openssl-generated-ingress-certificate.yaml # Default to OpenSSL Issuer in 2021.2.6
      - sas-bases/overlays/cas-server
      - sas-bases/overlays/crunchydata/postgres-operator # New Stable 2022.10
      - sas-bases/overlays/postgres/platform-postgres # New Stable 2022.10
      - sas-bases/overlays/internal-elasticsearch # New Stable 2020.1.3
      - sas-bases/overlays/update-checker # added update checker
      ## disable CAS autoresources to keep things simpler
      #- sas-bases/overlays/cas-server/auto-resources                                        # CAS-related
      #- sas-bases/overlays/crunchydata_pgadmin                                              # Deploy the sas-crunchy-data-pgadmin container - remove 2022.10
      - site-config/sas-prepull/add-prepull-cr-crb.yaml
      - sas-bases/overlays/cas-server/state-transfer # Enable state transfer for the cas-shared-default CAS server - new PVC sas-cas-transfer-data
      - site-config/sas-microanalytic-score/astores/resources.yaml
      - site-config/gelcontent_pvc.yaml
      - site-config/cas-shared-gelcorp
    configurations:
      - sas-bases/overlays/required/kustomizeconfig.yaml
    transformers:
      - sas-bases/overlays/internal-elasticsearch/sysctl-transformer.yaml # New Stable 2020.1.3
      - sas-bases/overlays/startup/ordered-startup-transformer.yaml
      - site-config/cas-enable-host.yaml
      - sas-bases/overlays/required/transformers.yaml
      - site-config/mirror.yaml
      #- site-config/daily_update_check.yaml      # change the frequency of the update-check
      #- sas-bases/overlays/cas-server/auto-resources/remove-resources.yaml    # CAS-related
      ## temporarily removed to alleviate RACE issues
      - sas-bases/overlays/internal-elasticsearch/internal-elasticsearch-transformer.yaml # New Stable 2020.1.3
      - sas-bases/overlays/sas-programming-environment/enable-admin-script-access.yaml # To enable admin scripts
      #- sas-bases/overlays/scaling/zero-scale/phase-0-transformer.yaml
      #- sas-bases/overlays/scaling/zero-scale/phase-1-transformer.yaml
      - sas-bases/overlays/cas-server/state-transfer/support-state-transfer.yaml # Enable state transfer for the cas-shared-default CAS server - enable and mount new PVC
      - site-config/change-check-interval.yaml
      - sas-bases/overlays/sas-microanalytic-score/astores/astores-transformer.yaml
      - site-config/sas-pyconfig/change-configuration.yaml
      - site-config/sas-pyconfig/change-limits.yaml
      - site-config/cas-add-nfs-mount.yaml
      - site-config/cas-add-allowlist-paths.yaml
      - site-config/cas-modify-user.yaml
      - site-config/cas-manage-casdiskcache-shared-gelcorp.yaml
    components:
      - sas-bases/components/crunchydata/internal-platform-postgres # New Stable 2022.10
      - sas-bases/components/security/core/base/full-stack-tls
      - sas-bases/components/security/network/networking.k8s.io/ingress/nginx.ingress.kubernetes.io/full-stack-tls
    patches:
      - path: site-config/storageclass.yaml
        target:
          kind: PersistentVolumeClaim
          annotationSelector: sas.com/component-name in (sas-backup-job,sas-data-quality-services,sas-commonfiles,sas-cas-operator,sas-pyconfig)
      - path: site-config/cas-gelcontent-mount-pvc.yaml
        target:
          group: viya.sas.com
          kind: CASDeployment
          name: .*
          version: v1alpha1
      - path: site-config/compute-server-add-nfs-mount.yaml
        target:
          labelSelector: sas.com/template-intent=sas-launcher
          version: v1
          kind: PodTemplate
      - path: site-config/compute-server-annotate-podtempate.yaml
        target:
          name: sas-compute-job-config
          version: v1
          kind: PodTemplate
    secretGenerator:
      - name: sas-consul-config
        behavior: merge
        files:
          - SITEDEFAULT_CONF=site-config/sitedefault.yaml
      - name: sas-image-pull-secrets
        behavior: replace
        type: kubernetes.io/dockerconfigjson
        files:
          - .dockerconfigjson=site-config/crcache-image-pull-secrets.json
    configMapGenerator:
      - name: ingress-input
        behavior: merge
        literals:
          - INGRESS_HOST=gelcorp.pdcesx03145.race.sas.com
      - name: sas-shared-config
        behavior: merge
        literals:
          - SAS_SERVICES_URL=https://gelcorp.pdcesx03145.race.sas.com
      # # This is to fix an issue that only appears in very slow environments.
      # # Do not do this at a customer site
      - name: sas-go-config
        behavior: merge
        literals:
          - SAS_BOOTSTRAP_HTTP_CLIENT_TIMEOUT_REQUEST='15m'
      - name: input
        behavior: merge
        literals:
          - IMAGE_REGISTRY=crcache-race-sas-cary.unx.sas.com
  5. Now let’s rebuild and apply the Viya deployment manifest to apply the new CAS_DISK_CACHE setting.

    1. Keep a copy of the current manifest.yaml file.

      cp -p /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml /tmp/${current_namespace}/manifest_05-042-01.yaml
    2. Run the sas-orchestration deploy command.

      cd ~/project/deploy
      rm -rf /tmp/${current_namespace}/deploy_work/*
      source ~/project/deploy/.${current_namespace}_vars
      
      docker run --rm \
                 -v ${PWD}/license:/license \
                 -v ${PWD}/${current_namespace}:/${current_namespace} \
                 -v ${HOME}/.kube/config_portable:/kube/config \
                 -v /tmp/${current_namespace}/deploy_work:/work \
                 -e KUBECONFIG=/kube/config \
                 --user $(id -u):$(id -g) \
             sas-orchestration \
                deploy \
                   --namespace ${current_namespace} \
                   --deployment-data /license/SASViyaV4_${_order}_certs.zip \
                   --license /license/SASViyaV4_${_order}_license.jwt \
                   --user-content /${current_namespace} \
                   --cadence-name ${_cadenceName} \
                   --cadence-version ${_cadenceVersion} \
                   --image-registry ${_viyaMirrorReg}

      When the deploy command completes successfully the final message should say The deploy command completed successfully as shown in the log snippet below.

      The deploy command started
      
      [...]
      
      The deploy command completed successfully

      If the sas-orchestration deploy command fails checkout the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

  6. Restart the cas-shared-gelcorp server so that it is aware of the new CAS_DISK_CACHE configuration.

    Since we enabled the state transfer the cas-shared-gelcorp server you have now choices to restart the CAS server.

    • Choice 1: initiate the state transfer

      All loaded tables and active CAS session will be kept.

      The casoperator.sas.com/instance-index label for all pods of the CAS server will be incremented by 1.

      kubectl patch casdeployment shared-gelcorp \
                    --type='json' \
                    -p='[{"op": "replace", "path": "/spec/startStateTransfer", "value":true}]'
    • Choice 2: delete the CAS server pods

      All loaded tables and active CAS sessions will be lost.

      The casoperator.sas.com/instance-index label for all pods of the CAS server will be reset to 0.

      kubectl delete pod \
                     --selector="casoperator.sas.com/server==shared-gelcorp"

    Quickly switch over to OpenLens and watch what happens to the CAS pods.

    1. If you switch over to OpenLens fast enough you may be able to see the cas-shared-gelcorp pods terminate.

      08_072_Lens_Monitor_Gelcorp_CASServer_0000
    2. Then you should see all cas-shared-gelcorp pods restart.

      08_072_Lens_Monitor_Gelcorp_CASServer_0001
      08_072_Lens_Monitor_Gelcorp_CASServer_0002
    3. When all pods are Running and the containers of all cas-shared-gelcorp pods show green the server is ready to be used.

      08_072_Lens_Monitor_Gelcorp_CASServer_0003

    As one last validation step, run the following command to make sure the CAS server is ready.

    kubectl wait pods \
                 --selector="casoperator.sas.com/server==shared-gelcorp" \
                 --for condition=ready --timeout 15m

    You should see these messages in the output.

    pod/sas-cas-server-shared-gelcorp-backup condition met
    pod/sas-cas-server-shared-gelcorp-controller condition met
    pod/sas-cas-server-shared-gelcorp-worker-0 condition met
    pod/sas-cas-server-shared-gelcorp-worker-1 condition met

    The cas-shared-gelcorp server is now reconfigured to use the new CAS_DISK_CACHE location.

  7. Let’s validate the new CAS_DISK_CACHE settings by repeating the steps you did at the start of this exercise.

    1. Open SAS Environment Manager, log in as geladm, and assume the SASAdministrators membership.

      gellow_urls | grep "SAS Environment Manager"
      1. Navigate to the Servers page.

      2. Right-click on cas-shared-gelcorp and select the Configuration option.

      3. Navigate to the Nodes tab

      4. Choose the Controller node, right-click on it, and select the Runtime Environment option.

      5. Scroll through the Environment Variable table until you find the CAS_DISK_CACHE variable. You should see that CAS_DISK_CACHE is pointing to the new hostpath location.

        08_072_CASServer_CAS_DISK_CACHE_0001
    2. Let’s try using kubectl again to see if we can obtain the CAS_DISK_CACHE information. Remember that earlier the CASENV_CAS_DISK_CACHE environment variable was undefined on the CAS controller.

      _CASControllerPodName=$(kubectl get pod \
            --selector "casoperator.sas.com/server==shared-gelcorp,casoperator.sas.com/node-type==controller,casoperator.sas.com/controller-index==0" \
            --no-headers \
         | awk '{printf $1}')
      
      echo ${_CASControllerPodName}
      
      kubectl exec -it ${_CASControllerPodName} \
                   -c sas-cas-server \
                   -- env \
         | grep "CAS" \
         | grep -v "SAS_"

      Do you see the CASENV_CAS_DISK_CACHE variable this time?

      You should see something like this.

      CASCONTROLLERHOST=controller.sas-cas-server-shared-gelcorp.gelcorp
      CASENV_CAS_DISK_CACHE=/casdiskcache/cdc01:/casdiskcache/cdc02:/casdiskcache/cdc03:/casdiskcache/cdc04
      CASENV_CAS_VIRTUAL_HOST=controller.sas-cas-server-shared-gelcorp.gelcorp
      CASBACKUPHOST=backup.sas-cas-server-shared-gelcorp.gelcorp
      CASENV_CAS_K8S_SERVICE_NAME=sas-cas-server-shared-gelcorp-client
      CASENV_CONSUL_NAME=cas-shared-gelcorp
      CASCFG_HOSTKNOWNBY=controller.sas-cas-server-shared-gelcorp.gelcorp
      CASENV_CAS_VIRTUAL_PATH=/cas-shared-gelcorp-http
      CASCFG_DQSETUPLOC=QKB CI 33
      CASKEY=ce087d3dbd1a5a39abe248a8e9b6a36d488e101550e016531b05b49b46c7687c
      CASCFG_INITIALBACKUPS=1
      CASCFG_DQLOCALE=ENUSA
      CASCFG_INITIALWORKERS=2
      CAS_CLIENT_SSL_CA_LIST=/security/trustedcerts.pem
      CAS_POD_NAME=sas-cas-server-shared-gelcorp-controller
      CASCFG_MODE=mpp
      CASENV_CASDEPLOYMENT_SPEC_ALLOWLIST_APPEND=/cas/data/caslibs:/gelcontent:/mnt/gelcontent/
      CASCLOUDNATIVE=1
      CASENV_CASDATADIR=/cas/data
      CASENV_CASPERMSTORE=/cas/permstore
      CASCFG_GCPORT=5571
      CASENV_CAS_VIRTUAL_PROTO=http
      CASENV_CAS_VIRTUAL_PORT=8777
      CASENV_CAS_LICENSE=/cas/license/license.sas
    3. Extra credit: Look at the contents of the CAS_DISK_CACHE inside the Kubernetes nodes file system.

      ansible 'sasnode*' \
              -b \
              -m 'shell' \
              -a 'lsof -nP -c cas 2>/dev/null \
                     | grep "(deleted)" \
                     | grep -E "casdiskcache|sasnode"' \
         | grep -v "| FAILED |" \
         | grep -v "non-zero return code"

      This allows you to see the blocks of data loaded into the cas-shared-gelcorp server. Remember that the HR tables were automatically reloaded because of the session zero settings you configured earlier.

      Click here to see the output

      You should see something like…

      sasnode03 | CHANGED | rc=0 >>
      cas       23335  sas   24r   REG    8,3   801896 168188636 /casdiskcache/cdc03/casmap_1487_48FEF3E0_0x7f3816287418_801896 (deleted)
      cas       23335  sas   27r   REG    8,3   253016 172597847 /casdiskcache/cdc04/casmap_1487_48FEF4CB_0x7f3816287418_253016 (deleted)
      cas       23335  sas   28r   REG    8,3     3504 164078333 /casdiskcache/cdc02/casmap_1487_48FF1F0F_0x7f380fdf72a8_3504 (deleted)
      cas       23335  sas   30r   REG    8,3  5042208 168188637 /casdiskcache/cdc03/casmap_1487_48FF20BE_0x7f380fcf1008_5042208 (deleted)
      cas       23335  sas   32r   REG    8,3  5052576 172602757 /casdiskcache/cdc04/casmap_1487_48FF2135_0x7f380fdf72a8_5052576 (deleted)
      sasnode02 | CHANGED | rc=0 >>
      cas        4201  sas   24r      REG    8,3   804200 163935601 /casdiskcache/cdc04/casmap_1490_48FEF3E1_0x7f2d7cc38418_804200 (deleted)
      cas        4201  sas   27r      REG    8,3   253016 152216596 /casdiskcache/cdc01/casmap_1490_48FEF4CC_0x7f2d7cc38418_253016 (deleted)
      cas        4201  sas   28r      REG    8,3     3504 163935611 /casdiskcache/cdc04/casmap_1490_48FF1EFE_0x7f2d7cc38008_3504 (deleted)
      cas        4201  sas   30r      REG    8,3  5052576 152216598 /casdiskcache/cdc01/casmap_1490_48FF20AE_0x7f2d7cc38008_5052576 (deleted)
      cas        4201  sas   32r      REG    8,3  5042208 155530074 /casdiskcache/cdc02/casmap_1490_48FF2157_0x7f2d768232a8_5042208 (deleted)
      sasnode04 | CHANGED | rc=0 >>

Lesson learned

To relocate CAS_DISK_CACHE from its default emptyDir volume to a hostpath volume you will need to:

  • Create directories for the CAS_DISK_CACHE on each Kubernetes node used for CAS pods
  • Create a patchTransformer manifest to
    • mount the directories to the CAS pods
    • configure CAS to use the new CAS_DISK_CACHE location
  • Add the patchTransformer manifest to kustomization.yaml
  • Rebuild and apply your new SASDeployment custom resource to implement the changes
  • Restart the CAS server

Lesson 09

SAS Viya Administration Operations
Lesson 09, Section 1 Exercise: Configure Compute CPU and Memory

Configure_Compute_Memory_Limits

In its default configuration, the SAS Viya compute pod template specifies that its main container should be limited to using 2GB of memory. SAS Viya’s default value for the MEMSIZE SAS system option is also 2GB.

In this exercise, we will see what happens when you increase MEMSIZE above the compute pod’s memory limit, and then run SAS code which tries to use more memory than that limit: although SAS tries to prevent us from doing so, it is not always successful, and we can cause the compute pod to be killed by Kubernetes.

We will then increase the compute pod’s main container’s memory limit, so that the same SAS code runs successfully.

In this hands-on exercise

Start a Compute Session in SAS Studio

  1. In Chrome, open SAS Studio. If you need the URL, run this and Ctrl + click the link it outputs:

    gellow_urls | grep "SAS Studio"
  2. Log in to SAS Studio as:

    • username: Delilah
    • password: lnxsas
  3. When SAS Studio opens, wait until you have a compute session running under the “SAS Studio compute context”.

See the current value of the MEMSIZE SAS system option

As the documentation explains, the MEMSIZE system option specifies the limit on the total amount of virtual memory that can be used by a SAS session.

  1. Open a new SAS Program tab.

  2. Copy the following into the new SAS Program tab, and run it:

    proc options option=memsize; run;

    Expected SAS log output:

    80       proc options option=memsize; run;
        SAS (r) Proprietary Software Release V.04.00  TS1M0
     MEMSIZE=2147483648
                       Specifies the limit on the amount of virtual memory that can be used during a SAS session.

    Here, we see that MEMSIZE=2147483648 bytes, which is 2 Gigabytes (= 2 * (1024 ^ 3) bytes), the default value in SAS Viya.

    So, SAS should prevent your session from using more than 2 Gb memory. Let’s try using more to see what happens.

Create several large datasets and try to load them into memory

  1. In SAS Studio, still signed in as Delilah, still in a compute session under the “SAS Studio compute context”, copy the following code into the new SAS Program tab, and run it:

    %let libraryname=work;
    %let datasetname=bigtable;
    %let rows=671089;
    
    %macro generate(n_rows,n_num_cols,n_char_cols,outdata=test,seed=0);
        data &outdata;
            array nums[&n_num_cols];
            array chars[&n_char_cols] $;
            temp = "abcdefghijklmnopqrstuvwxyz";
            do i=1 to &n_rows;
                do j=1 to &n_num_cols;
                    nums[j] = ranuni(&seed);
                end;
                do j=1 to &n_char_cols;
                    chars[j] = substr(temp,ceil(ranuni(&seed)*18),8);
                end;
                output;
            end;
            drop i j temp;
        run;
    %mend;
    
    %generate(&rows.,100,100,outdata=&datasetname);
    %generate(&rows.,100,100,outdata=&datasetname.2);
    %generate(&rows.,100,100,outdata=&datasetname.3);
    
    PROC SQL ;
    TITLE ‘Filesize for &datasetname Data Set’ ;
    SELECT libname,
            memname,
            memtype,
            FILESIZE FORMAT=SIZEKMG.,
            FILESIZE FORMAT=SIZEK.
        FROM DICTIONARY.TABLES
        WHERE libname = upper("&libraryname")
            AND memname CONTAINS upper("&datasetname")
            AND memtype = "DATA" ;
    QUIT ;
    
    * Load datasets into memory;
    sasfile &libraryname..&datasetname load;
    sasfile &libraryname..&datasetname.2 load;
    sasfile &libraryname..&datasetname.3 load;

    Expected results - each of the datasets created is about 1GB:

    Large datasets
  2. Switch to the Log tab on the right hand panel in SAS Studio, and look for the log output from running the three sasfile ... load; statements at the end of the program.

    When the compute server tries to run the three sasfile &libraryname..&datasetname load; statements at the end of that program, you should see a NOTE, a WARNING and an ERROR in the SAS program log, something like this:

    123  * Load datasets into memory;
    124  sasfile &libraryname..&datasetname load;
    NOTE: The file WORK.BIGTABLE.DATA has been loaded into memory by the SASFILE statement.
    125  sasfile &libraryname..&datasetname.2 load;
    WARNING: Only 3950 of 8286 pages of WORK.BIGTABLE2.DATA can be loaded into memory by the SASFILE statement.
    126  sasfile &libraryname..&datasetname.3 load;
    ERROR: File WORK.BIGTABLE3.DATA is damaged. I/O processing did not complete.
    127

    When the compute server tried to load the three datasets into memory, it successfully loaded the first one, it could only load part of the second dataset, because the dataset is larger than the compute server’s remaining unallocated memory below the MEMSIZE limit. It failed to load the third dataset entirely, possibly due to being out of available memory.

    Tip: To find the .sas7bdat files for the three datasets on your cluster, in MobaXterm, in an ssh session to sasnode01 as cloud user, which searches for it on each node, run this:

    ansible 'sasnode*' -b -m shell -a 'find /var/lib -type f -name "bigtable*.sas7bdat"  -print | xargs --no-run-if-empty sudo ls -al'

    Example output is below. The command found all three bigtable dataset files on sasnode02, but they might be on a different node when you run this. It depends where SAS Workload Management decided to start the compute server pod:

    sasnode02 | CHANGED | rc=0 >>
    -rw-r--r-- 1 1283006467 1283006467 1086193664 Jul 31 14:10 /var/lib/kubelet/pods/2e7392e7-4012-48f7-b5ae-7889e7837dcb/volumes/kubernetes.io~empty-dir/viya/tmp/compsrv/default/313c5741-1776-4f36-9a61-2eda79bc23de/SAS_workC168000001A2_sas-compute-server-c0a20423-be04-4455-b754-b8b5fc0dd1f1-33/bigtable2.sas7bdat
    -rw-r--r-- 1 1283006467 1283006467 1086193664 Jul 31 14:10 /var/lib/kubelet/pods/2e7392e7-4012-48f7-b5ae-7889e7837dcb/volumes/kubernetes.io~empty-dir/viya/tmp/compsrv/default/313c5741-1776-4f36-9a61-2eda79bc23de/SAS_workC168000001A2_sas-compute-server-c0a20423-be04-4455-b754-b8b5fc0dd1f1-33/bigtable3.sas7bdat
    -rw-r--r-- 1 1283006467 1283006467 1086193664 Jul 31 14:10 /var/lib/kubelet/pods/2e7392e7-4012-48f7-b5ae-7889e7837dcb/volumes/kubernetes.io~empty-dir/viya/tmp/compsrv/default/313c5741-1776-4f36-9a61-2eda79bc23de/SAS_workC168000001A2_sas-compute-server-c0a20423-be04-4455-b754-b8b5fc0dd1f1-33/bigtable.sas7bdat
    
    sasnode05 | CHANGED | rc=0 >>
    
    
    sasnode04 | CHANGED | rc=0 >>
    
    
    sasnode01 | CHANGED | rc=0 >>
    
    
    sasnode03 | CHANGED | rc=0 >>

Try to increase memsize in a running compute server

  1. Still in SAS Studio as Delilah, choose New > SAS Program from the menu to open another SAS program window.

  2. In the second SAS Program window, try running this SAS statement:

    options memsize=4G;

    Note: Possible result: sometimes, the log might contain about four error messages like this:

    ERROR: XOB failure detected.  Aborted during the COMPILATION phase.

    If you see these errors, they indicate that the compute server failed to run SAS Studio’s preamble code. If this happens, your compute server is not in a healthy state. You can fix that by starting a new compute server, and inside it a new compute session, as follows:

    1. Still in SAS Studio as Delilah, choose Options > Reset SAS session, and in the Reset Session prompt, click Reset.

    2. Wait while the new compute session is started in a new compute server. This may take 30 seconds or so.

    3. Try running the same SAS options statement in your second SAS program tab again:

    options memsize=4G;

    The expected result is a warning, saying you are not allowed to change the memsize in a compute server after it has finished starting up:

    80   options memsize=4G;
                -------
                30
    WARNING 30-12: SAS option MEMSIZE is valid only at startup of the SAS System. The SAS option is ignored.

    If MEMSIZE can only be changed during the compute server initialization, then by default only SAS Administrators can change the MEMSIZE in a compute server. This is a good thing.

Create a compute context with increased memsize

We will create a compute context with a memsize of 4GB, so that it fully load a 3GB table into memory with ‘room to spare’, because we know that some of its total memory capacity will already be in use for other things.

  1. In Firefox, open SAS Environment Manager. If you need the URL, run this in your MobaXterm session connected to sasnode01 as cloud-user:

    gellow_urls | grep "SAS Environment Manager"

    Click and drag your mouse pointer to select the SAS Environment Manager URL to the clipboard, then paste it into the address bar in Firefox.

  2. In Firefox, log in to SAS Environment Manager as:

    • username: geladm
    • password: lnxsas
  3. Navigate to the Contexts page in SAS Environment Manager.

  4. In the Contexts page, from the View menu select Compute contexts.

  5. Right-click the SAS Studio compute context, and select Copy from the popup menu.

  6. Name the copy of the compute context “SAS Studio compute context with memsize 4G”, and on the Advanced tab, paste this into the box labelled “Enter each SAS option on a new line:”

    -memsize 4G

    The Advanced tab of the new compute context dialog should look like this:

    New SAS Studio compute context 4G Memory - Advanced tab
  7. Click Save to save the new compute context. You should see it in the list of compute contexts.

Create several large datasets and try to load them into memory in a compute context with memsize = 4G

  1. Switch back to Chrome.

  2. In Chrome, still in SAS Studio, still signed in as Delilah, click the Reload the page button, or press F5 to reload the web page. Click Reload if prompted in a popup dialog. This will start a new SAS Studio session, but you will still be signed in as Delilah.

    Note: The previous compute server and its pod are not terminated right away. They will eventually time out and be terminated.

  3. When the SAS Studio page has reloaded, and a new compute session has finished starting, click on the compute context menu, and choose ‘SAS Studio compute context with memsize 4G’ from the dropdown list:

    Choose the new compute context with 4G memory

    In the Change Compute Context popup, click Change to confirm.

  4. When the new compute session has started under “SAS Studio compute context with memsize 4G”, run the same proc options statement you ran earlier to see the new value of memsize:

    proc options option=memsize; run;

    Expected SAS log output:

    80   proc options option=memsize; run;
        SAS (r) Proprietary Software Release V.04.00  TS1M0
    MEMSIZE=4294967296
                    Specifies the limit on the amount of virtual memory that can be used during a SAS session.

    Here, we see that MEMSIZE=4294967296 bytes, which is 4 Gigabytes (= 4 * (1024 ^ 3) bytes). This shows that the SAS option you added to the new compute context worked. As far as SAS is concerned, this should be enough to load a 3GB table.

  5. Copy the following code into the new SAS Program tab, and run it. This is the same code you ran earlier:

    %let libraryname=work;
    %let datasetname=bigtable;
    %let rows=671089;
    
    %macro generate(n_rows,n_num_cols,n_char_cols,outdata=test,seed=0);
        data &outdata;
            array nums[&n_num_cols];
            array chars[&n_char_cols] $;
            temp = "abcdefghijklmnopqrstuvwxyz";
            do i=1 to &n_rows;
                do j=1 to &n_num_cols;
                    nums[j] = ranuni(&seed);
                end;
                do j=1 to &n_char_cols;
                    chars[j] = substr(temp,ceil(ranuni(&seed)*18),8);
                end;
                output;
            end;
            drop i j temp;
        run;
    %mend;
    
    %generate(&rows.,100,100,outdata=&datasetname);
    %generate(&rows.,100,100,outdata=&datasetname.2);
    %generate(&rows.,100,100,outdata=&datasetname.3);
    
    PROC SQL ;
    TITLE ‘Filesize for &datasetname Data Set’ ;
    SELECT libname,
            memname,
            memtype,
            FILESIZE FORMAT=SIZEKMG.,
            FILESIZE FORMAT=SIZEK.
        FROM DICTIONARY.TABLES
        WHERE libname = upper("&libraryname")
            AND memname CONTAINS upper("&datasetname")
            AND memtype = "DATA" ;
    QUIT ;
    
    * Load datasets into memory;
    sasfile &libraryname..&datasetname load;
    sasfile &libraryname..&datasetname.2 load;
    sasfile &libraryname..&datasetname.3 load;

    Expected results - each of the datasets created is about 1GB:

    Large datasets

    However, what happens next seems to vary between two similarly likely alternatives. Sometimes, the program completes with error messages like this, and the compute session continues to run in a sas-compute-server pod which remains running:

    119  * Load datasets into memory;
    120  sasfile &libraryname..&datasetname load;
    NOTE: The file WORK.BIGTABLE.DATA has been loaded into memory by the SASFILE statement.
    121  sasfile &libraryname..&datasetname.2 load;
    ERROR: File WORK.BIGTABLE2.DATA is damaged. I/O processing did not complete.
    122  sasfile &libraryname..&datasetname.3 load;
    ERROR: File WORK.BIGTABLE3.DATA is damaged. I/O processing did not complete.
    123

    On other occasions, the program might not finish before you see this error message in SAS Studio:

    SAS Session Problem Detected

    If you see SAS program log messages like those above and your SAS compute session continues to work, try simply running the program again. In our experience, it usually does not take many attempts at running the program above in a compute context with MEMSIZE 4G for the compute container to use more than 2G of memory, and for Kubernetes’ OOM killer to kill the sas-compute-server pod. It happens reasonably often on the first attempt.

    If, or hopefully when, you see the error dialog in SAS Studio saying SAS Session Problem Detected, Reset the session and wait for the new compute session to start.

See the memory limits and usage for the sas-programming-environment container

  1. In your MobaXterm session connected to sasnode01 as cloud-user, run this to view the resources requested for the sas-programming-environment container in Delilah’s compute server pod:

    MY_POD=`kubectl get pods --no-headers -l launcher.sas.com/username=Delilah | awk '{print $1}'`
    kubectl get pod $MY_POD -o json | jq -r '.spec.containers[] | select (.name=="sas-programming-environment") | .name, .resources '

    Note: The first command of the two above gets a list of pods launched on behalf of the user Delilah, and puts the first value of the output line, which is the pod name, in variable called MY_POD. The second command then gets a description of that pod in JSON format, and uses jq to select the pod spec’s sas-programming-environment container, and then pretty-print the container’s name and resources section for easier reading.

    Expected output :

    sas-programming-environment
    {
      "limits": {
        "cpu": "2",
        "memory": "2Gi"
      },
      "requests": {
        "cpu": "50m",
        "memory": "300M"
      }
    }
  2. Run the following command to find out what resources the pod is actually using at the moment in time when you run the command:

    MY_POD=`kubectl get pods --no-headers -l launcher.sas.com/username=Delilah | awk '{print $1}'`
    kubectl top pod $MY_POD

    Since at this point in the exercise, your SAS Studio session was recently reset, this compute pod is freshly started and is not likely to be using much memory.

  3. Your turn. Use the what you have learned in this exercise so far to run the SAS program above, and watch the amount of memory your compute server uses while it runs.

    Tip: You may also use OpenLens for this, but it appears OpenLens does not always report the amount of memory in use correctly - it sometimes reports twice the memory use that the kubectl command above reports.

Increase the sas-compute-job-config Memory limit

  1. Create a PodTemplate overlay, which defines a higher memory limit for the Compute Server specific sas-compute-job-config PodTemplate.

    # Set Compute Server Memory requests and limits
    tee ~/project/deploy/$current_namespace/site-config/compute-memory-limits.yaml > /dev/null <<EOF
    ###################################################################################
    # Kustomize patch configuration to set the default and max for
    # memory requests and memory limits to 10% more than the defaults.
    #
    # This PatchTransformer will target only the compute server podTemplate,
    # with name=sas-compute-job-config. This does not include the batch, connect,
    # or general-purpose programming run-time podTemplates.
    #
    # We left a commented-out alternative target in the file, which would select
    # all launched podTemplates except for the sas-cas-pod-template.
    ###############################################################################
    ---
    apiVersion: builtin
    kind: PatchTransformer
    metadata:
      name: compute-memory-limits
    patch: |-
      - op: add
        path: /metadata/annotations/launcher.sas.com~1default-memory-limit
        value: 4000M
      - op: add
        path: /metadata/annotations/launcher.sas.com~1max-memory-limit
        value: 4000M
    target:
      kind: PodTemplate
      # labelSelector: sas.com/template-intent=sas-launcher,workload.sas.com/class=compute
      name: sas-compute-job-config
    EOF
  2. Make a copy of your kustomization.yaml file, so we can see the effect of adding two new lines to it in the next step.

    cp -p ~/project/deploy/gelcorp/kustomization.yaml ~/project/deploy/gelcorp/kustomization-09-011-01.yaml
  3. Use an script to update your kustomization.yaml to reference the overlay:

    # Insert reference to memory limits patch transformer, if it is not already in kustomization.yaml
     [[ $(grep -c "site-config/compute-memory-limits.yaml" ~/project/deploy/${current_namespace}/kustomization.yaml) == 0 ]] && \
    yq4 eval -i '.transformers += ["site-config/compute-memory-limits.yaml"]' ~/project/deploy/${current_namespace}/kustomization.yaml
  4. Run the following command to view the change this yq4 command made to your kustomization.yaml. The changes are in green in the right column.

    icdiff ~/project/deploy/gelcorp/kustomization-09-011-01.yaml ~/project/deploy/gelcorp/kustomization.yaml
  5. Delete the kustomization-09-011-01.yaml file, so that it is not inadvertently included in gelcorp-sasdeployment.yaml in the next step.

    rm  ~/project/deploy/gelcorp/kustomization-09-011-01.yaml

Build and Apply using SAS-Orchestration Deploy

  1. Keep a copy of the current manifest.yaml file.

    cp -p /tmp/${current_namespace}/deploy_work/deploy/manifest.yaml /tmp/${current_namespace}/manifest_09-011-01.yaml
  2. Run the sas-orchestration deploy command.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
               -v ${PWD}/license:/license \
               -v ${PWD}/${current_namespace}:/${current_namespace} \
               -v ${HOME}/.kube/config_portable:/kube/config \
               -v /tmp/${current_namespace}/deploy_work:/work \
               -e KUBECONFIG=/kube/config \
               --user $(id -u):$(id -g) \
           sas-orchestration \
              deploy \
                 --namespace ${current_namespace} \
                 --deployment-data /license/SASViyaV4_${_order}_certs.zip \
                 --license /license/SASViyaV4_${_order}_license.jwt \
                 --user-content /${current_namespace} \
                 --cadence-name ${_cadenceName} \
                 --cadence-version ${_cadenceVersion} \
                 --image-registry ${_viyaMirrorReg}

    When the deploy command completes successfully the final message should say The deploy command completed successfully as shown in the log snippet below.

    The deploy command started
    
    [...]
    
    The deploy command completed successfully

    If the sas-orchestration deploy command fails checkout the steps in 99_Additional_Topics/03_Troubleshoot_SAS_Orchestration_Deploy to help you troubleshoot any problems.

Try to load large datasets into memory in a compute server with a memory limit of 4G

  1. Switch back to Chrome. Sign in to SAS Studio again as Delilah if your session has timed out.

  2. In Chrome, in SAS Studio as Delilah, make sure you have a compute session started under the “SAS Studio compute context with memsize 4G”. You may have to reset your SAS session.

  3. Then copy the following code into a new SAS Program tab, and run it. This is the same code you ran earlier:

    %let libraryname=work;
    %let datasetname=bigtable;
    %let rows=671089;
    
    %macro generate(n_rows,n_num_cols,n_char_cols,outdata=test,seed=0);
        data &outdata;
            array nums[&n_num_cols];
            array chars[&n_char_cols] $;
            temp = "abcdefghijklmnopqrstuvwxyz";
            do i=1 to &n_rows;
                do j=1 to &n_num_cols;
                    nums[j] = ranuni(&seed);
                end;
                do j=1 to &n_char_cols;
                    chars[j] = substr(temp,ceil(ranuni(&seed)*18),8);
                end;
                output;
            end;
            drop i j temp;
        run;
    %mend;
    
    %generate(&rows.,100,100,outdata=&datasetname);
    %generate(&rows.,100,100,outdata=&datasetname.2);
    %generate(&rows.,100,100,outdata=&datasetname.3);
    
    PROC SQL ;
    TITLE ‘Filesize for &datasetname Data Set’ ;
    SELECT libname,
            memname,
            memtype,
            FILESIZE FORMAT=SIZEKMG.,
            FILESIZE FORMAT=SIZEK.
        FROM DICTIONARY.TABLES
        WHERE libname = upper("&libraryname")
            AND memname CONTAINS upper("&datasetname")
            AND memtype = "DATA" ;
    QUIT ;
    
    * Load datasets into memory;
    sasfile &libraryname..&datasetname load;
    sasfile &libraryname..&datasetname.2 load;
    sasfile &libraryname..&datasetname.3 load;

    Expected results - each of the datasets created is about 1GB (the results table is the same as when we saw it earlier).

    This time, all three large datasets should be loaded into memory successfully, and the corresponding log messages look like this:

    119  * Load datasets into memory;
    120  sasfile &libraryname..&datasetname load;
    NOTE: The file WORK.BIGTABLE.DATA has been loaded into memory by the SASFILE statement.
    121  sasfile &libraryname..&datasetname.2 load;
    NOTE: The file WORK.BIGTABLE2.DATA has been loaded into memory by the SASFILE statement.
    122  sasfile &libraryname..&datasetname.3 load;
    NOTE: The file WORK.BIGTABLE3.DATA has been loaded into memory by the SASFILE statement.

    Your SAS compute session should continue to run without issue.

See the new memory limits and usage for the sas-programming-environment container

  1. In your MobaXterm session connected to sasnode01 as cloud-user, run this to view the resources requested for the sas-programming-environment container in Delilah’s compute server pod:

    MY_POD=`kubectl get pods --no-headers -l launcher.sas.com/username=Delilah | awk '{print $1}'`
    kubectl get pod $MY_POD -o json | jq -r '.spec.containers[] | select (.name=="sas-programming-environment") | .name, .resources '

    Note: The first command of the two above gets a list of pods launched on behalf of the user Delilah, and puts the first value of the output line, which is the pod name, in variable called MY_POD. The second command then gets a description of that pod in JSON format, and uses jq to select the pod spec’s sas-programming-environment container, and then pretty-print the container’s name and resources section for easier reading.

    Expected output - notice that the memory limit is now 4G, instead of 2G as we saw earlier:

    sas-programming-environment
    {
      "limits": {
        "cpu": "2",
        "memory": "4G"
      },
      "requests": {
        "cpu": "50m",
        "memory": "300M"
      }
    }
  2. Run the following command to find out what resources the pod is actually using at the moment in time when you run the command:

    MY_POD=`kubectl get pods --no-headers -l launcher.sas.com/username=Delilah | awk '{print $1}'`
    kubectl top pod $MY_POD

    Example output:

    NAME                                                         CPU(cores)   MEMORY(bytes)
    sas-compute-server-388f7b0a-07c2-4672-903f-d01a8af37ede-38   1m           3488Mi

    The compute server pod is using much more than 2GB memory, and should not be killed by Kubernetes as long as it does not exceed the new, higher limit of 4GB memory that we set.

    In this way, you now know how to change both the MEMSIZE and the compute server memory limit, to similar values, so that the SAS compute server can run successfully SAS programs that require more than 2GB of memory. The example program we use in this exercise is contrived, but it served to demonstrate the issues that you will likely see when your SAS program requires more memory than the available limits in SAS and Kubernetes, and how those limits can be adjusted to enable the program to run.


Lesson 10

SAS Viya Administration Operations
Lesson 10, Section 0 Exercise: Defining Alerts

In this exercise, we will create a new alert for Prometheus AlertManager by defining a PrometheusRule. The alert will be configured to send an alert notification when metric values meet the condition specified in the rule.

View metrics in the Prometheus Expression Browser

In this step, we will explore the Prometheus UI and examine the metrics that can be queried. PromQL expressions are used to query the metrics collected by Prometheus. These form the basis for alert conditions.

  1. Log on to the Prometheus UI. The URL can be retrieved by running:

    gellow_urls | grep "Prometheus"
  2. On the Graph page, PromQL queries can be entered in the Expression box. Metrics can be selected from the Metrics Explorer by clicking the globe icon next to the Execute button.
    Select (or find with auto-complete) the container_memory_usage_bytes metric and click Execute.

  3. The metric data value is displayed in bytes. View in GB by changing the expression to:

    container_memory_usage_bytes / (1024 * 1024 * 1024)

    Review the results in the table.

  4. Filter the results again by modifying the query to display SAS Viya containers (i.e. in pods with names beginning with ‘sas-’) from the gelcorp namespace only.

    container_memory_usage_bytes{container!~"POD",namespace="gelcorp",pod=~"sas-.+"} / (1024 * 1024 * 1024)
  5. Click the Graph button to view the time-series chart. Use the displayed information to answer the following:

    • Which container is using the most memory?
    • Which pod is it in?
  6. How could the query be modified to display results for intnode03 only?

    View the answer

    container_memory_usage_bytes{namespace="gelcorp",pod=~"sas-.+",node="intnode03"} / (1024 * 1024 * 1024)


  7. Run the new query from the previous answer. Which container is using the most memory now?

Create a rule

An alert condition can be specified as a PromQL query in a PrometheusRule definition. In this step, define a PrometheusRule to create an alert rule to trigger when the amount of memory consumed by containers inside SAS Viya pods in the gelcorp namespace are more than 20% of total available memory on a Kubernetes cluster node. IMPORTANT: Note that the threshold of 5% is unusually low; this is intentional for this demonstration to ensure the alert fires.

Using PromQL, this condition can be expressed as: log ((sum by (node) (container_memory_usage_bytes{namespace="gelcorp",pod=~"sas-.+"})) / (sum by (node) (kube_node_status_capacity{resource="memory"})) * 100) > 5 Now create the PrometheusRule to set up the alert.

  1. Create a YAML file to create a new PrometheusRules containing the query (defined in the expr element).

    tee ~/PrometheusRule.yaml > /dev/null << EOF
    apiVersion: monitoring.coreos.com/v1
    kind: PrometheusRule
    metadata:
      labels:
        prometheus: prometheus-operator-prometheus
        role: alert-rules
      name: prometheus-viya-rules
      namespace: v4mmon
    spec:
      groups:
      - name: custom-viya-alerts
        rules:
        - alert: ViyaMemoryUsage
          annotations:
            description: Total SAS Viya namespace container memory usage is more than 5% of total memory capacity.
            summary: SAS Viya container high memory usage
            runbook: https://gelgitlab.race.sas.com/GEL/workshops/PSGEL260-sas-viya-4.0.1-administration/-/blob/master/04_Observability/images/runbook.md
          expr: ((sum by (node) (container_memory_usage_bytes{namespace="gelcorp",pod=~"sas-.+"})) / (sum by (node) (kube_node_status_capacity{resource="memory"})) * 100) > 5
          labels:
            severity: critical
    EOF
  2. Apply the rule to the namespace.

    kubectl create --filename ~/PrometheusRule.yaml -n v4mmon

    View the output

    prometheusrule.monitoring.coreos.com/prometheus-viya-rules created

Define a routing tree

Firing alerts can send alert notifications to send messages to nominated people to let them know that an alert condition has been met. Routing, which is the process that defines who gets notified and how, is specified in the Alertmanager configuration in the form of a routing tree. A routing tree defines receivers (persons or channels to whom alert notifications are delivered), and routes (the conditions for determining the receiver to which specific alert notifications are sent).

Define a routing tree to send all firing alerts to a receiver called viya-admins-email-alert, and configure this receiver to send alert notifications to the cloud-user’s email address.

  1. Create a YAML file containing the necessary configuration to define the routing of the alert (when it fires) to cloud-user@localhost.com.

    tee ~/alertmanager.yaml > /dev/null << EOF
    global:
      smtp_smarthost: $(hostname -f):1025
      smtp_from: 'alertmanager@gelcorp.com'
      smtp_require_tls: false
      resolve_timeout: 5m
    route:
      receiver: viya-admins-email-alert
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 12h
    receivers:
    - name: viya-admins-email-alert
      email_configs:
      - to: cloud-user@localhost.com
        headers:
          Subject: 'Prometheus AM Alert Triggered'
          send_resolved: true
        require_tls: false
    EOF

    The values defined in global section in this file contain connection information for the default local mail server. The route section does not contain any child routes or any perform any label matching/filtering. This

  2. The Prometheus AlertManager configuration is stored as a secret(alertmanager-v4m-alertmanager) in the v4mmon namespace. In order for the configuration to be updated with the contents of the YAML file, the secret must be updated.

    First, encode the YAML with base64 encoding. Run the command below to store the encoded string in a variable.

    encodedamcfg=$(cat ~/alertmanager.yaml | base64 -w0)
  3. The resulting encoded string must now be added (in the alertmanager.yaml element) to a new YAML file, alertmanager-secret.yaml, as shown below, in order to update the alertmanager-v4m-alertmanager secret.

    The command below inserts the encoded value of the encodedamcfg variable into the new YAML file.

    tee ~/alertmanager-secret.yaml > /dev/null << EOF
    apiVersion: v1
    data:
      alertmanager.yaml: $(echo $encodedamcfg)
    kind: Secret
    metadata:
      name: alertmanager-v4m-alertmanager
      namespace: v4mmon
    type: Opaque
    EOF
  4. Update the secret.

    kubectl apply -f ~/alertmanager-secret.yaml -n v4mmon

    View the output

    Warning: resource secrets/alertmanager-v4m-alertmanager is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
    secret/alertmanager-v4m-alertmanager configured


  5. Check that the configuration has been updated using the amtool CLI deployed in the Alertmanager pod.

    # get alertmanager url
    amurl=$(gellow_urls | grep  "Alert Manager"|awk '{print $6}')
    
    # check config
    kubectl -n v4mmon exec -it alertmanager-v4m-alertmanager-0 -- amtool --alertmanager.url=$amurl config show

    The new configuration may take a minute to take effect. When it does, the output will appear as follows, with the new receiver defined:

    global:
      resolve_timeout: 5m
      http_config:
        follow_redirects: true
      smtp_from: alertmanager@gelcorp.com
      smtp_hello: localhost
      smtp_smarthost: pdcesx02109.race.sas.com:1025
      smtp_require_tls: false
      pagerduty_url: https://events.pagerduty.com/v2/enqueue
      opsgenie_api_url: https://api.opsgenie.com/
      wechat_api_url: https://qyapi.weixin.qq.com/cgi-bin/
      victorops_api_url: https://alert.victorops.com/integrations/generic/20131114/alert/
      telegram_api_url: https://api.telegram.org
    route:
      receiver: viya-admins-email-alert
      continue: false
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 12h
    receivers:
    - name: viya-admins-email-alert
      email_configs:
      - send_resolved: false
        to: cloud-user@localhost.com
        from: alertmanager@gelcorp.com
        hello: localhost
        smarthost: pdcesx02109.race.sas.com:1025
        headers:
          From: alertmanager@gelcorp.com
          Send_resolved: "true"
          Subject: Prometheus AM Alert Triggered
          To: cloud-user@localhost.com
        html: '{{ template "email.default.html" . }}'
        require_tls: false
    templates: []
  6. Test the new route using amtool by simulating an alert being triggered.

    kubectl -n v4mmon exec -it alertmanager-v4m-alertmanager-0 -- \
       amtool --alertmanager.url=$amurl config routes test -v severity=high

    Expected output:

    viya-admins-email-alert

    Since you now only have one route in the Alertmanager configuration, all alerts, regardless of label values, will be sent to the sole receiver when they begin firing.

Manage firing alerts

Check to see if the alert correctly fires, and that AlertManager sends an email notification when it does. Note that because the threshold for the alert condition was set so low (20%), it will be firing immediately.

In the workshop environment, an email client (Evolution) has been installed and configured to receive emails sent to cloud-user@localhost.com.

  1. Open Evolution by launching it directly from MobaXterm on sasnode01.

    evolution

    Verify that an alert notification email has been sent, and that the ViyaMemoryUsage alert appears in the list of triggered alerts.

    Note that the alert notification email displays all firing alerts, because we only defined one route and one receiver in the routing tree, and they become the defaults for all alerts.

  2. Click the link to View in AlertManager at the top of the notification email. This will open the AlertManager UI in your browser.

  3. Expand the list of “Not grouped” alerts and find the ViyaMemoryUsage alert.

    Why are there multiple alerts firing for this rule? (Hint: click Info to view additional details.)

  4. Silence the alert (firing for intnode01) by clicking on the Silence button. Set a 48 hour silence for the alert firing for intnode01.

    • Enter your name in the Creator field.
    • Remove the “node=intnode01” matcher (to silence the alert for any node for which it is firing) by clicking the trashcan icon.
    • Enter a comment in the Comment field.

    Click Create.

    On the silence confirmation page, note that 5 alerts have been silenced (one for each node).

  5. Head back to the Alerts page and verify the silence has taken effect (the alert is no longer firing).

  6. Close the browser tab and the Evolution mail client.


Lesson 11

SAS Viya Administration Operations
Lesson 11, Section 0 Exercise: Troubleshoot Issues

If you have not completed the rest of the course please follow the instructions in 01_Introduction/01_901_Fast-Forward_Instructions to run the exercise solutions for Chapters 2 and 3.

Issue 1

Symptoms and Description: Users report that there VA reports are not working and power users indicate they cannot access their data in CAS.

Problem ### The Problem

  1. Create the problem

    bash -c "/home/cloud-user/PSGEL260-sas-viya-4.0.1-administration/10_Troubleshooting/scripts/issue001_create.sh"
  2. Open SAS Drive using the generated link and logon as geladm:lnxsas

    gellow_urls | grep "SAS Drive"
  3. Navigate to / Products / SAS Visual Analytics / Samples and open the ’Retail Insights` report.

  4. What happens?

Troubleshooting

I can fix it!

  1. Develop a strategy for how you will fix the problem, then try and implement your fix.

Ask me some questions to guide me through the problem identification and resolution

Click here to get to be asked some questions.
  1. Does the error indicate what the next step should be?
  2. Is the CAS Server running?
  3. What is the status of CAS related pods?
  4. Can you view the logs of the CAS Server?
  5. Can you view details of the CAS controller pod?
  6. Can you restart the CAS Server?
If you have identified the problem you can move into the Fix it section.

Guide me through the process

Click here to get a guided troubleshooting process.
  1. Logon to Environment Manager as geladm:lnxsas. Select Servers. What do you see in relation to the CAS Server?

  2. What Viya services could be the problem? Use kubectl to get the POD’s that are managed by the CAS operator.

    kubectl get pod --selector='app.kubernetes.io/instance=default'

    Expected output: log NAME READY STATUS RESTARTS AGE sas-cas-server-default-controller 0/3 Pending 0 8m8s

  3. Looks like the CAS controller is PENDING. A PENDING a pod is waiting to get scheduled on a node, or for at least one of its containers to initialize. Perform a describe on the CAS controller. Review the events section of the output.

    kubectl describe pod --selector='app.kubernetes.io/instance=default'

    Expected output:

    Events:
    Type     Reason            Age    From               Message
    ----     ------            ----   ----               -------
    Warning  FailedScheduling  3m48s  default-scheduler  0/5 nodes are available: 5 Insufficient cpu. preemption: 0/5 nodes are available: 5 No preemption victims found for incoming pod.
    Warning  FailedScheduling  105s   default-scheduler  0/5 nodes are available: 5 Insufficient cpu. preemption: 0/5 nodes are available: 5 No preemption victims found for incoming pod.
  4. Can you determine the issue from the message. The POD cannot find a node with enough CPU to start the CAS controller. The PENDING status usually means that Kubernetes cannot find a place to start the POD because of resource issues, either disk, memory or CPU.

  5. Have I run out of CPU on my nodes? Doesn’t look like it but in real-life this could be the problem.

    kubectl top nodes

    Expected output:

    NAME        CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
    intnode01   1830m        22%    39795Mi         62%
    intnode02   714m         8%     33957Mi         52%
    intnode03   1133m        14%    24021Mi         37%
    intnode04   1585m        19%    28259Mi         44%
    intnode05   2342m        29%    20552Mi         32%
  6. How much CPU is CAS asking for from Kuberenetes? The command below shows the requests setting for each container in the CAS pod. Kubernetes will look for a node to schedule the pod that can meet the sum of the cpu requests and memory requests defined for the pod.

    kubectl describe pod --selector='app.kubernetes.io/managed-by=sas-cas-operator' | grep "Requests:" -A3 -B7
  7. The CAS container is requesting too much cpu. To fix it you would have to adjust the requests settings for the CAS server OR make sure you have nodes available that can satisfy the CPU request. This is obviously a problem we created for you. Please proceed to the Fix it section.

Fix it

  1. Run the following script to fix the problem.

    bash -c "/home/cloud-user/PSGEL260-sas-viya-4.0.1-administration/10_Troubleshooting/scripts/issue001_fix.sh"

Issue 2

The Problem

Symptoms and Description: Users report they cannot run a program in SAS Studio

Problem
  1. Create the problem

    bash -c "/home/cloud-user/PSGEL260-sas-viya-4.0.1-administration/10_Troubleshooting/scripts/issue002_create.sh"
  2. Open SAS Studio using the generated link and logon as geladm:lnxsas

    gellow_urls | grep "SAS Studio"
  3. Can you submit any SAS Code?

Troubleshooting

I can fix it!

  1. Develop a strategy for how you will fix the problem, then try and implement your fix.

Ask me some questions to guide me through the problem identification and resolution

Click here to get to be asked some questions.
  1. Can you logon to SAS Studio as an administrator and run a SAS Program? What happens?
  2. Is there a message that helps you understand what is wrong?
  3. What log can you check to see what is going on?
  4. Does the message in the log help you?
  5. Where will you fix the problem?

Guide me through the process

Click here to get a guided troubleshooting process.
  1. Logon to SAS Studio as geladm:lnxsas.

  2. In a terminal window on sasnode1 find the launcher pod owned by geladm.

    kubectl get pod -l launcher.sas.com/requested-by-client=sas.studio,launcher.sas.com/username=geladm
  3. View the log from the pod. Is there any useful information?

    kubectl logs -l launcher.sas.com/requested-by-client=sas.studio,launcher.sas.com/username=geladm | klog

    Looks like a SAS Session cannot start.

    Defaulted container "sas-programming-environment" out of: sas-programming-environment, sas-certframe (init), sas-config-init (init)
    ERROR 2023-05-17 15:36:29.217 +0000 [compsrv] - ERROR: (SASXKRIN): KERNEL RESOURCE INITIALIZATION FAILED.
    ERROR 2023-05-17 15:36:29.217 +0000 [compsrv] - ERROR: Unable to initialize the SAS kernel.
    INFO  2023-05-17 15:36:29.253 +0000 [compsrv] - Request  [00000002] >> GET /compute/sessions/6c000a0a-fcfd-40ba-9e55-4cac56ae6dd8-ses0000/state
    INFO  2023-05-17 15:36:29.253 +0000 [compsrv] - Response [00000002] << HTTP/1.1 200 OK
    INFO  2023-05-17 15:36:29.408 +0000 [compsrv] - Request  [00000003] >> POST /compute/sessions/6c000a0a-fcfd-40ba-9e55-4cac56ae6dd8-ses0000/jobs
    ERROR 2023-05-17 15:36:29.409 +0000 [compsrv] - The session requested is currently in a failed or stopped state.
    INFO  2023-05-17 15:36:29.409 +0000 [compsrv] - Response [00000003] << HTTP/1.1 400 Bad Request
    INFO  2023-05-17 15:36:29.410 +0000 [compsrv] - Header   [00000003] << Content-Type: application/vnd.sas.error+json;version=2;charset=utf-8
    INFO  2023-05-17 15:36:29.410 +0000 [compsrv] - Header   [00000003] << Content-Length: 412
    INFO  2023-05-17 15:36:29.410 +0000 [compsrv] - Data     [00000003] << {"details":["ERROR: Unrecognized SAS option name YOUBROKEIT.","ERROR: (SASXKRIN): KERNEL RESOURCE INITIALIZATION FAILED.","ERROR: Unable to initialize the SAS kernel."],"errorCode":5113,"errors":[],"httpStatusCode":400,"id":"","links":[],"message":"The session requested is currently in a failed or stopped state.","remediation":"Correct the errors in the session request, and create a new session.","version":2}
  4. A key piece of information from the log is “ERROR: Unrecognized SAS option name YOUBROKEIT.”,“ERROR: (SASXKRIN): KERNEL RESOURCE INITIALIZATION FAILED. SAS Options are set in the SAS Config or SAS Autoexec. In SAS Viya these files are modified with SAS Environment Manager. (See this blog post.)

  5. Sign in to SAS Environment manager as geladm:lnxsas.

    1. In the vertical navigation bar, select Configuration
    2. Using the View: drop-down list, choose Definitions and Select sas.compute.server
    3. Click the edit button next to Compute service:configuration_options
    4. Edit the configuration to fix the problem.
  6. Test the fix by logging out and logging in again to SAS Studio.

Fix it

  1. Run the following script to fix the problem OR if you fixed it yourself skip this step.

    bash -x "/home/cloud-user/PSGEL260-sas-viya-4.0.1-administration/10_Troubleshooting/scripts/issue002_fix.sh"

Issue 3

Symptoms and Description: The Viya administrator is trying to make a configuration change in the environment. The process is failing. Can you help?

Problem

The Problem

  1. Create the problem

    bash -c "/home/cloud-user/PSGEL260-sas-viya-4.0.1-administration/10_Troubleshooting/scripts/issue003_create.sh"
  2. Run the orchestrate deploy command and review the output.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
            -v ${PWD}/license:/license \
            -v ${PWD}/${current_namespace}:/${current_namespace} \
            -v ${HOME}/.kube/config_portable:/kube/config \
            -v /tmp/${current_namespace}/deploy_work:/work \
            -e KUBECONFIG=/kube/config \
            --user $(id -u):$(id -g) \
            sas-orchestration \
            deploy \
               --namespace ${current_namespace} \
               --deployment-data /license/SASViyaV4_${_order}_certs.zip \
               --license /license/SASViyaV4_${_order}_license.jwt \
               --user-content /${current_namespace} \
               --cadence-name ${_cadenceName} \
               --cadence-version ${_cadenceVersion} \
               --image-registry ${_viyaMirrorReg}
  3. What happens?

Troubleshooting

I can fix it!

  1. Develop a strategy for how you will fix the problem, then try and implement your fix.

Ask me some questions to guide me through the problem identification and resolution

Click here to get to be asked some questions.
  1. Can you use the sas-orchestration deploy command to apply the change?
  2. What message is returned?
  3. Can you manually build a kubernetes manifest ?

Guide me through the process

Click here to get a guided troubleshooting process.
  1. Run the orchestrate deploy command and review the output.

    cd ~/project/deploy
    rm -rf /tmp/${current_namespace}/deploy_work/*
    source ~/project/deploy/.${current_namespace}_vars
    
    docker run --rm \
            -v ${PWD}/license:/license \
            -v ${PWD}/${current_namespace}:/${current_namespace} \
            -v ${HOME}/.kube/config_portable:/kube/config \
            -v /tmp/${current_namespace}/deploy_work:/work \
            -e KUBECONFIG=/kube/config \
            --user $(id -u):$(id -g) \
            sas-orchestration \
            deploy \
               --namespace ${current_namespace} \
               --deployment-data /license/SASViyaV4_${_order}_certs.zip \
               --license /license/SASViyaV4_${_order}_license.jwt \
               --user-content /${current_namespace} \
               --cadence-name ${_cadenceName} \
               --cadence-version ${_cadenceVersion} \
               --image-registry ${_viyaMirrorReg}
  2. The output from the sas-orchestration deploy command notes “Error accumulating resources” This usually means a problem in the kustomization.yaml file that does not allow the manifest to be built. If you look closely at the message you will see the text overlays/cas-servers: no such file or directory, get: invalid source string: sas-bases/overlays/cas-servers"“.

  3. TIP a simple way to test if your kustomization.yaml has errors is to use kustomize to do a manual build and see if it is succesful. Make sure you output the mainfests to a temporary location outside of your project directory.

    cd ~/project/deploy/gelcorp
    kustomize build -o /tmp/site.yaml

    Expected output:

    Error: accumulating resources: accumulateFile "accumulating resources from 'sas-bases/overlays/cas-servers': evalsymlink failure on '/home/cloud-user/project/deploy/gelcorp/sas-bases/overlays/cas-servers' : lstat /home/cloud-user/project/deploy/gelcorp/sas-bases/overlays/cas-servers: no such file or directory", loader.New "Error loading sas-bases/overlays/cas-servers with git: url lacks host: sas-bases/overlays/cas-servers, dir: evalsymlink failure on '/home/cloud-user/project/deploy/gelcorp/sas-bases/overlays/cas-servers' : lstat /home/cloud-user/project/deploy/gelcorp/sas-bases/overlays/cas-servers: no such file or directory, get: invalid source string: sas-bases/overlays/cas-servers"
  4. The paths in the console output from the sas-orchestration deploy command are valid inside the running docker container. The paths in the message we get when we do a manual build with kustomize are the actual source of the files inside our project directory. From the message we can see there is a problem with the reference to /home/cloud-user/project/deploy/gelcorp/sas-bases/overlays/cas-servers . In this case it is a typo, it should be cas-server.

  5. To fix the problem edit the kustomization.yaml file and repeat step 1 to run the orchestration deploy command. it should now work.

Fix it

  1. Run the following script to fix the problem.

    bash -c "/home/cloud-user/PSGEL260-sas-viya-4.0.1-administration/10_Troubleshooting/scripts/issue003_fix.sh"

Issue 4

Symptoms and Description: SAS jobs submitted in batch are failing to complete successfully. Can you help?

Problem

The Problem

  1. Create the problem

    bash -c "/home/cloud-user/PSGEL260-sas-viya-4.0.1-administration/10_Troubleshooting/scripts/issue004_create.sh"
  2. Try running a batch job.

    sas-viya batch jobs submit-pgm --pgm /home/cloud-user/PSGEL260-sas-viya-4.0.1-administration/files/code/doWork1mins.sas -c default
  3. What happens?

Troubleshooting

I can fix it!

  1. Develop a strategy for how you will fix the problem, then try and implement your fix.

Ask me some questions to guide me through the problem identification and resolution

Click here to get to be asked some questions.
  1. Do interactive jobs work? Can you run code in SAS Studio? What queue do these interactive jobs use?
  2. At what point do the jobs fail?
  3. Can you run a job using a different context or queue? Are there any differences?
  4. Does the status of the job in Jobs tab of SAS Environment Manager’s Workload Orchestrator area provide any clues?

If you have identified the problem you can move onto the Fix it section.

Guide me through the process

Click here to get a guided troubleshooting process.
  1. Check to see if interactive jobs work. First, get the URL for SAS Studio.

    gellow_urls | grep "SAS Studio"
  2. Log on geladm:lnxsas. Note that you succesfully establish a connection to the SAS Studio compute context.

  3. Try running some code:

    data work.large_cars;
    set sashelp.cars;
    do i = 1 to 1000; /* Replicate the dataset 1000 times */
        output;
    end;
    run;
    
    proc sort data=work.large_cars out=work.sorted_cars;
        by make model type;
    run;

    After a while, perhaps even before you can execute the code, an error is displayed:

    The issue therefore seems to be affecting interactive jobs as well as batch jobs.

  4. Remember that all SAS Compute workloads are submitted as jobs to SAS Workload Orchestrator queues. Check which queue the SAS Studio compute context is using by going to SAS Environment Manager using the Manage Environment link from the navigation menu.

  5. Click on Contexts and select Compute contexts from the dropdown box. Click on the SAS Studio compute context to view its properties.

    Note that there is no value for SAS Workload Orchestrator queue, which tells us that this context will send jobs to the default queue.

  6. Investigate the failing batch jobs further. Note that the failing jobs are submitted using the default context, but fail after several seconds. The command does not specify a queue, which means they too are added to the default queue.

    If you created an ahdoc queue in an earlier exercise, try submitting jobs to that queue.

    sas-viya batch jobs submit-pgm --pgm /home/cloud-user/PSGEL260-sas-viya-4.0.1-administration/files/code/doWork1mins.sas -c default -q adhoc

    Do you see the the same behaviour?

  7. Compare the default queue with queues created earlier using the CLI.

    sas-viya workload-orchestrator queues list

    Is there anything in the queue configurations that may explain the behaviour you are seeing?

  8. Return to SAS Environment Manager and navigate to the Workload Orchestator area’s Jobs page (or use the CLI’s workload-orchestrator plugin to view the jobs).

    The failed jobs have a state of KILLED-LIMIT. This gives us a clue as to why the jobs are being terminated.

    Did you see anything about limits in the default queue configuration?

  9. Click on one of the failed job IDs to view more information. Click on the Limits page.

    Note that the current value for the maxClockTime resource is greater than the defined maximum value of 10.

    This reflects the limit defined in the queue configuration for the default queue seen earlier.

    ...
    "queues": [
        {
            "isDefaultQueue": true,
            "limits": [
                {
                    "name": "maxClockTime",
                    "value": 10
                }
            ],
            "maxJobs": -1,
            "maxJobsPerHost": -1,
            "maxJobsPerUser": -1,
            "name": "default",
            "priority": 10,
            "scalingMinJobs": -1,
            "scalingMinSecs": -1,
            "tenant": "uaa",
            "willRestartJobs": false
        }
    ],

    With this limit defined, all jobs in the queue will fail after 10 seconds.

  10. Fix the problem by removing the limit. In SAS Environment Manager’s Workload Orchestrator area, click on the Configuration tab, and then click Queues.

  11. Expand the default queue, and scroll to the bottom to view the defined limit. Click the trashcan icon to delete the maxClockTime limit (or increase it).

  12. Click the Save button to apply the change.

  13. Try submitting another batch job to validate.

    sas-viya batch jobs submit-pgm --pgm /home/cloud-user/PSGEL260-sas-viya-4.0.1-administration/files/code/doWork1mins.sas -c default

    This time, the job will finish executing successfully.

Fix it

  1. Run the following script to fix the problem.

    bash -c "/home/cloud-user/PSGEL260-sas-viya-4.0.1-administration/10_Troubleshooting/scripts/issue004_fix.sh"

Issue 5

Symptoms and Description: CAS Management Service is not available from SAS Environment Manager

Problem

The Problem

  1. Create the problem

    bash -x "/home/cloud-userPSGEL260-sas-viya-4.0.1-administration/10_Troubleshooting/scripts/issue005_create.sh"

Troubleshooting

I can fix it!

  1. Develop a strategy for how you will fix the problem, then try and implement your fix.

Ask me some questions to guide me through the problem identification and resolution

Click here to get to be asked some questions.
  1. Does the error indicate what the next step should be?
  2. What is the status of the CAS control pod?
  3. What is the status of CAS related pods?
  4. Can you view the logs of the CAS Server?
  5. Can you view details of the CAS controller pod?
  6. Can you restart the CAS Server?
If you have identified the problem you can move into the Fix it section.

Guide me through the process

Click here to get a guided troubleshooting process.
  1. Logon to Environment Manager as geladm:lnxsas. Select Servers. What do you see in relation to the CAS Server?

  2. The POD that services the CAS Management service is cas-control. Check the status of CAS control.

    kubectl get pods -l app=sas-cas-control

    Expected output: log NAME READY STATUS RESTARTS AGE sas-cas-control-69c657fd7c-lsdng 0/1 Running 0 136m

  3. Looks like none of the containers have started. Do a describe of CAS control and review the events section. This shows the POD is not ready but with no reason why.

    kubectl describe pod -l app=sas-cas-control

    Expected output: log Warning Unhealthy 3m9s (x317 over 86m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503

  4. The next step would be to look at the log of CAS control. Now we have more information, it appears there is no CAS server.

    kubectl logs -l app=sas-cas-control | gel_log

    Expected output: log INFO 2024-09-05T16:58:56.764673+00:00 [sas-cas-control]- no ready CAS servers and no shutdown CAS servers found, so cas-control is not ready INFO 2024-09-05T16:59:13.448135+00:00 [sas-cas-control]- no ready CAS servers, so cas-control is not ready INFO 2024-09-05T16:59:13.448172+00:00 [sas-cas-control]- checking for shutdown CAS servers INFO 2024-09-05T16:59:13.520353+00:00 [sas-cas-control]- no ready CAS servers and no shutdown CAS servers found, so cas-control is not ready INFO 2024-09-05T16:59:33.433562+00:00 [sas-cas-control]- no ready CAS servers, so cas-control is not ready INFO 2024-09-05T16:59:33.433598+00:00 [sas-cas-control]- checking for shutdown CAS servers

  5. Use kubectl to get the POD’s that are managed by the CAS operator. Notice the status of sas-cas-server-default-controller shows Init:0/2. In Kubernetes, the status Init:0/2 for a pod indicates that the pod has two init containers, and neither of them has been completed successfully yet. (Init containers perform startup tasks in PODS, which must be completed before the main application containers start). The output here indicates there is a problem starting the CAS server.

    kubectl get pod --selector='app.kubernetes.io/instance=default'

    Expected output: log NAME READY STATUS RESTARTS AGE sas-cas-server-default-controller 0/3 Init:0/2 0 93m

  6. For more information perform a describe on the CAS controller. Review the events section of the output.

    kubectl describe pod --selector='app.kubernetes.io/instance=default'

    Expected output: log Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedMount 88s (x48 over 82m) kubelet MountVolume.SetUp failed for volume "sas-viya-gelcorp-volume" : mount failed: exit status 32 Mounting command: mount Mounting arguments: -t nfs mynfs.sco.com:/shared/gelcontent /var/lib/kubelet/pods/1ef1388d-2913-4e00-ac29-25c665d7abf6/volumes/kubernetes.io~nfs/sas-viya-gelcorp-volume Output: mount.nfs: Failed to resolve server mynfs.sco.com: Name or service not known

  7. Can you determine the issue from the message. Looks like a mount command is failing. For futher debugging we could check the event log.

    kubectl get events | grep sas-cas

    Expected output: log 47s Warning Unhealthy pod/sas-cas-control-69c657fd7c-lsdng Readiness probe failed: HTTP probe failed with statuscode: 503 58s Warning FailedMount pod/sas-cas-server-default-controller MountVolume.SetUp failed for volume "sas-viya-gelcorp-volume" : mount failed: exit status 32...

  8. The message from the kubectl describe provides the best clue to the issue : mount.nfs: Failed to resolve server mynfs.sco.com: Name or service not known. The mount of the shared storage inside the POD is failing because the POD cannot access mynfs.sco.com. The events command indicates that the name of the volume that failed is “sas-viya-gelcorp-volume”.

Fix it

  1. The problem for us is one that was created by entering an unknown host as the hosttname of the NFS server. We can fix it by adding the correct hostname. Run the following script to fix the problem.

    bash -x "/home/cloud-userPSGEL260-sas-viya-4.0.1-administration/10_Troubleshooting/scripts/issue005_fix.sh"