1 of 13

v5

Install

Your Onyxia instance, today

Oneliner

TLDR. Here is how you can get an Onyxia instance running in a matter of seconds.

helm repo add onyxia https://inseefrlab.github.io/onyxia

cat << EOF > ./onyxia-values.yaml
ingress:
  enabled: true
  hosts:
    - host: datalab.my-domain.net
EOF

helm install onyxia onyxia/onyxia -f onyxia-values.yaml

With this, you will obtain an instance operating in a degraded mode, which lacks features such as authentication, S3 explorer, secret management, etc. However, you will still have the capability to launch services from the catalog.

Step by step installation guide

In this section, we will set up Onyxia from the ground up, along with all the associated technologies. This includes MinIO for S3, Keycloak for OIDC, and Vault for managing secrets.

Provision a Kubernetes cluster

First you'll need a Kubernetes cluster. If you have one already you can skip this section.

Hashicorp maintains great tutorials for terraforming Kubernetes clusters on AWS, GCP or Azure.

Pick one of the three and follow the guide.

You can stop after the configure kubectl section.

Ingress controller

Deploy an ingress controller on your cluster:

The following command is for AWS.

For GCP use this command.

For Azure use this command.

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.2.0/deploy/static/provider/aws/deploy.yaml

DNS

Let's assume you own the domain name my-domain.net, for the rest of the guide you should replace my-domain.net by a domain you actually own.

Now you need to get the external address of your cluster, run the command

kubectl get services -n ingress-nginx

and write down the External IP assigned to the LoadBalancer.

Depending on the cloud provider you are using it can be an IPv4, an IPv6 or a domain. On AWS for example, it will be a domain like xxx.elb.eu-west-1.amazonaws.com.

If you see <pending>, wait a few seconds and try again.

Once you have the address, create the following DNS records:

onyxia.my-domain.net CNAME xxx.elb.eu-west-1.amazonaws.com. 
*.lab.my-domain.net  CNAME xxx.elb.eu-west-1.amazonaws.com.

If the address you got was an IPv4 (x.x.x.x), create a A record instead of a CNAME.

If the address you got was ans IPv6 (y:y:y:y:y:y:y:y), create a AAAA record.

https://onyxia.my-domain.net will be the URL for your instance of Onyxia. The URL of the services created by Onyxia are going to look like: https://<something>.lab.my-domain.net

You can customise "onyxia" and "lab" to your liking, for example you could chose datalab.my-domain.net and *.kub.my-domain.net.

SSL

In this section we will obtain a TLS certificate issued by LetsEncrypt using the certbot commend line tool then get our ingress controller to use it.

If you are already familiar with certbot you're probably used to run it on a remote host via SSH. In this case you are expected to run it on your own machine, we'll use the DNS chalenge instead of the HTTP chalenge.

brew install certbot #On Mac, lookup how to install certbot for your OS

#Because we need a wildcard certificate we have to complete the DNS callange.  
sudo certbot certonly --manual --preferred-challenges dns

# When asked for the domains you wish to optains a certificate for enter:
#   onyxia.my-domain.net *.lab.my-domain.net

The obtained certificate needs to be renewed every three month.

To avoid the burden of having to remember to re-run the certbot command periodically you can setup cert-manager and configure a DNS01 challenge provider on your cluster but that's out of scope for Onyxia.

You may need to delegate your DNS Servers to one of the supported DNS service provider.

Now we want to create a Kubernetes secret containing our newly obtained certificate:

DOMAIN=my-domain.net
sudo kubectl create secret tls onyxia-tls \
    -n ingress-nginx \
    --key /etc/letsencrypt/live/onyxia.$DOMAIN/privkey.pem \
    --cert /etc/letsencrypt/live/onyxia.$DOMAIN/fullchain.pem

Lastly, we want to tell our ingress controller to use this TLS certificate, to do so run:

kubectl edit deployment ingress-nginx-controller -n ingress-nginx

This command will open your configured text editor, go to line 56 and add:

      - --default-ssl-certificate=ingress-nginx/onyxia-tls

If you are on a Mac or Window computer you can install Docker desktop then enable Kubernetes.

Docker desktop isn't available on Linux, you can use Kind instead.

Port Forwarding

You'll need to forward the TCP ports 80 and 443 to your local machine. It's done from the administration panel of your domestic internet Box. If you're on a corporate network, no luck for you I'm afraid.

DNS

Let's assume you own the domain name my-domain.net, for the rest of the guide you should replace my-domain.net by a domain you actually own.

Get your internet box routable IP and create the following DNS records:

onyxia.my-domain.net A <YOUR_IP>
*.lab.my-domain.net  A <YOUR_IP>

If you have DDNS domain you can create CNAME instead example:

onyxia.my-domain.net CNAME jhon-doe-home.ddns.net.
*.lab.my-domain.net  CNAME jhon-doe-home.ddnc.net.

https://onyxia.my-domain.net will be the URL for your instance of Onyxia.

The URL of the services created by Onyxia are going to look like: https://xxx.lab.my-domain.net

You can customise "onyxia" and "lab" to your liking, for example you could chose datalab.my-domain.net and *.kub.my-domain.net.

SSL

In this section we will obtain a TLS certificate issued by LetsEncrypt using the certbot commend line tool.

brew install certbot #On Mac, lookup how to install certbot for your OS

# Because we need a wildcard certificate we have to complete the DNS callange.  
sudo certbot certonly --manual --preferred-challenges dns

# When asked for the domains you wish to optains a certificate for enter:
#   onyxia.my-domain.net *.lab.my-domain.net

The obtained certificate needs to be renewed every three month.

You may need to delegate your DNS Servers to one of the supported DNS service provider.

Now we want to create a Kubernetes secret containing our newly obtained certificate:

kubectl create namespace ingress-nginx
DOMAIN=my-domain.net
sudo kubectl create secret tls onyxia-tls \
    -n ingress-nginx \
    --key /etc/letsencrypt/live/onyxia.$DOMAIN/privkey.pem \
    --cert /etc/letsencrypt/live/onyxia.$DOMAIN/fullchain.pem

Ingress controller

We'll install ingress-nginx in our cluster ~~but any other ingress controller will do~~.

cat << EOF > ./ingress-nginx-values.yaml
controller:
  extraArgs:
    default-ssl-certificate: "ingress-nginx/onyxia-tls"
EOF

helm install ingress-nginx ingress-nginx \
    --repo https://kubernetes.github.io/ingress-nginx \
    --namespace ingress-nginx \
    -f ./ingress-nginx-values.yaml

Installing Onyxia using helm

In this section we assume that:

You have a Kubernetes cluster and kubectl configured
onyxia.my-domain.net and *.lab.my-domain.net are pointing to your cluster's external address. my-domain.net being a domain that you own. You can customise "onyxia" and "lab" to your liking, for example you could chose datalab.my-domain.net and *.kub.my-domain.net.
You have an ingress controller configured with a default TLS certificate for *.lab.my-domain.net and onyxia.my-domain.net.

As of today the default service catalog will only work with ingress-nginx.

This will be addressed in the near future.

Through out this guide we make as if everything was instantaneous. In reality if you are testing on a small cluster you will need to wait several minutes after hitting helm install for the services to be ready.

Use kubectl get pods to see if your pods are up and ready.

(Optional) Make sure that your cluster is ready for Onyxia

To make sure that your Kubernetes cluster is correctly configured let's deploy a test web app on it before deploying Onyxia.

DOMAIN=my-domain.net

cat << EOF > ./test-spa-values.yaml
ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
  hosts:
    - host: test-spa.lab.$DOMAIN
EOF

helm repo add etalab https://etalab.github.io/helm-charts
helm install test-spa etalab/keycloakify-demo-app -f test-spa-values.yaml
echo "Navigate to https://test-spa.lab.$DOMAIN, see the Hello World"
helm uninstall test-spa

helm repo add onyxia https://inseefrlab.github.io/onyxia

DOMAIN=my-domain.net

cat << EOF > ./onyxia-values.yaml
ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
  hosts:
    - host: onyxia.$DOMAIN
api:
  regions: 
    [
       {
          "id":"demo",
          "name":"Demo",
          "description":"This is a demo region, feel free to try Onyxia !",
          "services":{
             "type":"KUBERNETES",
             "singleNamespace":true,
             "namespacePrefix":"user-",
             "usernamePrefix":"oidc-",
             "groupNamespacePrefix":"projet-",
             "groupPrefix":"oidc-",
             "authenticationMode":"serviceAccount",
             "expose":{
                "domain":"lab.$DOMAIN"
             },
             "monitoring":{
                "URLPattern":"todo"
             },
             "initScript":"https://inseefrlab.github.io/onyxia/onyxia-init.sh"
          }
       }
    ]
EOF

helm install onyxia onyxia/onyxia -f onyxia-values.yaml

You can now access https://onyxia.my-domain.net and start services. Congratulations! 🥳

You have the ability to customize the user interface (UI) of Onyxia through the provision of specific environment variables to the UI. For details on the available options, please consult the 'UI Customization' section of this file.

If you are unsure about how to supply these variables, refer to the later section of this guide where we discuss how to provide the KEYCLOAK_* parameters. You'll then be able to add your UI-related parameters alongside them.

Enabling user authentication

At the moment there is no authentication process, everyone can access our platform and and start services.

Let's setup Keycloak to enable users to create account and login to our Onyxia.

Notes if you already have a Keycloak server

If you already have a Keycloak server it is up to you to pick from this guide what is rellevent to you.

Main takeway is that you probably want to load the Onyxia custom Keycloak theme and enable -Dkeycloak.profile=preview in order to be able to enforce that usernames are well formatted and define an accept list of email domains allowed to create account on your Onyxia instance.

You probably want to enable Terms and Conditions as required actions.

That out of the way, note that you can configure onyxia-web to integrate with your existing Keycloak server, you just need to set some dedicated environment variable in the values.yaml of the onyxia helm chart. Example:

 web:
  env:
    # Available env are documented here: https://github.com/InseeFrLab/onyxia-web/blob/main/.env
    KEYCLOAK_URL: https://auth.lab.my-domain.net/auth
    KEYCLOAK_CLIENT_ID: onyxia
    KEYCLOAK_REALM: datalab
    JWT_EMAIL_CLAIM: email
    JWT_FAMILY_NAME_CLAIM: family_name
    JWT_FIRST_NAME_CLAIM: given_name
    JWT_USERNAME_CLAIM: preferred_username
    JWT_LOCALE_CLAIM: locale

For deploying our Keycloak we use codecentric's helm chart.

helm repo add codecentric https://codecentric.github.io/helm-charts

DOMAIN=my-domain.net
POSTGRESQL_PASSWORD=xxxxx #Replace by a strong password, you will never need it.
# Credentials for logging to https://auth.lab.$DOMAIN/auth
KEYCLOAK_USER=admin
KEYCLOAK_PASSWORD=yyyyyy 

cat << EOF > ./keycloak-values.yaml
image:
  # We use the legacy variant of the image until codecentric update it's helm chart
  tag: "19.0.3-legacy"
replicas: 1
extraInitContainers: |
  - name: realm-ext-provider
    image: curlimages/curl
    imagePullPolicy: IfNotPresent
    command:
      - sh
    args:
      - -c
      - |
        # There is a custom theme published alongside every onyxia-web release
        # The version of the Keycloak theme and the version of onyxia-web don't need 
        # to match but you should update the theme from time to time.  
        # https://github.com/InseeFrLab/onyxia-web/releases
        curl -L -f -S -o /extensions/onyxia.jar https://github.com/InseeFrLab/onyxia-web/releases/download/v2.29.4
/keycloak-theme.jar
    volumeMounts:
      - name: extensions
        mountPath: /extensions
extraVolumeMounts: |
  - name: extensions
    mountPath: /opt/jboss/keycloak/standalone/deployments
extraVolumes: |
  - name: extensions
    emptyDir: {}
extraEnv: |
  - name: KEYCLOAK_USER
    value: $KEYCLOAK_USER
  - name: KEYCLOAK_PASSWORD
    value: $KEYCLOAK_PASSWORD
  - name: JGROUPS_DISCOVERY_PROTOCOL
    value: kubernetes.KUBE_PING
  - name: KUBERNETES_NAMESPACE
    valueFrom:
     fieldRef:
       apiVersion: v1
       fieldPath: metadata.namespace
  - name: KEYCLOAK_STATISTICS
    value: "true"
  - name: CACHE_OWNERS_COUNT
    value: "2"
  - name: CACHE_OWNERS_AUTH_SESSIONS_COUNT
    value: "2"
  - name: PROXY_ADDRESS_FORWARDING
    value: "true"
  - name: JAVA_OPTS
    value: >-
      -Dkeycloak.profile=preview -XX:+UseContainerSupport -XX:MaxRAMPercentage=50.0 -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true 
ingress:
  enabled: true
  servicePort: http
  annotations:
    kubernetes.io/ingress.class: nginx
    ## Resolve HTTP 502 error using ingress-nginx:
    ## See https://www.ibm.com/support/pages/502-error-ingress-keycloak-response
    nginx.ingress.kubernetes.io/proxy-buffer-size: 128k
  rules:
    - host: "auth.lab.$DOMAIN"
      paths:
        - path: /
          pathType: Prefix
  tls:
    - hosts:
        - auth.lab.$DOMAIN
postgresql:
  postgresqlPassword: $POSTGRESQL_PASSWORD
EOF

helm install keycloak codecentric/keycloak -f keycloak-values.yaml

You can now login to the administration console of https://auth.lab.my-domain.net and login using the credentials you have defined with KEYCLOAK_USER and KEYCLOAK_PASSWORD.

Create a realm called "datalab" (or something else), go to Realm settings
1. On the tab General
  1. User Profile Enabled: On
2. On the tab login
  1. User registration: On
  2. Forgot password: On
  3. Remember me: On
3. On the tab email, we give an example with **** AWS SES, if you don't have a SMTP server at hand you can skip this by going to Authentication (on the left panel) -> Tab Required Actions -> Uncheck "set as default action" Verify Email. Be aware that with email verification disable, anyone will be able to sign up to your service.
  1. From: [email protected]
  2. Host: email-smtp.us-east-2.amazonaws.com
  3. Port: 465
  4. Authentication: enabled
  5. Username: **************
  6. Password: ***************************************
  7. When clicking "save" you'll be asked for a test email, you have to provide one that correspond to a pre-existing user or you will get a silent error and the credentials won't be saved.
4. On the tab Themes
  1. Login theme: onyxia-web (you can also select the login theme on a per client basis)
  2. Email theme: onyxia-web
5. On the tab Localization
  1. Internationalization: Enabled
  2. Supported locales: <Select the languages you wish to support>
Create a client called "onyxia"
1. Root URL: https://onyxia.my-domain.net/
2. Valid redirect URIs: https://onyxia.my-domain.net/*
3. Web origins: *
4. Login theme: onyxia-web
In Authentication (on the left panel) -> Tab Required Actions enable and set as default action Therms and Conditions.

Now you want to ensure that the username chosen by your users complies with Onyxia requirement (only alphanumerical characters) and define a list of email domain allowed to register to your service.

Go to Realm Settings (on the left panel) -> Tab User Profile (this tab shows up only if User Profile is enabled in the General tab and you can enable user profile only if you have started Keycloak with -Dkeycloak.profile=preview) -> JSON Editor.

Now you can edit the file as suggested in the following DIFF snippet. Be mindful that in this example we only allow emails @gmail.com and @hotmail.com to register you want to edit that.

{
  "attributes": [
    {
      "name": "username",
      "displayName": "${username}",
      "validations": {
        "length": {
          "min": 3,
          "max": 255
        },
+       "pattern": {
+         "error-message": "${alphanumericalCharsOnly}",
+         "pattern": "^[a-zA-Z0-9]*$"
+       },
        "username-prohibited-characters": {}
      }
    },
    {
      "name": "email",
      "displayName": "${email}",
      "validations": {
        "email": {},
+       "pattern": {
+         "pattern": "^[^@]+@([^.]+\\.)*((gmail\\.com)|(hotmail\\.com))$"
+       },
        "length": {
          "max": 255
        }
      }
    },
    {
      "name": "firstName",
      "displayName": "${firstName}",
      "required": {
        "roles": [
          "user"
        ]
      },
      "permissions": {
        "view": [
          "admin",
          "user"
        ],
        "edit": [
          "admin",
          "user"
        ]
      },
      "validations": {
        "length": {
          "max": 255
        },
        "person-name-prohibited-characters": {}
      }
    },
    {
      "name": "lastName",
      "displayName": "${lastName}",
      "required": {
        "roles": [
          "user"
        ]
      },
      "permissions": {
        "view": [
          "admin",
          "user"
        ],
        "edit": [
          "admin",
          "user"
        ]
      },
      "validations": {
        "length": {
          "max": 255
        },
        "person-name-prohibited-characters": {}
      }
    }
  ]
}

Now our Keycloak server is fully configured we just need to update our Onyxia deployment to let it know about it.

Update the onyxia-values.yaml file that you created previously, don't forget to replace all the occurence of my-domain.net by your actual domain.

Don't forget as well to remplace the terms of services of the sspcloud by your own terms of services. CORS should be enabled on those .md links (Access-Control-Allow-Origin: *).

+serviceAccount:
+  clusterAdmin: true
 ingress:
   enabled: true
   annotations:
     kubernetes.io/ingress.class: nginx
   hosts:
     - host: onyxia.my-domain.net
 web:
+  env:
+    TERMS_OF_SERVICES: |
+      { 
+        "en": "https://www.sspcloud.fr/tos_en.md", 
+        "fr": "https://www.sspcloud.fr/tos_fr.md" 
+      }
 api:
   env:
+    authentication.mode: openidconnect
+    oidc.issuer-uri: "https://auth.lab.my-domain.net/auth/realms/datalab"
+    oidc.clientID: "onyxia"
+    oidc.audience: "onyxia"
   regions:
     [
        {
           "id":"demo",
           "name":"Demo",
           "description":"This is a demo region, feel free to try Onyxia !",
           "services":{
              "type":"KUBERNETES",
-             "singleNamespace": true,
+             "singleNamespace": false,
              "namespacePrefix":"user-",
              "usernamePrefix":"oidc-",
              "groupNamespacePrefix":"projet-",
              "groupPrefix":"oidc-",
              "authenticationMode":"serviceAccount",
              "expose":{
                 "domain":"lab.my-domain.net"
              },
              "monitoring":{
                 "URLPattern":"todo"
              },
              "initScript":"https://inseefrlab.github.io/onyxia/onyxia-init.sh"
           }
        }
     ]

Now that you have updated onyxia-values.yaml restart onyxia-web with the new configuration.

helm upgrade onyxia inseefrlab/onyxia -f onyxia-values.yaml

Now your users should be able to create account, log-in, and start services on their own Kubernetes namespace.

S3 Storage

Onyxia-web use AWS Security Token Service API to get token and empowered user with storage features. We support any S3 storage compatible with this API. In this context, we are using MinIO, which is compatible with the Amazon S3 storage service and we demonstrate how to integrate it with Keycloak.

Create a Keycloak client for Accessing Keycloak

Before configuring MinIO, let's create a new client for Keycloak (from the previous existing "datalab" realm).

Create a client called "minio".

Client ID: minio
Client Protocol: openid-connect
Root URL: https://minio.lab.my-domain.net/

Complete the content of client "minio" with the following values.

Access Type: confidential
Valid Redirect URIs (two values are required): https://minio.lab.my-domain.net/* and https://minio-console.lab.my-domain.net/*
Web origins: *

Save the content, a new tab called Credentials must be appear. Navigate to Credentials tab and copy the secret value for the next section.

Navigate to Mappers tab and create a protocol Mapper.

Name: policy
Mapper Type: Hardcoded claim

Complete the content of Mapper "policy" with the following values.

Token Claim Name: policy
Claim value: stsonly
Add to ID token: on
Add to access token: on
Add to userinfo: on

Install MinIO

We recommand you to follow MinIO documentation for this installation and you must activate OIDC authentification. We will use the official Helm in this tutorial. All Helm configuration values can be found within this link.

Replace COPY_SECRET_FROM_KEYCLOAK_MINIO_CLIENT by the secret value defined into the "minio" Keycloak client (see previous section).

helm repo add minio https://charts.min.io/
 
DOMAIN=my-domain.net

cat << EOF > ./minio-values.yaml
## replicas: 16
ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
  path: /
  hosts:
    - minio.lab.$DOMAIN
  tls:
    - hosts:
        - minio.lab.$DOMAIN
consoleIngress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
  paths: /
  hosts:
    - minio-console.lab.$DOMAIN
  tls:
    - hosts:
        - minio-console.lab.$DOMAIN
environment:
  MINIO_BROWSER_REDIRECT_URL: https://minio-console.lab.$DOMAIN
oidc:
  enabled: true
  configUrl: "https://auth.lab.$DOMAIN/auth/realms/datalab/.well-known/openid-configuration"
  clientId: "minio"
  claimName: "policy"
  scopes: "openid,profile,email"
  redirectUri: "https://minio-console.lab.$DOMAIN/oauth_callback"
  claimPrefix: ""
  comment: ""
  clientSecret: COPY_SECRET_FROM_KEYCLOAK_MINIO_CLIENT
policies:
  - name: stsonly
    statements:
      - resources:
          - 'arn:aws:s3:::oidc-${jwt:preferred_username}'
          - 'arn:aws:s3:::oidc-${jwt:preferred_username}/*'
        actions:
          - "s3:*"
EOF

helm install minio minio/minio -f minio-values.yaml

MinIO is now deployed and is accessible on the console url.

By default, there are 16 MinIO containers running. If this number is too large for your Kubernetes cluster, you can limit it by configuring the 'replicas' key.

Create a Keycloak Client for Onyxia/MinIO

Before configuring the onyxia region to create tokens we should go back to Keycloak and create a new client to enable onyxia-web to request token for MinIO. This client is a little bit more complexe than other if you want to manage durations (here 7 days) and this client should have a claim name policy and with a value of stsonly according to our last deployment of MinIO.

From "datalab" realm, create a client called "onyxia-minio"

Client ID: onyxia-minio
Client Protocol: openid-connect
Root URL: https://onyxia.my-domain.net/

Complete the content of client "onyxia-minio" with the following values.

Access Type: public
Valid Redirect URIs: https://onyxia.my-domain.net/*
Web origins: *
Advanced Settings 1. Access Token Lifespan : 7 days 2. Client Session Idle : 7 days 3. Client Session Max: 7 days

Save the content and navigate to Mappers tab and create two protocol Mappers.

Create the first Mapper called "policy".

Token Name: policy
Mapper Type: Hardcoded claim
Token Claim Name: policy
Claim value: stsonly
Add to ID token: on
Add to access token: on
Add to userinfo: on

Create the second Mapper called "audience-minio".

Token Name: audience-minio
Mapper Type: Audience
_Included Custom Audience _: minio
Add to ID token: on
Add to access token: on

Update Onyxia

S3 storage is configured inside a region in Onyxia api. You have some options to configure this storage and let inform Onyxia web all needed informations how to generate those tokens : keycloak parameters to access storage API, duration of STS tokens, bucket name with a standard prefix and a claim in the user JWT token to generate a unique identifiant for this bucket name, whether Onyxia-web should try to to create this bucket silently or not. There is also options for projects. You should look all options for the version of your need on github

serviceAccount:
  clusterAdmin: true
 ingress:
   enabled: true
   annotations:
     kubernetes.io/ingress.class: nginx
   hosts:
     - host: onyxia.my-domain.net
 web:
  env:
    KEYCLOAK_REALM: datalab
    KEYCLOAK_URL: https://auth.lab.my-domain.net/auth
    TERMS_OF_SERVICES: |
      { "en": "https://www.sspcloud.fr/tos_en.md", "fr": "https://www.sspcloud.fr/tos_fr.md" }
 api:
   env:
    authentication.mode: openidconnect
    keycloak.realm: datalab
    keycloak.auth-server-url: https://auth.lab.my-domain.net/auth
   regions:
     [
        {
           "id":"demo",
           "name":"Demo",
           "description":"This is a demo region, feel free to try Onyxia !",
           "services":{
              "type":"KUBERNETES",
              "singleNamespace": false,
              "namespacePrefix":"user-",
              "usernamePrefix":"oidc-",
              "groupNamespacePrefix":"projet-",
              "groupPrefix":"oidc-",
              "authenticationMode":"admin",
              "expose":{
                 "domain":"lab.my-domain.net"
              },
              "monitoring":{
                 "URLPattern":"todo"
              },
              "cloudshell":{
                 "catalogId":"inseefrlab-helm-charts-datascience",
                 "packageName":"cloudshell"
              },
              "initScript":"https://inseefrlab.github.io/onyxia/onyxia-init.sh"
           },
           "data":{
              "S3":{
-                "URL":"todo",
+                "type": "minio",
+                "URL": "https://minio.lab.my-domain.net",
+                "region": "us-east-1",
+                "bucketPrefix": "oidc-",
+                "groupBucketPrefix": "projet-",
+                "bucketClaim": "preferred_username",
+                "defaultDurationSeconds": 86400,
+                "keycloakParams":
+                {
+                      "URL": "https://auth.lab.my-domain.net/auth",
+                      "realm": "datalab",
+                      "clientId": "onyxia-minio",
+                },
+                "acceptBucketCreation": true,
                 "monitoring":{
                    "URLPattern":"minio"
                 }
              }
           },
           "auth":{
              "type":"openidconnect"
           },
           "location":{
              "lat":48.8164,
              "long":2.3174,
              "name":"Montrouge (France)"
           }
        }
     ]

helm upgrade onyxia inseefrlab/onyxia -f onyxia-values.yaml

Vault

Onyxia-web use vault as a storage for two kinds of secrets : 1. secrets or information generate by Onyxia to store differents values (ui preferences for example) 2. user secrets Vault must be configured with JWT or OIDC authentification methods.

As vault need to be initialized with a master key, It can't be directly configured with all parameters such as oidc or access policies and roles. So first step we create a vault with dev mode (do not use this in production and do your initialization with any of the recommanded configuration : shamir, gcp, another vault)

helm repo add hashicorp https://helm.releases.hashicorp.com
 
DOMAIN=my-domain.net

cat << EOF > ./vault-values.yaml
server:
  dev:
    enabled: true
    # Set VAULT_DEV_ROOT_TOKEN_ID value
    devRootToken: "root"
  ingress:
    enabled: true
    annotations:
      kubernetes.io/ingress.class: nginx
    hosts:
      - host: "vault.lab.$DOMAIN"
    tls:
      - hosts:
          - vault.lab.$DOMAIN
EOF

helm install vault hashicorp/vault -f vault-values.yaml

Create a client called "vault"

Root URL: https://vault.lab.my-domain.net/
Valid redirect URIs: https://vault.lab.my-domain.net/*
Web origins: *

TODO; Refer to the legacy documentation.

User guide

Using Onyxia (as a data scientist)

Start a service

Following is a documentation Onyxia when configured with the default service catalogs :

This collection of charts help users to launch many IDE with various binary stacks (python , R) with or without GPU support. Docker images are built here and help us to give a homogeneous stack.

This collection of charts help users to launch many databases system. Most of them are based on bitnami/charts.

This collection of charts help users to start automation tools for their datascience activity.

The Onyxia user experience may be very different from one catalog of service to another.

The catalog defines what options are available though Onyxia.

Users can edit various parameters. Onyxia do some assertion based on the charts values schema and the configuration on the instance. For example some identity token can be injected by default (because Onyxia connect users to many APIs).

After launching a service, notes are shown to the user. He can retrieve those notes on the README button. Charts administrator should explain how to connect to the services (url , account) and what happens on deletion.

File browser

Users can manage their files on S3. There is no support for rename in S3 so don't be surprise. Onyxia is educational. Any action on the S3 browser in the UI is written in a console with a cli.

User can do the following S3 actions :

download files
upload files
delete files

Of course, in our default catalags there are all the necessary tools to connect to S3.

Our advice is to never download file to your container but directly ingest in memory the data.

Secret browser

Users can mange their secrets on Vault. There is also a cli console.

Onyxia use only a key value v2 secret engine in Vault. Users can store some secrets there and inject them in their services if configured by the helm chart.

Of course, in our default catalags there are all the necessary tools to connect to Vault.

Catalog of services

Every Onyxia instance may or may not have it's own catalog. There is three default catalogs :

This collection of charts help users to launch many IDE with various binary stacks (python , R) with or without GPU support. Docker images are built and help us to give a homogeneous stack.

This collection of charts help users to launch many databases system. Most of them are based on .

This collection of charts help users to start automation tools for their datascience activity.

You can always find the source of the catalog by clicking on the "contribute to the... " link.

If you take , it has only one catalog, .

The available catalogs in a given Onyxia instance are configured at install time, example with datalab.sspcloud.fr:

In order to contribute you have to be familiar with and to be familiar with Helm you need to be familiar with .

In Onyxia we use the values.schema.json file to know what options should be displayed to the user at and what default value Onyxia should inject.

Let's consider a sample of the values.schema.json of the InseeFrLab/helm-charts-datascience's Jupyter chart:

And it translates into this:

Note the "git.name", "git.email" and "git.token", this enables to pre fill the fields.

If the user took the time to fill its profile information, know what is the Git username, email and personal access token of the user.

is defined the structure of the context that you can use in the overwriteDefaultWith field:

You can also concatenate string values using syntax.

Defining region scoped resources limit

You probably want to be able to define a limit to the amount of resources a user can request when launching a service.

It's possible to do it at the catalog level but it's best to enable the person who is deploying Onyxia to define boundaries for his deployment regions.

This is the purpose of the x-onyxia param useRegionSliderConfig

You now have all the relevent information to submit PR on the existing catalogs or even to create your own.

Remember that a helm chart repository is nothing more than a GitHub repo with a special to publish the charts on GitHub Pages.

If you are looking for a repo to start from have a look at , it has a directory where you can put the icons of your services.

Security consideration

Information about security considerations

1. Autolaunch Feature

The autolaunch feature empowers you to create HTTP links that automatically deploy an environment. This is an invaluable tool for initiating trainings effortlessly. However, exercise caution while using it as it could pose a security risk to the user. Consider disabling this feature if it doesn't suit your requirements or if security is a primary concern.

2. Group Feature

Onyxia is primarily designed to allocate resources such as a namespace and an S3 bucket to an individual user for work purposes. Additionally, it incorporates a feature that allows multiple users to share access to the same resources within a project. While this can be extremely beneficial for collaboration, be aware that it might be exploited by a malicious user within the group to leverage the privileges of another project member. Always monitor shared resources and maintain proper user access control to prevent such security breaches.

Migration guides

v4 -> v5

The primary breaking change in this release pertains to Keycloak configuration. With this update, you're no longer limited to using Keycloak; any OIDC-compliant identity provider is now supported. To accommodate this new feature, you'll need to make some adjustments to the configuration of your Onyxia instance.

You don't need to specify the issuerURI in multiple locations as we have done here. If you're using just one identity server (You have only one Keycloak server for example), you can set the issuerURI solely in api->env->oidc.issuer-uri.

Migrating to the new helm repo

Previously, the Helm chart of Onyxia was hosted on the inseefrlab/helm-charts repo and has now been moved to inseefrlab/onyxia. As a result you would now install Onyxia like this:

-helm repo add inseefrlab https://inseefrlab.github.io/helm-charts
+helm repo add onyxia https://inseefrlab.github.io/onyxia

-helm install onyxia inseefrlab/helm-charts
+helm install onyxia onyxia/onyxia

In the following we assume the current version of Onyxia is 4.1.4 but you are encorging to use the latest version instead. See releases.

If you use ArgoCD for deploying onyxia:

apps/onyxia/Chart.yaml

 apiVersion: v2
 name: onyxia
 version: 1.0.0
 dependencies:
   - name: onyxia
-    version: 4.1.0
+    version: 4.1.4
-    repository: https://inseefrlab.github.io/helm-charts/
+    repository: https://inseefrlab.github.io/onyxia/

You no longer need to manually manage the version of onyxia-web and onyxia-api, now, if you want to update Onyxia, you just update the chart version number.

helm repo add onyxia https://inseefrlab.github.io/onyxia

DOMAIN=my-domain.net

cat << EOF > ./onyxia-values.yaml
# ...
web:
  image:
-   tag: 2.29.4
api:
  image:
-   tag: v0.32   
# ...
EOF

helm install onyxia onyxia/onyxia -f onyxia-values.yaml

For the Keycloak theme, the version is now synchronized with the Onyxia version.

helm repo add codecentric https://codecentric.github.io/helm-charts

cat << EOF > ./keycloak-values.yaml
# ... See https://docs.onyxia.sh/#enabling-user-authentication
extraInitContainers: |
  - name: realm-ext-provider
    image: curlimages/curl
    imagePullPolicy: IfNotPresent
    command:
      - sh
    args:
      - -c
      - |
-       curl -L -f -S -o /extensions/onyxia.jar https://github.com/InseeFrLab/onyxia/releases/download/v2.29.4/keycloak-theme.jar
+       curl -L -f -S -o /extensions/onyxia.jar https://github.com/InseeFrLab/onyxia/releases/download/v4.1.4/keycloak-theme.jar
    volumeMounts:
      - name: extensions
        mountPath: /extensions
extraVolumeMounts: |
  - name: extensions
    mountPath: /opt/jboss/keycloak/standalone/deployments
extraVolumes: |
  - name: extensions
    emptyDir: {}
# ...
EOF

helm install keycloak codecentric/keycloak -f keycloak-values.yaml

Also note that, the theme will now appear as "onyxia" in the dropdown. Previously it was "onyxia-web"

Technical doc

Willing to submit PRs on the Onyxia codebase?

The Web Application

The TypeScript App that runs in the browser.

This is the documentation for InseeFrLab/onyxia -> web/.

git clone https://github.com/InseeFrLab/onyxia
cd onyxia/web

yarn install
#Setup the var envs to tell the app to connect to the sspcloud
#Fill up with your own value to run the web app against your onyxia API.
cp .env.local.sample .env.local

# To stat the app locally
yarn start

Technical stack

Technologies at play in Onyxia-web

To find your way in Onyxia, the best approach is to start by getting a surface-level understanding of the libraries that are leveraged in the project.

Modules marked by 🐔 are our own.

Typescript

We are fully committed on keeping everything type safe. If you are a seasoned developer but not fully comfortable with TypeScript yet a good way to get you quickly up to speed is to go through the What's new section of the official website.

You can skip anything related to class we don't do OOP in the project.

tsafe 🐔

We also heavily rely on tsafe. It's a collection of utilities that help write cleaner TypeScript code. It is crutial to understand at least assert, id, Equals and symToStr to be able to contribute on the codebase.

TS-CI 🐔

We try, whenever we see an opportunity for it, to publish as standalone NPM module chunks of the code we write for Onyxia-web. It help keep the complexity in check. We use TS-CI as a starter for everything we publish on NPM.

For working on what the end user 👁

Anything contained in the src/ui directory.

Onyxia-UI 🐔

The UI toolkit used in the project, you can find the setup of onyxia-UI in onyxia-web here: src/ui/theme.tsx.

MUI integration

Onyxia-UI is fully compatible with MUI.

Onyxia-UI offers a library of reusable components but you can also use MUI components in the project, their aspect will automatically be adapted to blend in with the theme.

🎨 Palettes

We currently offers builtin support for four color palettes:

France: datalab.sspcloud.fr?THEME_ID=france
Ultraviolet: datalab.sspcloud.fr?THEME_ID=ultraviolet
Verdant: datalab.sspcloud?THEME_ID=verdant
Onyxia (default): datalab.sspcloud.fr?THEME_ID=onyxia

You can also provide your own palette.

🔡 Fonts

The fonts are loaded in the public/index.html. It's important to keep it that way for Keycloakify.

🔡 Linking onyxia-ui in onyxia-web

To release a new version of Onyxia-UI. You just need to bump the package.json's version and push. The CI will automate publish a new version on NPM.

If you want to test some changes made to onyxia-ui in onyxia-web before releasing a new version of onyxia-ui to NPM you can link locally onyxia-ui in onyxia-web.

cd ~/github
git clone https//github.com/InseeFrLab/onyxia
cd onyxia/web
yarn install

cd ~/github/onyxia #This is just a suggestion, clone wherever you see fit.
git clone https://github.com/InseeFrLab/onyxia-ui ui
cd ui
yarn install
yarn build
yarn link-in-web
npx tsc -w

# Open a new terminal
cd ~/github/onyxia/web
yarn start

Now you can make changes in ~/github/onyxia/ui/and see the live updates.

If you want to install/update some dependencies, you must remove the node_modules, do you updates, then link again.

tss-react 🐔

The library we use for styling.

Rules of thumbs when it comes to styling:

Every component should accept an optional classNameprop it should always overwrite the internal styles.
A component should not size or position itself. It should always be the responsibility of the parent component to do it. In other words, you should never have height, width, top, left, right, bottom or margin in the root styles of your components.
You should never have a color or a dimension hardcoded elsewhere than in the theme configuration. Use theme.spacing() (ex1, ex2, ex3) and theme.colors.useCases.xxx.

screen-scaler 🐔

Onyxia is mostly used on desktop computer screens. It's not worth the effort to create a fully flege responsive design for the UI. screen-scaler enables us to design for a sigle canonical screen size. The library take charge of scaling/shrinking the image. depending on the real size of the screen. It also asks to rotate the screen when the app is rendered in protrait mode.

Storybook

It enables us to test the graphical components in isolation. See sources.

To launch Storybook locally run the following command:

yarn storybook

cra-envs 🐔

We need to be able to do:

docker run --env OIDC_URL="https://url-of-our-keycloak.fr/auth" InseeFrLab/onyxia-web

Then, somehow, access OIDC_URL in the code like process.env["OIDC_URL"].

In theory it shouldn't be possible, onyxia-web is an SPA, it is just static JS/CSS/HTML. If we want to bundle values in the code, we should have to recompile. But this is where cra-envs comes into play.

It enables to run onyxia-web again a specific infrastructure while keeping the app docker image generic.

Checkout the helm chart:

  web:
    replicaCount: 2
    env:
      MINIO_URL: https://minio.lab.sspcloud.fr
      VAULT_URL: https://vault.lab.sspcloud.fr
      OIDC_URL: https://auth.lab.sspcloud.fr/auth
      OIDC_REALM: sspcloud
      TITLE: SSP Cloud

All the accepted environment variables are defined here: .env. They are all prefixed with REACT_APP_ to be compatible with create-react-app. Default values are defined in this file.
Only in development (yarn start) .env.local is also loaded and have priority over .env
Then, in the code the variable can be accessed like this.

Please try not to access the environment variable to liberally through out the code. In principle they should only be accessed here. We try to keep things pure as much as possible.

powerhooks 🐔

It's a collection general purpose react hooks. Let's document the few use cases you absolutely need to understand:

Avoiding useless re-render of Components

For the sake of performance we enforce that every component be wrapped into React.memo(). It makes that a component only re-render if one of their prop has changed.

However if you use inline functions or useCallback as callbacks props your components will re-render every time anyway:

We always use useConstCallback for callback props. And useCallbackFactory for callback prop in lists.

Measuring Components

It is very handy to be able to get the height and the width of components dynamically. It prevents from having to hardcode dimension when we don’t need to. For that we use useDomRect``

Keycloakify 🐔

It's a build tool that enables to implement the login and register pages that users see when they are redirected to Keycloak for authentication.

If the app is being run on Keycloak the kcContext isn't undefined and it means shat we should render the login/register pages.

If you want to test, uncomment this line and run yarn start. You can also test the login pages in a local keycloak container by running yarn keycloak. All the instructions will be printed on the console.

The keycloak-theme.jar file is automatically build and uploaded as a GitHub release asset by the CI.

type-routes

The library we use for routing. It's like react-router but type safe.

i18nifty 🐔

For internalization and translation.

create-react-app

We plane to move to Vite when Keycloakify will support it.

The project is a non-ejected create-react-app using typescript template (you can find here the template repo that was used as a base for this project).

We use react-app-rewired instead of the default react-scripts to be able to use custom Webpack plugins without having to eject the App. The custom webpack plugins that we use are defined here /config-overrides.json. Currently we only one we use is circular-dependency-plugins.

For working on 🧠 of the App

Anything contained in the src/core directory.

redux-clean-architecture 🐔

The framework used to implement strict separation of concern betwen the UI and the Core and high modularity of the code.

EVT 🐔

EVT is an event management library (like RxJS is).

A lot of the things we do is powered under the hood by EVT. You don't need to know EVT to work on onyxia-web however, in order to demystify the parts of the codes that involve it, here are the key ideas to take away:

If we need to perform particular actions when a value gets changed, we useStatefullEvt.
We use Ctxto detaches event handlers when we no longer need them. (See line 108 on this playground)
In React, we use the useEvt hook to work with DOM events.

Architecture

Main rules

src/ui contains the React application, it's the UI of the app.
src/core contains the 🧠 of the app.
- Nothing in the src/core directory should relate to React. A concept like react hooks for example is out of scope for the src/core directory.
- src/core should never import anything from src/ui, even types.
- It should be possible for example to port onyxia-web to Vue.js or React Native without changing anything to the src/core directory.
- The goal of src/core is to expose an API that serves the UI.
- The API exposed should be reactive. We should not expose to the UI functions that returns promises, instead, the functions we expose should update states and the UI should react to these states updates.

Architecture

Whenever we need to interact with the infrastructure we define a port in src/core/port. A port is only a type definition. In our case the infrastructure is: the Keycloak server, the Vault server, the Minio server and a Kubernetes API (Onyxia-API).
In src/core/adapters are the implementations of the ports. For each port we should have at least two implementations, a dummy and a real one. It enabled the app to still run, be it in degraded mode, if one piece of the infrastructure is missing. Say we don’t have a Vault server we should still be able to launch containers.
In src/lib/usecases we expose APIs for the UI to consume.

In practice

Let's say we want to create a new page in onyxia-web where users can type in a repo name and get the current number of stars the repo has on GitHub.

UPDATE: This video remain relevant but please not that the clean archi setup have been considerably improved in latest releases. A dedicated repo have been created to explain it in detail.

Main take-way is that app have been renamed ui and lib have been renamed core.

You might wonder why some values, instead of being redux state, are returned by thunks functions.

For example, it might seem more natural to do:

const { isUserLoggedIn } = useCoreState(state => state.userAuthentication);

Instead of what we actually do, which is:

const { userAuthenticationThunks } = useThunks();
const isUserLoggedIn = userAuthenticationThunks.getIsUserLoggedIn();

However the rule is to never store as a redux state, values that are not susceptible to change. Redux states are values that we observe, any redux state changes should trigger a re-render of the React components that uses them. Conversely, there is no need to observe a value that will never change. We can get it once and never again, get it in a callback or wherever.

But, you may object, users do login and logout, isUserLoggedIn is not a constant!

Actually, from the standpoint of the web app, it is. When a user that isn't authenticated click on the login button, it is being redirected away. When he returns to the app everything is reloaded from scratch.

Now let's say we want the search to be restricted to a given GitHub organization. (Example: InseeFrLab.) The GitHub organization should be specified as an environment variable by the person in charge of deploying Onyxia. e.g.:

  web:
    env:
      MINIO_URL: https://minio.lab.sspcloud.fr
      VAULT_URL: https://vault.lab.sspcloud.fr
      OIDC_URL: https://auth.lab.sspcloud.fr/auth
      OIDC_REALM: sspcloud
      TITLE: SSP Cloud
      ORG_NAME: InseeFrLab #<==========

If no ORG_NAME is provided by the administrator, the app should always show 999 stars for any repo name queried.

Another example: Recording user's GitLab token

Currently users can save their GitHub Personal access token in their Onyxia account but not yet their GitLab token. Let's see how we would implement that.

How to deal with project switching

The easy action to take when the user selects another project is to simply reload the page (windows.location.reload()). We want to avoid doing this to enable what we call "hot projet swiping":

To implement this behavior you have to leverage the evtAction middleware from clean-redux. It enabled to register functions to be run when certain actions are dispatched.

Unlike the other video, the following one is voiced. Find the relevant code here.

The REST API

The backend REST API in Java

This is the documentation for InseeFrLab/onyxia -> api/.

It's the part of the App that runs in the clusters. It handles the things that can't be done directly from the frontend.

Roadmap

Onyxia Project Core Team Future Developments Roadmap

Data Explorer Evolution

Transforming the existing file browser into a comprehensive data explorer is a central aspect of our development roadmap. This enhancement aims to provide data scientists with immediate access to the initial rows of various file formats (including but not limited to parquet, CSV, and JSON) directly via the Onyxia-web interface. This will effectively integrate a basic SQL engine, DuckDB Wasm, for swift data access and manipulation.

Accessibility Improvement

Enhancing accessibility is a vital and immediate priority for the Onyxia project. Currently, the platform lacks certain accessibility features which we plan to implement in the immediate future.

Expanded S3 Management Options

As it stands, the Onyxia project does not have the capacity to set quotas for S3 buckets or create custom policies. These responsibilities are currently delegated to other administrators. However, we are in the process of developing an S3 operator that will simplify the process of onboarding onto S3, thereby reducing dependency on external administrators.

Install

Your Onyxia instance, today

Oneliner

TLDR. Here is how you can get an Onyxia instance running in a matter of seconds.

helm repo add onyxia https://inseefrlab.github.io/onyxia

cat << EOF > ./onyxia-values.yaml
ingress:
  enabled: true
  hosts:
    - host: datalab.my-domain.net
EOF

helm install onyxia onyxia/onyxia -f onyxia-values.yaml

Step by step installation guide

In this section, we will set up Onyxia from the ground up, along with all the associated technologies. This includes MinIO for S3, Keycloak for OIDC, and Vault for managing secrets.

Provision a Kubernetes cluster

First you'll need a Kubernetes cluster. If you have one already you can skip this section.

Hashicorp maintains great tutorials for terraforming Kubernetes clusters on AWS, GCP or Azure.

Pick one of the three and follow the guide.

You can stop after the configure kubectl section.

Ingress controller

Deploy an ingress controller on your cluster:

The following command is for AWS.

For GCP use this command.

For Azure use this command.

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.2.0/deploy/static/provider/aws/deploy.yaml

DNS

Let's assume you own the domain name my-domain.net, for the rest of the guide you should replace my-domain.net by a domain you actually own.

Now you need to get the external address of your cluster, run the command

kubectl get services -n ingress-nginx

and write down the External IP assigned to the LoadBalancer.

Depending on the cloud provider you are using it can be an IPv4, an IPv6 or a domain. On AWS for example, it will be a domain like xxx.elb.eu-west-1.amazonaws.com.

If you see <pending>, wait a few seconds and try again.

Once you have the address, create the following DNS records:

onyxia.my-domain.net CNAME xxx.elb.eu-west-1.amazonaws.com. 
*.lab.my-domain.net  CNAME xxx.elb.eu-west-1.amazonaws.com.

If the address you got was an IPv4 (x.x.x.x), create a A record instead of a CNAME.

If the address you got was ans IPv6 (y:y:y:y:y:y:y:y), create a AAAA record.

https://onyxia.my-domain.net will be the URL for your instance of Onyxia. The URL of the services created by Onyxia are going to look like: https://<something>.lab.my-domain.net

You can customise "onyxia" and "lab" to your liking, for example you could chose datalab.my-domain.net and *.kub.my-domain.net.

SSL

In this section we will obtain a TLS certificate issued by LetsEncrypt using the certbot commend line tool then get our ingress controller to use it.

brew install certbot #On Mac, lookup how to install certbot for your OS

#Because we need a wildcard certificate we have to complete the DNS callange.  
sudo certbot certonly --manual --preferred-challenges dns

# When asked for the domains you wish to optains a certificate for enter:
#   onyxia.my-domain.net *.lab.my-domain.net

The obtained certificate needs to be renewed every three month.

You may need to delegate your DNS Servers to one of the supported DNS service provider.

Now we want to create a Kubernetes secret containing our newly obtained certificate:

DOMAIN=my-domain.net
sudo kubectl create secret tls onyxia-tls \
    -n ingress-nginx \
    --key /etc/letsencrypt/live/onyxia.$DOMAIN/privkey.pem \
    --cert /etc/letsencrypt/live/onyxia.$DOMAIN/fullchain.pem

Lastly, we want to tell our ingress controller to use this TLS certificate, to do so run:

kubectl edit deployment ingress-nginx-controller -n ingress-nginx

This command will open your configured text editor, go to line 56 and add:

      - --default-ssl-certificate=ingress-nginx/onyxia-tls

If you are on a Mac or Window computer you can install Docker desktop then enable Kubernetes.

Docker desktop isn't available on Linux, you can use Kind instead.

Port Forwarding

DNS

Let's assume you own the domain name my-domain.net, for the rest of the guide you should replace my-domain.net by a domain you actually own.

Get your internet box routable IP and create the following DNS records:

onyxia.my-domain.net A <YOUR_IP>
*.lab.my-domain.net  A <YOUR_IP>

If you have DDNS domain you can create CNAME instead example:

onyxia.my-domain.net CNAME jhon-doe-home.ddns.net.
*.lab.my-domain.net  CNAME jhon-doe-home.ddnc.net.

https://onyxia.my-domain.net will be the URL for your instance of Onyxia.

The URL of the services created by Onyxia are going to look like: https://xxx.lab.my-domain.net

You can customise "onyxia" and "lab" to your liking, for example you could chose datalab.my-domain.net and *.kub.my-domain.net.

SSL

In this section we will obtain a TLS certificate issued by LetsEncrypt using the certbot commend line tool.

brew install certbot #On Mac, lookup how to install certbot for your OS

# Because we need a wildcard certificate we have to complete the DNS callange.  
sudo certbot certonly --manual --preferred-challenges dns

# When asked for the domains you wish to optains a certificate for enter:
#   onyxia.my-domain.net *.lab.my-domain.net

The obtained certificate needs to be renewed every three month.

You may need to delegate your DNS Servers to one of the supported DNS service provider.

Now we want to create a Kubernetes secret containing our newly obtained certificate:

kubectl create namespace ingress-nginx
DOMAIN=my-domain.net
sudo kubectl create secret tls onyxia-tls \
    -n ingress-nginx \
    --key /etc/letsencrypt/live/onyxia.$DOMAIN/privkey.pem \
    --cert /etc/letsencrypt/live/onyxia.$DOMAIN/fullchain.pem

Ingress controller

We'll install ingress-nginx in our cluster ~~but any other ingress controller will do~~.

cat << EOF > ./ingress-nginx-values.yaml
controller:
  extraArgs:
    default-ssl-certificate: "ingress-nginx/onyxia-tls"
EOF

helm install ingress-nginx ingress-nginx \
    --repo https://kubernetes.github.io/ingress-nginx \
    --namespace ingress-nginx \
    -f ./ingress-nginx-values.yaml

Installing Onyxia using helm

In this section we assume that:

You have a Kubernetes cluster and kubectl configured
onyxia.my-domain.net and *.lab.my-domain.net are pointing to your cluster's external address. my-domain.net being a domain that you own. You can customise "onyxia" and "lab" to your liking, for example you could chose datalab.my-domain.net and *.kub.my-domain.net.
You have an ingress controller configured with a default TLS certificate for *.lab.my-domain.net and onyxia.my-domain.net.

As of today the default service catalog will only work with ingress-nginx.

This will be addressed in the near future.

Use kubectl get pods to see if your pods are up and ready.

(Optional) Make sure that your cluster is ready for Onyxia

To make sure that your Kubernetes cluster is correctly configured let's deploy a test web app on it before deploying Onyxia.

DOMAIN=my-domain.net

cat << EOF > ./test-spa-values.yaml
ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
  hosts:
    - host: test-spa.lab.$DOMAIN
EOF

helm repo add etalab https://etalab.github.io/helm-charts
helm install test-spa etalab/keycloakify-demo-app -f test-spa-values.yaml
echo "Navigate to https://test-spa.lab.$DOMAIN, see the Hello World"
helm uninstall test-spa

helm repo add onyxia https://inseefrlab.github.io/onyxia

DOMAIN=my-domain.net

cat << EOF > ./onyxia-values.yaml
ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
  hosts:
    - host: onyxia.$DOMAIN
api:
  regions: 
    [
       {
          "id":"demo",
          "name":"Demo",
          "description":"This is a demo region, feel free to try Onyxia !",
          "services":{
             "type":"KUBERNETES",
             "singleNamespace":true,
             "namespacePrefix":"user-",
             "usernamePrefix":"oidc-",
             "groupNamespacePrefix":"projet-",
             "groupPrefix":"oidc-",
             "authenticationMode":"serviceAccount",
             "expose":{
                "domain":"lab.$DOMAIN"
             },
             "monitoring":{
                "URLPattern":"todo"
             },
             "initScript":"https://inseefrlab.github.io/onyxia/onyxia-init.sh"
          }
       }
    ]
EOF

helm install onyxia onyxia/onyxia -f onyxia-values.yaml

You can now access https://onyxia.my-domain.net and start services. Congratulations! 🥳

Enabling user authentication

At the moment there is no authentication process, everyone can access our platform and and start services.

Let's setup Keycloak to enable users to create account and login to our Onyxia.

Notes if you already have a Keycloak server

If you already have a Keycloak server it is up to you to pick from this guide what is rellevent to you.

You probably want to enable Terms and Conditions as required actions.

 web:
  env:
    # Available env are documented here: https://github.com/InseeFrLab/onyxia-web/blob/main/.env
    KEYCLOAK_URL: https://auth.lab.my-domain.net/auth
    KEYCLOAK_CLIENT_ID: onyxia
    KEYCLOAK_REALM: datalab
    JWT_EMAIL_CLAIM: email
    JWT_FAMILY_NAME_CLAIM: family_name
    JWT_FIRST_NAME_CLAIM: given_name
    JWT_USERNAME_CLAIM: preferred_username
    JWT_LOCALE_CLAIM: locale

For deploying our Keycloak we use codecentric's helm chart.

helm repo add codecentric https://codecentric.github.io/helm-charts

DOMAIN=my-domain.net
POSTGRESQL_PASSWORD=xxxxx #Replace by a strong password, you will never need it.
# Credentials for logging to https://auth.lab.$DOMAIN/auth
KEYCLOAK_USER=admin
KEYCLOAK_PASSWORD=yyyyyy 

cat << EOF > ./keycloak-values.yaml
image:
  # We use the legacy variant of the image until codecentric update it's helm chart
  tag: "19.0.3-legacy"
replicas: 1
extraInitContainers: |
  - name: realm-ext-provider
    image: curlimages/curl
    imagePullPolicy: IfNotPresent
    command:
      - sh
    args:
      - -c
      - |
        # There is a custom theme published alongside every onyxia-web release
        # The version of the Keycloak theme and the version of onyxia-web don't need 
        # to match but you should update the theme from time to time.  
        # https://github.com/InseeFrLab/onyxia-web/releases
        curl -L -f -S -o /extensions/onyxia.jar https://github.com/InseeFrLab/onyxia-web/releases/download/v2.29.4
/keycloak-theme.jar
    volumeMounts:
      - name: extensions
        mountPath: /extensions
extraVolumeMounts: |
  - name: extensions
    mountPath: /opt/jboss/keycloak/standalone/deployments
extraVolumes: |
  - name: extensions
    emptyDir: {}
extraEnv: |
  - name: KEYCLOAK_USER
    value: $KEYCLOAK_USER
  - name: KEYCLOAK_PASSWORD
    value: $KEYCLOAK_PASSWORD
  - name: JGROUPS_DISCOVERY_PROTOCOL
    value: kubernetes.KUBE_PING
  - name: KUBERNETES_NAMESPACE
    valueFrom:
     fieldRef:
       apiVersion: v1
       fieldPath: metadata.namespace
  - name: KEYCLOAK_STATISTICS
    value: "true"
  - name: CACHE_OWNERS_COUNT
    value: "2"
  - name: CACHE_OWNERS_AUTH_SESSIONS_COUNT
    value: "2"
  - name: PROXY_ADDRESS_FORWARDING
    value: "true"
  - name: JAVA_OPTS
    value: >-
      -Dkeycloak.profile=preview -XX:+UseContainerSupport -XX:MaxRAMPercentage=50.0 -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true 
ingress:
  enabled: true
  servicePort: http
  annotations:
    kubernetes.io/ingress.class: nginx
    ## Resolve HTTP 502 error using ingress-nginx:
    ## See https://www.ibm.com/support/pages/502-error-ingress-keycloak-response
    nginx.ingress.kubernetes.io/proxy-buffer-size: 128k
  rules:
    - host: "auth.lab.$DOMAIN"
      paths:
        - path: /
          pathType: Prefix
  tls:
    - hosts:
        - auth.lab.$DOMAIN
postgresql:
  postgresqlPassword: $POSTGRESQL_PASSWORD
EOF

helm install keycloak codecentric/keycloak -f keycloak-values.yaml

You can now login to the administration console of https://auth.lab.my-domain.net and login using the credentials you have defined with KEYCLOAK_USER and KEYCLOAK_PASSWORD.

Create a realm called "datalab" (or something else), go to Realm settings
1. On the tab General
  1. User Profile Enabled: On
2. On the tab login
  1. User registration: On
  2. Forgot password: On
  3. Remember me: On
3. On the tab email, we give an example with **** AWS SES, if you don't have a SMTP server at hand you can skip this by going to Authentication (on the left panel) -> Tab Required Actions -> Uncheck "set as default action" Verify Email. Be aware that with email verification disable, anyone will be able to sign up to your service.
  1. From: [email protected]
  2. Host: email-smtp.us-east-2.amazonaws.com
  3. Port: 465
  4. Authentication: enabled
  5. Username: **************
  6. Password: ***************************************
  7. When clicking "save" you'll be asked for a test email, you have to provide one that correspond to a pre-existing user or you will get a silent error and the credentials won't be saved.
4. On the tab Themes
  1. Login theme: onyxia-web (you can also select the login theme on a per client basis)
  2. Email theme: onyxia-web
5. On the tab Localization
  1. Internationalization: Enabled
  2. Supported locales: <Select the languages you wish to support>
Create a client called "onyxia"
1. Root URL: https://onyxia.my-domain.net/
2. Valid redirect URIs: https://onyxia.my-domain.net/*
3. Web origins: *
4. Login theme: onyxia-web
In Authentication (on the left panel) -> Tab Required Actions enable and set as default action Therms and Conditions.

Now you want to ensure that the username chosen by your users complies with Onyxia requirement (only alphanumerical characters) and define a list of email domain allowed to register to your service.

Now you can edit the file as suggested in the following DIFF snippet. Be mindful that in this example we only allow emails @gmail.com and @hotmail.com to register you want to edit that.

{
  "attributes": [
    {
      "name": "username",
      "displayName": "${username}",
      "validations": {
        "length": {
          "min": 3,
          "max": 255
        },
+       "pattern": {
+         "error-message": "${alphanumericalCharsOnly}",
+         "pattern": "^[a-zA-Z0-9]*$"
+       },
        "username-prohibited-characters": {}
      }
    },
    {
      "name": "email",
      "displayName": "${email}",
      "validations": {
        "email": {},
+       "pattern": {
+         "pattern": "^[^@]+@([^.]+\\.)*((gmail\\.com)|(hotmail\\.com))$"
+       },
        "length": {
          "max": 255
        }
      }
    },
    {
      "name": "firstName",
      "displayName": "${firstName}",
      "required": {
        "roles": [
          "user"
        ]
      },
      "permissions": {
        "view": [
          "admin",
          "user"
        ],
        "edit": [
          "admin",
          "user"
        ]
      },
      "validations": {
        "length": {
          "max": 255
        },
        "person-name-prohibited-characters": {}
      }
    },
    {
      "name": "lastName",
      "displayName": "${lastName}",
      "required": {
        "roles": [
          "user"
        ]
      },
      "permissions": {
        "view": [
          "admin",
          "user"
        ],
        "edit": [
          "admin",
          "user"
        ]
      },
      "validations": {
        "length": {
          "max": 255
        },
        "person-name-prohibited-characters": {}
      }
    }
  ]
}

Now our Keycloak server is fully configured we just need to update our Onyxia deployment to let it know about it.

Update the onyxia-values.yaml file that you created previously, don't forget to replace all the occurence of my-domain.net by your actual domain.

Don't forget as well to remplace the terms of services of the sspcloud by your own terms of services. CORS should be enabled on those .md links (Access-Control-Allow-Origin: *).

+serviceAccount:
+  clusterAdmin: true
 ingress:
   enabled: true
   annotations:
     kubernetes.io/ingress.class: nginx
   hosts:
     - host: onyxia.my-domain.net
 web:
+  env:
+    TERMS_OF_SERVICES: |
+      { 
+        "en": "https://www.sspcloud.fr/tos_en.md", 
+        "fr": "https://www.sspcloud.fr/tos_fr.md" 
+      }
 api:
   env:
+    authentication.mode: openidconnect
+    oidc.issuer-uri: "https://auth.lab.my-domain.net/auth/realms/datalab"
+    oidc.clientID: "onyxia"
+    oidc.audience: "onyxia"
   regions:
     [
        {
           "id":"demo",
           "name":"Demo",
           "description":"This is a demo region, feel free to try Onyxia !",
           "services":{
              "type":"KUBERNETES",
-             "singleNamespace": true,
+             "singleNamespace": false,
              "namespacePrefix":"user-",
              "usernamePrefix":"oidc-",
              "groupNamespacePrefix":"projet-",
              "groupPrefix":"oidc-",
              "authenticationMode":"serviceAccount",
              "expose":{
                 "domain":"lab.my-domain.net"
              },
              "monitoring":{
                 "URLPattern":"todo"
              },
              "initScript":"https://inseefrlab.github.io/onyxia/onyxia-init.sh"
           }
        }
     ]

Now that you have updated onyxia-values.yaml restart onyxia-web with the new configuration.

helm upgrade onyxia inseefrlab/onyxia -f onyxia-values.yaml

Now your users should be able to create account, log-in, and start services on their own Kubernetes namespace.

S3 Storage

Create a Keycloak client for Accessing Keycloak

Before configuring MinIO, let's create a new client for Keycloak (from the previous existing "datalab" realm).

Create a client called "minio".

Client ID: minio
Client Protocol: openid-connect
Root URL: https://minio.lab.my-domain.net/

Complete the content of client "minio" with the following values.

Access Type: confidential
Valid Redirect URIs (two values are required): https://minio.lab.my-domain.net/* and https://minio-console.lab.my-domain.net/*
Web origins: *

Save the content, a new tab called Credentials must be appear. Navigate to Credentials tab and copy the secret value for the next section.

Navigate to Mappers tab and create a protocol Mapper.

Name: policy
Mapper Type: Hardcoded claim

Complete the content of Mapper "policy" with the following values.

Token Claim Name: policy
Claim value: stsonly
Add to ID token: on
Add to access token: on
Add to userinfo: on

Install MinIO

Replace COPY_SECRET_FROM_KEYCLOAK_MINIO_CLIENT by the secret value defined into the "minio" Keycloak client (see previous section).

helm repo add minio https://charts.min.io/
 
DOMAIN=my-domain.net

cat << EOF > ./minio-values.yaml
## replicas: 16
ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
  path: /
  hosts:
    - minio.lab.$DOMAIN
  tls:
    - hosts:
        - minio.lab.$DOMAIN
consoleIngress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
  paths: /
  hosts:
    - minio-console.lab.$DOMAIN
  tls:
    - hosts:
        - minio-console.lab.$DOMAIN
environment:
  MINIO_BROWSER_REDIRECT_URL: https://minio-console.lab.$DOMAIN
oidc:
  enabled: true
  configUrl: "https://auth.lab.$DOMAIN/auth/realms/datalab/.well-known/openid-configuration"
  clientId: "minio"
  claimName: "policy"
  scopes: "openid,profile,email"
  redirectUri: "https://minio-console.lab.$DOMAIN/oauth_callback"
  claimPrefix: ""
  comment: ""
  clientSecret: COPY_SECRET_FROM_KEYCLOAK_MINIO_CLIENT
policies:
  - name: stsonly
    statements:
      - resources:
          - 'arn:aws:s3:::oidc-${jwt:preferred_username}'
          - 'arn:aws:s3:::oidc-${jwt:preferred_username}/*'
        actions:
          - "s3:*"
EOF

helm install minio minio/minio -f minio-values.yaml

MinIO is now deployed and is accessible on the console url.

By default, there are 16 MinIO containers running. If this number is too large for your Kubernetes cluster, you can limit it by configuring the 'replicas' key.

Create a Keycloak Client for Onyxia/MinIO

From "datalab" realm, create a client called "onyxia-minio"

Client ID: onyxia-minio
Client Protocol: openid-connect
Root URL: https://onyxia.my-domain.net/

Complete the content of client "onyxia-minio" with the following values.

Access Type: public
Valid Redirect URIs: https://onyxia.my-domain.net/*
Web origins: *
Advanced Settings 1. Access Token Lifespan : 7 days 2. Client Session Idle : 7 days 3. Client Session Max: 7 days

Save the content and navigate to Mappers tab and create two protocol Mappers.

Create the first Mapper called "policy".

Token Name: policy
Mapper Type: Hardcoded claim
Token Claim Name: policy
Claim value: stsonly
Add to ID token: on
Add to access token: on
Add to userinfo: on

Create the second Mapper called "audience-minio".

Token Name: audience-minio
Mapper Type: Audience
_Included Custom Audience _: minio
Add to ID token: on
Add to access token: on

Update Onyxia

serviceAccount:
  clusterAdmin: true
 ingress:
   enabled: true
   annotations:
     kubernetes.io/ingress.class: nginx
   hosts:
     - host: onyxia.my-domain.net
 web:
  env:
    KEYCLOAK_REALM: datalab
    KEYCLOAK_URL: https://auth.lab.my-domain.net/auth
    TERMS_OF_SERVICES: |
      { "en": "https://www.sspcloud.fr/tos_en.md", "fr": "https://www.sspcloud.fr/tos_fr.md" }
 api:
   env:
    authentication.mode: openidconnect
    keycloak.realm: datalab
    keycloak.auth-server-url: https://auth.lab.my-domain.net/auth
   regions:
     [
        {
           "id":"demo",
           "name":"Demo",
           "description":"This is a demo region, feel free to try Onyxia !",
           "services":{
              "type":"KUBERNETES",
              "singleNamespace": false,
              "namespacePrefix":"user-",
              "usernamePrefix":"oidc-",
              "groupNamespacePrefix":"projet-",
              "groupPrefix":"oidc-",
              "authenticationMode":"admin",
              "expose":{
                 "domain":"lab.my-domain.net"
              },
              "monitoring":{
                 "URLPattern":"todo"
              },
              "cloudshell":{
                 "catalogId":"inseefrlab-helm-charts-datascience",
                 "packageName":"cloudshell"
              },
              "initScript":"https://inseefrlab.github.io/onyxia/onyxia-init.sh"
           },
           "data":{
              "S3":{
-                "URL":"todo",
+                "type": "minio",
+                "URL": "https://minio.lab.my-domain.net",
+                "region": "us-east-1",
+                "bucketPrefix": "oidc-",
+                "groupBucketPrefix": "projet-",
+                "bucketClaim": "preferred_username",
+                "defaultDurationSeconds": 86400,
+                "keycloakParams":
+                {
+                      "URL": "https://auth.lab.my-domain.net/auth",
+                      "realm": "datalab",
+                      "clientId": "onyxia-minio",
+                },
+                "acceptBucketCreation": true,
                 "monitoring":{
                    "URLPattern":"minio"
                 }
              }
           },
           "auth":{
              "type":"openidconnect"
           },
           "location":{
              "lat":48.8164,
              "long":2.3174,
              "name":"Montrouge (France)"
           }
        }
     ]

helm upgrade onyxia inseefrlab/onyxia -f onyxia-values.yaml

Vault

helm repo add hashicorp https://helm.releases.hashicorp.com
 
DOMAIN=my-domain.net

cat << EOF > ./vault-values.yaml
server:
  dev:
    enabled: true
    # Set VAULT_DEV_ROOT_TOKEN_ID value
    devRootToken: "root"
  ingress:
    enabled: true
    annotations:
      kubernetes.io/ingress.class: nginx
    hosts:
      - host: "vault.lab.$DOMAIN"
    tls:
      - hosts:
          - vault.lab.$DOMAIN
EOF

helm install vault hashicorp/vault -f vault-values.yaml

Create a client called "vault"

Root URL: https://vault.lab.my-domain.net/
Valid redirect URIs: https://vault.lab.my-domain.net/*
Web origins: *

TODO; Refer to the legacy documentation.