Multi-cluster multi-primary istio on AWS EKS

Recently I was working on setting up istio in a multi-cluster setup following the Install Multi-Primary on different networks guide on EKS clusters. Everything seemed to work (no errors in logs), until I reached the verification step, where requests didn’t go to the other mesh: in CLUSTER1 I always got a response from Hello version: v1, instance: helloworld-v1-86f77cd7bd-cpxhv, while in CLUSTER2 always from Hello version: v2, instance: helloworld-v2-758dd55874-6x4t8.

I also implemented this workaround as a shell script running in a CronJob, check it out here.

This turns out to be a known problem with EKS and comes down to the fact that EKS loadbalancers use hostnames instead of IP addresses, which are not supported by istio.

Workaround: manually resolve the IP and add it to istio ConfigMap (namespace: istio-system).

1. Figure out the eastwestgateway’s hostname

➜  kubectl get service -n istio-system --context=${CLUSTER1} istio-eastwestgateway -o yaml

at the bottom look for the status section like:

...
status:
  loadBalancer:
    ingress:
    - hostname: a5e21e07fd1a64a518ab6c02b4dfb9f5-826145575.us-west-2.elb.amazonaws.com

Also take a note of the topology.istio.io/network label, e.g. network1. We will use it when configuring the ConfigMap in the other cluster.

2. Figure out the corresponding IP address

➜  dig +short a5e21e07fd1a64a518ab6c02b4dfb9f5-826145575.us-west-2.elb.amazonaws.com
1.2.3.4
1.2.3.5

3. Update the istio ConfigMap in the other cluster

First get the existing ConfigMap:

➜  kubectl get configmaps -n istio-system istio -o yaml --context=${CLUSTER2} > istio_configmap_cluster2.yaml

Look for data.meshNetworks, e.g.:

apiVersion: v1
data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
      meshId: mesh
      proxyMetadata: {}
      tracing:
        zipkin:
          address: zipkin.istio-system:9411
    enablePrometheusMerge: true
    rootNamespace: istio-system
    trustDomain: cluster.local
  meshNetworks: 'networks: {}'
kind: ConfigMap
...

Extend data.meshNetworks with the information from the previos steps:

apiVersion: v1
data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
      meshId: mesh
      proxyMetadata: {}
      tracing:
        zipkin:
          address: zipkin.istio-system:9411
    enablePrometheusMerge: true
    rootNamespace: istio-system
    trustDomain: cluster.local
  meshNetworks: |-
      networks:
        network1:
          endpoints:
            - fromRegistry: cluster1
          gateways:
            - address: 1.2.3.4
              port: 15443
            - address: 1.2.3.5
              port: 15443
kind: ConfigMap
...

Save the file, apply the changes:

➜  kubectl apply --context=${CLUSTER2} -f istio_configmap_cluster2.yaml

4. Repeat the same for the other cluster

Warnings

  • according to this comment we should list all networks in all clusters (even the cluster’s own network)
  • hardcoding the IP like this will break if the ELB gets a new IP. In the thread there is a great discussion whether this will happen or not
    • one possible solution to this is to have a CronJob that periodically updates the IPs