Bridge Channel send: no opened channel on Kubernetes with HAproxy

I inherited a setup on a bare-metal Kubernetes cluster that I have to work with. After doing some updates and redeploys, I am having the dreaded “only 2 participants can hear/see” along with a bad video quality.
The setup looks like this:

bare-metal Kubernetes cluster
MetalLB
nginx-ingress with cert-manager
HAproxy for jitsi load balancing only
UDP ports are being allowed/forwarded by the firewall
NodePort for each jvb shard with different UDP port (HAproxy is aware)

This is the error in the console:

2021-12-23T10:37:24.084Z [modules/RTC/BridgeChannel.js] <_send>:  Bridge Channel send: no opened channel. Logger.js:154:22
    r Logger.js:154
    _send BridgeChannel.js:423
    sendMessage BridgeChannel.js:199
    sendChannelMessage RTC.js:878
    sendEndpointMessage JitsiConference.js:2838
    sendMessage JitsiConference.js:2891
    e2eping JitsiConference.js:415
    sendRequest e2eping.js:92
2021-12-23T10:37:24.085Z [JitsiConference.js] <7273/jc.prototype._init/this.e2eping<>:  Failed to send E2E ping request or response. undefined Logger.js:154:22
Firefox can’t establish a connection to the server at wss://meet.my.public.fqdn.de/colibri-ws//cb1f3f534297507a/37011691?pwd=6lf5mmns2msvv8m4iv5n0c1p66. BridgeChannel.js:83:19
2021-12-23T10:37:24.166Z [modules/RTC/BridgeChannel.js] <7273/_handleChannel/e.onclose>:  Channel closed by server Logger.js:154:22
2021-12-23T10:37:24.166Z [modules/RTC/BridgeChannel.js] <7273/_handleChannel/e.onclose>:  Channel closed: 1006 Logger.js:154:22
GETwss://meet.my.public.fqdn.de/colibri-ws//cb1f3f534297507a/37011691?pwd=6lf5mmns2msvv8m4iv5n0c1p66
[HTTP/1.1 502 Bad Gateway 30181ms]

the web service is showing these errors in parallel:

10.5.48.105 - - [23/Dec/2021:11:51:13 +0100] "GET / HTTP/1.1" 200 15930 "-" "kube-probe/1.19"
2021/12/23 11:51:20 [error] 259#259: *9068 cb1f3f534297507a could not be resolved (110: Operation timed out), client: 10.42.3.41, server: _, request: "GET /colibri-ws//cb1f3f534297507a/37011691?pwd=6lf5mmns2msvv8m4iv5n0c1p66 HTTP/1.1", host: "meet.my.public.fqdn.de"
10.42.3.41 - - [23/Dec/2021:11:51:20 +0100] "GET /colibri-ws//cb1f3f534297507a/37011691?pwd=6lf5mmns2msvv8m4iv5n0c1p66 HTTP/1.1" 502 161 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:95.0) Gecko/20100101 Firefox/95.0"
10.42.3.41 - - [23/Dec/2021:11:51:21 +0100] "POST /http-bind?room=ashamedlowsalarmindeed HTTP/1.1" 200 298 "-" "jitsi-meet/146 CFNetwork/1327.0.4 Darwin/21.2.0"
10.5.48.105 - - [23/Dec/2021:11:51:23 +0100] "GET / HTTP/1.1" 200 15930 "-" "kube-probe/1.19"
10.42.3.59 - - [23/Dec/2021:11:51:24 +0100] "GET /sounds/outgoingRinging.wav HTTP/1.1" 200 132344 "-" "python-requests/2.23.0"
10.42.3.59 - - [23/Dec/2021:11:51:29 +0100] "GET /xmpp-websocket?room=pdkm1-action HTTP/1.1" 200 163 "https://meet.my.public.fqdn.de/pdkm1-action" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36"
2021/12/23 11:51:29 [error] 259#259: *9074 64946b208f9e623f could not be resolved (110: Operation timed out), client: 10.42.3.41, server: _, request: "GET /colibri-ws//64946b208f9e623f/6254ce95?pwd=1g0lbpbi1r48q8q5ree300mk7c HTTP/1.1", host: "meet.my.public.fqdn.de"
10.42.3.41 - - [23/Dec/2021:11:51:29 +0100] "GET /colibri-ws//64946b208f9e623f/6254ce95?pwd=1g0lbpbi1r48q8q5ree300mk7c HTTP/1.1" 502 564 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36 Edg/96.0.1054.62"
10.42.3.59 - - [23/Dec/2021:11:51:31 +0100] "POST /http-bind?room=ashamedlowsalarmindeed HTTP/1.1" 200 298 "-" "jitsi-meet/146 CFNetwork/1327.0.4 Darwin/21.2.0"
2021/12/23 11:51:31 [error] 259#259: *9079 64946b208f9e623f could not be resolved (110: Operation timed out), client: 10.42.3.59, server: _, request: "GET /colibri-ws//64946b208f9e623f/a425d923?pwd=6s9qlmc5fqu75csdj5ujtbgshi HTTP/1.1", host: "meet.my.public.fqdn.de"
10.42.3.59 - - [23/Dec/2021:11:51:31 +0100] "GET /colibri-ws//64946b208f9e623f/a425d923?pwd=6s9qlmc5fqu75csdj5ujtbgshi HTTP/1.1" 502 564 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36"
10.5.48.105 - - [23/Dec/2021:11:51:33 +0100] "GET / HTTP/1.1" 200 15930 "-" "kube-probe/1.19"
10.42.3.41 - - [23/Dec/2021:11:51:41 +0100] "POST /http-bind?room=ashamedlowsalarmindeed HTTP/1.1" 200 298 "-" "jitsi-meet/146 CFNetwork/1327.0.4 Darwin/21.2.0"
10.5.48.105 - - [23/Dec/2021:11:51:43 +0100] "GET / HTTP/1.1" 200 15930 "-" "kube-probe/1.19"
2021/12/23 11:51:44 [error] 259#259: *9090 cb1f3f534297507a could not be resolved (110: Operation timed out), client: 10.42.3.59, server: _, request: "GET /colibri-ws//cb1f3f534297507a/6d82aa65?pwd=6th4350a9ksugr021bkpmcpi3t HTTP/1.1", host: "meet.my.public.fqdn.de"

this is from the jvb deployment manifest:

    - args:
        - "30300"
        - /init
        command:
        - /entrypoint/entrypoint.sh
        env:
        - name: PUBLIC_URL
          value: "https://meet.my.public.fqdn.de"
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: XMPP_SERVER
          value: shard-0-prosody
        - name:  DOCKER_HOST_ADDRESS
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        - name: XMPP_DOMAIN
          value: meet.jitsi
        - name: XMPP_AUTH_DOMAIN
          value: auth.meet.jitsi
        - name: XMPP_INTERNAL_MUC_DOMAIN
          value: internal-muc.meet.jitsi
        - name: JVB_STUN_SERVERS
          value: stun.l.google.com:19302,stun1.l.google.com:19302,stun2.l.google.com:19302
        - name: JICOFO_AUTH_USER
          value: focus
        - name: JVB_TCP_HARVESTER_DISABLED
          value: "true"
        - name: JVB_ENABLE_APIS
          value: colibri,rest
        - name: JVB_AUTH_USER
          value: jvb
        - name: JVB_AUTH_PASSWORD
          valueFrom:
            secretKeyRef:
              key: JVB_AUTH_PASSWORD
              name: jitsi-config
        - name: JICOFO_AUTH_PASSWORD
          valueFrom:
            secretKeyRef:
              key: JICOFO_AUTH_PASSWORD
              name: jitsi-config
        - name: JVB_BREWERY_MUC
          value: jvbbrewery
        - name: TZ
          value: Europe/Berlin
        image: jitsi/jvb:stable-6726-1
        imagePullPolicy: Always
        lifecycle:
          preStop:
            exec:
              command:
              - bash
              - /shutdown/graceful_shutdown.sh
              - -t 3
        name: jvb
        readinessProbe:
          httpGet:
            path: /about/health
            port: 8080
          initialDelaySeconds: 10
        resources:
          limits:
            cpu: "8"
            memory: 4000Mi
          requests:
            cpu: "4"
            memory: 2000Mi
        volumeMounts:
        - mountPath: /entrypoint
          name: jvb-entrypoint
        - mountPath: /shutdown
          name: jvb-shutdown
      terminationGracePeriodSeconds: 2147483647
      volumes:
      - configMap:
          defaultMode: 484
          name: jvb-entrypoint
        name: jvb-entrypoint
      - configMap:
          defaultMode: 484
          name: jvb-shutdown
        name: jvb-shutdown

this is from the prosody deployment manifest:

  containers:
      - env:
        - name: PUBLIC_URL
          value: "https://meet.my.public.fqdn.de"
        - name: ENABLE_XMPP_WEBSOCKET
          value: "1"
        - name: XMPP_DOMAIN
          value: meet.jitsi
        - name: XMPP_AUTH_DOMAIN
          value: auth.meet.jitsi
        - name: XMPP_MUC_DOMAIN
          value: muc.meet.jitsi
        - name: XMPP_INTERNAL_MUC_DOMAIN
          value: internal-muc.meet.jitsi
        - name: JICOFO_COMPONENT_SECRET
          valueFrom:
            secretKeyRef:
              key: JICOFO_COMPONENT_SECRET
              name: jitsi-config
        - name: JVB_AUTH_USER
          value: jvb
        - name: JVB_AUTH_PASSWORD
          valueFrom:
            secretKeyRef:
              key: JVB_AUTH_PASSWORD
              name: jitsi-config
        - name: JICOFO_AUTH_USER
          value: focus
        - name: JICOFO_AUTH_PASSWORD
          valueFrom:
            secretKeyRef:
              key: JICOFO_AUTH_PASSWORD
              name: jitsi-config
        - name: TZ
          value: Europe/Berlin
        - name: JVB_TCP_HARVESTER_DISABLED
          value: "true"
        - name: GLOBAL_MODULES
          value: prometheus,measure_stanza_counts,measure_client_presence
        - name: GLOBAL_CONFIG
          value: statistics = "internal";\nstatistics_interval = 15;
        image: jitsi/prosody:stable-6726-1
        imagePullPolicy: Always
        name: prosody
        ports:
        - containerPort: 5280
          name: metrics
        readinessProbe:
          exec:
            command:
            - prosodyctl
            - --config
            - /config/prosody.cfg.lua
            - status
        resources:
          limits:
            cpu: 300m
            memory: 300Mi
          requests:
            cpu: 300m
            memory: 300Mi
        volumeMounts:
        - mountPath: /prosody-plugins-custom/mod_prometheus.lua
          name: prosody
          subPath: mod_prometheus.lua
        - mountPath: /usr/lib/prosody/modules/mod_measure_stanza_counts.lua
          name: prosody
          subPath: mod_measure_stanza_counts.lua
        - mountPath: /usr/lib/prosody/modules/mod_measure_client_presence.lua
          name: prosody
          subPath: mod_measure_client_presence.lua
      volumes:
      - configMap:
          items:
          - key: mod_prometheus.lua
            path: mod_prometheus.lua
          - key: mod_measure_stanza_counts.lua
            path: mod_measure_stanza_counts.lua
          - key: mod_measure_client_presence.lua
            path: mod_measure_client_presence.lua
          name: prosody
        name: prosody

this is the web service deployment manifest:

containers:
      - env:
        - name: PUBLIC_URL
          value: "https://meet.my.public.fqdn.de"
        - name: DISABLE_HTTPS
          value: "1"
        - name: HTTP_PORT
          value: "80"
        - name: XMPP_SERVER
          value: shard-0-prosody
        - name: JICOFO_AUTH_USER
          value: focus
        - name: XMPP_DOMAIN
          value: meet.jitsi
        - name: XMPP_AUTH_DOMAIN
          value: auth.meet.jitsi
        - name: XMPP_INTERNAL_MUC_DOMAIN
          value: internal-muc.meet.jitsi
        - name: XMPP_BOSH_URL_BASE
          value: http://shard-0-prosody:5280
        - name: XMPP_MUC_DOMAIN
          value: muc.meet.jitsi
        - name: TZ
          value: Europe/Berlin
        - name: JVB_TCP_HARVESTER_DISABLED
          value: "true"
        image: jitsi/web:stable-6726-1
        imagePullPolicy: Always
        name: web
        readinessProbe:
          httpGet:
            port: 80
        resources:
          limits:
            cpu: 1000m
            memory: 300Mi
          requests:
            cpu: 500m
            memory: 300Mi
        volumeMounts:
        - mountPath: /usr/share/jitsi-meet/static/welcomePageAdditionalContent.html
          name: web
          subPath: welcomePageAdditionalContent.html
        - mountPath: /usr/share/jitsi-meet/plugin.head.html
          name: web
          subPath: plugin.head.html
        - mountPath: /defaults/config.js
          name: web
          subPath: config.js
        - mountPath: /defaults/interface_config.js
          name: web
          subPath: interface_config.js
      volumes:
      - configMap:
          items:
          - key: welcomePageAdditionalContent.html
            path: welcomePageAdditionalContent.html
          - key: plugin.head.html
            path: plugin.head.html
          - key: config.js
            path: config.js
          - key: interface_config.js
            path: interface_config.js
          name: web
        name: web

this is the list of all services that have been created in the jitsi namespace:

NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
haproxy           ClusterIP   10.43.82.91     <none>        80/TCP                       280d
haproxy-0         ClusterIP   10.43.177.210   <none>        1024/TCP                     280d
haproxy-1         ClusterIP   10.43.182.197   <none>        1024/TCP                     280d
shard-0-jvb-0     NodePort    10.43.118.86    <none>        30300:30300/UDP              105d
shard-0-jvb-1     NodePort    10.43.197.21    <none>        30301:30301/UDP              105d
shard-0-jvb-10    NodePort    10.43.189.56    <none>        30310:30310/UDP              105d
shard-0-jvb-11    NodePort    10.43.156.59    <none>        30311:30311/UDP              105d
shard-0-jvb-2     NodePort    10.43.247.228   <none>        30302:30302/UDP              105d
shard-0-jvb-3     NodePort    10.43.174.59    <none>        30303:30303/UDP              105d
shard-0-jvb-4     NodePort    10.43.161.184   <none>        30304:30304/UDP              105d
shard-0-jvb-5     NodePort    10.43.8.118     <none>        30305:30305/UDP              105d
shard-0-jvb-6     NodePort    10.43.240.138   <none>        30306:30306/UDP              105d
shard-0-jvb-7     NodePort    10.43.222.110   <none>        30307:30307/UDP              105d
shard-0-jvb-8     NodePort    10.43.212.242   <none>        30308:30308/UDP              105d
shard-0-jvb-9     NodePort    10.43.75.201    <none>        30309:30309/UDP              105d
shard-0-prosody   ClusterIP   10.43.45.63     <none>        5222/TCP,5280/TCP,5347/TCP   280d
web               ClusterIP   None            <none>        80/TCP                       280d

this is the ingress object for the service:

NAME              CLASS    HOSTS                         ADDRESS        PORTS     AGE
haproxy-ingress   <none>   meet.my.public.fqdn.de        1xx.5x.3x.1x   80, 443   280d

and finally the HAproxy configuration configMap:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: haproxy-config
  namespace: jitsi
data:
  haproxy.cfg: |
    global
      # log to stdout
      log stdout format raw local0 info
      # enable stats socket for dynamic configuration and status retrieval
      stats socket ipv4@127.0.0.1:9999 level admin
      stats socket /var/run/hapee-lb.sock mode 666 level admin
      stats timeout 2m

    defaults
      log               global
      option            httplog
      retries           3
      maxconn           2000
      timeout connect   5s
      timeout client    50s
      timeout server    50s

    resolvers kube-dns
      # kubernetes DNS is defined in resolv.conf
      parse-resolv-conf
      hold valid 10s

    frontend http_in
      bind *:80
      mode http
      option forwardfor
      option http-keep-alive
      default_backend jitsi-meet

    # expose statistics in Prometheus format
    frontend stats
      mode http
      bind *:9090
      option http-use-htx
      http-request use-service prometheus-exporter if { path /metrics }
      stats enable
      stats uri /stats
      stats refresh 10s

    peers mypeers
      log stdout format raw local0 info
      peer "${HOSTNAME}" "${MY_POD_IP}:1024"
      peer "${OTHER_HOSTNAME}" "${OTHER_IP}:1024"

    backend jitsi-meet
      balance roundrobin
      mode http
      option forwardfor
      http-reuse safe
      http-request set-header Room %[urlp(room)]
      acl room_found urlp(room) -m found
      stick-table type string len 128 size 2k expire 1d peers mypeers
      stick on hdr(Room) if room_found
      # _http._tcp.web.jitsi.svc.cluster.local:80 is a SRV DNS record
      # A records don't work here because their order might change between calls and would result in different
      # shard IDs for each peered HAproxy
      server-template shard 0-5 _http._tcp.web.jitsi.svc.cluster.local:80 check resolvers kube-dns init-addr none

Apologies if this is quite extensive, I am unsure which would be the most relevant parts of it. Also, I am very new to Jitsi in general and highly appreciate any help on this.
Thanks to the community and all helping members.

I could not get this exactly answered by the documentation, but this request does not seem right to me:

 [error] 259#259: *284 f28bfec0f6deefaf could not be resolved (110: Operation timed out), client: 10.42.3.59, server: _, request: "GET /colibri-ws//f28bfec0f6deefaf/597d42e4?pwd=346hll9te44busbbupt3d1paa6 HTTP/1.1", host: "meet.my.public.fqdn.de"

I think this part here is either missing the jvb server ID (could also be an IP address) or there is one forward slash being added. What could be causing this, if this is indeed not like it’s supposed to look like?

Thanks.

Yes, it is missing the jvb pod IP.
For example, It is supposed to be GET /colibri-ws/10.0.0.1/f28bfec0f6deefaf. check the value of WS_SERVER_ID in jvb config.

Thanks for you reply, @metadata. I added the environment variable to the statefulSet definition for the jvb shards, referencing its own name as the service name is identical to the instance name. I assume this is a valid definition for the WS_SERVER_ID?

- name: JVB_WS_SERVER_ID
          valueFrom:
            fieldRef:
              fieldPath: metadata.name   <<-- resolves to something like shard-0-jvb-x (x=jvb instance number)

This service name now appears in the web server log output like this:

10.42.3.59 - - [24/Dec/2021:11:35:12 +0100] "GET /colibri-ws/shard-0-jvb-10/8454bcbec746d63a/db5e7c9d?pwd=5q5gd4lp6iptgl6ngq1f9tnk0l HTTP/1.1" 502 564 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36"

Unfortunately, the error behaviour persists and there is still something not working in the connection, as you can see from the next log entry:

2021/12/24 11:35:43 [error] 259#259: *9784 shard-0-jvb-10 could not be resolved (110: Operation timed out), client: 10.42.3.41, server: _, request: "GET /colibri-ws/shard-0-jvb-10/8454bcbec746d63a/db5e7c9d?pwd=5q5gd4lp6iptgl6ngq1f9tnk0l HTTP/1.1", host: "meet.my.public.fqdn.de"

The console output is still showing the no opened channel error while logs on the jvb instance don’t provide any information about the issue:

INFO: Nomination confirmed for pair: 18x.5x.3x.xx:30310/udp/srflx -> 7x.2x.1x.xx:61996/udp/prflx (stream-db5e7c9d.RTP).
Dec 24, 2021 11:34:11 AM org.jitsi.utils.logging2.LoggerImpl log
INFO: Selected pair for stream stream-db5e7c9d.RTP: 185.58.36.75:30310/udp/srflx -> 77.21.136.55:61996/udp/prflx (stream-db5e7c9d.RTP)
Dec 24, 2021 11:34:11 AM org.jitsi.utils.logging2.LoggerImpl log
INFO: CheckList of stream stream-db5e7c9d is COMPLETED
Dec 24, 2021 11:34:11 AM org.jitsi.utils.logging2.LoggerImpl log
INFO: ICE state changed from Running to Completed.
Dec 24, 2021 11:34:11 AM org.jitsi.utils.logging2.LoggerImpl log
INFO: ICE state changed old=Running new=Completed
Dec 24, 2021 11:34:11 AM org.jitsi.utils.logging2.LoggerImpl log
INFO: ICE connected

Any ideas?

Thanks, again and happy holidays.

I think it is unable to resolve it to Pod IP. If you are not assigning any env variables then there is a fallback JVB_WS_SERVER_ID_FALLBACK. check here

After changing the variable to reference status.podIP instead of metadata.name, the error message shard-0-jvb-10 could not be resolved (110: Operation timed out) in the web server log is gone now.
More importantly, the console output is no longer showing the Bridge Channel send: no opened channel error message.

Thanks for the hint, although I don’t exactly understand why a resolvable service name exposing the expected UDP port would not work, but the IP address of the pod itself does.

root@shard-0-jvb-0:/# host shard-0-jvb-0
shard-0-jvb-0.jitsi.svc.cluster.local has address 10.43.118.86
kubectl get svc -n jitsi shard-0-jvb-0
NAME            TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)           AGE
shard-0-jvb-0   NodePort   10.43.118.86   <none>        30300:30300/UDP   106d

If the web server is connecting with the jvb pod directly, what is the point of having the service exposed then? Also, the image quality still did not improve, do you have any idea how to deal with that?

Anyway, thanks for your help, really appreciate it.

1 Like

This saved my life! Thanks for the update. Would never figure out that the JVB_WS_SERVER_ID needs to be a DNS name or IP for it to correctly work.