How-to to setup Grafana dashboards to monitor Jitsi, my comprehensive tutorial for the beginner

How-to to setup Grafana dashboards to monitor Jitsi, my comprehensive tutorial for the beginner

Who has read my earlier post about setting up 2 servers with jitsi and jibri (How-to to setup integrated Jitsi and Jibri for dummies, my comprehensive tutorial for the beginner), knows about my setup. Now I will show how I was able to setup a working dashboard to monitor the Jitsi Meet server based on the rest api provided by Jitsi. This will require installation of Grafana (dashboards), InfluxDB (performance database) and Telegraf (data collector).

In this example I decided to host my dashboards (all 3 services) on my Jibri-Server since that server is not used very often. In the examples I will connect based on IP-Adresses, you have to carefully substitute these with your own IPv4 adresses:
IPv4 of my Jitsi server: 116.203.231.172
IPv4 of my Grafana server: 116.203.20.99

We will briefly go through following steps for this installation:
Server “grafana” (116.203.20.99)
Step 1: Install database (InfluxDB) to host the data for dashboards
Step 2: Install Grafana to display jitsi stats in dashboards
Step 3: Install service (telegraf) to collect stats and write to database (InfluxDB)

Server: “jitsi” (116.203.231.172)
Step 4: Adapt Jitsi onfiguration to expose stats

At the end, I ended up with this dashboard showing stats during a call with 4 participants:


This dashboard contains ALL fields from the colibri rest-api response. Will require tweaking and tuning.

Server "grafana"

Step 1: Install InfluxDB

apt update && apt install -y gnupg2 curl wget
wget -qO- https://repos.influxdata.com/influxdb.key | sudo apt-key add -
echo "deb https://repos.influxdata.com/debian buster stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
apt update && apt install influxdb -y
systemctl enable --now influxdb
systemctl status influxdb

If you run a firewall (i.e. ufw) on this server, open the port for influxdb and grafana webserver:

ufw allow 8086/tcp
ufw allow 3000/tcp

Step 2: Install Grafana to display stats dashboards

curl https://packages.grafana.com/gpg.key | sudo apt-key add -
add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"
apt update && apt install grafana -y
systemctl enable --now grafana-server
systemctl status grafana-server

Step 3: Install & configure telegraf

wget -qO- https://repos.influxdata.com/influxdb.key | sudo apt-key add -
echo "deb https://repos.influxdata.com/debian buster stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
apt update && apt install telegraf -y
mv /etc/telegraf/telegraf.conf /etc/telegraf/telegraf.conf.original

nano /etc/telegraf/telegraf.conf

Enter following contents in telegraf.conf:

[global_tags]

###############################################################################
#                                  GLOBAL                                     #
###############################################################################

[agent]
    interval = "10s"
    debug = false
    hostname = "jitsi_host"
    round_interval = true
    flush_interval = "10s"
    flush_jitter = "0s"
    collection_jitter = "0s"
    metric_batch_size = 1000
    metric_buffer_limit = 10000
    quiet = false
    logfile = ""
    omit_hostname = false

nano /etc/telegraf/telegraf.d/jitsi.conf

Enter following contents in jitsi.conf:

###############################################################################
#                                  INPUTS                                     #
###############################################################################

[[inputs.http]]
    name_override = "jitsi_stats"
    urls = [
      "http://116.203.231.172:8080/colibri/stats"
    ]

    data_format = "json"

###############################################################################
#                                  OUTPUTS                                    #
###############################################################################

[[outputs.influxdb]]
    urls = ["http://localhost:8086"]
    database = "jitsi"
    timeout = "0s"
    retention_policy = ""

We enable start on boot and start telegraf now on server “jitsi”:

systemctl enable --now telegraf
systemctl status telegraf

(Mind: We will not create a database as Telegraf will create our database if it does not find one)

Server: "jitsi"

Step 4: Adapt Jitsi onfiguration to expose stats

nano /etc/jitsi/videobridge/config

Make sure to configure the jvb options:

JVB_OPTS="--apis=rest,xmpp"

and

nano /etc/jitsi/videobridge/sip-communicator.properties

Here we configure colibri statistics:

org.jitsi.videobridge.ENABLE_STATISTICS=true
org.jitsi.videobridge.STATISTICS_TRANSPORT=muc,colibri

service jitsi-videobridge2 restart

Check output in the terminal on the jitsi server:

curl -v http://127.0.0.1:8080/colibri/stats

Response: {"inactive_endpoints":0,"inactive_conferences":0,"total_ice_succeeded_relayed":0,"total_loss_degraded_participant_seconds":0,"bit_rate_download":0,"muc_clients_connected":1,"total_participants":0,"total_packets_received":0,"rtt_aggregate":0.0,"packet_rate_upload":0,"p2p_conferences":0,"total_loss_limited_participant_seconds":0,"octo_send_bitrate":0,"total_dominant_speaker_changes":0,"receive_only_endpoints":0,"total_colibri_web_socket_messages_received":0,"octo_receive_bitrate":0,"loss_rate_upload":0.0,"version":"2.1.169-ga28eb88e","total_ice_succeeded":0,"total_colibri_web_socket_messages_sent":0,"total_bytes_sent_octo":0,"total_data_channel_messages_received":0,"loss_rate_download":0.0,"total_conference_seconds":0,"bit_rate_upload":0,"total_conferences_completed":0,"octo_conferences":0,"num_eps_no_msg_transport_after_delay":0,"endpoints_sending_video":0,"packet_rate_download":0,"muc_clients_configured":1,"conference_sizes":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"total_packets_sent_octo":0,"conferences_by_video_senders":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"videostreams":0,"jitter_aggregate":0.0,"total_ice_succeeded_tcp":0,"octo_endpoints":0,"current_timestamp":"2020-04-17 23:14:38.468","total_packets_dropped_octo":0,"conferences":0,"participants":0,"largest_conference":0,"total_packets_sent":0,"total_data_channel_messages_sent":0,"total_bytes_received_octo":0,"octo_send_packet_rate":0,"conferences_by_audio_senders":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"total_conferences_created":0,"total_ice_failed":0,"threads":37,"videochannels":0,"total_packets_received_octo":0,"graceful_shutdown":false,"octo_receive_packet_rate":0,"total_bytes_received":0,"rtp_loss":0.0,"total_loss_controlled_participant_seconds":0,"total_partially_failed_conferences":0,"endpoints_sending_audio":0,"total_bytes_sent":0,"mucs_configured":1,"total_failed_conferences":0,"mucs_joined":1}

With this response we see that the rest api responds and can be used for our purpose!

To make sure that the colibri rest endpoint can be accessed by telegraf:

ufw allow 8080/tcp

Configure dashboards in Grafana

Open Grafana in browser: http://116.203.20.99:3000 and prepare your admin account

Add datasource.
We will add an InfluxDB datasource and set it to be the default.

  • Name: InfluxDB
  • Default: On
  • HTTP URL: http://localhost:8086
  • HTTP Access: Server (default)
  • Database: jitsi

Build a Dashboard.

  • New Panel: Choose Visualisation
  • Type: Stat
  • Go to “Queries”
    –> Follow the settings in below screenshot to get a first panel display.

A collection of all fields from the jitsi rest api response is available in the json attached for convenience.

Happy tweaking and tuning…

Cheers, Igor

Jitsi Monitor-1587310814489.json.txt (125.2 KB)

10 Likes

Very well covered Igor! Thank you for sharing.

Does this support multiple videobridges as shown with Jitsi Scalable installation?

Hi @poysama… As I don’t have such setup, I’d have to make an educated guess…

The configuration is made in /etc/jitsi/videobridge/sip-communicator.properties, so that leads me to believe all videobridges that are connected to the muc will be exposed by the REST api. However, I would assume there may be no distinction for the individual bridges but rather a sum of all bridges in each field.

Maybe a developer knows, otherwise you’d have to run some tests to compare the colibri rest responses with single/multiple bridge configuration.

Here is already a good Dashboard designed: https://grafana.com/grafana/dashboards/11969

1 Like

for multi videobridge setups you can use my adoped design: jitsimeet-multiple-videobridges-1587632543977.json.txt (40.9 KB)

3 Likes

Hi @Woodworker_Life, @flyinghuman and all contributors, thanks for your support! :slight_smile:
I’ve got a setup like

  • server1: jitsi-meet (full install), with a local user required for creating meetings
  • server2: jitsi-videobridge2 and the grafana server as per this thread.
    All up and running so far, load balancing works fine.

My issue right now is, that the videobridge-data from server 2 do not appear in the graphics.
E.g. the meetings held on server 2 do not appear in the stats,though colibri is enabled in /etc/jitsi/videobridge/sip-communicator.properties
True or false: Data from both videobridges should be collected on server1, because there’s the jicofo which ultimately generates the json data feed?

Thanks for you enlightment,
HP.

This is great thanks! Any idea why my CPU and MEM usage show ‘No Data’ however? is this data pulled from the jitsi rest api or somewhere else? I am using flyinghuman’s layout temporarily.

Hi @gbeirn,
I think the answer is found here: Proposed videobridge stats removal
And yes, I found the same issue yesterday, too.
Not sure what system montoring I should choose though, statsd or something else.
Any clues anyone?

Thanks @hpr good find! Somehow my searching didn’t turn up that thread. I have usually used Prometheus in the past which I will likely use for this as well.

1 Like

One could try following:

Setup a telegraf instance on the jitsi server and create a /etc/telegraf/telegraf.d/system.conf with:

###############################################################################
#                                  INPUTS                                     #
###############################################################################

[[inputs.cpu]]
## Whether to report per-cpu stats or not
    percpu = true
    ## Whether to report total system cpu stats or not
    totalcpu = true

###############################################################################
#                                  OUTPUTS                                    #
###############################################################################

[[outputs.influxdb]]
    urls = ["http://[grafana-server-IP]:8086"]
    database = "jitsi"
    timeout = "0s"
    retention_policy = ""

Then restart telegraf: service telegraf restart

If the outputs write to the same influxdb database (in this case jitsi), you should be able to pickup quite easily on the cpu stats (either 'percpu' or 'totalcpu' or both)…

1 Like

Hey @Woodworker_Life,

Thanks, this is it!
Sneak, lean and does the job.

Hi. This is assuming that the collector is configured to each JVB instance right? Because this doesn’t seem to scale well given when you can release or expand JVB instances on the fly. I am looking for a collector that collects all JVB data from the single MCU they are connected to.

Sorry, I slightly doubt if this can work. CPU (and mem) are data whose source is each individual server. I’d argue that you need to set up telegraf on each of your VBs. But of coures you can send them to the one server running grafana.
Not 100% sure and untested, but …

Hello,

at this step, is displayed an error

Though Grafana is supposed to create the missing database, any idea what I missed?

1 Like

hi @ashledombos: telegraf should create the database… (grafana does not create the database, only reads from it). I’d restart telegraf and see if the database gets created after the restart. If not, then telegraf can’t access the database, check from there… :wink:

1 Like

Hi, simply install telegraf on each bridge and send the metrics to the same influxDB where the grafana server has access to.

1 Like

Thank you, indeed I just had to restart the service.

Thanks. I haven’t thought of that. Currently running it right now. Do I retain the “jitsi_stats” for each name_override in Telegraph? Would that be aggregated on Influx DB or would it overwrite it?

Just to confirm that I get this right: Same database but different tables (e.g. jvb2, jvb3, …), right?