Jigasi high CPU usage with few calls

I don’t expect the threads to go down for the same scenario. Throwing more CPU power will help, but you need to do the math more cpu per machine or more vms in the autoscaling pool.

Yes, I have to find the sweet spot between CPU power and number of calls that it will handle vs pool sizing.

The main issue is costs :slight_smile:

Do you have any feedbacks/clues about how the new ARM64 graviton2 CPU that AWS offers perform? (and is ARM64 supported by jitsi/jigasi?)
I was wondering if those CPUs architectures will help with media related operations.

No idea about those instances, but arm will not work, what I can think of jitsi-srtp uses native bindings to openssl and arm is not one of the supported architectures and it will default to java impl for those operations which I doubt will do better …

Ok thanks for the feedback, bummer!

@damencho, while doing tests to determine the best instance/instance size I think I found an issue that leads to Jigasi misbehaving and ultimately failing to handle calls.

I was doing tests on a C5.XL instance, for 80 simultaneous calls, then 100.
I was measuring CPU usage, system load & memory consumption.

80 simultaneous calls seems to be the maximum on this instance size, at 100 calls the pressure is too high and there are unexpected call failures.

But the issue is that while doing my tests, I noticed on my two instances of jigasi that the number of threads grow at each run, then it shrinks when the run is finished, but it seems that there is a “thread leak” since at the end I got many more threads.

For example, I ran 10 tests, each tests consists of 80 sip calls per instance.
Each sip call = 1 conference room with 1 user (so 2 users in the same conference room since 2 instances have 80 sip calls each).
Here are the number of threads (using /about/stats route) that I got after each run:

  • “threads”:718.
  • “threads”:2231
  • “threads”:2380
  • “threads”:2929
  • “threads”:3615
  • “threads”:4049
  • “threads”:4499
  • “threads”:5047
  • “threads”:6010
  • “threads”:7054

I got the same behavior on my two EC2 instances.
I noticed that after the 4th run, the performance was worse each run with more and more failed calls.

Then I stopped the tests and got out for lunch, and when I came back here’s what I found:
With 0 active calls, the first EC2 instance is sitting at ~7200 threads and 1.6-1.9 Load average, jigasi is using about 30-40% of each cpu cores doing nothing.
On the second EC2 instance, jigasi has disconnected from asterisk and will not handle any incoming calls, sitting at 7400threads and using a small CPU portion (~3% of each CPU cores)

There are no messages in jigasi logs despite the idle cpu consumption.

That’s unexpected, any help is welcome! don’t hesitate if you need more details :slight_smile:

Bump, still can’t figure what is going on :face_with_raised_eyebrow:

I created a github issue for the thread leak for further investigations.
here: Jigasi thread leak issue? · Issue #341 · jitsi/jigasi · GitHub

Some feedback for future readers :

The issue for thread leaks has been identified and fixed on release jigasi_1.1-182-gd1a2e18 (at the time of writing this, it is located under the unstable repository).

Thumbs up :+1: to @damencho !

As for this thread, since jigasi is not crashing anymore I will do more load testing and report back the results I got.

Hi @cyril.r ,

In a previous message you tell that you are able to use Jigasi in translator Mode with your Asterisk IPBX.

Which version of Astersik did you use for this configuration ?
If you’are doing audio mixing on your Astersik have you notice some audio issue (the only related topic on Asterisk community report some audio error with muxing multistream audio) ?
Could you share you Asterisk configuration with the community ?

Regards,
Damien.

Hello Damien,

We used Asterisk 16 for our testing, our use case is really simple there are very few people per room (but a high number of rooms) so we didn’t had audio mixing issues (or we didn’t notice them).

The configuration is quite simple :

[jitsi]
type = auth
auth_type=userpass
username = redacted
password = redacted

[jitsi]
type = aor
max_contacts=2
;qualify_frequency = 30

[jitsi]
type = endpoint
context = stasis-test
disallow = all
allow = opus8
rtp_timeout = 60
direct_media = no
aors = jitsi
trust_id_outbound=yes
trust_id_inbound=no
outbound_auth=jitsi
rtp_symmetric=yes

Regards

Hi @cyril.r ,

I’ve recently made some test with the same configuration OPUS+translator mode on Jigasi.
It reduces the CPU usage on JIGASI by a factor 2 in our test scenario but in the same time the CPU consumption on the Asterisk part increase by a factor 10.
Have you measure the CPU usage of your Asterisk instances ?

Regards,
Damien