Jibri - Chrome not reachable with two or more users in meeting. Recording is not stopped


#1

Hi,

This may seem like a repeated issue, but it’s not. It is a completely new issue and I have found no solution while looking around the web.

Jibri is unable to stop the recording when more than one user is connected.
The recording works fine when I try to record with only one user.

When I try to record the meeting with multiple users, it starts recording normally.
But when I press Stop Recording button, the recording icon doesn’t go away and recording is stopped prompt doesn’t come.

Logs show this error continuously and repeatedly:

2019-01-10 12:29:33.754 SEVERE: [48] org.jitsi.jibri.selenium.JibriSelenium.invoke() Error while running call status checks: org.openqa.selenium.WebDriverException: chrome not reachable
(Session info: chrome=71.0.3578.98)
(Driver info: chromedriver=2.45.615279 (12b89733300bd268cff3b78fc76cb8f3a7cc44e5),platform=Linux 4.4.0-141-generic x86_64) (WARNING: The server did not provide any stacktrace information)
Command duration or timeout: 0 milliseconds
Build info: version: ‘unknown’, revision: ‘unknown’, time: ‘unknown’
System info: host: ‘ip-172-31-51-126’, ip: ‘172.31.51.126’, os.name: ‘Linux’, os.arch: ‘amd64’, os.version: ‘4.4.0-141-generic’, java.version: ‘1.8.0_191’
Driver info: org.openqa.selenium.chrome.ChromeDriver
Capabilities {acceptInsecureCerts: false, acceptSslCerts: false, applicationCacheEnabled: false, browserConnectionEnabled: false, browserName: chrome, chrome: {chromedriverVersion: 2.45.615279 (12b89733300bd2…, userDataDir: /tmp/.org.chromium.Chromium…}, cssSelectorsEnabled: true, databaseEnabled: false, goog:chromeOptions: {debuggerAddress: localhost:33374}, handlesAlerts: true, hasTouchScreen: false, javascriptEnabled: true, locationContextEnabled: true, mobileEmulationEnabled: false, nativeEvents: true, networkConnectionEnabled: false, pageLoadStrategy: normal, platform: LINUX, platformName: LINUX, proxy: Proxy(), rotatable: false, setWindowRect: true, strictFileInteractability: false, takesHeapSnapshot: true, takesScreenshot: true, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unexpectedAlertBehaviour: ignore, unhandledPromptBehavior: ignore, version: 71.0.3578.98, webStorageEnabled: true}
Session ID: dbda0c6e9a7d0aa3cc587f662ba55755 with stack:
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
java.lang.reflect.Constructor.newInstance(Constructor.java:423)
org.openqa.selenium.remote.ErrorHandler.createThrowable(ErrorHandler.java:214)
org.openqa.selenium.remote.ErrorHandler.throwIfResponseFailed(ErrorHandler.java:166)
org.openqa.selenium.remote.http.JsonHttpResponseCodec.reconstructValue(JsonHttpResponseCodec.java:40)
org.openqa.selenium.remote.http.AbstractHttpResponseCodec.decode(AbstractHttpResponseCodec.java:80)
org.openqa.selenium.remote.http.AbstractHttpResponseCodec.decode(AbstractHttpResponseCodec.java:44)
org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:164)
org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:83)
org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:601)
org.openqa.selenium.remote.RemoteWebDriver.executeScript(RemoteWebDriver.java:537)
org.jitsi.jibri.selenium.pageobjects.CallPage.getNumParticipants(CallPage.kt:69)
org.jitsi.jibri.selenium.JibriSelenium$EmptyCallStatusCheck.run(JibriSelenium.kt:288)
org.jitsi.jibri.selenium.JibriSelenium$startRecurringCallStatusChecks$1.invoke(JibriSelenium.kt:167)
org.jitsi.jibri.selenium.JibriSelenium$startRecurringCallStatusChecks$1.invoke(JibriSelenium.kt:100)
org.jitsi.jibri.util.extensions.SchedulerExecutorServiceExtsKt$sam$java_lang_Runnable$0.run(SchedulerExecutorServiceExts.kt)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)

Full log here…
jibri-log.log (71.0 KB)

I am running Both Jitsi-meet and Jibri on AWS EC2 instance with Ubuntu 16.04 server.
Jitsi-meet version: 1.0.3472-1
Jibri version: 7.1.70-1 and 6.8.68-1

Jitsi-meet side works completely fine with no issues.
This happens with both these versions of Jibri, hence I don’t think this may be issue with jibri.
I have configured them as per the instructions, and this issue doesn’t happen in my local setup.

I am new to AWS, and due to some issues on my local setup, I am trying to make it work on AWS.

Please help me look into it and if anyone else have also faced this, please let me know how to fix this.
Thanks in advance.

Regards,
Dhruvin


#2

Are you sure this is running 7.1.70? Those logs are definitely not from Jibri version 7.1.70, is it possible an old version is still running?


#3

Hi @bbaldino,
Thanks for your input.

Those logs are from 6.8.68-1, but same error occurs on 7.1.70-1.
I had jibri recording with 6.8.68-1 in local setup, and so I have tried to degrade version to check, but the error still persists.

Regards,
Dhruvin


#4

Can you please attach logs from 7.1.70-1 reproducing this issue?


#5

Hi @bbaldino,

Sorry for the late reply, was having holidays here.
Here you go, logs from jibri 7.1.70-1
jibri-7.log (24.4 KB)

Thanks and Regards,
Dhruvin


#6

Ok, I think I see what’s going on here:

  1. When trying to stop the service, we find that chrome/webdriver has had an error (org.openqa.selenium.NoSuchSessionException: invalid session id, which looks like it happened because of a page crash?). The periodic call checks happen to run while we’re in the middle of stopping and see the error as well, they fire a state transition which sends an error response and tries to stop the service again (in another thread) but this will be blocked as there’s already a thread inside of stopService in JibriManager
  2. What’s unclear is why the current thread running stop does not continue. We see it fail to get the participants but proceed anyway:
    2019-01-16 09:47:37.931 INFO: [26] org.jitsi.jibri.service.impl.FileRecordingJibriService.stop() Participants in this recording: []
    but then we get no other logs from that thread. The next thing we try and do is write the metadata file.

The only thing I can think of that would cause that to cease writing without printing anything would be: the thread is actually blocked, somehow, on the io, or, an exception is thrown. We do catch Exception there, but maybe it was just a Throwable? And, since this is running inside of the xmpp handler thread, maybe they’re catching all Throwable and not printing anything?

There is definitely a bug here, a failure we can be handling better. I filed https://github.com/jitsi/jibri/issues/179 but we haven’t been seeing this in production and I’m in the middle of some other work so I’m not sure when I’ll be able to spend time on it. One thing to look into would be if you can figure out what’s going on with the page crashing…if you can sort that out then you shouldn’t hit this bug.


#7

I just quickly opened https://github.com/jitsi/jibri/pull/180. Will see if we can get that in and a release done, maybe that will give us some more information.


#8

Hi @bbaldino,

Even I faced this issue which occured when I was running Jibri on a t2.micro instance on EC2. When I upgraded the instance to t3.xlarge, the error didn’t occur and the recording was successful with audio.

Also, if I set a local VM with the same config as that of t3.xlarge, I’m not getting audio, but it works fine on AWS. Any idea here?


#9

Sorry, not sure what that would be off the top of my head.