[jitsi-dev] Implementing a "Speakerphone" Button


#1

Hello,

My company is going to try replacing nearly all of our phones with softphones. We've chosen Jitsi for this. The only feature we're still wanting is a speakerphone button. The idea being that by pushing this button, a call could be initially answered via headset but later transferred to a mic/speaker setup and possibly back to the headset again. The behavior would mimic most business handsets.

I've started attempting to implement this. I've got the actual speakerphone button added to the GUI, I've added two additional combo boxes to the audio device setup (Speakerphone Audio In and Speakerphone Audio Out). I'm also successful in setting these options and saving them to the configuration file. The trouble I'm having is actually getting the devices to switch when the button is selected.

I'm having a hard time getting my head around the architecture in the media layers. Initially I had thought that the proper place to switch both the input and output device would be in MediaAwareCall but I found that the MediaAwarePeers only contain data regarding input devices. I did go ahead and try setting the capture device like this:

for(MediaAwareCallPeer<?,?,?> peer : getCallPeersVector())
         {
             MediaStream audio = peer.getMediaHandler().getStream(MediaType.AUDIO);
             if(speaker)
             {
                 //TODO: SPEAKERPHONE FIX THIS!
                 audio.stop();
                 audio.setDevice(mediaService.getSpeakerCaptureDevice());
                 audio.start();
             }
             else
             {
                 //TODO: SPEAKERPHONE set it back to the regular audio device
                 audio.stop();
                 audio.setDevice(getDefaultDevice(MediaType.AUDIO));
                 audio.start();
             }
         }

However, all call audio goes silent when the speakerphone button is toggled. I also noticed the the level indicator for the output device hangs until the the speakerphone button is toggled off, at which point it appears to resume normal operation; however, there's still no sound.

I also have noticed that portaudio gets registered as a custom renderer for JMF in the DeviceConfiguration code. Is this the only place that the output configuration is ever manipulated? Is it necessary to create a new portaudio instance to change the output device?

Anyway, I'm at a loss as to how to proceed and I'm afraid I'm going to walk all over the architecture if I hack any further without guidance (I may have already...). So, is there anyone out there that could help me understand what's going on with the media devices and where would be the proper place to implement this kind of functionality?

Thanks,
Aaron Stover


#2

Is there any additional info I could provide to help anyone in answering this? I apologize if my question(s) is/are too vague. I've continued to play with the audio streams but am still confused about how exactly they work. I've read the page on the opensource architecture website but it doesn't discuss incoming streams. I'm going to continue to hammer at this but if anyone could provide even some high level architectural guidance regarding streams and what conceptually needs to happen to get the speakerphone type behavior, it would really be appreciated.

I suppose a very specific question I would have right now is why does the output audio stop when I close or stop the input audio stream?

Thanks,
Aaron

···

On 11/22/2011 5:35 PM, Aaron Stover (Celestech) wrote:

Hello,

My company is going to try replacing nearly all of our phones with softphones. We've chosen Jitsi for this. The only feature we're still wanting is a speakerphone button. The idea being that by pushing this button, a call could be initially answered via headset but later transferred to a mic/speaker setup and possibly back to the headset again. The behavior would mimic most business handsets.

I've started attempting to implement this. I've got the actual speakerphone button added to the GUI, I've added two additional combo boxes to the audio device setup (Speakerphone Audio In and Speakerphone Audio Out). I'm also successful in setting these options and saving them to the configuration file. The trouble I'm having is actually getting the devices to switch when the button is selected.

I'm having a hard time getting my head around the architecture in the media layers. Initially I had thought that the proper place to switch both the input and output device would be in MediaAwareCall but I found that the MediaAwarePeers only contain data regarding input devices. I did go ahead and try setting the capture device like this:

for(MediaAwareCallPeer<?,?,?> peer : getCallPeersVector())
        {
            MediaStream audio = peer.getMediaHandler().getStream(MediaType.AUDIO);
            if(speaker)
            {
                //TODO: SPEAKERPHONE FIX THIS!
                audio.stop();
                audio.setDevice(mediaService.getSpeakerCaptureDevice());
                audio.start();
            }
            else
            {
                //TODO: SPEAKERPHONE set it back to the regular audio device
                audio.stop();
                audio.setDevice(getDefaultDevice(MediaType.AUDIO));
                audio.start();
            }
        }

However, all call audio goes silent when the speakerphone button is toggled. I also noticed the the level indicator for the output device hangs until the the speakerphone button is toggled off, at which point it appears to resume normal operation; however, there's still no sound.

I also have noticed that portaudio gets registered as a custom renderer for JMF in the DeviceConfiguration code. Is this the only place that the output configuration is ever manipulated? Is it necessary to create a new portaudio instance to change the output device?

Anyway, I'm at a loss as to how to proceed and I'm afraid I'm going to walk all over the architecture if I hack any further without guidance (I may have already...). So, is there anyone out there that could help me understand what's going on with the media devices and where would be the proper place to implement this kind of functionality?

Thanks,
Aaron Stover


#3

Hey Aaron,

The easiest way to implement this would be by using the same code that
currently handles the device selectors in the audio and video
configuration tabs. Currently, however, it doesn't handle dynamic device
changes so you'd have to implement this yourself.

Hope this helps,
Emil

На 25.11.11 22:24, Aaron Stover (Celestech) написа:

···

Is there any additional info I could provide to help anyone in answering
this? I apologize if my question(s) is/are too vague. I've continued to
play with the audio streams but am still confused about how exactly they
work. I've read the page on the opensource architecture website but it
doesn't discuss incoming streams. I'm going to continue to hammer at
this but if anyone could provide even some high level architectural
guidance regarding streams and what conceptually needs to happen to get
the speakerphone type behavior, it would really be appreciated.

I suppose a very specific question I would have right now is why does
the output audio stop when I close or stop the input audio stream?

Thanks,
Aaron

On 11/22/2011 5:35 PM, Aaron Stover (Celestech) wrote:

Hello,

My company is going to try replacing nearly all of our phones with
softphones. We've chosen Jitsi for this. The only feature we're still
wanting is a speakerphone button. The idea being that by pushing this
button, a call could be initially answered via headset but later
transferred to a mic/speaker setup and possibly back to the headset
again. The behavior would mimic most business handsets.

I've started attempting to implement this. I've got the actual
speakerphone button added to the GUI, I've added two additional combo
boxes to the audio device setup (Speakerphone Audio In and
Speakerphone Audio Out). I'm also successful in setting these options
and saving them to the configuration file. The trouble I'm having is
actually getting the devices to switch when the button is selected.

I'm having a hard time getting my head around the architecture in the
media layers. Initially I had thought that the proper place to switch
both the input and output device would be in MediaAwareCall but I
found that the MediaAwarePeers only contain data regarding input
devices. I did go ahead and try setting the capture device like this:

for(MediaAwareCallPeer<?,?,?> peer : getCallPeersVector())
        {
            MediaStream audio =
peer.getMediaHandler().getStream(MediaType.AUDIO);
            if(speaker)
            {
                //TODO: SPEAKERPHONE FIX THIS!
                audio.stop();
                audio.setDevice(mediaService.getSpeakerCaptureDevice());
                audio.start();
            }
            else
            {
                //TODO: SPEAKERPHONE set it back to the regular audio
device
                audio.stop();
                audio.setDevice(getDefaultDevice(MediaType.AUDIO));
                audio.start();
            }
        }

However, all call audio goes silent when the speakerphone button is
toggled. I also noticed the the level indicator for the output device
hangs until the the speakerphone button is toggled off, at which point
it appears to resume normal operation; however, there's still no sound.

I also have noticed that portaudio gets registered as a custom
renderer for JMF in the DeviceConfiguration code. Is this the only
place that the output configuration is ever manipulated? Is it
necessary to create a new portaudio instance to change the output device?

Anyway, I'm at a loss as to how to proceed and I'm afraid I'm going to
walk all over the architecture if I hack any further without guidance
(I may have already...). So, is there anyone out there that could help
me understand what's going on with the media devices and where would
be the proper place to implement this kind of functionality?

Thanks,
Aaron Stover

--
Emil Ivov, Ph.D. 67000 Strasbourg,
Project Lead France
Jitsi
emcho@jitsi.org PHONE: +33.1.77.62.43.30
http://jitsi.org FAX: +33.1.77.62.47.31


#4

Hi Emil,

Thanks for the feedback. It has helped. I was assuming that setdevice() had already been written to handle dynamic device changes. The following comments and code from MediaStreamImpl.setDevice(MediaDevice) had me confused:

             if(oldValue != null)
             {
                 // transfer the rendering session objects (JMF player, ...)
                 // to the new MediaDeviceSession. So we do not have a
                 // reinitialization of the receive stream if we just change our
                 // device (switch from camera to desktop streaming).
                 deviceSession = abstractMediaDevice.createSession(oldValue);
             }

I'll take a look at what's happening when this operation is attempted with Audio capture devices.

Again, thanks for the info.

Aaron

···

On 11/28/2011 9:27 AM, Emil Ivov wrote:

Hey Aaron,

The easiest way to implement this would be by using the same code that
currently handles the device selectors in the audio and video
configuration tabs. Currently, however, it doesn't handle dynamic device
changes so you'd have to implement this yourself.

Hope this helps,
Emil

На 25.11.11 22:24, Aaron Stover (Celestech) написа:

Is there any additional info I could provide to help anyone in answering
this? I apologize if my question(s) is/are too vague. I've continued to
play with the audio streams but am still confused about how exactly they
work. I've read the page on the opensource architecture website but it
doesn't discuss incoming streams. I'm going to continue to hammer at
this but if anyone could provide even some high level architectural
guidance regarding streams and what conceptually needs to happen to get
the speakerphone type behavior, it would really be appreciated.

I suppose a very specific question I would have right now is why does
the output audio stop when I close or stop the input audio stream?

Thanks,
Aaron

On 11/22/2011 5:35 PM, Aaron Stover (Celestech) wrote:

Hello,

My company is going to try replacing nearly all of our phones with
softphones. We've chosen Jitsi for this. The only feature we're still
wanting is a speakerphone button. The idea being that by pushing this
button, a call could be initially answered via headset but later
transferred to a mic/speaker setup and possibly back to the headset
again. The behavior would mimic most business handsets.

I've started attempting to implement this. I've got the actual
speakerphone button added to the GUI, I've added two additional combo
boxes to the audio device setup (Speakerphone Audio In and
Speakerphone Audio Out). I'm also successful in setting these options
and saving them to the configuration file. The trouble I'm having is
actually getting the devices to switch when the button is selected.

I'm having a hard time getting my head around the architecture in the
media layers. Initially I had thought that the proper place to switch
both the input and output device would be in MediaAwareCall but I
found that the MediaAwarePeers only contain data regarding input
devices. I did go ahead and try setting the capture device like this:

for(MediaAwareCallPeer<?,?,?> peer : getCallPeersVector())
         {
             MediaStream audio =
peer.getMediaHandler().getStream(MediaType.AUDIO);
             if(speaker)
             {
                 //TODO: SPEAKERPHONE FIX THIS!
                 audio.stop();
                 audio.setDevice(mediaService.getSpeakerCaptureDevice());
                 audio.start();
             }
             else
             {
                 //TODO: SPEAKERPHONE set it back to the regular audio
device
                 audio.stop();
                 audio.setDevice(getDefaultDevice(MediaType.AUDIO));
                 audio.start();
             }
         }

However, all call audio goes silent when the speakerphone button is
toggled. I also noticed the the level indicator for the output device
hangs until the the speakerphone button is toggled off, at which point
it appears to resume normal operation; however, there's still no sound.

I also have noticed that portaudio gets registered as a custom
renderer for JMF in the DeviceConfiguration code. Is this the only
place that the output configuration is ever manipulated? Is it
necessary to create a new portaudio instance to change the output device?

Anyway, I'm at a loss as to how to proceed and I'm afraid I'm going to
walk all over the architecture if I hack any further without guidance
(I may have already...). So, is there anyone out there that could help
me understand what's going on with the media devices and where would
be the proper place to implement this kind of functionality?

Thanks,
Aaron Stover


#5

I've continued to work on this and after a crash course in JMF I think things are starting to get clearer.

I found that in the nested class AudioMixerMediaDevice$AudioMixerMediaDeviceSession, the constructor is written as such:

        /**
          * Initializes a new <tt>AudioMixingMediaDeviceSession</tt> which is to
          * represent the <tt>MediaDeviceSession</tt> of this <tt>AudioMixer</tt>
          * with its <tt>MediaDevice</tt>
          */
         public AudioMixerMediaDeviceSession()
         {
             super(AudioMixerMediaDevice.this);
         }

The code does not seem to be in line with the comments and doesn't make much sense to me since it results in a circular reference. I changed it to pass the actual media device to the superclass constructor rather than the mixing device:

        /**
          * Initializes a new <tt>AudioMixingMediaDeviceSession</tt> which is to
          * represent the <tt>MediaDeviceSession</tt> of this <tt>AudioMixer</tt>
          * with its <tt>MediaDevice</tt>
          */
         public AudioMixerMediaDeviceSession()
         {
             super(AudioMixerMediaDevice.this.device);
         }

Now the behavior I'm seeing out of AudioStreamImpl.setDevice() seems much more reasonable. The playback of the receivestream does not stop.

Does this change make sense to others?

Thanks,
Aaron

···

On 11/28/2011 1:46 PM, Aaron Stover (Celestech) wrote:

Hi Emil,

Thanks for the feedback. It has helped. I was assuming that setdevice() had already been written to handle dynamic device changes. The following comments and code from MediaStreamImpl.setDevice(MediaDevice) had me confused:

            if(oldValue != null)
            {
                // transfer the rendering session objects (JMF player, ...)
                // to the new MediaDeviceSession. So we do not have a
                // reinitialization of the receive stream if we just change our
                // device (switch from camera to desktop streaming).
                deviceSession = abstractMediaDevice.createSession(oldValue);
            }

I'll take a look at what's happening when this operation is attempted with Audio capture devices.

Again, thanks for the info.

Aaron

On 11/28/2011 9:27 AM, Emil Ivov wrote:

Hey Aaron,

The easiest way to implement this would be by using the same code that
currently handles the device selectors in the audio and video
configuration tabs. Currently, however, it doesn't handle dynamic device
changes so you'd have to implement this yourself.

Hope this helps,
Emil

На 25.11.11 22:24, Aaron Stover (Celestech) написа:

Is there any additional info I could provide to help anyone in answering
this? I apologize if my question(s) is/are too vague. I've continued to
play with the audio streams but am still confused about how exactly they
work. I've read the page on the opensource architecture website but it
doesn't discuss incoming streams. I'm going to continue to hammer at
this but if anyone could provide even some high level architectural
guidance regarding streams and what conceptually needs to happen to get
the speakerphone type behavior, it would really be appreciated.

I suppose a very specific question I would have right now is why does
the output audio stop when I close or stop the input audio stream?

Thanks,
Aaron

On 11/22/2011 5:35 PM, Aaron Stover (Celestech) wrote:

Hello,

My company is going to try replacing nearly all of our phones with
softphones. We've chosen Jitsi for this. The only feature we're still
wanting is a speakerphone button. The idea being that by pushing this
button, a call could be initially answered via headset but later
transferred to a mic/speaker setup and possibly back to the headset
again. The behavior would mimic most business handsets.

I've started attempting to implement this. I've got the actual
speakerphone button added to the GUI, I've added two additional combo
boxes to the audio device setup (Speakerphone Audio In and
Speakerphone Audio Out). I'm also successful in setting these options
and saving them to the configuration file. The trouble I'm having is
actually getting the devices to switch when the button is selected.

I'm having a hard time getting my head around the architecture in the
media layers. Initially I had thought that the proper place to switch
both the input and output device would be in MediaAwareCall but I
found that the MediaAwarePeers only contain data regarding input
devices. I did go ahead and try setting the capture device like this:

for(MediaAwareCallPeer<?,?,?> peer : getCallPeersVector())
         {
             MediaStream audio =
peer.getMediaHandler().getStream(MediaType.AUDIO);
             if(speaker)
             {
                 //TODO: SPEAKERPHONE FIX THIS!
                 audio.stop();
                 audio.setDevice(mediaService.getSpeakerCaptureDevice());
                 audio.start();
             }
             else
             {
                 //TODO: SPEAKERPHONE set it back to the regular audio
device
                 audio.stop();
                 audio.setDevice(getDefaultDevice(MediaType.AUDIO));
                 audio.start();
             }
         }

However, all call audio goes silent when the speakerphone button is
toggled. I also noticed the the level indicator for the output device
hangs until the the speakerphone button is toggled off, at which point
it appears to resume normal operation; however, there's still no sound.

I also have noticed that portaudio gets registered as a custom
renderer for JMF in the DeviceConfiguration code. Is this the only
place that the output configuration is ever manipulated? Is it
necessary to create a new portaudio instance to change the output device?

Anyway, I'm at a loss as to how to proceed and I'm afraid I'm going to
walk all over the architecture if I hack any further without guidance
(I may have already...). So, is there anyone out there that could help
me understand what's going on with the media devices and where would
be the proper place to implement this kind of functionality?

Thanks,
Aaron Stover


#6

This software looks great but the first thing I noticed was no ability
to go to a speaker phone. Is this on the horizon somewhere??

Thanks

-Dave-

···

--
Dave Fouts
Maverick Computer Enterprises
970-639-9840


#7

Hey David,

I am not sure I follow. Jitsi allows you to play audio on any of your
devices.

Isn't this what you mean by "speakerphone"?

Emil

···

On 18.05.13, 01:06, David Fouts wrote:

This software looks great but the first thing I noticed was no ability
to go to a speaker phone. Is this on the horizon somewhere??

Thanks

-Dave-

--
https://jitsi.org