Phone call integration (SIP) with OpenVidu

OpenVidu provides a very powerful infrastructure for holding virtual meetings using WebRTC technology. But in order to provide a complete user experience that does not leave out users who, due to connectivity or device support issues, cannot access through WebRTC, it is necessary to be able to support and integrate legacy, but universal media such as telephone calls.

Context

InterviewOpps allows its clients to conduct on-demand and live video interviews. On-demand video interviews allow candidates to record video/audio answers to pre-recorded questions while live interviews allow people to talk in a video conference format, after the call the tools for collaboration rating the interview are provided to the interviewee and their colleagues.

For both On-Demand and Live Interviews InterviewOpps is using OpenVidu technology. While On-Demand Interviews allow candidates to take their time, Live Interviews have to happen synchronously, so any technical difficulties on either the interviewer or interviewee side will waste at least 2 people’s time. And with WebRTC these kind of difficulties are not uncommon. Users might be using an Internet connection with a firewall, slow Internet connection or their microphone could be not working or not configured. InterviewOpps found that in these situation users are quick to abandon the platform and use a phone call or FaceTime.

The solution InterviewOpps chose for this was to allow users to use plain old phone for audio. This serves as a fallback allowing users to not leave the platform. The way it works, users have to dial a phone number, then enter an interview code followed by hash sign and then they are in the interview. If they also have video coming from another device, that is combined with the audio coming through the phone system.

The Project

InterviewOpps project is to create an interview platform, based on OpenVidu but extending the functionality to allow telephone access. Within the Naeva Tec project we have made an application component that allows us to integrate these phone calls into an OpenVidu session, both with incoming calls and made from the application.
Outgoing calls that are made by associating a SIP address that maps to a telephone number (or extension, depending on the telephone provider) that can be called by any telephone number.
In turn, outgoing calls to any phone number will be facilitated through the API that we have developed at Naeva Tec,

SIP,  integrating with telephony networks

The connection with the basic telephony network is made through a SIP access provider. The development and the final client have been done using the services of Twilio Programmable Voice. But any SIP provider can be used with the solution. In fact, it is advisable to implement this solution with a PBX (Private Branch eXchange, which is a telephone network of a company, which allows different communication channels such as VoIP, ISDN, analog) since, although it is feasible to implement this functionality within OpenVidu it is simpler and more reasonable to integrate with a PBX.

Our solution

We implemented our soplution as an  integration component. This integration component exposes a REST API that allows managing SIP support of an OpenVidu session, make outgoing calls and establish telephone calls in a similar way to the operation of OpenVidu plus a webhook that allows receiving events (incoming calls, calls hanging from the phone, etc.). The establishment of a call will mean that the audio coming from the telephone (and the video if it supports it) becomes a new publisher of the session to which the rest of the participants can subscribe. And the rest of the publishers in the session will redirect the media to the phone. For the audio media, a mix of the audios will be made. If the phone supports video, the REST API will allow deciding which video publisher is forwarded to the phone.

In this component we use the same publisher / subscriber model that OpenVidu proposes. By default the phone will subscribe to all the other publishers in the session, but the REST API provides the flexibility that the application can decide which audio streams the phone will subscribe (all subscribers will be mixed before send them to the phone), and what video stream (unique, no type of composition is made) is forwarded to the phone if it supports video.

Internally this component integrates with OpenVidu for the media subscription and for the publication of the audio to the OpenVidu session. Kurento is used for the integration with the SIP telephony network (it can be the same OpenVidu Kurento server). Kurento provides us with almost all the necessary elements to integrate with a SIP service: RTP protocol support, audio mixing, video switching (in case the SIP terminal supports video).

Conclusion and lessons learned

However, although Kurento and OpenVidu provide great RTP / SRTP support which provides the basis for integrating with a SIP network, the reality is that SIP telephony networks make use of the RTP / SRTP protocol in ways that are poorly adapted to the RTP implementation made in Kurento. Specifically, the RTP protocol support available in Kurento has found the following shortcomings:

  • Lack of support for interim SDP responses used in SIP protocol through 183 responses
  • It is usual that the SDP offers or responses of the SIP network do not include SSRCs associated with the media flows, while for Kurento it is essential to identify a media session
  • And precisely they do not include them because whne some media switching happens in the PBX the common practice is that they also change the SSRCs and even the time base. Kurento does not support this type of switching.

To cover these essential functionalities we have created a Kurento component and we have contributed to the Kurento community the SipRtpEndPoint already available as PR in the Kurento GitHub repository and it is being revised and we hope to see it as a Kurento module soon.