How to create a WebRTC video call app with Node.js

A basic understanding of Node.js and JavaScript will be helpful.

Pusher is perfect for instantaneously distributing messages amongst people and devices. This is exactly why Pusher is a great choice for signaling in WebRTC, the act of introducing two devices in realtime so they can make their own peer-to-peer connection.

WebRTC (Web Real-Time Communications) is a technology which enables web applications and sites to capture and optionally stream audio and/or video media, and to exchange arbitrary data between browsers without requiring an intermediary. The set of standards that comprises WebRTC makes it possible to share data and perform teleconferencing peer-to-peer, without requiring that the user install plug-ins or any other third-party software.

In this tutorial, we will build a video call app that allows you to make calls, accept and also reject calls.
Making your own video call application using WebRTC is simple thanks to the Pusher API.

webrtc-video-call-preview

Prerequisites

A basic understanding of Node.js and client-side JavaScript is required for this tutorial.

Setting up a Pusher account and app

Pusher is a hosted service that makes it super-easy to add realtime data and functionality to web and mobile applications.

Pusher acts as a realtime layer between your servers and clients. Pusher maintains persistent connections to the clients - over Web-socket if possible and falling back to HTTP-based connectivity - so that as soon as your servers have new data they want to push to the clients they can do, via Pusher.

If you do not already have one, head over to Pusher and create a free account.
We will register a new app on the dashboard. The only compulsory options are the app name and cluster. A cluster represents the physical location of the Pusher server that will handle your app’s requests. Also, copy out your App ID, Key, and Secret from the App Keys section, as we will need them later on.

Setting up the project

Let’s create a new node project by running:

    #create directory
    mkdir pusher-webrtc
    #move into the new directory
    cd puhser-webrtc
    #initialize a node project
    npm init -y

Next, let’s move ahead by installing the required libraries:

    npm install body-parser express pusher --save

In the command above, we have installed three libraries which are:

Express: fast, unopiniated, minimalistic web framework for Node.js.
Body-parser: parse incoming request bodies in a middleware before your handlers, available under the req.body property.
Pusher: the official Node.js library for Pusher.

Setting up the entry point

Create a file called index.js in the root folder and paste in:

    const express = require('express');
    const bodyParser = require('body-parser');
    const Pusher = require('pusher');
    const app = express();
    
    
    // Body parser middleware
    app.use(bodyParser.json());
    app.use(bodyParser.urlencoded({ extended: true }));
    // Session middleware
    
    // Create an instance of Pusher
    const pusher = new Pusher({
        appId: 'XXX-API-ID',
        key: 'XXX-API-KEY',
        secret: 'XXX-API-SECRET',
        cluster: 'XXX-API-CLUSTER',
        encrypted: true
    });
    
    app.get('/', (req, res) => {
        return res.sendFile(__dirname + '/index.html');
    });
    
    //listen on the app
    app.listen(3000, () => {
        return console.log('Server is up on 3000')
    });

In the code block above, we have added the required libraries, used the body-parser middleware, and started an instance of Pusher, passing in the app id, key, secret, and cluster.

Next, we defined the base route, in which we serve an index.html file (which we will create later on).

Finally, we set the app to listen on port 3000.

Setting up the authentication route

Since we are building a video call app, it will be nice to know who’s online at the moment. Pusher’s presence channels keeps a record of members online. We will use presence channels as opposed to the usual public channels.

Pusher’s presence channel subscriptions must be authenticated. Hence, we will have an authentication route. Add the route below to your index.js file:

    // get authentictation for the channel;
    app.post("/pusher/auth", (req, res) => {
      const socketId = req.body.socket_id;
      const channel = req.body.channel_name;
      var presenceData = {
        user_id:
          Math.random()
            .toString(36)
            .slice(2) + Date.now()
      };
      const auth = pusher.authenticate(socketId, channel, presenceData);
      res.send(auth);
    });

In the code above, we defined a new route at /pusher/auth which uses the usual pusher.authenticate method, but with an additional parameter which holds the details of the user trying to access the channel. This parameter is expected to be an object with two keys which are: user_id and user_info. The user_info key is however optional.

Note: In the example above, I am just passing a random unique id to each user. In a real-world application, you might need to pass in the user id from the database or other authentication methods as used in your app.

Creating the `index.html` file

Remember while we were creating the entry point, we served a file called index.html in the base route, which we were yet to create? Next, we will create a new file called index.html in the root folder, and add:

    <!DOCTYPE html>
    <html>
    
    <head>
        <title>WebRTC Audio/Video-Chat</title>
    </head>
    
    <body>
        <div id="app">
            <span id="myid"> </span>
            <video id="selfview"></video>
            <video id="remoteview"></video>
            <button id="endCall" style="display: none;" onclick="endCurrentCall()">End Call </button>
            <div id="list">
                <ul id="users">
    
                </ul>
            </div>
        </div>
    </body>
    
    </html>

In the code block above, we have a basic HTML setup with one span element which holds the ID of the current user, two video elements for both the caller and the receiver, a button to end the current call, with an onclick attribute if endCurrentCall() which we will define soon, and finally an ul element which holds the list of all users.

Displaying online users

To make video calls, we need to be able to see online users, which was the reason we opted for presence channels. Just before the body closing tag, paste in:

    <script src="https://js.pusher.com/4.1/pusher.min.js"></script>
    <script>
    var pusher = new Pusher("XXX-API-KEY", {
      cluster: "XXX-API-CLUSTER",
      encrypted: true,
      authEndpoint: "pusher/auth"
    });
    var usersOnline,
      id,
      users = [],
      sessionDesc,
      currentcaller,
      room,
      caller,
      localUserMedia;
    const channel = pusher.subscribe("presence-videocall");
    
    channel.bind("pusher:subscription_succeeded", members => {
      //set the member count
      usersOnline = members.count;
      id = channel.members.me.id;
      document.getElementById("myid").innerHTML = ` My caller id is : ` + id;
      members.each(member => {
        if (member.id != channel.members.me.id) {
          users.push(member.id);
        }
      });
    
      render();
    });
    
    channel.bind("pusher:member_added", member => {
      users.push(member.id);
      render();
    });
    
    channel.bind("pusher:member_removed", member => {
      // for remove member from list:
      var index = users.indexOf(member.id);
      users.splice(index, 1);
      if (member.id == room) {
        endCall();
      }
      render();
    });
    
    function render() {
      var list = "";
      users.forEach(function(user) {
        list +=
          `<li>` +
          user +
          ` <input type="button" style="float:right;"  value="Call" onclick="callUser('` +
          user +
          `')" id="makeCall" /></li>`;
      });
      document.getElementById("users").innerHTML = list;
    }
    </script>

Here, we have required the official client library for Pusher. Next, we start a new Pusher instance, passing in our app key, and also the authentication route we had created earlier.

We go on to define initial variables which we will use in the code:

usersOnline: the count of users online
id: the ID of the current user
users: an array that holds the details of all users
sessionDesc: the SDP offer being sent. SDP refers to the session description of the peer connection provided by WebRTC. (You would see more of this as we move on)
room: the identifier of the current people having a call.
caller: the peer connection object of the person calling/receiving a call.
localUserMedia: a reference to the local audio and video stream being transmitted from the caller.

Next, we subscribe to a presence channel called presence-videocall. Once subscribed to our channel, it triggers an authentication, which returns an object. To access this object, we have to bind to the pusher:subscription_succeeded event. We then get the users count, the user id, append all members apart from the current user to the user’s array. We then call a render function. (The render function would be to display the online users. We will create this function soon).

Also, we bind to two more events which are: pusher:member_added and pusher:member_removed in which we add new members and delete logged out members from the array respectively.

Finally, we define the render function which loops through all users and then appends them to the ul element as li tags with call buttons which have an onclick attribute of callUser which we will create soon.

Integrating WebRTC into the app

Now we are all set, we can use Pusher to take care of signaling within the video call. First, let’s get the video call started. Paste the following after the render function in the index.html file:

    //To iron over browser implementation anomalies like prefixes
    GetRTCPeerConnection();
    GetRTCSessionDescription();
    GetRTCIceCandidate();
    //prepare the caller to use peerconnection
    prepareCaller();
    function GetRTCIceCandidate() {
      window.RTCIceCandidate =
        window.RTCIceCandidate ||
        window.webkitRTCIceCandidate ||
        window.mozRTCIceCandidate ||
        window.msRTCIceCandidate;
    
      return window.RTCIceCandidate;
    }
    
    function GetRTCPeerConnection() {
      window.RTCPeerConnection =
        window.RTCPeerConnection ||
        window.webkitRTCPeerConnection ||
        window.mozRTCPeerConnection ||
        window.msRTCPeerConnection;
      return window.RTCPeerConnection;
    }
    
    function GetRTCSessionDescription() {
      window.RTCSessionDescription =
        window.RTCSessionDescription ||
        window.webkitRTCSessionDescription ||
        window.mozRTCSessionDescription ||
        window.msRTCSessionDescription;
      return window.RTCSessionDescription;
    }
    function prepareCaller() {
      //Initializing a peer connection
      caller = new window.RTCPeerConnection();
      //Listen for ICE Candidates and send them to remote peers
      caller.onicecandidate = function(evt) {
        if (!evt.candidate) return;
        console.log("onicecandidate called");
        onIceCandidate(caller, evt);
      };
      //onaddstream handler to receive remote feed and show in remoteview video element
      caller.onaddstream = function(evt) {
        console.log("onaddstream called");
        if (window.URL) {
          document.getElementById("remoteview").src = window.URL.createObjectURL(
            evt.stream
          );
        } else {
          document.getElementById("remoteview").src = evt.stream;
        }
      };
    }

In the code block above, we called functions which we defined just after calling them. The first three functions GetRTCPeerConnection(), GetRTCSessionDescription() and GetRTCIceCandidate() are used to iron out browser implementation anomalies for RTCPeerConnection, RTCSessionDescription and such as web-kit or Mozilla Gecko browsers. You may wonder what are they?

The RTCPeerConnection interface represents a WebRTC connection between the local computer and a remote peer. It provides methods to connect to a remote peer, maintain and monitor the connection, and close the connection once it’s no longer needed.

The RTCSessionDescription interface describes one end of a connection or potential connection and how it’s configured. Each RTCSessionDescription comprises a description type indicating which part of the offer/answer negotiation process it describes and of the SDP descriptor of the session.

The RTCIceCandidate interface is part of the WebRTC API which represents a candidate Internet Connectivity Establishment (ICE) server which may establish an RTCPeerConnection.

Remember we also called the prepareCaller function? So what is it about? This function sets a new RTCPeerConnection instance to the predefined caller variable while assigning functions for its onicecandidate and onaddstream event. In the event of an icecandidate, we call the onIceCandidate function, which we will define soon, while in the event of a newly added stream, we set the URL of the stream to be the URL of our remote video. i.e this is the second party’s video.

Defining the onIceCandidate function and using the candidate

Let’s look at what our onIceCandidate function would look like. Paste the following into the script part of your index.html file:

    //Send the ICE Candidate to the remote peer
    function onIceCandidate(peer, evt) {
        if (evt.candidate) {
            channel.trigger("client-candidate", {
                "candidate": evt.candidate,
                "room": room
            });
        }
    }
    
    channel.bind("client-candidate", function(msg) {
            if(msg.room==room){
                console.log("candidate received");
                caller.addIceCandidate(new RTCIceCandidate(msg.candidate));
            }
        });

In this function, we make a quick trigger to the other party, informing him that a new iceCandidate event has occurred. This function will be called whenever the local ICE agent needs to deliver a message to the other peer through the signaling server (In this case, Pusher). This lets the ICE agent perform negotiation with the remote peer without the browser itself needing to know any specifics about the technology being used for signaling; implement this method to use whatever messaging technology you choose to send the ICE candidate to the remote peer.

On the other end, we bind for the candidate and then add the IceCandidate to the current RTCPeerConnection

Calling a user

Calling a user using WebRTC is simple. First, we need to get the caller’s stream, then create an offer to the peer you are calling. Here, we use Pusher to signal the other peer that an incoming call is waiting for him.

In the code below, you notice we trigger client-events rather than making a post request to the server which triggers an event that we bound to.

The reason for this is because we need not store this information on the server. Unless you need to, I’ll recommend that you use client-events. However, for client-events to work, you need to have them enabled on your Pusher’s app dashboard.
Paste the following in the script section of your index.html file:

    function getCam() {
      //Get local audio/video feed and show it in selfview video element
      return navigator.mediaDevices.getUserMedia({
        video: true,
        audio: true
      });
    }
    //Create and send offer to remote peer on button click
    function callUser(user) {
      getCam()
        .then(stream => {
          if (window.URL) {
            document.getElementById("selfview").src = window.URL.createObjectURL(
              stream
            );
          } else {
            document.getElementById("selfview").src = stream;
          }
          toggleEndCallButton();
          caller.addStream(stream);
          localUserMedia = stream;
          caller.createOffer().then(function(desc) {
            caller.setLocalDescription(new RTCSessionDescription(desc));
            channel.trigger("client-sdp", {
              sdp: desc,
              room: user,
              from: id
            });
            room = user;
          });
        })
        .catch(error => {
          console.log("an error occured", error);
        });
    }
    function toggleEndCallButton() {
      if (document.getElementById("endCall").style.display == "block") {
        document.getElementById("endCall").style.display = "none";
      } else {
        document.getElementById("endCall").style.display = "block";
      }
    }

The code has been explained above. However, notice we have an extra function called toggleEndCallButton . This is used to toggle the end call button, so you can end an active call.

Also, note we triggered a client-event. This event uses Pusher to notify the recipient he has a call. Here, instead of generating a unique room ID for the two users, we use the recipient’s ID as the room id. Please use any unique identifier for the room.

Receiving a call

Receiving a call is easy. First, the recipient needs to be notified that he has a call. Remember we emitted a client event earlier on while making the call? Now we need to bind and listen to it.

    channel.bind("client-sdp", function(msg) {
        if(msg.room == id){
            var answer = confirm("You have a call from: "+ msg.from + "Would you like to answer?");
            if(!answer){
                return channel.trigger("client-reject", {"room": msg.room, "rejected":id});
            }
            room = msg.room;
            getCam()
            .then(stream => {
                localUserMedia = stream;
                toggleEndCallButton();
                if (window.URL) {
                    document.getElementById("selfview").src = window.URL.createObjectURL(stream);
                } else {
                    document.getElementById("selfview").src = stream;
                }
                caller.addStream(stream);
                var sessionDesc = new RTCSessionDescription(msg.sdp);
                caller.setRemoteDescription(sessionDesc);
                caller.createAnswer().then(function(sdp) {
                    caller.setLocalDescription(new RTCSessionDescription(sdp));
                    channel.trigger("client-answer", {
                        "sdp": sdp,
                        "room": room
                    });
                });
    
            })
            .catch(error => {
                console.log('an error occured', error);
            })
        }
    });
    channel.bind("client-answer", function(answer) {
      if (answer.room == room) {
        console.log("answer received");
        caller.setRemoteDescription(new RTCSessionDescription(answer.sdp));
      }
    });
    
    channel.bind("client-reject", function(answer) {
      if (answer.room == room) {
        console.log("Call declined");
        alert("call to " + answer.rejected + "was politely declined");
        endCall();
      }
    });
    
    function endCall() {
      room = undefined;
      caller.close();
      for (let track of localUserMedia.getTracks()) {
        track.stop();
      }
      prepareCaller();
      toggleEndCallButton();
    }

In the code above, we bind to the client-sdp event which was emitted when we made the call. Next, we check that the room is equal to the ID of the receiver (remember we used the receiver’s ID as the room. This way it doesn’t alert the wrong person). We move ahead to present a confirm box, prompting the user to accept or reject the call. If the user rejects. we return a client trigger of client-reject, passing in the room’s call that was rejected.

If the call isn’t rejected, we get the recipient’s webcam, then set the reference to the stream (so we can stop the webcam while ending the call). We add the stream to the video output and the current RTCPeerConnection instance.

We set the remote description as the description of the sdp sent by the caller. Finally, we create an answer and then send the answer to the caller.

If the answer is not received, the call would not be connected. When the answer is received, the caller then sets his remote description to the sdp from the receiver.

Notice the listener for the client-reject event calls the endCall function (this is because when a call is rejected, we want to end everything about the call). The end call function sets the room to its status quo, closes the
, stops the media streaming, prepares the caller to make/receive new calls, then finally disable the end call button.

Wrapping it all up

At the end of the whole episode, here is what our JavaScript code looks like:

    var pusher = new Pusher("XXX-API-KEY", {
      cluster: "mt1",
      encrypted: true,
      authEndpoint: "pusher/auth"
    });
    var usersOnline,
      id,
      users = [],
      sessionDesc,
      currentcaller,
      room,
      caller,
      localUserMedia;
    const channel = pusher.subscribe("presence-videocall");
    
    channel.bind("pusher:subscription_succeeded", members => {
      //set the member count
      usersOnline = members.count;
      id = channel.members.me.id;
      document.getElementById("myid").innerHTML = ` My caller id is : ` + id;
      members.each(member => {
        if (member.id != channel.members.me.id) {
          users.push(member.id);
        }
      });
    
      render();
    });
    
    channel.bind("pusher:member_added", member => {
      users.push(member.id);
      render();
    });
    
    channel.bind("pusher:member_removed", member => {
      // for remove member from list:
      var index = users.indexOf(member.id);
      users.splice(index, 1);
      if (member.id == room) {
        endCall();
      }
      render();
    });
    
    function render() {
      var list = "";
      users.forEach(function(user) {
        list +=
          `<li>` +
          user +
          ` <input type="button" style="float:right;"  value="Call" onclick="callUser('` +
          user +
          `')" id="makeCall" /></li>`;
      });
      document.getElementById("users").innerHTML = list;
    }
    
    //To iron over browser implementation anomalies like prefixes
    GetRTCPeerConnection();
    GetRTCSessionDescription();
    GetRTCIceCandidate();
    prepareCaller();
    function prepareCaller() {
      //Initializing a peer connection
      caller = new window.RTCPeerConnection();
      //Listen for ICE Candidates and send them to remote peers
      caller.onicecandidate = function(evt) {
        if (!evt.candidate) return;
        console.log("onicecandidate called");
        onIceCandidate(caller, evt);
      };
      //onaddstream handler to receive remote feed and show in remoteview video element
      caller.onaddstream = function(evt) {
        console.log("onaddstream called");
        if (window.URL) {
          document.getElementById("remoteview").src = window.URL.createObjectURL(
            evt.stream
          );
        } else {
          document.getElementById("remoteview").src = evt.stream;
        }
      };
    }
    function getCam() {
      //Get local audio/video feed and show it in selfview video element
      return navigator.mediaDevices.getUserMedia({
        video: true,
        audio: true
      });
    }
    
    function GetRTCIceCandidate() {
      window.RTCIceCandidate =
        window.RTCIceCandidate ||
        window.webkitRTCIceCandidate ||
        window.mozRTCIceCandidate ||
        window.msRTCIceCandidate;
    
      return window.RTCIceCandidate;
    }
    
    function GetRTCPeerConnection() {
      window.RTCPeerConnection =
        window.RTCPeerConnection ||
        window.webkitRTCPeerConnection ||
        window.mozRTCPeerConnection ||
        window.msRTCPeerConnection;
      return window.RTCPeerConnection;
    }
    
    function GetRTCSessionDescription() {
      window.RTCSessionDescription =
        window.RTCSessionDescription ||
        window.webkitRTCSessionDescription ||
        window.mozRTCSessionDescription ||
        window.msRTCSessionDescription;
      return window.RTCSessionDescription;
    }
    
    //Create and send offer to remote peer on button click
    function callUser(user) {
      getCam()
        .then(stream => {
          if (window.URL) {
            document.getElementById("selfview").src = window.URL.createObjectURL(
              stream
            );
          } else {
            document.getElementById("selfview").src = stream;
          }
          toggleEndCallButton();
          caller.addStream(stream);
          localUserMedia = stream;
          caller.createOffer().then(function(desc) {
            caller.setLocalDescription(new RTCSessionDescription(desc));
            channel.trigger("client-sdp", {
              sdp: desc,
              room: user,
              from: id
            });
            room = user;
          });
        })
        .catch(error => {
          console.log("an error occured", error);
        });
    }
    
    function endCall() {
      room = undefined;
      caller.close();
      for (let track of localUserMedia.getTracks()) {
        track.stop();
      }
      prepareCaller();
      toggleEndCallButton();
    }
    
    function endCurrentCall() {
      channel.trigger("client-endcall", {
        room: room
      });
    
      endCall();
    }
    
    //Send the ICE Candidate to the remote peer
    function onIceCandidate(peer, evt) {
      if (evt.candidate) {
        channel.trigger("client-candidate", {
          candidate: evt.candidate,
          room: room
        });
      }
    }
    
    function toggleEndCallButton() {
      if (document.getElementById("endCall").style.display == "block") {
        document.getElementById("endCall").style.display = "none";
      } else {
        document.getElementById("endCall").style.display = "block";
      }
    }
    
    //Listening for the candidate message from a peer sent from onicecandidate handler
    channel.bind("client-candidate", function(msg) {
      if (msg.room == room) {
        console.log("candidate received");
        caller.addIceCandidate(new RTCIceCandidate(msg.candidate));
      }
    });
    
    //Listening for Session Description Protocol message with session details from remote peer
    channel.bind("client-sdp", function(msg) {
      if (msg.room == id) {
        console.log("sdp received");
        var answer = confirm(
          "You have a call from: " + msg.from + "Would you like to answer?"
        );
        if (!answer) {
          return channel.trigger("client-reject", { room: msg.room, rejected: id });
        }
        room = msg.room;
        getCam()
          .then(stream => {
            localUserMedia = stream;
            toggleEndCallButton();
            if (window.URL) {
              document.getElementById("selfview").src = window.URL.createObjectURL(
                stream
              );
            } else {
              document.getElementById("selfview").src = stream;
            }
            caller.addStream(stream);
            var sessionDesc = new RTCSessionDescription(msg.sdp);
            caller.setRemoteDescription(sessionDesc);
            caller.createAnswer().then(function(sdp) {
              caller.setLocalDescription(new RTCSessionDescription(sdp));
              channel.trigger("client-answer", {
                sdp: sdp,
                room: room
              });
            });
          })
          .catch(error => {
            console.log("an error occured", error);
          });
      }
    });
    
    //Listening for answer to offer sent to remote peer
    channel.bind("client-answer", function(answer) {
      if (answer.room == room) {
        console.log("answer received");
        caller.setRemoteDescription(new RTCSessionDescription(answer.sdp));
      }
    });
    
    channel.bind("client-reject", function(answer) {
      if (answer.room == room) {
        console.log("Call declined");
        alert("call to " + answer.rejected + "was politely declined");
        endCall();
      }
    });
    
    channel.bind("client-endcall", function(answer) {
      if (answer.room == room) {
        console.log("Call Ended");
        endCall();
      }
    });

Next, let’s run our app by running:

    node index.js

Finally, navigate to http://localhost:3000 to try the app out.
Below is an image of what we have built:

webrtc-video-call-preview

Conclusion

In this tutorial, you learned how to put together your own WebRTC chat application using Pusher as a signaling server. We covered setting up a WebRTC connection using simple JavaScript.
From here you can take things further and explore more complex call applications by adding in better video security, notifications that a user is on another call, group video calls, and more!

The code base to this tutorial is hosted in a public GitHub repository. Play around with the code.

JavaScript Node.js Online Presence

29 March 2018

by Samuel Ogundipe