End to End JavaScript Video Streaming

Video streaming has been about since the early days of the World Wide Web in the mid-1990s. In the course of time, various technologies have been used to realise video streaming on the web: it all started with the integration of Java Applets in browsers. In technological terms, this often meant that individual images were loaded from a web server within a defined interval. The technologies that started to emerge in the early 2000s, such as Adobe Flash & Microsoft’s Silverlight, and the fact that more and more households had data connections meant that the frame rates increased and the resolution of the video images was improved. The type of communication changed, too: away from data polling and towards push transmission.

Now, if you want to make a simple livestream available, you can use various free services. These vary in terms of bandwidth, quality, the format on offer and the size and number of advertising banners shown.

For our open day for applicants, ADVENTALK, we wanted to incorporate a live stream into our information page in order to give potential new colleagues an insight into our daily work. We needed a special page format for the stream and weren’t too keen on advertising either. None of the freely available or low-cost alternatives met our needs, so we opted to develop our own solution.

Before ADVENTALK this construction moved through denkwerk and provided the pictures for our livestream

One thing became clear after analysing our requirements: we didn’t need 24 or 16 frames a second, 2 to 4 are more than sufficient. What is more, the stream was to be fairly mobile and easy to administer, and would ideally use resources that we already had anyway. Time was also a consideration: the realisation of the stream was to use as few resources as possible.

In our case, it was then fairly simple to select the technology: access video data, send it to a server that would then convert the data into JPEG data that would then be requested and displayed by a browser – only JavaScript would allow this.

In order to access the WebCam video data, we used the getUserMedia APIs that are available in Opera, Chrome and Firefox. These enable access to the audio and video periphery linked to the computer. In our case, we only needed a video image. The client-side code executed in Chrome for this scenario is as follows:

<video id="sourcevid" autoplay></video>
<script>
  var videoStream = document.getElementById('sourcevid');

  // output the video data in the source video element
  var successCallback = function (srm) {
    videoStream.src = window.webkitURL.createObjectURL(srm);
  };

  // log error
  var errorCallback = function (error) {
    console.log('error: ' + error.msg);
  };

  // grab the incoming device data
  window.navigator.webkitGetUserMedia({video: true}, successCallback, errorCallback);
</script>

In Chrome, these APIs currently only exist in a prefixed version. In Firefox and Opera, the code would look like:

<video id="sourcevid" autoplay></video>
<script>
  var videoStream = document.getElementById('sourcevid');

  // output the video data in the source video element
  var successCallback = function (stream) {
    videoStream.src = stream;
  };

  // log error
  var errorCallback = function (error) {
    console.log('error: ' + error.msg);
  };

  // grab the incoming device data
  window.navigator.getUserMedia(['video'], successCallback, errorCallback);
</script>

This setup already allows us to show the video image in the browser.
In order to now send the image to the server (in a form that the latter can understand), we can use a canvas element to convert the video image into the JPEG or PNG format:

<video id="sourcevid" autoplay></video>
<canvas width="640" height="480" id="output"></canvas>
<script>
var convertVideoToJpg = function (stream, canvasElement, ctx) {
  ctx.drawImage(stream, 0, 0);
  var picture = canvasElement.toDataURL('image/jpeg');
}

var init = function () {
  var videoStream = document.getElementById('sourcevid');
  var canvas = document.getElementById('output');
  var ctx = canvas.getContext('2d');

  // output the video data in the source video element
  var successCallback = function (srm) {
    videoStream.src = window.webkitURL.createObjectURL(srm);
  };

  // log error
  var errorCallback = function (error) {
    console.log('error: ' + error.msg);
  };

  // grab the incoming device data
  window.navigator.webkitGetUserMedia({video: true}, successCallback, errorCallback);

  // send the video data every 250ms
  setInterval(function () {
    convertVideoToJpg(videoStream, canvas, ctx);
  }, 500);
}

window.onload = init;
</script>

In order to now send the data to the server, we are using the WebSocket technology, or the Socket.io framework:

<video id="sourcevid" autoplay></video>
<canvas width="640" height="480" id="output"></canvas>
<script src="http://remote.server.url.io:8080/socket.io/socket.io.js"></script>
<script>
var socket = io.connect('http://remote.server.url.io:8080/');

var convertVideoToJpgAndSendToServer = function (stream, canvasElement, ctx) {
  ctx.drawImage(stream, 0, 0);
  var picture = canvasElement.toDataURL('image/jpeg');
  socket.emit('vs-stream', {
    picture: picture
  });
}

var init = function () {
  var videoStream = document.getElementById('sourcevid');
  var canvas = document.getElementById('output');
  var ctx = canvas.getContext('2d');

  // output the video data in the source video element
  var successCallback = function (srm) {
    videoStream.src = window.webkitURL.createObjectURL(srm);
  };

  // log error
  var errorCallback = function (error) {
    console.log('error: ' + error.msg);
  };

  // grab the incoming device data
  window.navigator.webkitGetUserMedia({video: true}, successCallback, errorCallback);

  // send the video data every 250ms
  setInterval(function () {
    convertVideoToJpgAndSendToServer(videoStream, canvas, ctx);
  }, 500);
}

window.onload = init;
</script>

This code snippet now sends a Base64-encoded text JPEG to our server using a WebSocket connection in an interval of 500 milliseconds.

The server code is similarly simple. Thanks to the large Node.js standard library, the only external dependencies we need are the Socket.io WebSocket framework already mentioned above and the Abstraction Express web server:

// load and configure socket.io & express
var express = require('express');
var app = express();
var server = require('http').createServer(app);
var io = require('socket.io').listen(server);

// defines the port, the server is running on
// either the snd. argument from the server call
// node server.js :port or falls back to port 8080 if none given
var port = process.argv[2] || 8080;

// holds the base64 text from the last received images
var lastImage = '';

// returns the jpg image ressouce if the url
// image/any_random_valid_ressource_string.jpg is called
app.get('/image/*.jpg', function (req, res) {
  res.set('Content-Type', 'image/jpeg');
  // convert the base64 text into a string that the node Buffer object understands
  // and send the composed binary image data to the client
  res.send(new Buffer(lastImage.replace(/^data:image\/jpeg;base64,/,""), 'base64'));
});

// get our little server up & running
server.listen(port, function () {
  console.log('Server running @ http://localhost:' + port);
});

// get our stream up and running
io.sockets.on('connection', function (socket) {
  // if socket data with the 'vs-stream' namespace is received,
  // write the contents to the global ´lastImage´ variable
  socket.on('vs-stream', function (data) {
    if (data.picture !== '') lastImage = data.picture;
  });
});

The code itself is fairly self-explanatory: whenever an image is sent from the streaming client via WebSockets to the server, the text is cached in the ‘lastImage’ variable. If a client now accesses the URL ‘http://remote.server.url.io:8080/image/jeder_valide_string.jpg’, it receives the last image as a binary JPG.

The only thing missing for livestreaming success is the code that is responsible for the ongoing display/switching of the individual images in the output browser. Our conversion into JPG data means that every browser that can display images can show the pictures. So we only need a small JavaScript that changes the image on the client in the interval defined by us (500ms).

<img id="stream" width="974" height="400" src="http://remote.server.url.io:8080/lastxxx.jpg"/>
<script>
  var image = document.getElementById('stream');
  setInterval(function () {
    image.setAttribute('src', 'http://remote.server.url.io:8080/last' + Math.floor(Math.random()*111) + '.jpg');
  }, 500);
</script>

We don’t need anything other than this code snipped. In an interval of 500ms, the ‘src’ attribute of the image tag is changed, prompting the browser to download and display the newly generated image from the server. This creates the illusion of a moving image.

You can get an idea of what the stream looks like in the video below.

The lack of audio stream does, of course, mean that this is not the sort of livestream that can be used for the transmission of sporting events or concerts. The use of WebSockets with binary data means that the example can certainly be improved and, depending on the end user’s connection, up to 10 frames a second could certainly be achieved. There is, however, no doubt that this is not a model for “premium content” – browser manufacturers have agreed on WebRTC for this purpose. You can find more information on WebRTC and video/audio in browsers here.