Canvas filled three ways: JS, WebAssembly and WebGL

There are roughly speaking three ways to update an HTML5 canvas: plain old JavaScript, WebGL and WebAssembly. While JS and WebGL have their own ways of actually writing to the canvas, WebAssembly requires us to copy the resulting memory buffer to the canvas in JS. This might sound like a slow solution, but it still beats JS that gets bogged down even with a moderately complicated draw loop.

WebGL is really fast, but gets rather complicated if your effect has a lot of state that has to be passed and updated between frames, such as particles (it’s still possible by either handling the state in JS and passing it to WebGL, or even handling it purely in WebGL by storing it in a texture).

Let’s create a simple effect in all three ways and see how they work.

Javascript

Updating the canvas in JS is quite simple, even if we do it pixel by pixel. Here’s an example that colors every pixel red:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
const width = 640;
const height = 360;

// Get 2d drawing context
const ctx = document.getElementById('c').getContext('2d');
// Get copy of actual imagedata (for the whole canvas area)
const imageData = ctx.getImageData(0, 0, width, height);
// Create a buffer that's the same size as our canvas image data
const buf = new ArrayBuffer(imageData.data.length);
// 'Live' 8 bit clamped view to our buffer, we'll use this for writing to the canvas
const buf8 = new Uint8ClampedArray(buf);
// 'Live' 32 bit view into our buffer, we'll use this for drawing
const buf32 = new Uint32Array(buf);
// RGBA stored in a 32bit uint
const red = (255 << 24) | 255;
// Loop through all the pixels.
for (let y = 0; y < height; y += 1) {
  const yw = y * width;
  for (let x = 0; x < width; x += 1) {
    buf32[yw + x] = red;
  }
}
// Update imageData and put it to our drawing context
imageData.data.set(buf8);
ctx.putImageData(imageData, 0, 0);

View in CodePen

Now let’s create our effect proper and see how it fares:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74
const canvas = document.getElementById('c');
const RAD = 2 * Math.PI;
const BLADES = 3;
const CYCLE_WIDTH = 100;
const BLADES_T_CYCLE_WIDTH = BLADES * CYCLE_WIDTH;
const height = canvas.height;
const width = canvas.width;
const ch = height / 2;
const cw = width / 2;
const maxDistance = Math.sqrt((ch * ch) + (cw * cw));
// Disabling alpha seems to give a slight boost. Image data still includes alpha though.
const ctx = canvas.getContext(
  '2d',
  {
    alpha: false,
    antialias: false,
    depth: false
  }
);

const imageData = ctx.getImageData(0, 0, width, height);

// Create a buffer that's the same size as our image
const buf = new ArrayBuffer(imageData.data.length);
// 'Live' 8 bit clamped view to our array, we'll use this for writing to the canvas
const buf8 = new Uint8ClampedArray(buf);
// 'Live' 32 bit view into our array, we'll use this for drawing
const buf32 = new Uint32Array(buf);

const render = (timestamp) => {
  // Flooring this makes things a whole lot faster in both FF and Chrome
  // 2000 added to timestamp out of pure laziness... without it we get some weird visuals in the beginning
  const scaledTimestamp = Math.floor((timestamp / 10.0) + 2000.0);
  for (let y = 0; y < height; y += 1) {
    const dy = ch - y;
    const dysq = dy * dy;
    const yw = y * width;
    for (let x = 0; x < width; x += 1) {
      const dx = cw - x;
      const dxsq = dx * dx;
      const angle = Math.atan2(dx, dy) / RAD;
      // Arbitrary mangle of the distance, just something that looks pleasant
      const asbs = dxsq + dysq;
      const distanceFromCenter = Math.sqrt(asbs);
      const scaledDistance = (asbs / 400.0) + distanceFromCenter;
      const lerp =
        1.0 - (
          (Math.abs(
            (scaledDistance - scaledTimestamp) + (angle * BLADES_T_CYCLE_WIDTH)
          ) %
          CYCLE_WIDTH) /
          CYCLE_WIDTH);
      // Fade R more slowly
      const absoluteDistanceRatioGB = 1.0 - (distanceFromCenter / maxDistance);
      const absoluteDistanceRatioR = (absoluteDistanceRatioGB * 0.8) + 0.2;
      // Don't round these, it makes things slower
      const fadeB = 50.0 * lerp * absoluteDistanceRatioGB;
      const fadeR =
        240.0 * lerp * absoluteDistanceRatioR * (1.0 + lerp) * 0.5;
      const fadeG = 120.0 * lerp * lerp * lerp * absoluteDistanceRatioGB;
      buf32[yw + x] =
        (255 << 24) |   // A
        (fadeB << 16) | // B
        (fadeG << 8) |  // G
        fadeR;          // R
    }
  }
  // Write our data back to the canvas
  imageData.data.set(buf8);
  ctx.putImageData(imageData, 0, 0);
  window.requestAnimationFrame(render);
};

window.requestAnimationFrame(render);

View in CodePen

Note that this could be optimized a lot. All the distance calculations are static between frames, we could easily precalculate them etc. But that is really not the point of this exercise, we just want something that’s heavy enough to show differences between the three methods.

Javascript

JS can’t render the effect at 60 FPS even at 800×400. On my laptop I get around
45 FPS. Note that just updating the canvas with two prefilled arrays leaves the FPS short of 60.

WebAssembly

Let’s look at an MVP that colors every pixel red again:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
#define HEIGHT 400
#define WIDTH 800
#include <emscripten.h>

int data[WIDTH * HEIGHT];
int red = (255 << 24) | 255;

int* EMSCRIPTEN_KEEPALIVE render() {
   for (int y = 0; y < HEIGHT; y++) {
     int yw = y * WIDTH;
     for (int x = 0; x < WIDTH; x++) {
       data[yw + x] = red;
     }
   }
   return &data[0];
}

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
const width = 640;
const height = 360;

// Contains the actual webassembly
const base64data = 'AGFzbQEAAAABBQFgAAF/AhIBA2VudgZtZW1vcnkCAYACgAIDAgEABwsBB19yZW5kZXIAAApJAUcBA38DQCAAQaAGbCECQQAhAQNAIAEgAmpBAnRBgAhqQf+BgHg2AgAgAUEBaiIBQaAGRw0ACyAAQQFqIgBBkANHDQALQYAICw==';

const decode = (b64) => {
  const str = window.atob(b64);
  const array = new Uint8Array(str.length);
  for (let i = 0; i < str.length; i += 1) {
    array[i] = str.charCodeAt(i);
  }
  return array.buffer;
};

const memSize = 256;
const memory = new WebAssembly.Memory({ initial: memSize, maximum: memSize });

const instance = new WebAssembly.Instance(
  new WebAssembly.Module(new Uint8Array(decode(base64data))),
  { env: { memory } }
);

// Get 2d drawing context
const ctx = document.getElementById('c').getContext('2d');
const pointer = instance.exports._render();
const data = new Uint8ClampedArray(memory.buffer, pointer, width * height * 4);
const img = new ImageData(data, width, height);
ctx.putImageData(img, 0, 0);

View in CodePen

We have to pass a memory location to JS so that we know where to copy the data from. Other than that, the code is really similar to the JS version.

And now the full effect:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82
#define PI 3.14159265358979323846
#define RAD 6.283185307179586
#define COEFF_1 0.7853981633974483
#define COEFF_2 2.356194490192345
#define BLADES 3
#define CYCLE_WIDTH 100
#define BLADES_T_CYCLE_WIDTH 300
#include <math.h>
#include <stdlib.h>
#include <emscripten.h>

int height;
int width;
int pixelCount;
int ch;
int cw;
double maxDistance;
/*
We'll cheat a bit and just allocate loads of memory
so we don't have to implement malloc
*/
int data[2000000];

int* EMSCRIPTEN_KEEPALIVE init(int cWidth, int cHeight) {
  width = cWidth;
  height = cHeight;
  pixelCount = width * height;
  ch = height >> 1;
  cw = width >> 1;
  maxDistance = sqrt(ch * ch + cw * cw);
  // data = malloc(pixelCount * sizeof(int));
  return &data[0];
}

double customAtan2(int y, int x) {
	double abs_y = abs(y) + 1e-10;
	double angle;
	if (x >= 0) {
		double r = (x - abs_y) / (x + abs_y);
    angle = 0.1963 * r * r * r - 0.9817 * r + COEFF_1;
	} else {
		double r = (x + abs_y) / (abs_y - x);
    angle = 0.1963 * r * r * r - 0.9817 * r + COEFF_2;
	}
	return y < 0 ? -angle : angle;
}

// Using the 'native' fmod would require us to provide the module with asm2wasm...
double customFmod(double a, double b)
{
  return (a - b * floor(a / b));
}

void EMSCRIPTEN_KEEPALIVE render(double timestamp) {
  int scaledTimestamp = floor(timestamp / 10.0 + 2000.0);
  for (int y = 0; y < height; y++) {
    int dy = ch - y;
    int dysq = dy * dy;
    int yw = y * width;
    for (int x = 0; x < width; x++) {
      int dx = cw - x;
      int dxsq = dx * dx;
      double angle = customAtan2(dx, dy) / RAD;
      // Arbitrary mangle of the distance, just something that looks pleasant
      int asbs = dxsq + dysq;
      double distanceFromCenter = sqrt(asbs);
      double scaledDistance = asbs / 400.0 + distanceFromCenter;
      double lerp = 1.0 - (customFmod(fabs(scaledDistance - scaledTimestamp + angle * BLADES_T_CYCLE_WIDTH), CYCLE_WIDTH)) / CYCLE_WIDTH;
      // Fade R more slowly
      double absoluteDistanceRatioGB = 1.0 - distanceFromCenter / maxDistance;
      double absoluteDistanceRatioR = absoluteDistanceRatioGB * 0.8 + 0.2;
      int fadeB = round(50.0 * lerp * absoluteDistanceRatioGB);
      int fadeR = round(240.0 * lerp * absoluteDistanceRatioR * (1.0 + lerp) / 2.0);
      int fadeG = round(120.0 * lerp * lerp * lerp * absoluteDistanceRatioGB);
      data[yw + x] =
        (255 << 24) |   // A
        (fadeB << 16) | // B
        (fadeG << 8) |  // G
        fadeR;          // R
    }
  }
}

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
const canvas = document.getElementById('c');

// Contains the actual webassembly
const base64data = 'AGFzbQEAAAABFgRgAn9/AX9gAn9/AXxgAXwAYAF8AXwCEgEDZW52Bm1lbW9yeQIBgAKAAgMFBAMCAQAHEwIFX2luaXQAAwdfcmVuZGVyAAEKowUEKQAgAEQAAAAAAADgP6CcIABEAAAAAAAA4D+hmyAARAAAAAAAAAAAZhsLogMCDH8DfEGMrOgDKAIAIgZBAEoEQEGQrOgDKAIAIQdBiKzoAygCACIEQQBKIQhBlKzoAygCACEJIABEAAAAAAAAJECjRAAAAAAAQJ9AoJyqtyEOQYCs6AMrAwAhDwNAIAcgA2siBSAFbCEKIAQgA2whCyAIBEBBACEBA0AgCSABayICIAJsIApqtyIAnyENRAAAAAAAAPA/IAIgBRACRBgtRFT7IRlAo0QAAAAAAMByQKIgAEQAAAAAAAB5QKMgDaAgDqGgmSIAIABEAAAAAAAAWUCjnEQAAAAAAABZQKKhRAAAAAAAAFlAo6EiAEQAAAAAAABJQKJEAAAAAAAA8D8gDSAPo6EiDaIQAKohAiAARAAAAAAAAPA/oCAARAAAAAAAAG5AoiANRJqZmZmZmek/okSamZmZmZnJP6CiokQAAAAAAADgP6IQAKohDCABIAtqQQJ0QYAIaiAAIAAgAEQAAAAAAABeQKKioiANohAAqkEIdCACQRB0ciAMckGAgIB4cjYCACABQQFqIgEgBEcNAAsLIANBAWoiAyAGSA0ACwsLhQEBA3wgAEEAIABrIABBf0obt0S7vdfZ33zbPaAhAiABtyEDIAFBf0oEfEQYLURU+yHpPyEEIAMgAqEgAiADoKMFRNIhM3982QJAIQQgAiADoCACIAOhowsiAiACIAJE4zYawFsgyT+ioqIgAkRgdk8eFmrvP6KhIASgIgKaIAIgAEEASBsLTABBiKzoAyAANgIAQYys6AMgATYCAEGQrOgDIAFBAXUiATYCAEGUrOgDIABBAXUiADYCAEGArOgDIAEgAWwgACAAbGq3nzkDAEGACAs=';
const decode = (b64) => {
  const str = window.atob(b64);
  const array = new Uint8Array(str.length);
  for (let i = 0; i < str.length; i += 1) {
    array[i] = str.charCodeAt(i);
  }
  return array.buffer;
};
const memSize = 256;
const memory = new WebAssembly.Memory({ initial: memSize, maximum: memSize });

const instance = new WebAssembly.Instance(
  new WebAssembly.Module(new Uint8Array(decode(base64data))),
  { env: { memory } }
);
const height = canvas.height;
const width = canvas.width;
// Disabling alpha seems to give a slight boost. Image data still includes alpha though.
const ctx = canvas.getContext(
  '2d',
  {
    alpha: false,
    antialias: false,
    depth: false
  }
);
if (!ctx) {
  throw 'Your browser does not support canvas';
}

const pointer = instance.exports._init(width, height);
const data = new Uint8ClampedArray(memory.buffer, pointer, width * height * 4);
const img = new ImageData(data, width, height);

const render = (timestamp) => {
  instance.exports._render(timestamp);
  ctx.putImageData(img, 0, 0);
  window.requestAnimationFrame(render);
};
window.requestAnimationFrame(render);

View in CodePen

Getting emcc to produce WASM with a sane amount of stuff imported from javascript etc. takes some time. The short of it is, use emcc -Os and check the WAT for required imports with something like wasm2wat.
You can take a look at the project for further details.

You’ll notice that we’re using a custom atan2 approximation: while emscripten happily produced WebAssembly with atan2 imported from C libs, the result was as slow as the JS version. We also use a custom FMOD, as emcc doesn’t seem to provide a native version of that. All floating point values are doubles: we don’t really need the accuracy, but it runs faster and creates a smaller WASM file.

We’re not doing dynamic memory allocation for the image data, as it doesn’t really affect the render speed and I can’t be bothered.

The JS code instantiates the WA module, sets up basic canvas stuff and copies the data over to the canvas in the render loop.

WebAssembly

The performance is better than with plain JS, but I’m still getting only about 55 FPS with my laptop. For some reason there’s quite a bit of fluctuation in the FPS, at least in Firefox.

WebGL

This time our MVP has a lot more boilerplate:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61
// Get 2d drawing context
const canvas = document.getElementById('c');
const gl = canvas.getContext('webgl');
gl.viewport(0, 0, gl.drawingBufferWidth, gl.drawingBufferHeight);
const buffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, buffer);
const vertexCount = 6;
const vertexLocations = [
  // X, Y
  -1.0, -1.0,
   1.0, -1.0,
  -1.0,  1.0,
  -1.0,  1.0,
   1.0, -1.0,
   1.0,  1.0
];

gl.bufferData(
  gl.ARRAY_BUFFER,
  new Float32Array(vertexLocations),
  gl.STATIC_DRAW
);

const program = gl.createProgram();
const buildShader = (type, source) => {
  const shader = gl.createShader(type);
  gl.shaderSource(shader, source);
  gl.compileShader(shader);
  gl.attachShader(program, shader);
};
buildShader(
  gl.VERTEX_SHADER,
  `
attribute vec2 a_position;
void main() {
  gl_Position = vec4(a_position, 0.0, 1.0);
}`
);

buildShader(
  gl.FRAGMENT_SHADER,
  `
void main() {
  gl_FragColor = vec4(1.0, 0.0, 0.0, 1.0);
}`
);
gl.linkProgram(program);
gl.useProgram(program);
const positionLocation = gl.getAttribLocation(program, 'a_position');
gl.enableVertexAttribArray(positionLocation);
const fieldCount = vertexLocations.length / vertexCount;
gl.vertexAttribPointer(
  positionLocation,
  fieldCount,
  gl.FLOAT,
  gl.FALSE,
  fieldCount * Float32Array.BYTES_PER_ELEMENT,
  0
);
// Draw
gl.drawArrays(gl.TRIANGLES, 0, vertexCount);

View in CodePen

We have to create a rectangle to draw on, compile two shaders and update the time variable on each frame.

The boilerplate is the only drawback though, our full effect really doesn’t complicate the code much:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
const canvas = document.getElementById('c');
const height = canvas.height;
const width = canvas.width;
// Get 2d drawing context
const gl = canvas.getContext('webgl');
gl.viewport(0, 0, gl.drawingBufferWidth, gl.drawingBufferHeight);
const buffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, buffer);
const vertexCount = 6;
const vertexLocations = [
  // X, Y
  -1.0, -1.0,
   1.0, -1.0,
  -1.0,  1.0,
  -1.0,  1.0,
   1.0, -1.0,
   1.0,  1.0
];

gl.bufferData(
  gl.ARRAY_BUFFER,
  new Float32Array(vertexLocations),
  gl.STATIC_DRAW
);

const program = gl.createProgram();
const buildShader = (type, source) => {
  const shader = gl.createShader(type);
  gl.shaderSource(shader, source);
  gl.compileShader(shader);
  gl.attachShader(program, shader);
  return shader;
};

const vertexShader = buildShader(
  gl.VERTEX_SHADER,
  `
attribute vec2 a_position;
void main() {
gl_Position = vec4(a_position, 0.0, 1.0);
}`
);

const fragmentShader = buildShader(
  gl.FRAGMENT_SHADER,
  `
#define M_RAD ${Math.PI * 2}
#define CYCLE_WIDTH 100.0
#define BLADES 3.0
#define BLADES_T_CYCLE_WIDTH 300.0
// Mobiles need this
precision highp float;
uniform float timestamp;
void main() {
  vec2 center = vec2(${width / 2}, ${height / 2});
  float maxDistance = length(vec2(center.x, center.y));
  float scaledTimestamp = floor(timestamp / 10.0 + 2000.0);
  vec2 d = center.xy - gl_FragCoord.xy;
  vec2 dsq = pow(d, vec2(2.0));
  // Flipped axis to counteract flipped Y
  float angle = atan(d.y, d.x) / M_RAD;
  // Arbitrary mangle of the distance, just something that looks pleasant
  float asbs = dsq.x + dsq.y;
  float distanceFromCenter = sqrt(asbs);
  float scaledDistance = (asbs / 400.0) + distanceFromCenter;
  float lerp = 1.0 - mod(abs(scaledDistance - scaledTimestamp + angle * BLADES_T_CYCLE_WIDTH), CYCLE_WIDTH) / CYCLE_WIDTH;
  // Fade R more slowly
  float absoluteDistanceRatioGB = (1.0 - distanceFromCenter / maxDistance);
  float absoluteDistanceRatioR = absoluteDistanceRatioGB * 0.8 + 0.2;

  float fadeB = (50.0 / 255.0) * lerp * absoluteDistanceRatioGB;
  float fadeR = (240.0 / 255.0) * lerp * absoluteDistanceRatioR * (1.0 + lerp) / 2.0;
  float fadeG = (120.0 / 255.0) * lerp * lerp * lerp * absoluteDistanceRatioGB;
  gl_FragColor = vec4(fadeR, fadeG, fadeB, 1.0);
}`
);

gl.linkProgram(program);
gl.useProgram(program);
// Detach and delete shaders as they're no longer needed
gl.detachShader(program, vertexShader);
gl.detachShader(program, fragmentShader);
gl.deleteShader(vertexShader);
gl.deleteShader(fragmentShader);
// Add attribute pointer to our vertex locations
const positionLocation = gl.getAttribLocation(program, 'a_position');
gl.enableVertexAttribArray(positionLocation);
const fieldCount = vertexLocations.length / vertexCount;
gl.vertexAttribPointer(
  positionLocation,
  fieldCount,
  gl.FLOAT,
  gl.FALSE,
  fieldCount * Float32Array.BYTES_PER_ELEMENT,
  0
);

const timestampId = gl.getUniformLocation(program, 'timestamp');

const render = (timestamp) => {
  // Update timestamp
  gl.uniform1f(timestampId, timestamp);
  // Draw
  gl.drawArrays(gl.TRIANGLES, 0, vertexCount);
  window.requestAnimationFrame(render);
};

window.requestAnimationFrame(render);

View in CodePen

The only out of ordinary thing here is that we request for high precision floats, as mobile devices produce garbled results if we don’t. WebGL is strongly typed and only likes to do some operations with floats, so pretty much everything here is floats. Many of the operations can be done on vectors, so the code is more compact and easier to read. WebGL’s Y axis is inverted, so as a lazy solution we flip the arguments to atan2.

WebGL

WebGL is really fast. I get 60 FPS even on my phone.

Summa summarum

While JavaScript offers the best developer experience and the widest support, its performance leaves a lot to be desired.

WebAssembly is still somewhat painful to get running, but should get better over time. While it is a bit behind WebGL in performance, it does allow using existing C/C++ codebases which might come in handy.

WebGL’s performance is the best of the bunch, and would pull even more ahead if our effect was more complicated.
In some cases combining WebAssembly and WebGL might be a good idea.

I’d recommend going with WebGL if at all possible, writing it was simple enough once the boilerplate was done.

Timo Mikkolainen

Vastaa