Canvas filled three ways: JS, WebAssembly and WebGL
There are roughly speaking three ways to update an HTML5 canvas: plain old JavaScript, WebGL and WebAssembly. While JS and WebGL have their own ways of actually writing to the canvas, WebAssembly requires us to copy the resulting memory buffer to the canvas in JS. This might sound like a slow solution, but it still beats JS that gets bogged down even with a moderately complicated draw loop.
WebGL is really fast, but gets rather complicated if your effect has a lot of state that has to be passed and updated between frames, such as particles (it’s still possible by either handling the state in JS and passing it to WebGL, or even handling it purely in WebGL by storing it in a texture).
Let’s create a simple effect in all three ways and see how they work.
Javascript
Updating the canvas in JS is quite simple, even if we do it pixel by pixel. Here’s an example that colors every pixel red:
constwidth=640;constheight=360;// Get 2d drawing contextconstctx=document.getElementById('c').getContext('2d');// Get copy of actual imagedata (for the whole canvas area)constimageData=ctx.getImageData(0,0,width,height);// Create a buffer that's the same size as our canvas image dataconstbuf=newArrayBuffer(imageData.data.length);// 'Live' 8 bit clamped view to our buffer, we'll use this for writing to the canvasconstbuf8=newUint8ClampedArray(buf);// 'Live' 32 bit view into our buffer, we'll use this for drawingconstbuf32=newUint32Array(buf);// RGBA stored in a 32bit uintconstred=(255<<24)|255;// Loop through all the pixels.for (lety=0;y<height;y+=1){constyw=y*width;for (letx=0;x<width;x+=1){buf32[yw+x]=red;}}// Update imageData and put it to our drawing contextimageData.data.set(buf8);ctx.putImageData(imageData,0,0);
constcanvas=document.getElementById('c');constRAD=2*Math.PI;constBLADES=3;constCYCLE_WIDTH=100;constBLADES_T_CYCLE_WIDTH=BLADES*CYCLE_WIDTH;constheight=canvas.height;constwidth=canvas.width;constch=height/2;constcw=width/2;constmaxDistance=Math.sqrt((ch*ch)+(cw*cw));// Disabling alpha seems to give a slight boost. Image data still includes alpha though.constctx=canvas.getContext('2d',{alpha:false,antialias:false,depth:false});constimageData=ctx.getImageData(0,0,width,height);// Create a buffer that's the same size as our imageconstbuf=newArrayBuffer(imageData.data.length);// 'Live' 8 bit clamped view to our array, we'll use this for writing to the canvasconstbuf8=newUint8ClampedArray(buf);// 'Live' 32 bit view into our array, we'll use this for drawingconstbuf32=newUint32Array(buf);constrender=(timestamp)=>{// Flooring this makes things a whole lot faster in both FF and Chrome// 2000 added to timestamp out of pure laziness... without it we get some weird visuals in the beginningconstscaledTimestamp=Math.floor((timestamp/10.0)+2000.0);for (lety=0;y<height;y+=1){constdy=ch-y;constdysq=dy*dy;constyw=y*width;for (letx=0;x<width;x+=1){constdx=cw-x;constdxsq=dx*dx;constangle=Math.atan2(dx,dy)/RAD;// Arbitrary mangle of the distance, just something that looks pleasantconstasbs=dxsq+dysq;constdistanceFromCenter=Math.sqrt(asbs);constscaledDistance=(asbs/400.0)+distanceFromCenter;constlerp=1.0-((Math.abs((scaledDistance-scaledTimestamp)+(angle*BLADES_T_CYCLE_WIDTH))%CYCLE_WIDTH)/CYCLE_WIDTH);// Fade R more slowlyconstabsoluteDistanceRatioGB=1.0-(distanceFromCenter/maxDistance);constabsoluteDistanceRatioR=(absoluteDistanceRatioGB*0.8)+0.2;// Don't round these, it makes things slowerconstfadeB=50.0*lerp*absoluteDistanceRatioGB;constfadeR=240.0*lerp*absoluteDistanceRatioR*(1.0+lerp)*0.5;constfadeG=120.0*lerp*lerp*lerp*absoluteDistanceRatioGB;buf32[yw+x]=(255<<24)|// A(fadeB<<16)|// B(fadeG<<8)|// GfadeR;// R}}// Write our data back to the canvasimageData.data.set(buf8);ctx.putImageData(imageData,0,0);window.requestAnimationFrame(render);};window.requestAnimationFrame(render);
Note that this could be optimized a lot. All the distance calculations are static between frames, we could easily precalculate them etc. But that is really not the point of this exercise, we just want something that’s heavy enough to show differences between the three methods.
Javascript
JS can’t render the effect at 60 FPS even at 800×400. On my laptop I get around
45 FPS. Note that just updating the canvas with two prefilled arrays leaves the FPS short of 60.
WebAssembly
Let’s look at an MVP that colors every pixel red again:
constwidth=640;constheight=360;// Contains the actual webassemblyconstbase64data='AGFzbQEAAAABBQFgAAF/AhIBA2VudgZtZW1vcnkCAYACgAIDAgEABwsBB19yZW5kZXIAAApJAUcBA38DQCAAQaAGbCECQQAhAQNAIAEgAmpBAnRBgAhqQf+BgHg2AgAgAUEBaiIBQaAGRw0ACyAAQQFqIgBBkANHDQALQYAICw==';constdecode=(b64)=>{conststr=window.atob(b64);constarray=newUint8Array(str.length);for (leti=0;i<str.length;i+=1){array[i]=str.charCodeAt(i);}returnarray.buffer;};constmemSize=256;constmemory=newWebAssembly.Memory({initial:memSize,maximum:memSize});constinstance=newWebAssembly.Instance(newWebAssembly.Module(newUint8Array(decode(base64data))),{env:{memory}});// Get 2d drawing contextconstctx=document.getElementById('c').getContext('2d');constpointer=instance.exports._render();constdata=newUint8ClampedArray(memory.buffer,pointer,width*height*4);constimg=newImageData(data,width,height);ctx.putImageData(img,0,0);
#define PI 3.14159265358979323846#define RAD 6.283185307179586#define COEFF_1 0.7853981633974483#define COEFF_2 2.356194490192345#define BLADES 3#define CYCLE_WIDTH 100#define BLADES_T_CYCLE_WIDTH 300#include<math.h>#include<stdlib.h>#include<emscripten.h>intheight;intwidth;intpixelCount;intch;intcw;doublemaxDistance;/*We'll cheat a bit and just allocate loads of memoryso we don't have to implement malloc*/intdata[2000000];int*EMSCRIPTEN_KEEPALIVEinit(intcWidth,intcHeight){width=cWidth;height=cHeight;pixelCount=width*height;ch=height>>1;cw=width>>1;maxDistance=sqrt(ch*ch+cw*cw);// data = malloc(pixelCount * sizeof(int));return&data[0];}doublecustomAtan2(inty,intx){doubleabs_y=abs(y)+1e-10;doubleangle;if(x>=0){doubler=(x-abs_y)/(x+abs_y);angle=0.1963*r*r*r-0.9817*r+COEFF_1;}else{doubler=(x+abs_y)/(abs_y-x);angle=0.1963*r*r*r-0.9817*r+COEFF_2;}returny<0?-angle:angle;}// Using the 'native' fmod would require us to provide the module with asm2wasm...doublecustomFmod(doublea,doubleb){return(a-b*floor(a/b));}voidEMSCRIPTEN_KEEPALIVErender(doubletimestamp){intscaledTimestamp=floor(timestamp/10.0+2000.0);for(inty=0;y<height;y++){intdy=ch-y;intdysq=dy*dy;intyw=y*width;for(intx=0;x<width;x++){intdx=cw-x;intdxsq=dx*dx;doubleangle=customAtan2(dx,dy)/RAD;// Arbitrary mangle of the distance, just something that looks pleasantintasbs=dxsq+dysq;doubledistanceFromCenter=sqrt(asbs);doublescaledDistance=asbs/400.0+distanceFromCenter;doublelerp=1.0-(customFmod(fabs(scaledDistance-scaledTimestamp+angle*BLADES_T_CYCLE_WIDTH),CYCLE_WIDTH))/CYCLE_WIDTH;// Fade R more slowlydoubleabsoluteDistanceRatioGB=1.0-distanceFromCenter/maxDistance;doubleabsoluteDistanceRatioR=absoluteDistanceRatioGB*0.8+0.2;intfadeB=round(50.0*lerp*absoluteDistanceRatioGB);intfadeR=round(240.0*lerp*absoluteDistanceRatioR*(1.0+lerp)/2.0);intfadeG=round(120.0*lerp*lerp*lerp*absoluteDistanceRatioGB);data[yw+x]=(255<<24)|// A(fadeB<<16)|// B(fadeG<<8)|// GfadeR;// R}}}
constcanvas=document.getElementById('c');// Contains the actual webassemblyconstbase64data='AGFzbQEAAAABFgRgAn9/AX9gAn9/AXxgAXwAYAF8AXwCEgEDZW52Bm1lbW9yeQIBgAKAAgMFBAMCAQAHEwIFX2luaXQAAwdfcmVuZGVyAAEKowUEKQAgAEQAAAAAAADgP6CcIABEAAAAAAAA4D+hmyAARAAAAAAAAAAAZhsLogMCDH8DfEGMrOgDKAIAIgZBAEoEQEGQrOgDKAIAIQdBiKzoAygCACIEQQBKIQhBlKzoAygCACEJIABEAAAAAAAAJECjRAAAAAAAQJ9AoJyqtyEOQYCs6AMrAwAhDwNAIAcgA2siBSAFbCEKIAQgA2whCyAIBEBBACEBA0AgCSABayICIAJsIApqtyIAnyENRAAAAAAAAPA/IAIgBRACRBgtRFT7IRlAo0QAAAAAAMByQKIgAEQAAAAAAAB5QKMgDaAgDqGgmSIAIABEAAAAAAAAWUCjnEQAAAAAAABZQKKhRAAAAAAAAFlAo6EiAEQAAAAAAABJQKJEAAAAAAAA8D8gDSAPo6EiDaIQAKohAiAARAAAAAAAAPA/oCAARAAAAAAAAG5AoiANRJqZmZmZmek/okSamZmZmZnJP6CiokQAAAAAAADgP6IQAKohDCABIAtqQQJ0QYAIaiAAIAAgAEQAAAAAAABeQKKioiANohAAqkEIdCACQRB0ciAMckGAgIB4cjYCACABQQFqIgEgBEcNAAsLIANBAWoiAyAGSA0ACwsLhQEBA3wgAEEAIABrIABBf0obt0S7vdfZ33zbPaAhAiABtyEDIAFBf0oEfEQYLURU+yHpPyEEIAMgAqEgAiADoKMFRNIhM3982QJAIQQgAiADoCACIAOhowsiAiACIAJE4zYawFsgyT+ioqIgAkRgdk8eFmrvP6KhIASgIgKaIAIgAEEASBsLTABBiKzoAyAANgIAQYys6AMgATYCAEGQrOgDIAFBAXUiATYCAEGUrOgDIABBAXUiADYCAEGArOgDIAEgAWwgACAAbGq3nzkDAEGACAs=';constdecode=(b64)=>{conststr=window.atob(b64);constarray=newUint8Array(str.length);for (leti=0;i<str.length;i+=1){array[i]=str.charCodeAt(i);}returnarray.buffer;};constmemSize=256;constmemory=newWebAssembly.Memory({initial:memSize,maximum:memSize});constinstance=newWebAssembly.Instance(newWebAssembly.Module(newUint8Array(decode(base64data))),{env:{memory}});constheight=canvas.height;constwidth=canvas.width;// Disabling alpha seems to give a slight boost. Image data still includes alpha though.constctx=canvas.getContext('2d',{alpha:false,antialias:false,depth:false});if (!ctx){throw'Your browser does not support canvas';}constpointer=instance.exports._init(width,height);constdata=newUint8ClampedArray(memory.buffer,pointer,width*height*4);constimg=newImageData(data,width,height);constrender=(timestamp)=>{instance.exports._render(timestamp);ctx.putImageData(img,0,0);window.requestAnimationFrame(render);};window.requestAnimationFrame(render);
Getting emcc to produce WASM with a sane amount of stuff imported from javascript etc. takes some time. The short of it is, use emcc -Os and check the WAT for required imports with something like wasm2wat.
You can take a look at the project for further details.
You’ll notice that we’re using a custom atan2 approximation: while emscripten happily produced WebAssembly with atan2 imported from C libs, the result was as slow as the JS version. We also use a custom FMOD, as emcc doesn’t seem to provide a native version of that. All floating point values are doubles: we don’t really need the accuracy, but it runs faster and creates a smaller WASM file.
We’re not doing dynamic memory allocation for the image data, as it doesn’t really affect the render speed and I can’t be bothered.
The JS code instantiates the WA module, sets up basic canvas stuff and copies the data over to the canvas in the render loop.
WebAssembly
The performance is better than with plain JS, but I’m still getting only about 55 FPS with my laptop. For some reason there’s quite a bit of fluctuation in the FPS, at least in Firefox.
The only out of ordinary thing here is that we request for high precision floats, as mobile devices produce garbled results if we don’t. WebGL is strongly typed and only likes to do some operations with floats, so pretty much everything here is floats. Many of the operations can be done on vectors, so the code is more compact and easier to read. WebGL’s Y axis is inverted, so as a lazy solution we flip the arguments to atan2.
WebGL
WebGL is really fast. I get 60 FPS even on my phone.
Summa summarum
While JavaScript offers the best developer experience and the widest support, its performance leaves a lot to be desired.
WebAssembly is still somewhat painful to get running, but should get better over time. While it is a bit behind WebGL in performance, it does allow using existing C/C++ codebases which might come in handy.
WebGL’s performance is the best of the bunch, and would pull even more ahead if our effect was more complicated.
In some cases combining WebAssembly and WebGL might be a good idea.
I’d recommend going with WebGL if at all possible, writing it was simple enough once the boilerplate was done.