Making Audio Reactive Visuals

uber1

There are a few different ways to sync visuals to music:

  • Manual – live controlling visuals with keyboard or midi controls.
  • Sequencing – pre-analyzing the music and scripting an animation as a list of timecoded events.
  • Midi Input –  if you have access to the music’s midi data, this can be a great way to drive visuals.
  • Audio Reactive - code driven visuals that automatically adapt to a live audio input.

Here I want to talk about the last method. This can be useful for “Hands Free VJing”, allowing you to sit back and have the visuals automatically sync, or in a video game where you want some part of the visuals to react to the soundtrack.

The demos below work in Chrome using Three.js and the Web Audio API, but the same principals apply if you are using Processing, OpenFrameworks or some other framework.

Audio Analysis

audio

To sync to an audio input, we need to analyse the audio stream in realtime. There are 4 main pieces of data we can extract:

  • Volume - the thicker bar on the right hand side
  • Waveform - the jagged white line
  • Levels - the bar chart of frequency amplitudes, from bass on the left to treble on the right.
  • Beat Detection - the volume bar flashes white when a beat is detected. The white line above the volume bar indicates the beat threshold.

To see what these look like, view the Audio Analysis Demo. Drag and drop an MP3 file to play it, or switch to the mic input with the control panel at right.

Volume

The volume is the current global amplitude or loudness of the track. Volume data can be used raw or eased over time to give a smoother value:

smoothedVolume += (volume  - smoothedVolume) * 0.1;

Simple volume tracking can be enough to give a nice synced feel. In the Paradolia demo, the volume is used to determine the brightness of the lights in the scene. Beat detection is also used to trigger the material textures switching out.

Waveform

The waveform is the shape of the sound wave as it flies through the air and hits your ear. With the Web Audio API, use this call to get the waveform as an array of numbers between 0 and 256, where 128 indicates silence:

analyser.getByteTimeDomainData(timeByteData);

The Loop Waveform Visualizer draws the waveform data into circles that expand from the middle of the screen. The volume is also used to give a little bounce on the height of the waveform.

Levels

The levels are an array of amplitudes for each frequency range. They can be visualized as a bar chart or a 1980′s graphic equalizer. Using the WebAudio API this call will get the levels as an array of numbers between 0 to 256, where 0 indicates silence.

analyser.getByteFrequencyData(freqByteData);

In the ÜberViz demo the levels data sets the thickness of the colored strips. The smoothed volume is used to determine the size of the central white shape. The time period of the stripes movement is set to the BPM of the song. Beat detection is used to transition the camera angle.  On each transition I use the Bad TV shader to do a little warping (thanks to @active_theory for the suggestion).

uber2

Beat Detection

Reliable beat detection is hard. An audio waveform is a complex shape formed by multiple sounds overlapping, so it can be hard to pick out the beat. A beat can be defined as a “brutal variation in sound energy“, meaning a beat is when the volume goes up quickly in relation to the previous value. You can do beat detection on the global volume, or by focussing on specific frequencies (e.g. to separate the bass drum from the hi-hats).

In the Audio Analysis demo we use a Simple Beat Detection Algorithm with the following logic:

  1. Track a threshold volume level.
  2. If the current volume exceeds the threshold then you have a beat. Set the new threshold to the current volume.
  3. Reduce the threshold over time, using the Decay Rate.
  4. Wait for the Hold Time before detecting for the next beat. This can help reduce false positives.

In the demo, you can play with the ‘Beat Hold’ and ‘Beat Decay’ values to try to lock onto certain beats. This type of beat detection is good for finding less frequent ‘transition points’, depending on the delay and decay values used.

Beat detection results are heavily dependent on the track you choose. To get good results you want a track with a high dynamic range (from loud to quiet) and a simple structure. I find that Dubstep in particular is hard to beat detect, since it is typically uses lots of compression (making the whole song equally loud) and has complex drum breaks.

For professional live VJing or video music production, it’s often best to combine automatic audio-reactivity with live ‘knob twiddling’ or sequencing to produce the most interesting visuals.

Happy Visualizing!

Intro to Pixel Shaders in Three.js

8231596784_dc2f2d935d_b

I recently started playing with shaders in three.js and I wanted to share some of what I’ve discovered so far. Shaders are the ‘secret sauce’ of modern graphics programming and understanding them gives you a lot of extra graphical fire-power.

For me the big obstacle to learning shaders was the lack of documentation or simple examples, so hopefully this post will be useful to others starting out. This post will focus on using pixel shaders to add post-processing effects to Three.js scenes. This post assumes you already know the basics of using Three.js.

What is a Shader?

A Shader is a piece of code that runs directly on the GPU. Most modern devices have powerful GPUs designed to handle graphics effects without taxing the CPU. This means you get a lot of graphical power essentially for free.

The big conceptual shift when considering shaders is that they run in parallel. Instead of looping sequentially through each pixel one-by-one, shaders are applied to each pixel simultaneously, thus taking advantage of the parallel architecture of the GPU.

There are 2 main types of shaders – vertex shaders and pixel shaders.

  • Vertex Shaders generate or modify 3D geometry by manipulating its vertices. A good example is this fireball where the vertex positions of a sphere geometry are deformed by perlin noise.
  • Pixel Shaders (or ‘Fragment Shaders’) modify or draw the pixels in a scene. They are used to render a 3D scene into pixels (rasterization), and also typically used to add lighting and other effects to a 3D scene.

There are 2 different kinds of pixel shaders -

  • Shaders that draw an image or texture directly. These allows you to draw the kind of  abstract patterns seen on glsl.heroku.com. These types of shaders can be loaded into a THREE.ShaderMaterial to give cool textures to 3D objects like this example.
  • Shaders that modify another image or texture. These allow you to do post-processing on an existing texture, for example to add a glow or blur to a 3D scene. This second type of shader is what we will be talking about for the remainder of this post.

Pixel Shaders in Three.js

Three.js has an effects manager called EffectsComposer and many useful shaders built in. This code is not compiled into the main Three.js file, rather it is maintained separately in 2 folders in the three.js root folder:

  • /examples/js/postprocessing - contains the main EffectsComposer() class, and a number of ShaderPasses.
  • /examples/js/shaders – contains multiple individual shaders.

Unfortunately these shaders are not very well documented, so you need to dig in and test them out yourself.

Preview some of the three.js built-in shaders with this demo.

preview

Applying Shaders in Three.js

Applying a shader is pretty straight-forward. This example applies a dot screen and RGB shift effect to a simple 3D scene:

To use shaders that come with three.js, first we need to include the required shader JS files. Then in the scene initialization we set up the effect chain:

// postprocessing
composer = new THREE.EffectComposer( renderer );
composer.addPass( new THREE.RenderPass( scene, camera ) );

var dotScreenEffect = new THREE.ShaderPass( THREE.DotScreenShader );
dotScreenEffect.uniforms[ 'scale' ].value = 4;
composer.addPass( dotScreenEffect );

var rgbEffect = new THREE.ShaderPass( THREE.RGBShiftShader );
rgbEffect.uniforms[ 'amount' ].value = 0.0015;
rgbEffect.renderToScreen = true;
composer.addPass( rgbEffect );

First we create an EffectComposer() instance. The effect composer is used to chain together multiple shader passes by calling addPass(). Each Shader Pass applies a different effect to the scene. Order is important as each pass effects the output of the pass before. The first pass is typically the RenderPass(), which renders the 3D scene into the effect chain.

To create a shader pass we either create a ShaderPass() passing in a shader from the ‘shaders’ folder, or we can use some of the pre-built passes from the ‘postprocessing’ folder, such as BloomPass. Each Shader has a number of uniforms which are the input parameters to the shader and define the appearance of the pass. A uniform can be updated every frame, however it remains uniform across all the pixels in the pass. Browse which uniforms are available by viewing the shader JS file.

The last pass in the composer chain needs to be set to to renderToScreen. Then in the render loop, instead of calling renderer.render() you call composer.render()

That’s all you need to apply existing effects. If you want to build your own effects, continue.

GLSL Syntax

WebGL shaders are written in GLSL. The best intro to GLSL syntax I found is at Toby Schachman’s Pixel Shaders interactive tutorial. Go thru this quick tutorial first and you should get a lightbulb appearing over your head. Next take a look at the examples in his example gallery. You can live edit the code to see changes.

GLSL is written in C so get out your ‘Kernighan and Ritchie’ :). Luckily a little code goes a long way so you won’t need to write anything too verbose. The main WebGL language docs are the GLSL ES Reference Pages, which lists all the available functions.  GLSL Data Types are described here. Usually Googling ‘GLSL’ and your query will give you good results.

Some GLSL Notes:

  • Floats always need a number after the decimal point so 1 is written as 1.0
  • GLSL has many useful utility functions built in such as mix() for linear interpolation and clamp() to constrain a value.
  • GLSL allows access to components of vectors using the letters x,y,z,w and r,g,b,a. So for a 2D coordinate vec2 you can use pos.x, pos.y. For a vec4 color you can use col.r, col.g,col.b,col.a
  • Most GLSL functions can handle with multiple input types e.g. float, vec2, vec3 and vec4.
  • Debugging GLSL is notoriously difficult, however Chrome’s JS console will provide pretty good error messaging and will indicate which line of the shader is causing a problem.

Brightness Shader Example

For the first example we will walk through a super simple brightness shader. Slide the slider to change the brightness of the 3D scene.

Shader code can be included in the main JS file or maintained in separate JS files. In this case the shader code is in it’s own file. We can break apart the shader code into 3 sections, the uniforms, the vertex shader and the fragment shader. For this example we can skip the vertex shader since this section remains unchanged for pixel shaders. Three.js shaders require a vertex and fragment shader even if you are only modifying one.

UNIFORMS
The “uniforms” section lists all the inputs from the main JS. Uniforms can change every frame, but remain the same between all processed pixels

uniforms: {
"tDiffuse": { type: "t", value: null },
"amount": { type: "f", value: 0.5 }
},
  • tDiffuse is the texture from the previous shader. This name is always the same for three.js shaders. Type ‘t’ is a texture – essentially a 2D bitmap. tDiffuse is always passed in from the previous shader in the effect chain.
  • amount is a custom uniform defined for this shader. Passed in from the main JS. Type ‘f’ is a float.

FRAGMENT SHADER
The fragmentShader (pixel shader) is where the actual pixel processing occurs. First we define the variables, then we define the main() code loop.

fragmentShader: [

"uniform sampler2D tDiffuse;",
"uniform float amount;",
"varying vec2 vUv;",

"void main() {",
"vec4 color = texture2D(tDiffuse, vUv);",
"gl_FragColor = color*amount;",
"}"

].join("\n")

Here you will notice one of the quirks of shaders in three.js. The shader code is written as a list of strings that are concatenated. This due to the fact that there is no agreed way to load and parse separate GLSL files. It’s not great but you get used to it pretty quick.

  • uniform variables are passed in from main JS. The uniforms listed here must match the uniforms in the uniforms section at the top of the file.
  • varying variables vary for each pixel that is processed. vUv is a 2D vector that contains the UV coordinates of the pixel being processed. UV coords go from 0 to 1. This value is always called vUv and is passed in automatically by three.js

The main() function is the code that runs on each pixel.

  • Line 8 gets the color of this pixel from the passed in texture (tDiffuse) and the coord of this pixel (vUv). vec4 colors are in RGBA format with values from 0 to 1, so (1.0,0.0,0.0,0.5) would be red at 50% opacity.
  • Line 9 sets the gl_FragColor. gl_FragColor is always the output of a pixel shader. This is where you define the color of each output pixel. In this case we simply multiply the actual pixel color by the amount to create a simple brightness effect.

Mirror Shader Example

In addition to modifying the colors of each pixel, you can also copy pixels from one area to another. As in this example Mirror Shader.

For example to copy the left hand side of the screen to the right you can do this:

"uniform sampler2D tDiffuse;",
"varying vec2 vUv;",

"void main() {",
"vec2 p = vUv;",
"if (p.x > 0.5) p.x = 1.0 - p.x;",
"vec4 color = texture2D(tDiffuse, p);",
"gl_FragColor = color;",
"}"

This code checks the x position of each pixel it is run on (p.x). If it’s greater than 0.5 then the pixel is on the right hand side of the screen. In this case, instead of getting the gl_FragColor in the normal way, it gets the pixel color of the pixel at position 1.0 – p.x which is the opposite side of the screen.

More Shaders!

Here’s some examples of some more advanced shaders that I built recently. View the source to see how they work.

BAD TV SHADER:

Simulates a bad TV via horizontal distortion and vertical roll, using Ashima WebGL Noise. Click to randomize uniforms.

badtv 

DOT MATRIX SHADER:

Renders a texture as a grid of dots. The demo then applies a glow pass by blurring and compositing the scene via an Additive Blend Shader.

dotmatrix

PAREIDOLIA:

An audio reactive animation. Post-processing via a combination of mirror, dotscreen shader and RGB shift shaders:

para

Of course this is just scratching the surface of what you can do with shaders, but hopefully it will be enough to get some people started. If you made it this far, congratulations! Let me know how you get on in the comments.

WebCamMesh Demo

See my Experiment on ChromeExperiments.com
WebCamMesh is a HTML5 demo that projects webcam video onto a WebGL 3D Mesh. It creates a ‘fake’ 3D depth map by mapping pixel brightness to mesh vertex Z positions. Perlin noise is used to create the ripple effect by modifying the Z positions based on a 2D noise field. CSS3 filters are used to add contrast and saturation effects.

Running the Demo

Use mouse move to tilt and scroll wheel to zoom. The 3D effect works better if the foreground elements are brighter than the background, so try it in a dark room. Run the demo.

The demo requires a WebGL capable machine and WebCam support. Currently that means Chrome and Opera. On my MacBook Pro I get about 30 FPS with Chrome. If the demo craps out, try resizing your browser down and reloading the page.

[UPDATE] - I added Opera support to the demo. Note that Opera does not support CSS filters so you won’t get the contrast and saturation effects. You may also need to enable WebGL for Opera.

Built With:

To Do:

  • Add Audio reactivity to the mesh from MP3 Web audio API
  • Add Snap shot button. Problem is there is no way to save out the pixels with the CSS filters applied, so I need to re-do the filters with shaders.
  • Add grid resolution slider

Generated Images

I Killed Krauss!

This cartoon manages to be quite profound in just 3 panels. From Tom the Dancing Bug by Ruben Bolling. I can imagine a Charlie Kaufman-esque sci-fi movie based on this concept.

Are Ideas Really Worthless?

http://www.flickr.com/photos/sadi-junior/

It’s received wisdom among startup types that: “Ideas are worthless, it’s the execution that counts”. It’s become so much of a cliche that the great pop-sci middle-brow heavy-weight Malcolm Gladwell has just written a New Yorker article about it.

Now obviously there is some truth to this. It’s easy to say – “Let’s build a website exactly like Tumblr, but for recipes!” And then do nothing about it. This happens to me multiple times a day.

However there is a counter argument: without the right idea, any amount of execution is worthless.

I have personally seen it many times. Brilliant technologists and designers who build a startup around an idea that just doesn’t make much sense. It either doesn’t fulfill a need, or it cannot be explained in one sentence, or it is just too similar to ideas many other startups are executing on at the same time.

So yes, the execution is incredibly important – but so is finding the right idea to execute on. Building a technically or aesthetically wonderful product is a waste of time if nobody wants to use it.

The hard part is figuring out which are the good ideas and which are the bad. Doing this requires a combination of intuition, research and general knowledge of the world around you.

Flickr founder Caterina Fake voiced a similar opinion a while ago that resonated with me:

“Much more important than working hard is knowing how to find the right thing to work on. Paying attention to what is going on in the world. Seeing patterns. Being able to read what people want.”

Loop Waveform Visualizer

See my Experiment on ChromeExperiments.com
Chrome’s new Web Audio API allows us to do some pretty amazing audio stuff directly in the browser. Specifically the RealtimeAnalyserNode Interface provides real-time frequency and time-domain audio analysis which allows you to display ‘graphic equalizer’ style level bar charts and waveforms of an audio source. Chrome’s drag-and-drop file handling also allows you to drag and play MP3 files from the desktop. This gives a lot of potential to build very cool realtime audio visualizations in the browser, which was previously only possible using Processing or similar.

The Loop Waveform Visualizer uses a combination of level and waveform data to produce a circular audio visualization of any MP3. Use the mouse to tilt and the mousewheel to zoom.

To run this, you need a WebGL capable machine and the latest Chrome. Also be aware that it won’t look as good when running under Windows, since Chrome’s WebGL implementation on Windows does not suport line thickness (among other issues). It works better if you use a track that has a high dynamic range (meaning the volume of the track changes a lot over time).

The current time slice is rendered in the center, then displaced outwards over time. The level determines the brightness, thickness and Z scale of the loops. The Z displacement gives a nice ‘bounce to the beat’ effect. The waveform shape is drawn into the loop which means you can almost ‘see’ the sound. As with most visualizations, there was a lot of parameter tweaking to give a nice feel. I’m very happy with the performance of this piece – on my box it’s stays pretty solid at 60FPS. This is partially due to the fact that no new 3D objects are created over time. The 160 loops are created on initialization, then have their geometry modified each frame.

I have some plans to improve this piece, including adding post-processing effects, volume sensitivity controls, auto camera movement etc. Let me know if you have any more suggestions for improvement in the comments.

Creative Commons audio sample is from “Screw Base” by Beytah. Built with Three.js. Source is accessible from the demo URL.