Computers have gotten to the point that heavy video and graphics processing is quite affordable. This opens up the visual dimension for the musician in a whole new way. Motion and color tracking using video processing allows all sorts of interactions that previously would have been quite expensive and difficult to implement. By extracting data from live video streams, a wide range of gestures can be captured and mapped how the instrument designer sees fit.
Video is a noisy medium. Lighting and postprocessing can be important in obtaining usable data values. Most current video cameras run at slow frame rates, and are therefore not low latency. When using NTSC (60hz field rate) or PAL (50hz field rate) video capture devices, there is a built-in 16/20 msec delay from action to detection, not including processing time. Most webcams run at a framerate of 30 Hz, quite slow for use as a musical controller. But webcams using USB 2.0 or IEEE-1394 ("Firewire") can provide much higher frame rates to reduce this latency, some as high as 120Hz.
There are three key methods of tracking gestures with video: color, motion, and shape. The most common one is using motion detection. With Gem, you can use [pix_movement] in combination with [pix_blob]; PDP provides [pdp_mgrid], which is grid-based motion tracking; GridFlow provides motion detection by subtracting previous frame from current frame using [@-]. For color and shape tracking, PDP provides [pdp_ctrack] and [pdp_shape] respectively. Another option is to process the visual data using outside software and feed that data into Pd. reacTable [Kaltenbrunner et al.(2004)Kaltenbrunner, Geiger, and Jordà] takes that approach, using OSC to communicate between the two pieces of software.
Hans-Christoph Steiner 2005-04-25