Collaborative Streaming Performance
Guy van Belle, 14/03/2006

print version

This tutorial provides a practitioner's perspective on ways of doing real-time collaborative projects over the web, such as live audio sessions between multiple performers in different countries. It explains some freely-available tools for achieving this, along with descriptions of examples from the author's own practice, and presents some skeletal projects that can be adapted for other people to use in their own work.

Introduction

First of all, the 'technolgical artist', 'media artist', 'sound artist' status that I seem to have, also implies that the following tutorial will not be a very technical one, rather written from the point of view of the creative user. Secondly, since I have been creating the examples mostly in the Max/MSP/Jitter programming language on Apple's Osx (BSD) platform, I will be mainly commenting from that setup, but I hope to give some extended references for uses under Linux and PD, which I used on several occasions. Thirdly I will describe some setups that tend to obliterate personal creative efforts in favor of collaborative ones. Hence the main focus is on creativity and communication, and is not dealing with broadcasting virtuoso exposures.

In a practical sense we had to start and think about streaming in a stupid funny way: like all accidents have something unfortunately but at the same time disruptingly innovating in them. Back early 1990s internet was slower than slow, compared to today, but still faster than running to another computer even if it is in the same building. Being part of a computer band meant (and still means) that the first concern is what possibilities for synchronisation exist. Then you can use them or ignore them, but you can never escape them.

There are at least 2 ways to look at this problem. The first possibility is to use a cable between 2 computers, and pour out the entire audio/visual files into one end and hope it comes out the other end. Then we get what is on one computer copied to the other. But first of all copying the data will introduce a time-lag, or delay, also called latency. Secondly the ethernet and other cables don't allow us to do that in a way we don't interrupt the stream, so mistakes, hikkups and complete failures will occur. Thirdly, the copy will not give us a lot of the information we want to use for synchronizing: we will only know start and end, duration, and maybe when a new frame is coming in if we deal with video.

To make a much better interface for streaming we'd rather have some precise lightweight numbers sent across. For that purpose, 10 years ago cnmat made Open Sound Control (OSC) available.

1 - Open Sound Control

urls:
http://www.cnmat.berkeley.edu/OpenSoundControl
http://www.cnmat.berkeley.edu/OpenSoundControl/OSC-spec.html

"OpenSound Control ("OSC") is a protocol for communication among computers, sound synthesizers, and other multimedia devices that is optimized for modern networking technology and has been used in many application areas."

OSC allows you, from within a variety of applications, to build your own messages for sending data to a variety of applications on another computer. All you need is an ethernet cable to connect the 2 computers, or connect to a network. And you will decide between the 2 computers how the form of the message will be. For instance when you want to tell the other computer that you started a specific movie that plays at 3/4 speed you invent the message 'start movie 2 0.75'. The other computer of course will have to parce out the data and decide how to use this, but at least you are getting it across in a fast and economic way. Let's say you play a specific sound file based on that information: when the message comes in you use movie 2 to select your sound file, and you change 0.75 in an appropriate speed for it. Basically it is a DIY and very open and free way to send and receive messages between specific computers. Now, since the protocol is UTP (a version of TCP but without the handshaking), it is possible to send messages to 2 computers anywhere on the internet, just by specifying the tcp-ip number you are sending it to. You have to specify different port numbers for sending and receiving but once you have experimented with this you can continuously share data. And of course you can build up a complex system of interactions between several computers from within a variety of software applications. The great thing is that for us middle of the 90s it meant it could replace the midi protocol with its very fixed and general formatting, and include more datatypes apart from integers: strings, floats, etc... And that opened up a completely new world for interaction.

Example 1 - Berlin 10 comp piece

For a performance in Berlin in 2001, we used OSC as an interface to allow non-local musicians to interact with the 10 computers running a similar max-msp patch. The patch contained nothing more than 8 oscillators and some filters. The idea was to let 10 (local and connected) musicians experiment with 'beat frequencies'. Beat frequencies are a 'perceptual' phenomenon, and this occurs when 2 sound waves with a different frequency (or pitch) reach your ear. The effect is a rhythmical 'beating' of the sound (due to the sound becoming louder and softer). The setup was a round table with 10 internet-connected laptops each having 2 active speakers. Five local musicians were changing the frequencies by running around the table and trying to 'find' a beating sound. Five other musicians located at their homes in Rotterdam, Amsterdam, Brussels, Tokyo, and Reykjavik could access any of the computers and imitate what they were doing at home with the same patch. On purpose connected and local artists interfered in each other's playing.

The way it worked was conveniently done with OSC modifying one patch for sending connected and receiving in Berlin. The only thing I had to add was build a small patch that was receiving the number 1-10 for selecting a local computer. From the original patch I made 2 versions: one was sending out the parameters that were modified, and one could receive these parameters, on the the selected computer. The patch was so simple I could in Dresden use the same setup and do a workshop with a class of school children about sound phenomena.

About beat frequencies: http://hyperphysics.phy-astr.gsu.edu/hbase/sound/beat.html

Example 2 - Bratislava workshop

Trying to avoid workshops on specific tools (I am not a salesman) we rather try to work on collaborative ideas and concepts for generating a creative output. For a workshop in Bratislava in 2004, we were connecting several sound applications over a local network, on different operating systems. The nice thing is that you can distribute the data over the whole local network if you send them to the netmask (255.255.255.255) instead of to a specific tcp-ip number. This means you can run a kind of self-made proxy for your computer band in doing so.

Basically we ended up with four different networks that were parsed out and distributed over my computer:

So, all data generated by the 10 participants were reformatted into the 4 different new formats and sent to the netmask. Then we agreed that in turn everyone would take over in turn the general clock for sequencing/synchronizing. As this was sent over OSC, the speed of the music by a musician dynamically changed the speed of all connected sequencers accordingly. In the evening we played an excitingly divers set due to that simple principle based on a connection.

As you can see with the examples, the power of OSC is that it is light and fit for slow networks, and therefore fast and fit for synchronisation! You can build your own communication as you go along and only have to agree on a format for parsing it out on the other end. It means also that in a joint setup you can experiment with several topologies (from peer-to-peer, over one-to-all, to all-to-all and everything in between). You can send it into any application that understands OSC. Appart from standalone programs you can really invent whole setups when you use one of the languages that are supported: like php, perl, max, c, supercollider, smalltalk... Finally it is worthwile to look into the sensors and i/o boards that are supporting OSC.

Negative points could be that not all applications seem to understand the 'openness' and why to have a dynamical specification of the parameters (situated at user level). They make their own implementation of it, fixing it as a new MIDI, and so they lose an important advantage of OSC. A second point is the network itself that has made itself more complex than needed with the millions of firewalls and protectionist measures. That can really be a pain. That is why over the year people have been developing osc-server applications that do nothing but receive the OSC packages/data, and can be accessed from anywhere easily on ports that are open anyway (like port 80 for browsers). One of them is located at http://share.dj/share/event_info.php?eventID=28 and was made for a connected performance event. And finally, for really streaming rich/heavy data like audio and video we still have to come up with something different.

We have included the patch max_2_reaktor which we used to connect both programs.


max_2_reaktor patch

2 - QuickTime/Darwin Streaming Server

urls:
http://developer.apple.com/darwin/projects/streaming
http://www.apple.com/quicktime/broadcaster/

But of course we want to work with more than just control data. We want to stream the full video and audio! And the good news is that the internet as a physical network in 10 years' time has a lot improved for that: better stability, more bandwith, better compressions etc... But since we tend to work with open source and freeware tools as much as possible, there is a drawback: there is at the time being not really a good solution for streaming video. The closest thing we know of is maybe the darwin streamer by Apple. Though they maintain it is an open source project - it is for free and the server is running on linux, windows and osx systems - there will be always a lot of debate about this kind of appropriated software, also developed by other computer businesses. But again, it was the first technology in that direction and we have been using it in a lot of projects since 2000. [We are currently testing the theora streams but more about that later...]

Now, technically if you are working on the osx platform and you know of a QuickTime server anywhere, the setup is quite easy. Download the Quicktime Broadcaster application at the location mentioned above, connect any firewire or usb camera to your computer (make sure you got the right drivers) and you are basically setup. For testing and also when working under good circumstances (light is critical), I am using an iSight camera. Apart from a good quality, from within Max/MSP/Jitter we are able to control the image settings and parameters a lot. Also, it gives you an uncompressed input and so it is very fast in response! (When you hook up a DV camera over a firewire cable you wil see that the latency for the image becomes to long for working with movement and motion tracking).

And a tip, try to keep the camera as stable as possible, the quality will improve tremendously over a shaky handheld reportage!

Like we were explaining before, there is a whole difference between streaming some numbers for 'controlling' applications and processes remotely, than sending rich datastreams over. Don't forget that audio to be reasonably appreciated in quality needs at least 22,100 numbers per second, regardless of the bandwith of your internet connection which may also cause breaks in the continuity of the 'stream' of these data. With images it becomes even worse... So we have to come up with solutions for that.

Apart from the general setup for configuring the stream you will have to use compressions to reduce the amount of data going over the cables. Lately we have been using for image and sound MP4 compressions. But the H.264, Cinepak or any other compression will do as well. But check the CPU load that the QT Broadcaster is showing and make sure it does not go over 50% if you still want to use your computer for other things. Now, to make the stream easy these are the settings we are often using, but feel free to experiment till you got the setup that satisfies you, for instance:

The configuration for the network and server can be as following, we will give you 2 free options, but better to look for or set up your local free server and let us know, it would be cool to make a list of free open servers:

or

The server will automatically create a .sdp file and better test if it comes through ok, by starting up the QuickTime Player, and when you 'Open URL' copy the location information into the pop-up...

You can save the streaming file to your local disk by ticking the 'Record to disk' tap, but actually now you are ready to stream, so hit the Broadcast button, wait for its preroll and here you go!

Now, read underneath the description of what rtsp:// really means, but it is not a definition that can be understood by a browser. So the solution is to make a so-called reference movie. This is a movie in the right official format but instead of containing the images and sounds, it only has the reference to an sdp-file and since it can handle the rtsp protocol it will take care of the stream. The reference movie is easy to make. When you check your stream in the QuickTime Player (described above), you simple select 'save as' from the file menu and indicate in the options that it is a reference movie. You can save this .mov file on your own machine, send it to friends in an email, store it on a server and ... script it into a webpage!

Example 1 - anatomic connected performances

Between 2003-04 we were involved in the setting up of weekly experiments with performances in Amsterdam for Waag Society, under the name of Anatomic.Parts of it is described in the book 'Connected! LiveArt' by Sher Doruff (ed.) who initiated this project. Though the accounts are quite tendentious and contain lots of mistakes about time and space. But in the last section there are some interesting 'recipes' for streaming by Arjen Keesmaat and Jan-Kees van Kampen, both wonderful ex-Anatomic collaborators. With them and almost 100 participants over 2 years we were examining most of the technologies mentioned here, but with a specific purpose: how can you use them in a connected and artistic setup. Since we had a set of iBooks and some DVD cameras from an educational program, we could simply use a computer to monitor and make the quicktime streams. Waag Society gave and is still giving free access to their darwin server (running on a linux machine).

For a public performance in october 2003, we would set up in the theatrum anatomicum, and generate 1 quicktime stream (audio and video), mixed by the 10 participants in this audiovisual concert. There were other connected bands playing at the same time in Brussels, New York, Sofia, Brno and Tokyo. The only difference with a local setup was that each would generate one stream on the server. Any participant anywhere could select one of these streams, consider it an input and manipulate this, adding sound and visuals if necessary. Since the participants in one location would output a mixed stream it meant that a loop was created in between the several locations. We would use a php interface in a web page that would automatically detect if there were active streams on the darwin server. And so in addition to the local output 2 other streams could be displayed at the same time.

Example 2 - art's anniversary streaming event

In January 2004 we were contributing to an initiative set up by Milos Vojtechovsky, participating himself in the EBU Ars Acustica Special Evening.

"Art's Birthday Party is a celebration in memory of Robert Filliou who declared, on January 17 1963, that Art had been born exactly 1,000,000 years ago when someone dropped a dry sponge into a pail of water. 10 years later he celebrated Art's 1,000,010th birthday in the Neue Galerie, Aachen."

"After Filliou's death in 1987 some artists began to celebrate Art's Birthday with mail-art, fax and slow scan tv events in the spirit of his concept of "The Eternal Network" or "La Fete permanente". The Birthday parties took place in different cities across the world and artists were asked to bring birthday presents for Art. -- works that could be shared over the network."

For the 2004 setup we set up with a Society of Algorithm, by mxHz and Akihiro Kubota. We chose to work both with Max/MSP/Jitter to explore the recently invented scanned synthesis, by Max Mathews, one of computer music's pioneers. Scanned Synthesis is based on an elaborate physical model of strings. So we set up activation on 2 strings on each side (Brussels and Tokyo).

When string A is activated it writes the (16 tables of 255 numbers) data into a small rgb video matrix that is sent as a broadcast over the net to the other side. There it is read out again not only as a streaming movie but deciphered into numbers, activating string B there, and then vice versa from B to A over the network. The result is a video-audio delay due to the latency of the network. Also, by sending the parameters as abstract visual information in a stream, the compressions and rates are changing the individual settings. After a while also small images were transmitted and mixed in into the stream, resulting in a changing sound, that again changed the images etc... We will explore these topics further with www.societyofalgorithm.org: synaesthetics, matrix feedback, invention and use of historical and contemporary algorithms, etc ...

In addition for people who wonder what rtsp and sdp mean, the following explanation:

RTSP is short for Real Time Streaming Protocol. RTSP uses RTP (Real-Time Transport Protocol) to format packets of multimedia content and is designed to efficiently broadcast audio-visual data to large groups. RTSP grew out of work done by Columbia University, Netscape and RealNetworks.

SDP is short for Session Description Protocol, a protocol that defines a text-based format for describing streaming media sessions. SDP is not a transport protocol like RTSP, but a method of describing the details of the transmission. For example, an SDP file contains information about the format, timing and authorship of the transmission, name and purpose of the session, any media, protocols or codec formats, the version number, contact information and broadcast times.

We have included the patch "sndscansynth" which akihiro kubota and mxhz.org were developing together for data feedback over osc, actually together we work as http://societyofalgorithm.org (You will have to download the scanned synthesis objects from http://www.jmcouturier.com/download.html first, but lots of fun afterwards!)


sndscansynth patch (click to enlarge)

3 - Icecast

urls:
http://www.icecast.org
http://www.vorbis.com

There are not so many solutions for creating a network with audio-only streams. But again one of the more reliable ones is called icecast. In order to stream in the 2 most popular formats MPEG-3 and OGG Vorbis (its open source competitor), you need an icecast-2 server. When you are only interested in broadcasting, get an encoder from the list at http://www.icecast.org/3rdparty.php, and connect to a server somewhere/anywhere. There are 2 unfortunate things, first of all I don't know of any without password protection, and secondly encoders like Nicecast are commercial. And of course any application on your computer that understands mpeg-3 files will be able to receive m3u files (the streaming counterpart)> Same for ogg files of course.

But we got more interested in using an icecast server over the last years, when we tried to set up experimental topologies for creating the 2-way-radio project. The idea is to break the broadcasting philosophy (and also the technology resulting from that, ah take for example podcasting... brrr) by setting up 'for every receiver a possible transmitter' (hans magnus enzensberger) and as such, create a network of collaborative and interconnected generative music.

example 2WR (2-way-radios)

For the participation in the Pixelache festival, 14th april 2005, the connected line up was: mxHz.org (in famu praha), akihiro kubota (in tokyo at home), andrey savitsky (in minsk at home), goto10 (somewhere in roterdam, amsterdam, london, ...).

For the encoding we were using the objects for PD amd Max, by olaf matthes (http://www.akustische-kunst.org). This allows you to encode from within the languages, without having to start up 3rd party encoders. The plan was as following: each of the participant locations sends up one stream to an icecast2 server under mp3 or ogg vorbis. Each of the participants can select from the streams that are then on the server, bring it into his/her own program setup, analyze it, change it, add other sound to it, visualize it, leave it like it is, and send it back up in her/his streaming connection. As such dependent and independent loops are created, that are changed by each iteration of the down/up transfer. In the performance space at FAMU Prague the 4 streams were placed on a speaker in each corner of the room, which made it for the audience easier to follow the movement on sounds over the different locations.

For the participation in radio days at de appel amsterdam, 23rd april 2005, we worked in a similar line up. The initial text was written and read by Bojan Fajfric while he was talking to his father over the phone to beograd. This time since there were no external partners we made an internal routing of different filters, spectral and time variations on this initial input, and sent it across the room on 4 speakers.

We included some test patches we were using for testing connections up to and down from the icecast 2 server, feel free to experiment with! Again download first the ogg and mp3 cast and amp objects from olaf's site at: http://www.akustische-kunst.org - read the helpfiles carefully they document all possible aspects wonderfully! oggNO_oknoserver (broadcast an ogg file, and bringing it back home...)


oggcast patch patch (click to enlarge)

Downloadable files for this tutorial

examples: collabstream.tgz