Producing Participatory Media
Class 5 - July 15
Topics:
Streaming generally consists of an audio and/or video source fed into an encoder,
transmitted to a server which then feeds it out to clients. There of course
are variations on this such as streaming text where the source is not audio
or video and multicast where a server may not be needed but this is the general
model.
For more information visit:
Streaming media
- Wikipedia
Streaming Network Architectures
Multicast
Multicast is great in theory but in practice isn't very practical. It allows
for the serving of a single stream that is replicated throughout the network
and any client that chooses may "tune in". The best part is that a
server need not have a large amount of bandwidth to support a large number of
clients (as opposed to the unicast model). Multicast is also great for keeping
clients in synch with each other. Multicast is meant for live media and is
not suited for content on-demand. The biggest problem is that the network can
not support a great number of multicast sources and often network administrators
will not enable multicast on their networks due to bandwidth and security concerns.
For more information visit:
Multicast
- Wikipedia
MBone is the Multicast Backbone on the Internet that is now considered obsolete. It was a network that connected multicast enabled networks on the internet.
Internet2 is a network open to research
institutions that allows for much higher bandwidth and often has generally available
multicast streams.
Unicast
Unicast offers a one-to-one relationship between the server and the
clients receiving the stream. This is the model most in use today but causes
problems when content is popular and the media server does not have enough bandwidth
to support all of the client connections. Unicast is well suited
for delivery of media on-demand but can have issues when synchronization of
live content is a concern.
For more information visit:
Unicast
Streaming
Content Delivery Networks (CDN)
Content Delivery Networks grew out of the need for a better quality of service
for end to end streaming that could be achieved with unicast streaming alone.
CDNs are generally commercial companies that have developed a network of proxy
(for live streams) and caching (for on-demand media) servers. If you utilize
a CDN for live streaming you will send your live stream to a server at the CDN
which will then intelligently route your stream to their proxy servers
based on demand. Utilizing a CDN for on-demand content requires some means to
flag content to be cached on the the CDN network. Based upon popularity the
CDN will replicate your content on their caching servers which then serve the
content to clients. Generally CDN's have proxy/caching servers at major ISPs
to be as close as possible to individual clients bypassing the major points
of congestion on the internet. In a sense, CDNs are a hybrid of multicast and
unicast based networks.
For more information visit:
Akamai
Content Delivery
Network - Wikipedia
Coral Content Delivery Network
Streaming Protocols
TCP - Transmission Control Protocol
Not great for streaming audio and video as it does not accept packet loss or
delay and forces server to resend packets. It is more important for live streaming
audio and video to be in "real time" than to continuously fall behind
trying get each and every bit through the network. Using TCP is sometimes unavoidable
such as when firewalls are involved or other transmission methods are considered
insecure.
HTTP - Hypertext Transfer Protocol
Meant for delivery of web pages and associated data (images, text and so on).
HTTP utilizes TCP for delivery of information and adds a set
of request methods specific to web requests (such as when a form is posted or
submitted). Never the less, some streaming servers (particularly MP3 audio streams)
use HTTP to deliver their media. The biggest benefit is that it works well even
with the most aggressive firewall setups.
UDP - User Datagram Packet
Much better for delivery of live streaming audio and video. It does not have
the delivery requirements of TCP and therefore packets can be delayed or damaged
without causing the server to resend. Doing such allows for more efficient transfer
of data over unstable networks and allows clients to stay in synch with each
other while watching live streams. The biggest drawback to using UDP is that
many network administrators feel that it is insecure often block UDP data on
their networks.
RTP - RealTime Transport Protocol
Utilizes UDP as a transport mechanism but adds control connection
for communication between client and server primarily for quality of service
information.
RTSP - RealTime Streaming Protocol
A control protocol for streaming, doesn't actually deliver the data but acts
as the means to setup the transport and control the flow. Utilizes the application
or user selected data transport method (RTP, UDP, TCP, HTTP). Offers "VCR"
capabilities for streaming media (pause, seek, etc..).
For more information visit:
Wikipedia - TCP/IP
Codecs
Codec stands for Compress/Decompress. It is
the means to deliver or store high quality audio and video in less space and
using less bandwidth than would normally be required.
Video Capture and Compression Theory
A frame of video, essentially an image, at a resolution of 640 pixels
by 480 pixels with 24 bits per pixel (RGB, 255 values of Red, 255 values of
Blue and 255 values of Green) would occupy 7,372,800 bits.
At 15 frames per second we are looking at 110,592,000 bits per second (or 13,824,000
bytes per second or 13,824 kilobytes per second or 13.824 megabytes per second).
This is waaaaaaaaaaaaaaay more than the average internet connection. Broadband
connections (DSL and Cable Modems) typically allow for 1mbps (megabit per second),
which more than 100 times fewer bits per second than uncompressed video. Hence
the need for compression.
Video codecs typically employ some combination still frame encoding for key
frames (representing a full frame) and delta frames (current difference
from the previous key frame). Specific mathematical algorithms that video codecs
employ are: Discrete Cosine Transforms (DCT), Discrete Wavelet Transforms (DWT),
Fractals and various combinations of them all.
Audio Capture Theory
Audio is captured by sampling the amplitude of incoming sound waves
(amplitude (level) and frequency (time)). The sampling rate is the number of
of these samples taken per second, the higher the sampling rate the more accurate
the representation of the wave. Sampling rate is expressed in kHz (kilohertz)
which is 1000's of samples per second. The other factor of audio capture
is the number of bits per sample (quantization), 8 bit samples have 256 steps
to represent a given amplitude whereas 16 bits give 65,536 steps per representation.
The Nyquist Theorem states that an sampling
rates must be at least 2 times the highest frequency to represent a wave. The
higher the sampling rate the higher the frequency that can be sampled and represented
accurately. This is why music utilizes a much higher sampling rate than voice,
it utilizes a much wider range of frequencies.
A sampling rate of 22.05 kHz or 22,050 samples per second at 16 bits
per sample yields 352,800 bits per second (or 352.8 kilobits (kb) or 44.1 kilobytes
(KB)). This is much more feasible for transmission via broadband connections
but could still use some compression.
Specific Codecs
There is no "Best" codec... Each codec has it's
plusses and minuses and each is good for its particular use.
H.261
Early video conferencing codec (1990) - Video conferencing
requires fast encode/decode for low latency.
H.263
More recent video conferencing codec (1996)
Used in earlier versions of Flash (MX and version 7).
MJPEG - Motion JPEG
Individual frames of video encoded as JPEG. Good for analog capture
for non-linear video editing, every frame is in essence a keyframe and cuts
can be made anywhere..
MPEG-1
Targeted for CDROM speeds (1.5 mbps)
MP3
Audio layer of MPEG-1, MPEG-1 Layer 3.
MPEG-2
High bitrate codec used for digital cable, broadcast and satellite
as well as DVD. High royalty rates as well.
MPEG-4
Range of bitrates, developed for a large variety of uses including multimedia
over unreliable networks such as the internet and mobile phones. Includes components
for interactivity and many different audio and video profiles.
H.264 - also known as MPEG-4 Part 10 or AVC (Advanced Video
Coding)
Same as MPEG-4 but a newer "profile" and much more scalable
than previous MPEG-4 profiles have been. The next big thing.
AAC - Advanced Audio Coding
Audio portion of MPEG-4
DV
Tape format and "light" compression codec. Great for digital
video capture and editing. High bitrate.
Other Proprietary codecs worth mentioning
Sorenson Video 3
DivX
3ivx - MPEG-4
RealVideo 10
Microsoft Windows Media 8 (ASF)
Microsoft MPEG-4 v2 (Not really MPEG-4, Encodes to AVI, DivX stole it)
Q-Design Music
On2's VP 6 Codec - Used in Flash 8
Open Source Codecs worth mentioning
Ogg Vorbis and Ogg Theora
Dirac Originally developed by the BBC
XviD
On2's VP3 (now part of the Ogg project)
For more information:
Codec
Comparison
QuickTime Playback Compatibility
Chart
H.264 Begins
its Ascent
Flash Video | Optimizations and Tools
Live TV is dead, and we're noticing the smell
Formats
AVI - Audio Video Interleave
Developed by Microsoft for audio and video data utilizing VFW (Video
for Windows)
QuickTime
Apple's proprietary audio and video format. The basis for the MPEG-4
container.
http://www.apple.com/quicktime/
ASF - Advanced Systems Format
Microsoft's proprietary file format for Windows Media. Synchronized
Audio/Video data.
http://www.microsoft.com/windows/windowsmedia/format/asfspec.aspx
Players
QuickTime
Windows Media
RealPlayer
Flash Player
Several miscellaneous MP3 Players (stemming from WinAmp by Nullsoft
MPlayer
VLC
Servers
QuickTime/Darwin Streaming Server
Windows Media Server
RealServer/Helix
Flash Media Server
Miscellaneous MP3 Servers (stemming from Shoutcast by Nullsoft (now owned by
AOL))
Encoders
There are a myriad of encoders that can be used for both streaming and compression. For now we will focus on QuickTime Broadcaster.
QuickTime Broadcaster
Mac only, free. Does live encoding as well as "save file to disk"
for on-demand delivery.
First you have to create a Reference Movie. Reference movies are generally movie files that only contain a link to the live stream. The easiest way to create a reference movie is to open the live stream URL in QuickTime Pro (the URL should take the form: rtsp://hostname:portnumber/stream.sdp) and to Save As a Reference Movie. This reference movie can then be uploaded to a standard webserver and used in an embedded player or as a direct link.
For a live stream, the server sends an SDP (Session Description Protocol)
to the player which then starts requesting the live stream. This is a standard
way of doing things in the MPEG-4/RTSP world.
For more information:
Reference
Movies
QuickTime Pro
The QuickTime swiss army knife. Great for compressing for on-demand
files and translating files for use on or off a QuickTime Streaming Server.
For use on a streaming server, you need to export as a hinted file.
For use on a web page it is a good idea to use fast start so that
the file can be played while downloading.
For more information:
QuickTime Broadcaster
Session Description Protocol RFC
QuickTime
- Preparing Media for Streaming
QuickTime
- Delivering Live Streaming Media
Streaming with Flash
Flash follows it's own rules and doesn't abide by many of the standards we have talked about. In some cases this is good, in others not so great. It is good that it is very easy to get going but doesn't offer some of the delivery flexibility offered by more standard based streaming systems (such as streaming to consumer electronics devices like set-top-boxes or mobile phones (although support mobile phones are in the works)). It also is easy to do server side programming with but that is a topic for another day.
Here are the steps to follow to quickly start streaming with Flash: