The video may be stored in MPEG-2 format (standard for movie quality video and audio), which is very high quality, but direct streaming of MPEG-2 video requires a variable rate of from 4 to 10 Mbit/s. Such rates would be incompatible with many of the network connections, so the first inclination might be to transcode this information down to a rate that is commensurate with all, or a majority of, the network links.
The video compression method of choice to do this might be H.263, which offers a wide range of rates and frame sizes, and is widely supported. The result of the transcoding, however, would be that we would have a least common denominator encoding, so that even those users with higher rate network connections would be forced to accept the quality produced for the low-rate users.
One approach to working around the lowest common denominator limitation would be to use a layered coder with multicasting. That is, you would choose a video coder that allows multiple compression rates that can be obtained by incrementally improving the base layer.
Coders that have this capability are sometimes said to be scalable. MPEG-2 has several scalability options, including signal-to-noise ratio (SNR), spatial, and frame rate scalability. One or more of these options could be used and combined with multicasting to create (say) three multicast groups.
The first group would be the baseline coded layer, and the other two would use scalability to create incremental improvements in output quality as users join the remaining two multicast groups. There might be another good reason to use multicast in this application. Specifically, if the desired number of viewers is large, unicast transmissions to each of them could flood various links in the network.
Another approach to establishing interoperability between networks would be to transcode at network gateways. There are three disadvantages usually cited for transcoding: (1) complexity, (2) delay, and (3) added distortion.
For video streaming, because it is one way, delay is not a serious concern. Complexity would be an issue at some network interfaces, and in those cases, the streamed video might not be transcoded, thus yielding degraded network performance and poor delivered video quality.
If complexity does not preclude transcoding, then the remaining issue is the distortion added during the transcoding process. Of course, transcoding to a lower rate will yield lower quality, smaller frame size, and/or slower frame rate, so it is key to add as little additional distortion as possible.
This implies that we would prefer to not completely decode back to video and then re-encode the video at a lower rate (notice that this has implications in terms of complexity and delay, too). We would add less distortion if we could directly map the encoded stream, or at least the decoded parameters, directly into a lower-rate-coded version. As you can imagine, there are numerous possible options corresponding to any given network
No comments:
Post a Comment