The X video extension, often abbreviated as XVideo or Xv, is a video output mechanism for the X Window System. The protocol was designed by David Carver; the specification for version 2 of the protocol was written in July 1991.[1] It is mainly used today to resize video content in the video controller hardware in order to enlarge a given video or to watch it in full screen mode. Without XVideo, X would have to do this scaling on the main CPU. That requires a considerable amount of processing power, which could slow down or degrade the video stream; video controllers are specifically designed for this kind of computation, so can do it much more cheaply. Similarly, the X video extension can have the video controller perform color space conversions, and change the contrast, brightness, and hue of a displayed video stream.
In order for this to work, three things have to come together:
The video controller has to provide the required functions.
The device driver software for the video controller and the X display server program have to implement the XVideo interface.
The video playback software has to make use of this interface.
Most modern video controllers provide the functions required for XVideo; this feature is known as hardware scaling and YUV acceleration or sometimes as 2D hardware acceleration. The XFree86 X display server has implemented XVideo since version 4.0.2. To check whether a given X display server supports XVideo, one can use the utility xdpyinfo. To check whether the video controller provides the required functions and whether the X device driver implements XVideo for any of them, one can use the xvinfo program.
Video playback programs that run under the X Window system, such as MPlayer, MythTV or xine, typically have an option to enable XVideo output. It is very advisable to switch on this option if the system GPU video-hardware and device drivers supports XVideo and more modern rendering systems such as OpenGL and VDPAU are unavailable – the speedup is very noticeable even on a fast CPU.
While the protocol itself has features for reading and writing of video streams from and to video adapters, in practice today only the functions XvPutImage and XvShmPutImage are used: the client program repeatedly prepares images and passes them on to the graphics hardware to be scaled, converted and displayed.
Display
After video has been scaled and prepared for display on the video card, it must be displayed. There are a few possible ways to display accelerated video at this stage. Since full acceleration means that the video controller is responsible for scaling, converting, and drawing the video, the technique used depends entirely on what the video is being drawn onto.
The role of window manager support and compositing
Under X, how video is finally drawn depends largely on the X window manager in use. With properly installed drivers, and GPU hardware such as supported Intel, ATI, and nVidia chip sets, some window managers, called compositing window managers, allow windows to be separately processed and then rendered (or composited). This involves all windows being rendered to separate output buffers in memory first, and later combined to form a complete graphical interface. While in (video) memory, individual windows can be transformed separately, and accelerated video may be added at this stage using a texture filter, before the window is composited and drawn. XVideo can also be used to accelerate video playback during the drawing of windows using an OpenGL Framebuffer Object or pbuffer.
Metacity, an X window manager uses compositing in this way. The compositing can also make use of 3D pipelines accelerations such as GLX_EXT_texture_from_pixmap. Among other things, this process allows many video outputs to share the same screen without interfering with each other. Other compositing window managers such as Compiz also use compositing.
However, on a system with limited OpenGL acceleration function, specifically the lack of an OpenGL Framebuffer Object or pbuffer, the use of an OpenGL environment like Xgl makes xv hardware accelerations impossible.
The disadvantages of chroma keying
In the event that the window manager doesn't directly support compositing, it is more difficult to isolate where the video stream should be rendered, because by the time it can be accelerated the output has already been turned into a single image. The only way to do this is usually to employ a post processed hardware overlay, using chroma keying. After all of the windows have already been drawn, the only pieces of information we have available are the size and position of the video window's canvas. A third piece of information is required to indicate which parts of the video window's canvas are obscured by other windows and which are not. Therefore, the video player draws its canvas using a solid color (we'll say green), and this color becomes a makeshift third dimension. When all windows have been drawn, windows covering the video player will block out the green color. When the video stream is added to the output, the graphics card can simply scan the co-ordinates of the canvas. When it encounters green, it knows it has found a visible portion of the video window, and only draws those portions of the video. This same process was also the only available option to render hardware accelerated video under MicrosoftWindows XP and earlier, since its window management features were so deeply embedded into the operating system that accelerating them would have been impossible.
If the window manager doesn't support compositing, post processed hardware overlays using chroma keying as described in the previous paragraph can make it impossible to produce a proper screenshots of Xvideo applications. It can also make it impossible to view this kind of playback on a secondary display when only one overlay is allowed at the hardware level.