MPEG-4 Part 11Scene description and application engine was published as ISO/IEC 14496-11 in 2005.[1] MPEG-4 Part 11 is also known as BIFS, XMT, MPEG-J.[2][3] It defines:
the coded representation of the spatio-temporal positioning of audio-visual objects as well as their behaviour in response to interaction (scene description);
the coded representation of synthetic two-dimensional (2D) or three-dimensional (3D) objects that can be manifested audibly or visually;
and a system level description of an application engine (format, delivery, lifecycle, and behaviour of downloadable Java byte code applications). (The MPEG-J Graphics Framework eXtensions (GFX) is defined in MPEG-4 Part 21 - ISO/IEC 14496-21.[4])
Binary Format for Scenes (BIFS) is a binary format for two- or three-dimensional audiovisual content. It is based on VRML and part 11 of the MPEG-4 standard.
BIFS is MPEG-4 scene description protocol to compose MPEG-4 objects, describe interaction with MPEG-4 objects and to animate MPEG-4 objects.
The XMT framework accommodates substantial portions of SMIL, W3C Scalable Vector Graphics (SVG) and X3D (the new name of VRML). Such a representation can be directly played back by a SMIL or VRML player, but can also be binarised to become a native MPEG-4 representation that can be played by an MPEG-4 player. Another bridge has been created with BiM (Binary MPEG format for XML).[6]