RTP payload formats
Multimedia information signaling
The Real-time Transport Protocol (RTP) specifies a general-purpose data format and network protocol for transmitting digital media streams on Internet Protocol (IP) networks. The details of media encoding, such as signal sampling rate, frame size and timing, are specified in an RTP payload format . The format parameters of the RTP payload are typically communicated between transmission endpoints with the Session Description Protocol (SDP), but other protocols, such as the Extensible Messaging and Presence Protocol (XMPP) may be used.
Audio and video payload types
RFC 3551, entitled RTP Profile for Audio and Video (RTP/AVP ), specifies the technical parameters of payload formats for audio and video streams.
The standard also describes the process of registering new payload types with IANA; additional payload formats and payload types are defined in the following specifications:
RFC 3551 , Standard 65, RTP Profile for Audio and Video Conferences with Minimal Control
RFC 4856 , Media Type Registration of Payload Formats in the RTP Profile for Audio and Video Conferences
RFC 3190 , RTP Payload Format for 12-bit DAT Audio and 20- and 24-bit Linear Sampled Audio
RFC 6184 , RTP Payload Format for H.264 Video
RFC 3640 , RTP Payload Format for Transport of MPEG-4 Elementary Streams
RFC 6416 , RTP Payload Format for MPEG-4 Audio/Visual Streams
RFC 2250 , RTP Payload Format for MPEG1 /MPEG2 Video
RFC 7798 , RTP Payload Format for High Efficiency Video Coding (HEVC)
RFC 2435 , RTP Payload Format for JPEG-compressed Video
RFC 4587 , RTP Payload Format for H.261 Video Streams
RFC 2658 , RTP Payload Format for PureVoice Audio Video
RFC 4175 , RTP Payload Format for Uncompressed Video
RFC 7587 , RTP Payload Format for the Opus Speech and Audio Codec
RFC 9134 , RTP Payload Format for JPEG XS
RFC 9607 , RTP Payload Format for the Secure Communications Interoperability Protocol (SCIP) Codec
Payload identifiers 96–127 are used for payloads defined dynamically during a session. It is recommended to dynamically assign port numbers, although port numbers 5004 and 5005 have been registered for use of the profile when a dynamically assigned port is not required.
Applications should always support PCMU (payload type 0); previously, DVI4 (payload type 5) was also recommended, but this was removed in 2013 by RFC 7007.
Payload type (PT)
Name
Type
No. of channels
Clock rate (Hz)[ note 1]
Frame size (byte)
Default packet interval (ms)
Description
References
0
PCMU
audio
1
8000
any
20
ITU-T G.711 PCM μ-Law audio 64 kbit/s
RFC 3551
1
reserved (previously FS-1016 CELP )
audio
1
8000
reserved, previously FS-1016 CELP audio 4.8 kbit/s
RFC 3551, previously RFC 1890
2
reserved (previously G721 or G726-32)
audio
1
8000
reserved, previously ITU-T G.721 ADPCM audio 32 kbit/s or ITU-T G.726 audio 32 kbit/s
RFC 3551, previously RFC 1890
3
GSM
audio
1
8000
20
20
European GSM Full Rate audio 13 kbit/s (GSM 06.10)
RFC 3551
4
G723
audio
1
8000
30
30
ITU-T G.723.1 audio
RFC 3551
5
DVI4
audio
1
8000
any
20
IMA ADPCM audio 32 kbit/s
RFC 3551
6
DVI4
audio
1
16000
any
20
IMA ADPCM audio 64 kbit/s
RFC 3551
7
LPC
audio
1
8000
any
20
Experimental Linear Predictive Coding audio 5.6 kbit/s
RFC 3551
8
PCMA
audio
1
8000
any
20
ITU-T G.711 PCM A-Law audio 64 kbit/s
RFC 3551
9
G722
audio
1
8000[ note 2]
any
20
ITU-T G.722 audio 64 kbit/s
RFC 3551 - Page 14
10
L16
audio
2
44100
any
20
Linear PCM 16-bit Stereo audio 1411.2 kbit/s,[ 2] [ 3] [ 4] uncompressed
RFC 3551, Page 27
11
L16
audio
1
44100
any
20
Linear PCM 16-bit audio 705.6 kbit/s, uncompressed
RFC 3551, Page 27
12
QCELP
audio
1
8000
20
20
Qualcomm Code Excited Linear Prediction
RFC 2658, RFC 3551
13
CN
audio
1
8000
Comfort noise . Payload type used with audio codecs that do not support comfort noise as part of the codec itself such as G.711 , G.722.1 , G.722 , G.726 , G.727 , G.728 , GSM 06.10 , Siren , and RTAudio .
RFC 3389
14
MPA
audio
1, 2
90000
8–72
MPEG-1 or MPEG-2 audio only
RFC 3551, RFC 2250
15
G728
audio
1
8000
2.5
20
ITU-T G.728 audio 16 kbit/s
RFC 3551
16
DVI4
audio
1
11025
any
20
IMA ADPCM audio 44.1 kbit/s
RFC 3551
17
DVI4
audio
1
22050
any
20
IMA ADPCM audio 88.2 kbit/s
RFC 3551
18
G729
audio
1
8000
10
20
ITU-T G.729 and G.729a audio 8 kbit/s; Annex B is implied unless the annexb=no
parameter is used
RFC 3551, Page 20 , RFC 3555, Page 15
19
reserved (previously CN)
audio
reserved, previously comfort noise
RFC 3551
25
CELLB
video
90000
Sun CellB video[ 5]
RFC 2029
26
JPEG
video
90000
JPEG video
RFC 2435
28
nv
video
90000
Xerox PARC 's Network Video (nv)[ 6] [ 7]
RFC 3551, Page 32
31
H261
video
90000
ITU-T H.261 video
RFC 4587
32
MPV
video
90000
MPEG-1 and MPEG-2 video
RFC 2250
33
MP2T
audio/video
90000
MPEG-2 transport stream
RFC 2250
34
H263
video
90000
H.263 video, first version (1996)
RFC 3551, RFC 2190
72–76
reserved
reserved because RTCP packet types 200–204 would otherwise be indistinguishable from RTP payload types 72–76 with the marker bit set
RFC 3550, RFC 3551
77–95
unassigned
note that RTCP packet type 207 (XR, Extended Reports) would be indistinguishable from RTP payload types 79 with the marker bit set
RFC 3551, RFC 3611
dynamic
H263-1998
video
90000
H.263 video, second version (1998)
RFC 3551, RFC 4629, RFC 2190
dynamic
H263-2000
video
90000
H.263 video, third version (2000)
RFC 4629
dynamic (or profile)
H264 AVC
video
90000
H.264 video (MPEG-4 Part 10)
RFC 6184, previously RFC 3984
dynamic (or profile)
H264 SVC
video
90000
H.264 video
RFC 6190
dynamic (or profile)
H265
video
90000
H.265 video (HEVC)
RFC 7798
dynamic (or profile)
theora
video
90000
Theora video
draft-barbato-avt-rtp-theora
dynamic
iLBC
audio
1
8000
20, 30
20, 30
Internet low Bitrate Codec 13.33 or 15.2 kbit/s
RFC 3952
dynamic
PCMA-WB
audio
1
16000
5
ITU-T G.711.1 A-law
RFC 5391
dynamic
PCMU-WB
audio
1
16000
5
ITU-T G.711.1 μ-law
RFC 5391
dynamic
G718
audio
32000 (placeholder)
20
ITU-T G.718
draft-ietf-payload-rtp-g718
dynamic
G719
audio
(various)
48000
20
ITU-T G.719
RFC 5404
dynamic
G7221
audio
16000, 32000
20
ITU-T G.722.1 and G.722.1 Annex C
RFC 5577
dynamic
G726-16
audio
1
8000
any
20
ITU-T G.726 audio 16 kbit/s
RFC 3551
dynamic
G726-24
audio
1
8000
any
20
ITU-T G.726 audio 24 kbit/s
RFC 3551
dynamic
G726-32
audio
1
8000
any
20
ITU-T G.726 audio 32 kbit/s
RFC 3551
dynamic
G726-40
audio
1
8000
any
20
ITU-T G.726 audio 40 kbit/s
RFC 3551
dynamic
G729D
audio
1
8000
10
20
ITU-T G.729 Annex D
RFC 3551
dynamic
G729E
audio
1
8000
10
20
ITU-T G.729 Annex E
RFC 3551
dynamic
G7291
audio
16000
20
ITU-T G.729.1
RFC 4749
dynamic
GSM-EFR
audio
1
8000
20
20
ITU-T GSM-EFR (GSM 06.60)
RFC 3551
dynamic
GSM-HR-08
audio
1
8000
20
ITU-T GSM-HR (GSM 06.20)
RFC 5993
dynamic (or profile)
AMR
audio
(various)
8000
20
Adaptive Multi-Rate audio
RFC 4867
dynamic (or profile)
AMR-WB
audio
(various)
16000
20
Adaptive Multi-Rate Wideband audio (ITU-T G.722.2)
RFC 4867
dynamic (or profile)
AMR-WB+
audio
1, 2 or omit
72000
13.3–40
Extended Adaptive Multi Rate – WideBand audio
RFC 4352
dynamic (or profile)
vorbis
audio
(various)
(various)
Vorbis audio
RFC 5215
dynamic (or profile)
opus
audio
1, 2
48000[ note 3]
2.5–60
20
Opus audio
RFC 7587
dynamic (or profile)
speex
audio
1
8000, 16000, 32000
20
Speex audio
RFC 5574
dynamic
mpa-robust
audio
1, 2
90000
24–72
Loss-Tolerant MP3 audio
RFC 5219 (previously RFC 3119)
dynamic (or profile)
MP4A-LATM
audio
90000 or others
MPEG-4 Audio (includes AAC )
RFC 6416 (previously RFC 3016)
dynamic (or profile)
MP4V-ES
video
90000 or others
MPEG-4 Visual
RFC 6416 (previously RFC 3016)
dynamic (or profile)
mpeg4-generic
audio/video
90000 or other
MPEG-4 Elementary Streams
RFC 3640
dynamic
VP8
video
90000
VP8 video
RFC 7741
dynamic
VP9
video
90000
VP9 video
draft-ietf-payload-vp9
dynamic
AV1
video
90000
AV1 video
av1-rtp-spec
dynamic
L8
audio
(various)
(various)
any
20
Linear PCM 8-bit audio with 128 offset
RFC 3551 Section 4.5.10 and Table 5
dynamic
DAT12
audio
(various)
(various)
any
20 (by analogy with L16)
IEC 61119 12-bit nonlinear audio
RFC 3190 Section 3
dynamic
L16
audio
(various)
(various)
any
20
Linear PCM 16-bit audio
RFC 3551 Section 4.5.11, RFC 2586
dynamic
L20
audio
(various)
(various)
any
20 (by analogy with L16)
Linear PCM 20-bit audio
RFC 3190 Section 4
dynamic
L24
audio
(various)
(various)
any
20 (by analogy with L16)
Linear PCM 24-bit audio
RFC 3190 Section 4
dynamic
raw
video
90000
Uncompressed Video
RFC 4175
dynamic
ac3
audio
(various)
32000, 44100, 48000
Dolby AC-3 audio
RFC 4184
dynamic
eac3
audio
(various)
32000, 44100, 48000
Enhanced AC-3 audio
RFC 4598
dynamic
t140
text
1000
Text over IP
RFC 4103
dynamic
EVRC EVRC0 EVRC1
audio
8000
EVRC audio
RFC 4788
dynamic
EVRCB EVRCB0 EVRCB1
audio
8000
EVRC-B audio
RFC 4788
dynamic
EVRCWB EVRCWB0 EVRCWB1
audio
16000
EVRC-WB audio
RFC 5188
dynamic
jpeg2000
video
90000
JPEG 2000 video
RFC 5371
dynamic
UEMCLIP
audio
8000, 16000
UEMCLIP audio
RFC 5686
dynamic
ATRAC3
audio
44100
ATRAC 3 audio
RFC 5584
dynamic
ATRAC-X
audio
44100, 48000
ATRAC 3+ audio
RFC 5584
dynamic
ATRAC-ADVANCED-LOSSLESS
audio
(various)
ATRAC Advanced Lossless audio
RFC 5584
dynamic
DV
video
90000
DV video
RFC 6469 (previously RFC 3189)
dynamic
BT656
video
ITU-R BT.656 video
RFC 3555
dynamic
BMPEG
video
Bundled MPEG-2 video
RFC 2343
dynamic
SMPTE292M
video
SMPTE 292M video
RFC 3497
dynamic
RED
audio
Redundant Audio Data
RFC 2198
dynamic
VDVI
audio
Variable-rate DVI4 audio
RFC 3551
dynamic
MP1S
video
MPEG-1 Systems Streams video
RFC 2250
dynamic
MP2P
video
MPEG-2 Program Streams video
RFC 2250
dynamic
tone
audio
8000 (default)
tone
RFC 4733
dynamic
telephone-event
audio
8000 (default)
DTMF tone
RFC 4733
dynamic
aptx
audio
2 – 6
(equal to sampling rate)
4000 ÷ sample rate
4[ note 4]
aptX audio
RFC 7310
dynamic
jxsv
video
90000
JPEG XS video
RFC 9134
dynamic
scip
audio/video
8000 or 90000
SCIP
RFC 9607
^ The "clock rate" is the rate at which the timestamp in the RTP header is incremented, which need not be the same as the codec's sampling rate. For instance, video codecs typically use a clock rate of 90000 so their frames can be more precisely aligned with the RTCP NTP timestamp, even though video sampling rates are typically in the range of 1–60 samples per second.
^ Although the sampling rate for G.722 is 16000, its clock rate is 8000 to remain backwards compatible with RFC 1890, which incorrectly used this value.[ 1]
^ Because Opus can change sampling rates dynamically, its clock rate is fixed at 48000, even when the codec will be operated at a lower sampling rate. The maxplaybackrate
and sprop-maxcapturerate
parameters in SDP can be used to indicate hints/preferences about the maximum sampling rate to encode/decode.
^ For aptX, the packetization interval must be rounded down to the nearest packet interval that can contain an integer number of samples. So at sampling rates of 11025, 22050, or 44100, a packetization rate of "4" is rounded down to 3.99.
Text messaging payload
RFC 4103 , RTP Payload Format for Text Conversation
MIDI payload
See also
References
External links