[---------Access Unit--------------]
[nal][nal][nal][nal][nal][nal][nal]
[-----I Slice-----------------------]
header= 0x00 0x00 0x01
NAL=XY .. .. .. .. ..
XY= [0._._._|_._._._]
0 = forbidden bit
[_._]= 2 bits, NRI (nal_ref_idc)
whether this NAL is referenced by another NAL. [0][0] can be discarded as the data it' not part of the picure
_|_._._._= 5 bits unit_type
Unit Type :
0 reserved
1-23 NAL unit Single NAL unit packet:
1 NAL_SLICE
2 NAL_SLICE_DPA
3 NAL_SLICE_DPB
4 NAL_SLICE_DPC
5 H264_NAL_SLICE_IDR
6 NAL_SEI
7 NAL_SPS
8 NAL_PPS
9 NAL_AU_DELIMITER ,
10 NAL_SEQ_END
11 NAL_STREAM_END
12 NAL_FILLER_DATA
24 STAP-A Single-time aggregation packet 5.7.1
25 STAP-B Single-time aggregation packet 5.7.1
26 MTAP16 Multi-time aggregation packet 5.7.2
27 MTAP24 Multi-time aggregation packet 5.7.2
28 FU-A Fragmentation unit 5.8
29 FU-B Fragmentation unit 5.8
30-31 reserved
In particular, the H.264 specification requires that the value of NRI SHALL be equal to 0 for all NAL units having nal_unit_type equal to 6, 9, 10, 11, or 12.
if type is 5, the data belongs to an Iframe !!!
For NAL units having nal_unit_type equal to 7 or 8 (indicating a sequence parameter set or a picture parameter set, respectively), an H.264 encoder SHOULD set the value of NRI to 11 (in binary format).
For coded slice NAL units of a primary coded picture having nal_unit_type equal to 5 (indicating a coded slice belonging to an IDR picture), an H.264 encoder SHOULD set the value of NRI to 11 (in binary format).
NAL_AU_DELIMITER = 9 is the Access Unit delimiter
The next byte after 00 00 001 09 XY
X = [D.C.B.A]
A=1
CB:
.......00 => start of I frame !
.......01 => start of P frame
.......10 => start of B frame
H264 bitstream analysis
-
- Posts: 65
- Joined: Thu Oct 03, 2013 5:54 pm
H264 for dummies
The binary stream is structured and divided into packets.
On the upper level, there is separation of the stream on NAL-packets, and the stream has approximately the following form:

The abbreviation NAL stands for Network Abstraction Layer.

NAL-type defines what data structure is represented by current NAL-packet.
It can be slice, or parameter set, or filler and so on:
NAL types
The payload of NAL-packet identified as RBSP (Raw Byte Sequence Payload).
RBSP describes a row of bits specified order of SODB (String Of Data Bits).
If SODB empty (zero bits in length), RBSP is also empty.
The first byte of RBSP (most significant, far left) contains the eight bits SODB; next byte of RBSP shall contain the following eight SODB and so on, until there is less than eight bits SODB.
This is followed by a stop-bits and equalizing bit

Any coded image contains slices, which in turn are divided into macroblocks.
Most often, one encoded image corresponds to one slice.

Also, one image can have multiple slices. The slices are divided into the following types:
Looks like table 2 contains some redundant data.
But that is not true: types 5 - 9 mean that all other slices of the current image will be the same type.
Slice header contains the information about the type of slice, the type of macroblocks in the slice, number of the slice frame.
Also in the header contains information about the reference frame settings and quantification parameters.
And finally the slice data – macroblocks. This is where our pixels are hiding.
Macroblocks are the main carriers of information, because they contain sets of luminance and chrominance components corresponding to individual pixels.
Video decoding is ultimately reduced to the search and retrieval of macroblocks out of a bit stream.
This is how single macroblock looks like:

On the upper level, there is separation of the stream on NAL-packets, and the stream has approximately the following form:

The abbreviation NAL stands for Network Abstraction Layer.

NAL-type defines what data structure is represented by current NAL-packet.
It can be slice, or parameter set, or filler and so on:
NAL types
Type Definition
0 Undefined
1 Slice layer without partitioning non IDR
2 Slice data partition A layer
3 Slice data partition B layer
4 Slice data partition C layer
5 Slice layer without partitioning IDR
6 Additional information (SEI)
7 Sequence parameter set
8 Picture parameter set
9 Access unit delimiter
10 End of sequence
11 End of stream
12 Filler data
13..23 Reserved
24..31 Undefined
The payload of NAL-packet identified as RBSP (Raw Byte Sequence Payload).
RBSP describes a row of bits specified order of SODB (String Of Data Bits).
If SODB empty (zero bits in length), RBSP is also empty.
The first byte of RBSP (most significant, far left) contains the eight bits SODB; next byte of RBSP shall contain the following eight SODB and so on, until there is less than eight bits SODB.
This is followed by a stop-bits and equalizing bit

Any coded image contains slices, which in turn are divided into macroblocks.
Most often, one encoded image corresponds to one slice.

Also, one image can have multiple slices. The slices are divided into the following types:
Type Description
0 P-slice. Consists of P-macroblocks (each macro block is predicted using one reference frame) and / or I-macroblocks.
1 B-slice. Consists of B-macroblocks (each macroblock is predicted using one or two reference frames) and / or I-macroblocks.
2 I-slice. Contains only I-macroblocks. Each macroblock is predicted from previously coded blocks of the same slice.
3 SP-slice. Consists of P and / or I-macroblocks and lets you switch between encoded streams.
4 SI-slice. It consists of a special type of SI-macroblocks and lets you switch between encoded streams.
5 P-slice.
6 B-slice.
7 I-slice.
8 SP-slice.
9 SI-slice.
Looks like table 2 contains some redundant data.
But that is not true: types 5 - 9 mean that all other slices of the current image will be the same type.
Slice header contains the information about the type of slice, the type of macroblocks in the slice, number of the slice frame.
Also in the header contains information about the reference frame settings and quantification parameters.
And finally the slice data – macroblocks. This is where our pixels are hiding.
Macroblocks are the main carriers of information, because they contain sets of luminance and chrominance components corresponding to individual pixels.
Video decoding is ultimately reduced to the search and retrieval of macroblocks out of a bit stream.
This is how single macroblock looks like:

-
- Posts: 65
- Joined: Thu Oct 03, 2013 5:54 pm
H264 memo
SPS Sequence Parameter Set
PPS Picture Parameter Set
IDR Instantaneous Decoding Refresh
An IDR frame is a special type of I-frame in H.264.
An IDR frame specifies that no frame after the IDR frame can reference any frame before it.
This makes seeking the H.264 file easier and more responsive in the player.
Every IDR frame is an I-frame, but not vice versa.
Analyze an H264 bitstream:
download h264bitstream:
http://sourceforge.net/projects/h264bitstream/files/latest/download
extract the h264 from the movie:
The NAL units now reside in the file of.h264.
Run the h264_analyze command from the h264bitstream package.
PPS Picture Parameter Set
IDR Instantaneous Decoding Refresh
An IDR frame is a special type of I-frame in H.264.
An IDR frame specifies that no frame after the IDR frame can reference any frame before it.
This makes seeking the H.264 file easier and more responsive in the player.
Every IDR frame is an I-frame, but not vice versa.
Analyze an H264 bitstream:
download h264bitstream:
http://sourceforge.net/projects/h264bitstream/files/latest/download
extract the h264 from the movie:
Code: Select all
ffmpeg.exe -i Old Faithful.mp4 -vcodec copy -vbsf h264_mp4toannexb -an of.h264
The NAL units now reside in the file of.h264.
Run the h264_analyze command from the h264bitstream package.
Return to “Audio / Video encoding and decoding”
Who is online
Users browsing this forum: No registered users and 1 guest