ecodis :: Efficient Video Codecs
Video codecs which I worked on:
MPEG-I VVC / ITU-T H.266
Royalty-based broadcast & VR codec
In signal processing, data compression, source coding, or bit-rate reduction involves encoding information using fewer bits than the original representation. Compression can be either lossy or lossless. [...] The process of reducing the size of a data file is often referred to as data compression. In the context of data transmission, it is called source coding (encoding done at the source of the data before it is stored or transmitted) in opposition to channel coding.
Wikipedia page on data compression, 2017
Storing or transmitting contemporary ulta-high-definition (UHD) digital video content in uncompressed form is virtually impossible due to the extremely high data rates; only one second of HDR video with 3840×2160 pixels at 50 frames per second would fit onto a CD. Therefore, efficient lossy coding with very good visual quality even at very low data rates is even more important than in audio applications. This implies that a maximum of redundant and irrelevant information must be removed during the coding.
On this page, the three most efficient newest-generation video coding standards are introduced. The first one, whose specification I am involved in, is still being developed.
MPEG-I Versatile Video Coding (VVC), to be standardized as ITU-T H.266
VVC, also known as H.266, is a flexible general-purpose video coding specification currently standardized by ISO/IEC and ITU-T [ source]. Developed by the Joint Video Experts Team (JVET), the final VVC version is intended to exceed all existing standards (most notably, the three mentioned below) in compression performance at the same subjective reconstruction quality, with only a moderate increase in decoding workload.
Since «serious» work on the VVC standard has just begun (in 2018), I cannot present any comparisons between VVC and other state-of-the-art video codecs. However, I can report from the April 2018 JVET meeting in San Diego that some proposals submitted for standardization in February 2018 achieved notable compression efficiency gains near 40 % over VVC's predecessor (see below) at acceptable decoder complexities of 3–4 times that of the state of the art [ source]. Therefore, I expect the final version of the standard, due in late 2020, to provide bit-rate savings of at least one third at a decoding workload even closer to that of current video coding standards (factor of 2).
Update June 2019 A recently performed subjective test indicates that, for HD and UHD standard dynamic range content, even in its current unfinished state, VVC already achieves the same visual coding quality as its predecessor with 36–40 % lower bit-rate.
During the first stages of the VVC development, I proposed an encoder optimization algorithm improving the subjective video coding quality for some content and presented my work in Macau, Ljubljana, and Marrakech. At the Ljubljana meeting, I also suggested adding support for a new 10- and 12-bit packed YUV/RGB image and video storage format to the VVC code base [ report], a description of which is given below. Later during the standardization, I contributed to better in-loop filtering and the joint transform coding of the chroma components in color images and videos [ reports].
The working draft of the VVC reference encoder and decoder software is hosted here. As of May 2019, it achieves objective efficiency gains (in terms of Bjøntegaard delta rate) of 24 % for still pictures (decoder runtime factor of 1.8) and approximately 34 % for random-access videos (decoder runtime factor of 1.7) [ table]. Considering that we are halfway through VVC's standardization, I think this is a notable achievement.
MPEG-5 Essential Video Coding (EVC), a royalty-reduced 4K video codec
Around the April 2018 JVET meeting in San Diego, where the first draft of the VVC referencec software and specification text was agreed upon (see above), the Motion Picture Experts Group (MPEG) decided to initiate work on a separate, royalty-reduced video coding solution, to be completed and standardized in early 2020 under the name MPEG-5 Essential Video Coding (EVC). The term «royalty-reduced» indicates that the licensing terms for the use of EVC in its Main profile configuration are intended to be transparent and reasonable (licensing conditions are expected to be published by patent holders within 2 years), while the use of EVC in its Baseline configuration will be royalty-free. More details and the motivation behind this approach are given here.
In November 2018, two solutions were submitted to MPEG in response to its Call for Proposals (CfP) on new video coding technology with «simplified coding structure and an accelerated development time of 12 months» [ source]. Two months later, at its January 2019 meeting, MPEG evaluated both proposals [ report] and selected the one by Samsung, Huawei, and Qualcomm (a description of which is provided here) as the starting point of the EVC standardization. The relevant EVC documents are hosted here (note that an MPEG username and password are required to access these pages):
In its first revision, MPEG-5 EVC roughly matches its predecessor, MPEG-H HEVC, in both objective (PSNR) and subjective (visual quality) performance when operated in the royalty-free Baseline mode. Its Main profile configuration, however, was verified to already deliver 24 % better coding efficiency than HEVC, which may increase by a few more percent until the end of EVC's development. As such, EVC is serious competition for VVC, and the next few years will show which of these two codecs will achieve a wider market adoption. I will update this section once the EVC codec has been finalized.
MPEG-H High Efficiency Video Coding, also standardized as ITU-T H.265
Using the High Efficiency Video Coding (HEVC) specification standardized in ISO/IEC 23008-2 and ITU-T H.265 currently is the most efficient way to compress moving pictures, especially high-resolution HD and UHD video. Developed mainly between 2010 and 2013, with some screen content and 3D coding extensions added after 2013, HEVC achieves an increase of about 50 % in perceptual compression efficiency over previous coding standards like H.264 [ source]. In other words, averaged across several coded video sequences, HEVC provides roughly the same subjective video quality as the older coding formats, and it does so using encoded bit-streams which are only half as large. This performance boost is what allowed, for the first time, the delivery of high-quality UHD video to consumers via broadcasting, streaming, and UHD Blu-Ray disc.
As of 2018, hardware-based HEVC decoding is supported by most TVs, set-top boxes, video players, computers, tablets, and even smartphones. The best freely and publicly available HEVC encoder is maintained by the x265 project team and is located here:
HEVC, as described above, is the predecessor of the VVC standard, and most of its underlying technology can still be found in the current draft of the VVC specification. In fact, all visual codecs discussed on this page use the exact same algorithmic building blocks which define a modern hybrid block transform video codec like HEVC. These are
a partitioner segmenting each component of the input into nonoverlapping blocks,
a prediction stage attempting intra- or inter-picture prediction of each input block,
a residual transform converting the prediction error into a spectral representation,
a quantizer mapping the residual transform values to a smaller set of coefficients,
an entropy coder applying lossless compression to the quantized coefficients, and
a few postfilters reducing blocking, denoising, and ringing artifacts upon decoding.
Some codecs also add encoder-side prefilters to complement the decoder's postfilters. Note that modern audio codecs employ the same elements and that, in both audio and video coding, a second prediction stage may be used before or after the quantizer.
AOMedia AV1: A Freely Available Open-Source UHD Video Codec
VVC and HEVC, like other MPEG media codecs, are royalty-bearing, which means that licensing fees will be demanded for their use. The AV1 video codec jointly developed by the Alliance for Open Media ( AOMedia) between 2015 and 2018, on the contrary, is a general-purpose open-source coding format explicitly designed to be royalty-free. This was attempted by avoiding the use of still patented third-party technology in the coding and decoding algorithms (the success of this effort, however, has, as of 2019, not yet been fully verified). The video compression capability of AV1 is realized primarily with coding techniques derived from VP9/VP10, Daala, and Thor [ source]. Inside a WebM container, audio compression support is added through the OPUS codec [ source].
The IETF is expected to adopt AV1 as the Internet Video Coding (NetVC) standard in late 2018 [ source] alongside the OPUS codec, which has already been standardized in RfC 6716 in 2012. I anticipate broad hardware decoding support for AV1 in late 2020. Note that software decoding is already provided, even on Windows [ source], and work on a BSD-licensed optimized decoder called dav1d has progressed as well. The current versions of the AV1 specification and software are available at these pages:
Surprisingly, the subjective performance of the AV1 codec in comparison to its latest competitors, VVC and HEVC, is relatively inconsistent. In some independent tests, AV1 matched the coding efficiency of HEVC [ source], while in others, the codec was objectively and subjectively inferior to HEVC [ source]. This can be attributed to the use of different encoder speed presets in the evaluations: for HEVC-like performance, a very slow AV1 encoder preset must be used [ source]. Since there is clear evidence that the precursor to VVC, known as Joint Exploration Model ( JEM), outperforms both AV1 and HEVC and also encodes faster than AV1 [ source 1, source 2], it is obvious that VVC will, indeed, become the most efficient video codec during the next few years.
Figure 1. Outcome
of BBC's subjective
comparison tests of
HEVC (HM software
encoder), AV1, and
JEM as experimen-
tal ancestor of VVC.
(Fig. copyright BBC,
2018, picture taken
R&D blog post)
We must not forget, however, that AV1 intends to remain royalty-free. Thus, its near-HEVC performance is impressive and top-notch in the free open-source codec world. Some online documentation of AV1's most interesting and innovative coding tools can be found here and here. An overview of all coding tools is given in this paper.
Update Nov. 2018 As this image indicates, the speed of the AV1 reference encoder has recently been increased by at least a factor of 60 without a significant reduction in coding efficiency [1.6 %, source], so it seems that a runtime and coding gain similar to that of the HEVC reference encoder will, indeed, become possible with AV1 soon. Still, the efficiency of the VVC reference software clearly remains out of reach for AV1. This shortcoming will be addressed by AV2, whose standardization will begin in a few years when support for AV1 has been widely deployed to consumer devices [ source].
Summary: More Coding Gain Possible but Hard to Implement Efficiently
The current VVC standardization (see above) indicates that further gains in image and video coding performance are still possible. However, given the order of magnitude increases in encoding runtime of both VVC and AV1, I feel that we are rapidly leaving the path of reasonable gain-efficiency tradeoff followed for so long: with next-generation standards, coding of a single 4K image on one processor takes up to half an hour, and the hardware requirements especially for moving-picture coding are substantial. For this reason, most of the often promising but experimental coding tools in JEM (like FRUC, for example) won't make it into VVC: their algorithmic complexity and/or fast memory demands are just too high for usable implementations in both hardware and software. It's true that encoding can now be performed highly parallelized in the cloud, but spending a year of aggregate computing power on one hour of UHD video is clearly not efficient and, so far, not environmentally friendly [ article]. Don't forget the countless coding-decoding iterations performed by the various participating companies during experiments towards a codec's standardization itself (more and more of which need to be run due to the vanishing potential for further coding gain)! And remember that the final HEVC/H.265 reference encoder was only three times slower than its AVC/H.264 ancestor [ source]!
Therefore, aside from working on speeding up the new-generation video encoders by at least an order of magnitude, I believe it's time to reconsider the current approach in video coding development. Some experts at, e.g., Netflix share this vision and call for innovation «beyond block-based hybrid video coding» as outlined above. If that means using more extreme measures like large 4D-DCTs or CNNs, I disagree since, in my view, the computational burden for a competitive level of performance will likely be even more problematic than in today's codecs. If, however, the idea is to refrain from squeezing 9% more coding gain (i.e., statistical redundancy) out of existing block-based schemes, and instead to further exploit the inaccuracies of human vision in the design of image and video codecs (using parametric tools as in audio, which we still hardly do), then I fully support that approach. In fact, with my current work I already do. I hope you do too.
HEIF/MIAF and AVIF: The Two Latest-Generation Still-Image Codecs
Still-image coding, like video coding, has come a very long way since the early days of T.81/JPEG and H.262/MPEG-2 about a quarter of a century ago. Recently, two additions to the long list of image coding specifications have emerged, namely, single-picture constrained variants of HEVC, called High-Efficiency Image Format ( HEIF), and of AV1, named AV1 Still Image File Format ( AVIF). An extension of the HEIF standard known as Multi-Image Application Format ( MIAF) is currently being finalized as well, and as if that weren't enough, the JPEG committee is also working on a novel still-picture coding standard, referred to as JPEG XL, to be completed in October 2019 [ source]. All of these contenders have in common that they provide efficient support for high-resolution, HDR, and wide color gamut (WCG) as well as (partially) transparent image content. I will update this page with further details on each coding specification and comparative demos once evaluation software for all of these coding standards has become publicly available. For now, I can recommend this interactive picture coding demo by Thomas Daede, which is a bit outdated though. See also here and here.
page last modified in July 2019, last changes: mention AV2, update MPEG-5 EVC, subjective VVC test