Video Multimethod Assessment Fusion (VMAF) is an objective full-reference

video quality Video quality is a characteristic of a video passed through a video transmission or processing system that describes perceived video degradation (typically, compared to the original video). Video processing systems may introduce some amount of dis ...

metric developed by

Netflix Netflix, Inc. is an American subscription video on-demand over-the-top streaming service and production company based in Los Gatos, California. Founded in 1997 by Reed Hastings and Marc Randolph in Scotts Valley, California, it offers a ...

in cooperation with the

University of Southern California , mottoeng = "Let whoever earns the palm bear it" , religious_affiliation = Nonsectarian—historically Methodist , established = , accreditation = WSCUC , type = Private research university , academic_affiliations = , endowment = $8.1 ...

, The IPI/LS2N lab Nantes Université, and the Laboratory for Image and Video Engineering (LIVE) at

The University of Texas at Austin The University of Texas at Austin (UT Austin, UT, or Texas) is a public research university in Austin, Texas. It was founded in 1883 and is the oldest institution in the University of Texas System. With 40,916 undergraduate students, 11,075 ...

. It predicts subjective video quality based on a reference and distorted video sequence. The metric can be used to evaluate the quality of different

video codec A video codec is software or hardware that compresses and decompresses digital video. In the context of video compression, '' codec'' is a portmanteau of ''encoder'' and ''decoder'', while a device that only compresses is typically called an ...

s, encoders, encoding settings, or transmission variants.

History

The metric is based on initial work from the group of Professor C.-C. Jay Kuo at the University of Southern California. Here, the applicability of fusion of different video quality metrics using

support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborat ...

s (SVM) has been investigated, leading to a "FVQA (Fusion-based Video Quality Assessment) Index" that has been shown to outperform existing image quality metrics on a subjective video quality database. The method has been further developed in cooperation with Netflix, using different subjective video datasets, including a Netflix-owned dataset ("NFLX"). Subsequently renamed "Video Multimethod Assessment Fusion", it was announced on the ''Netflix TechBlog'' in June 2016 and version 0.3.1 of the reference implementation was made available under a permissive open-source license. In 2017, the metric was updated to support a custom model that includes an adaptation for cellular phone screen viewing, generating higher quality scores for the same input material. In 2018, a model that predicts the quality of up to 4K resolution content was released. The datasets on which these models were trained have not been made available to the public. In 2021, a

Technology and Engineering Emmy Award The Technology and Engineering Emmy Awards, or Technology and Engineering Emmys, are one of two sets of Emmy Awards that are presented for outstanding achievement in engineering development in the television industry. The Technology and Engineer ...

was awarded to Beamr, Netflix, University of Southern California,

University of Nantes The University of Nantes (french: Université de Nantes) is a public university located in the city of Nantes, France. In addition to the several campuses scattered in the city of Nantes, there are two satellite campuses located in Saint-Nazair ...

, SSIMWAVE, Disney, Google, Brightcove and ATEME for the Development of Open Perceptual Metrics for Video Encoding Optimization. It was the second time in 20 years that universities got an Emmy Award. It was also the first time a French University got one.

Components

VMAF uses existing image quality metrics and other features to predict video quality: * Visual Information Fidelity (VIF): considers information fidelity loss at four different spatial scales * Detail Loss Metric (DLM): measures loss of details, and impairments which distract viewer attention * Mean Co-Located Pixel Difference (MCPD): measures temporal difference between frames on the luminance component The above features are fused using a SVM-based regression to provide a single output score in the range of 0–100 per

video frame In filmmaking, video production, animation, and related fields, a frame is one of the many '' still images'' which compose the complete '' moving picture''. The term is derived from the historical development of film stock, in which the sequenti ...

, with 100 being quality identical to the reference video. These scores are then temporally pooled over the entire video sequence using the

arithmetic mean In mathematics and statistics, the arithmetic mean ( ) or arithmetic average, or just the '' mean'' or the ''average'' (when the context is clear), is the sum of a collection of numbers divided by the count of numbers in the collection. The co ...

to provide an overall differential

mean opinion score Mean opinion score (MOS) is a measure used in the domain of Quality of Experience and telecommunications engineering, representing overall quality of a stimulus or system. It is the arithmetic mean over all individual "values on a predefined scale t ...

(DMOS). Due to the public availability of the training source code ("VMAF Development Kit", VDK), the fusion method can be re-trained and evaluated based on different video datasets and features. Anti-noise

signal-to-noise ratio Signal-to-noise ratio (SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. SNR is defined as the ratio of signal power to the noise power, often expressed in de ...

(AN-SNR) was used in earlier versions of VMAF as a quality metric but was subsequently abandoned.

Performance

An early version of VMAF has been shown to outperform other image and video quality metrics such as SSIM,

PSNR Peak signal-to-noise ratio (PSNR) is an engineering term for the ratio between the maximum possible power of a Signal (information theory), signal and the power of corrupting noise that affects the fidelity of its representation. Because many sign ...

-HVS and VQM-VFD on three of four datasets in terms of prediction accuracy, when compared to subjective ratings. Its performance has also been analyzed in another paper, which found that VMAF did not perform better than SSIM and MS-SSIM on a video dataset. In 2017, engineers from

RealNetworks RealNetworks, Inc. is a provider of artificial intelligence and computer vision based products. RealNetworks was a pioneer in Internet streaming software and services. They are based in Seattle, Washington, United States. The company also pr ...

reported good reproducibility of Netflix' performance findings. I
MSU video quality metrics benchmark
where its various versions (including VMAF NEG) were tested, VMAF outperformed all other metrics on all compression standards (H.265, VP9, AV1, VVC). VMAF scores can be artificially increased without improving perceived quality by applying various operations before or after distorting the video, sometimes without impacting the popular

metric.

Software

reference implementation In the software development process, a reference implementation (or, less frequently, sample implementation or model implementation) is a program that implements all requirements from a corresponding specification. The reference implementation o ...

written in C and

Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...

("VMAF Development Kit, VDK") is published as

free software Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, n ...

under the terms of BSD+Patent license. Its source code and additional material are available on

GitHub GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, cont ...

References

{{reflist, refs= {{Cite journal, last1=Liu, first1=Tsung-Jung, last2=Lin, first2=Joe Yuchieh, last3=Lin, first3=Weisi, last4=Kuo, first4=C.-C. Jay, date=2013, title=Visual quality assessment: recent developments, coding applications and future trends, journal=APSIPA Transactions on Signal and Information Processing, volume=2, doi=10.1017/atsip.2013.5, issn=2048-7703, doi-access=free {{Cite journal, last1=Lin, first1=Joe Yuchieh, last2=Liu, first2=T. J., last3=Wu, first3=E. C. H., last4=Kuo, first4=C. C. J., date=December 2014, title=A fusion-based video quality assessment (FVQA) index, journal=Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific, pages=1–5, doi=10.1109/apsipa.2014.7041705, isbn=978-6-1636-1823-8, s2cid=7742774 {{Cite journal, last1=Lin, first1=Joe Yuchieh, last2=Wu, first2=Chi-Hao, last3= Ioannis, first3=Katsavounidis , last4=Li, first4=Zhi, last5= Aaron, first5=Anne, last6=Kuo, first6=C.-C. Jay, date=June 2015, title=EVQA: An ensemble-learning-based video quality assessment index, journal=Multimedia & Expo Workshops (ICMEW), 2015 IEEE International Conference on, pages=1–5, doi=10.1109/ICMEW.2015.7169760, isbn=978-1-4799-7079-7, s2cid=6996075 {{Cite web, url=https://medium.com/netflix-techblog/toward-a-practical-perceptual-video-quality-metric-653f208b9652, title=Toward A Practical Perceptual Video Quality Metric, last=Blog, first=Netflix Technology, date=2016-06-06, website=Netflix TechBlog, access-date=2017-07-15 {{Citation, title=vmaf: Perceptual video quality assessment based on multi-method fusion, date=2017-07-14, url=https://github.com/Netflix/vmaf, publisher=Netflix, Inc., access-date=2017-07-15 {{Cite journal, last1=Li, first1=S., last2=Zhang, first2=F., last3=Ma, first3=L., last4=Ngan, first4=K. N., date=October 2011, title=Image Quality Assessment by Separately Evaluating Detail Losses and Additive Impairments, journal=IEEE Transactions on Multimedia, volume=13, issue=5, pages=935–949, doi=10.1109/tmm.2011.2152382, s2cid=8618041, issn=1520-9210 {{cite arXiv, last1=Bampis, first1=Christos G., last2=Bovik, first2=Alan C., date=2017-03-02, title=Learning to Predict Streaming Video QoE: Distortions, Rebuffering and Memory, eprint=1703.00633, class=cs.MM {{cite journal, last1=Rassool, first1=Reza, title=VMAF reproducibility: Validating a perceptual practical video quality metric, journal=2017 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), pages=1–2, date=2017, url=https://www.realnetworks.com/sites/default/files/vmaf_reproducibility_ieee.pdf, access-date=2017-11-30, doi=10.1109/BMSB.2017.7986143, isbn=978-1-5090-4937-0, s2cid=5449498 {{cite arXiv, first1=Anastasia, last1=Zvezdakova, first2=Sergey, last2=Zvezdakov, first3= Dmitriy, last3=Kulikov, first4=Dmitriy, last4=Vatolin, date=2019-04-29, title=Hacking VMAF with Video Color and Contrast Distortion, eprint=2107.04510 {{cite arXiv, first1=Maksim, last1=Siniukov, first2=Anastasia, last2=Antsiferova, first3=Dmitriy, last3=Kulikov, first4=Dmitriy, last4=Vatolin, eprint=2107.04510, title=Hacking VMAF and VMAF NEG: vulnerability to different preprocessing methods

External links