Visual information fidelity

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
File:Vif wiki.pdf

Visual information fidelity (VIF) is a full reference image quality assessment index based on natural scene statistics and the notion of image information extracted by the human visual system.[1] It was developed by Hamid R Sheikh and Alan Bovik at the Laboratory for Image and Video Engineering (LIVE) at the University of Texas at Austin in 2006. It is deployed in the core of the Netflix VMAF video quality monitoring system, which controls the picture quality of all encoded videos streamed by Netflix.

System model

[edit | edit source]

Source model

[edit | edit source]

A Gaussian scale mixture (GSM) is used to statistically model the wavelet coefficients of a steerable pyramid decomposition of an image.[2] The model is described below for a given subband of the multi-scale multi-orientation decomposition and can be extended to other subbands similarly. Let the wavelet coefficients in a given subband be 𝒞={C¯i:i} where denotes the set of spatial indices across the subband and each C¯i is an M dimensional vector. The subband is partitioned into non-overlapping blocks of M coefficients each, where each block corresponds to C¯i. According to the GSM model, 𝒞=𝒮𝒰={SiU¯i:i}, where Si is a positive scalar and U¯i is a Gaussian vector with mean zero and co-variance 𝐂U. Further the non-overlapping blocks are assumed to be independent of each other and that the random field 𝒮 is independent of 𝒰.

Distortion model

[edit | edit source]

The distortion process is modeled using a combination of signal attenuation and additive noise in the wavelet domain. Mathematically, if 𝒟={D¯i:i} denotes the random field from a given subband of the distorted image, 𝒢={gi:i} is a deterministic scalar field and 𝒱={V¯i:i}, where V¯i is a zero mean Gaussian vector with co-variance 𝐂V=σv2𝐈, then

𝒟=𝒢𝒞+𝒱.

Further, 𝒱 is modeled to be independent of 𝒮 and 𝒰.

HVS model

[edit | edit source]

The duality of HVS models and NSS implies that several aspects of the HVS have already been accounted for in the source model. Here, the HVS is additionally modeled based on the hypothesis that the uncertainty in the perception of visual signals limits the amount of information that can be extracted from the source and distorted image. This source of uncertainty can be modeled as visual noise in the HVS model. In particular, the HVS noise in a given subband of the wavelet decomposition is modeled as additive white Gaussian noise. Let 𝒩={N¯i:i} and 𝒩={N¯i:i} be random fields, where N¯i and N¯i are zero mean Gaussian vectors with co-variance 𝐂N and 𝐂N. Further, let and denote the visual signal at the output of the HVS. Mathematically, we have =𝒞+𝒩 and =𝒟+𝒩. Note that 𝒩 and 𝒩 are random fields that are independent of 𝒮, 𝒰 and 𝒱.

VIF index

[edit | edit source]

Let C¯N=(C¯1,C¯2,,C¯N) denote the vector of all blocks from a given subband. Let SN,D¯N,E¯N and F¯N be similarly defined. Let sN denote the maximum likelihood estimate of SN given CN and 𝐂U. The amount of information extracted from the reference is obtained as

I(C¯N;E¯N|S¯N=sN)=12i=1Nlog2(|si2𝐂U+σn2𝐈||σn2𝐈|),

while the amount of information extracted from the test image is given as

I(C¯N;F¯N|S¯N=sN)=12i=1Nlog2(|gi2si2𝐂U+(σv2+σn2)𝐈||(σv2+σn2)𝐈|).

Denoting the N blocks in subband j of the wavelet decomposition by C¯N,j, and similarly for the other variables, the VIF index is defined as

VIF=jsubbandsI(C¯N,j;F¯N,jSN,j=sN,j)jsubbandsI(C¯N,j;E¯N,jSN,j=sN,j).

Performance

[edit | edit source]

The Spearman's rank-order correlation coefficient (SROCC) between the VIF index scores of distorted images on the LIVE Image Quality Assessment Database and the corresponding human opinion scores is evaluated to be 0.96.[citation needed]

References

[edit | edit source]
  1. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  2. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
[edit | edit source]