Spatiotemporal Video Quality Assessment Method via Multiple Feature Mappings

Authors

  • Daniel Oppong Bediako Institute of image processing and Pattern Recognition, Xi’an Jiaotong University, Xi’an China
  • Yi Zhang Institute of image processing and Pattern Recognition, Xi’an Jiaotong University, Xi’an China
  • Xuanqin Mou Institute of image processing and Pattern Recognition, Xi’an Jiaotong University, Xi’an China

Keywords:

Full reference video quality assessment, MMSFD-STS, spatiotemporal slice images

Abstract

Progressed video quality assessment (VQA) methods aim to evaluate the perceptual quality of videos in many applications but often prompt to increase computational complexity. Problems derive from the complexity of the distorted videos that are of significant concern in the communication industry, as well as the spatial-temporal content of the two-fold (spatial and temporal) distortion. Therefore, the findings of the study indicate that the information in the spatiotemporal slice (STS) images are useful in measuring video distortion. This paper mainly focuses on developing on a full reference video quality assessment algorithm estimator that integrates several features of spatiotemporal slices (STSS) of frames to form a high-performance video quality. This research work aims to evaluate video quality by utilizing several VQA databases by the following steps: (1) we first arrange the reference and test video sequences into a spatiotemporal slice representation. A collection of spatiotemporal feature maps were computed on each reference-test video. These response features are then processed by using a Structural Similarity (SSIM) to form a local frame quality.  (2) To further enhance the quality assessment, we combine the spatial feature maps with the spatiotemporal feature maps and propose the VQA model, named multiple map similarity feature deviation (MMSFD-STS). (3) We apply a sequential pooling strategy to assemble the quality indices of frames in the video quality scoring. (4) Extensive evaluations on video quality databases show that the proposed VQA algorithm achieves better/competitive performance as compared with other state- of- the- art methods.

References

. B. Girod, “Psychovisual Aspects Of Image Processing: What’s Wrong With Mean Squared Error?,” in Proceedings of the Seventh Workshop on Multidimensional Signal Processing, p. P.2-P.2, IEEE, Lake Placid, NY (1991) [doi:10.1109/MDSP.1991.639240].

. Z. Wang et al., “Image Quality Assessment: From Error Visibility to Structural Similarity,” IEEE Trans. on Image Process. 13(4), 600–612 (2004) [doi:10.1109/TIP.2003.819861].

. A. C. Bovik, The essential guide to image processing, Academic Press, London ; Boston (2009).

. W. Xue and L. Zhang, “Gradient Magnitude Similarity Deviation: An Highly Efficient Perceptual Image Quality Index,” 12.

. A. C. Bovik, “Content-weighted video quality assessment using a three-component image model,” J. Electron. Imaging 19(1), 011003 (2010) [doi:10.1117/1.3267087].

. P. V. Vu, C. T. Vu, and D. M. Chandler, “A spatiotemporal most-apparent-distortion model for video quality assessment,” in 2011 18th IEEE International Conference on Image Processing, pp. 2505–2508, IEEE, Brussels, Belgium (2011) [doi:10.1109/ICIP.2011.6116171].

. Y. Wang, T. Jiang, S. W. Ma, and W. Gao, “Novel spatio-temporal structural information based video quality metric,” IEEE Transactions on Circuits and System for Video Technology, vol. 22, no. 7, pp. 989-998, Jun. 2012

. P. Yan, X. Mou, and W. Xue, “Video quality assessment via gradient magnitude similarity deviation of spatial and spatiotemporal slices,” presented at IS&T/SPIE Electronic Imaging, 11 March 2015, San Francisco, California, United States, 94110M [doi:10.1117/12.2083283].

. D. A. Migliore, M. Matteucci, and M. Naccari, “A revaluation of differencing frame in fast and robust motion detection,” In Proceedings of the 4th ACM International Workshop on Video Surveillance and Sensor Networks, Oct. 2006, pp. 215-218

. M. A. Saad, A. C. Bovik, and C. Charrier, “Blind prediction of natural video quality,” IEEE Transactions on Image Processing, vol. 23, no. 3, pp. 1352-1365, Mar. 2014.

. C. W. Niblack, R. Barber, W. Equitz, and M. D. Flickner, “QBIC project: querying images by content, using color, texture, and shape,” In Storage and Retrieval for Image and Video Databases, Apr. 1993, pp. 173-188.

. L. Liu, Y. Hua, Q. Zhao, H. Huang, and A. C. Bovik, “Blind image quality assessment by relative gradient statistics and adaboosting neural network,” Signal Processing: Image Communication, vol. 40, no. 1, pp. 1-15, Jan. 2016.

. W. Xue, L. Zhang, X. Mou, and A. C. Bovik, “Gradient magnitude similarity deviation: a highly efficient perceptual image quality index,” IEEE Transactions on Image Processing, vol. 23, no. 2, pp. 684-695, Feb. 2014.

. Q. Li, W. Lin, and Y. Fang, “No-reference quality assessment for multiply-distorted images in gradient domain,” IEEE Signal Processing Letters, vol. 23, no. 4, pp. 541-545, Apr. 2016.

. C. Lee, S. Cho, J. Choe, T. Jeong, W. Ahn, and E. Lee, “Objective video quality assessment,” Optical Engineering, vol. 45, no. 1, article no. 017004, Jan. 2006.

. W. Xue, X. Mou, L. Zhang, A.C. Bovik, and X. Feng, “Blind image quality prediction using joint statistics of gradient magnitude and laplacian features,” IEEE Transactions on Image Processing, vol. 23, no. 11, pp. 4850–4862, Nov. 2014.

. . M. H. Pinson and S. Wolf, “A New Standardized Method for Objectively Measuring Video Quality,” IEEE Trans. on Broadcast. 50(3), 312–322 (2004) [doi:10.1109/TBC.2004.834028].

. 18. “RECOMMENDATION ITU-R BT.1907 - Objective perceptual video quality measurement techniques for broadcasting applications using HDTV in the presence of a full reference signal,” 26.

. . M. H. Pinson, N. Staelens, and A. Webster, “The history of video quality model validation,” in 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP), pp. 458–463, IEEE, Pula (CA), Italy (2013) [doi:10.1109/MMSP.2013.6659332].

. K. Seshadrinathan and A. Bovik, “Motion Tuned Spatio-Temporal Quality Assessment of Natural Videos,” Image Processing, IEEE Transactions on 19, 335–350 (2010) [doi:10.1109/TIP.2009.2034992].

. A. K. Moorthy and A. C. Bovik, “Efficient Video Quality Assessment Along Temporal Trajectories,” IEEE Trans. Circuits Syst. Video Technol. 20(11), 1653–1658 (2010)[doi:10.1109/TCSVT.2010.2087470].

. L. K. Choi and A. C. Bovik, “Video quality assessment accounting for temporal visual masking of local flicker,” Signal Processing: Image Communication 67, 182–198 (2018) [doi:10.1016/j.image.2018.06.009].

. W. Lu et al., “A spatiotemporal model of video quality assessment via 3D gradient differencing,” Information Sciences 478, 141–151 (2019) [doi:10.1016/j.ins.2018.11.003].

. G. Chen, C. Yang, and S. Xie, “Gradient-Based Structural Similarity for Image Quality Assessment,” in 2006 International Conference on Image Processing, pp. 2929–2932, IEEE, Atlanta, GA (2006) [doi:10.1109/ICIP.2006.313132].

. Z. Wang, E. P. Simoncelli, and A. C. Bovik, “Multiscale structural similarity for image quality assessment,” in The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, pp. 1398–1402, IEEE, Pacific Grove, CA, USA (2003) [doi:10.1109/ACSSC.2003.1292216].

. D. M. Chandler and S. S. Hemami, “VSNR: A Wavelet-Based Visual Signal-to-Noise Ratio for Natural Images,” IEEE Trans. on Image Process. 16(9), 2284–2298 (2007) [doi:10.1109/TIP.2007.901820].

. D. M. Chandler, “Most apparent distortion: full-reference image quality assessment and the role of strategy,” J. Electron. Imaging 19(1), 011006 (2010) [doi:10.1117/1.3267105].

. H. R. Sheikh and A. C. Bovik, “IMAGE INFORMATION AND VISUAL QUALITY,” 4.

. Z. Wang, L. Lu, and A. C. Bovik, “Video quality assessment based on structural distortion measurement,” Signal Processing: Image Communication 19(2), 121–132 (2004) [doi:10.1016/S0923-5965(03)00076-6].

. P. Yan and X. Mou, “Video quality assessment based on motion structure partition similarity of spatiotemporal slice images,” J. Electron. Imag. 27(03), 1 (2018) [doi:10.1117/1.JEI.27.3.033019].

. P. G. Freitas, W. Y. L. Akamine, and M. C. Q. Farias, “Using multiple spatio-temporal features to estimate video quality,” Signal Processing: Image Communication 64, 1–10 (2018) [doi:10.1016/j.image.2018.02.010].

. P. V. Vu and D. M. Chandler, “ViS3: an algorithm for video quality assessment via analysis of spatial and spatiotemporal slices,” J. Electron. Imaging 23(1), 013016 (2014) [doi:10.1117/1.JEI.23.1.013016].

. “Feature Extraction and Analysis using Gabor Filter and Higher Order.pdf.”

. M. J. Wainwright and E. P. Simoncelli, “Scale mixtures of Gaussians and the statistics of natural images,” Advances in Neural Information Processing Systems, Nov. 2000, pp. 855- 861

. A. Liu, W. Lin, and M. Narwaria, “Image quality assessment based on gradient similarity,” IEEE Trans. Image Process. vol. 21, no. 4, pp. 1500–1512, Apr. 2012.

. W. Xue, L. Zhang, X. Mou, and A. C. Bovik, “Gradient magnitude similarity deviation: A highly efficient perceptual image quality index,” IEEE Trans. Image Process., vol. 23, no. 2, pp. 684–695, Feb. 2014.

. Z. Ni, L. Ma, H. Zeng, C. Cai, and K.-K. Ma, “Screen content image quality assessment using edge model,” in Proc. IEEE Int. Conf. Image Process., Aug. 2016, pp. 81–85.

. B. Tao and B. W. Dickinson, “Texture recognition and image retrieval using gradient indexing,” Journal of Visual Communication and Image Representation, vol. 11, no. 3, pp. 372-342, Sep. 2000

. A. K. Moorthy and A. C. Bovik, “Visual Importance Pooling for Image Quality Assessment,” IEEE J. Sel. Top. Signal Process. 3(2), 193–201 (2009) [doi:10.1109/JSTSP.2009.2015374].

. A. K. Moorthy and A. C. Bovik, “Perceptually significant spatial pooling techniques for image quality assessment,” presented at IS&T/SPIE Electronic Imaging, 5 February 2009, San Jose, CA, 724012 [doi:10.1117/12.810166].

. K. Seshadrinathan et al., “Study of Subjective and Objective Quality Assessment of Video,” IEEE Trans. on Image Process. 19(6), 1427–1441 (2010) [doi:10.1109/TIP.2010.2042111].

. K. Seshadrinathan et al., “A subjective study to evaluate video quality assessment algorithms,” presented at IS&T/SPIE Electronic Imaging, 4 February 2010, San Jose, California, 75270H [doi:10.1117/12.845382].

. F. De Simone et al., “Subjective Quality Assessment of H.264/AVC Video Streaming with Packet Losses,” EURASIP Journal on Image and Video Processing 2011, 1–12 (2011) [doi:10.1155/2011/190431].

. F. De Simone et al., “A H.264/AVC video database for the evaluation of quality metrics,” in 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2430–2433, IEEE, Dallas, TX, USA (2010) [doi:10.1109/ICASSP.2010.5496296].

. K. Brunnstrom et al., “VQeg validation and ITU standardization of objective perceptual video quality metrics [Standards in a Nutshell],” IEEE Signal Process. Mag. 26(3), 96–101 (2009) [doi:10.1109/MSP.2009.932162].

List of Figures

. . Block diagram of the proposed method

. . The slice images are seen in different dimensions of the video. (a) Standard video frames. (b) Vertical STS images. (c) Horizontal images of the STS.

. The framework of the proposed multiple map similarity feature deviation (MMSFD-STS) for spatiotemporal slice model

. The scatter plot of the proposed method on the LIVE database and EPFL-PoliMI database. (a) Subjective MOS scores vs. the scores obtained by our method tested on the EPFL database, (b) subjective MOS scores vs. the scores obtained by our method tested on the PoliMl database. (c) Subjective MOS scores vs. the scores obtained by our method tested on LIVE database

List of Tables

. Comparison of LIVE database results by SROCC

. Comparison of LIVE database results by LCC

. Performance comparison on EPFL-PoliMl video database

. Runtime comparison of the STS-based VQA algorithms

Daniel Oppong Bediako received his Bachelor of Engineering (BEng) in Electrical Electronic Engineering from Accra Institute of Technology, Accra, Ghana, in 2013 and his MS degree in Information and Communication Engineering from Xi’an Jiaotong University, Xi’an, Shaanxi, China, in 2017. Currently, he is pursuing a Ph.D in information and communication engineering at Xi’an Jiaotong University. His research interests include image quality assessment and video quality assessment.

Yi Zhang received the B.S. and M.S. degrees in electrical engineering from Northwestern Polytechnical University, Xi'an, China, in 2008 and 2011, respectively, and the Ph.D. degree in electrical engineering from Oklahoma State University, Stillwater, OK, USA, in 2015. From 2016 to 2018, he was a Postdoctoral Research Associate with the Department of Electrical and Electronic Engineering, Shizuoka University, Japan. He is currently a Faculty Member with the School of Electronic and Information Engineering, Xi'an Jiaotong University, China. His research interests include 2D/3D image processing, machine learning, pattern recognition, and computer vision

Xuanqin Mou has been with the Institute of Image Processing and Pattern Recognition (IPPR), Electronic and Information Engineering School, Xi’an Jiaotong University, since 1987. He has been an associate professor since 1997 and a professor since 2002. Currently, he is the director of IPPR. Prof. Mou has authored or coauthored more than 200 peer-reviewed journal or conference papers. He has been granted as the Technology Academy Award for invention by the Ministry of Education of China, and the Technology Academy Awards from the Government of Shaanxi Province, China

Downloads

Published

2022-01-30

How to Cite

Daniel Oppong Bediako, Zhang, Y., & Mou, X. (2022). Spatiotemporal Video Quality Assessment Method via Multiple Feature Mappings. International Journal of Sciences: Basic and Applied Research (IJSBAR), 61(1), 291–310. Retrieved from https://gssrr.org/index.php/JournalOfBasicAndApplied/article/view/13257

Issue

Section

Articles