Abstract

In this talk, we will mainly focus on the optimization of AV1, the newly finalized open source, royalty free video codec standard. We will present the AV1 software encoding capability for its deployment in various use cases, including VOD, Live Streaming, and Real-Time Communications (RTC), taking our Aurora1 AV1 encoder as an instantiation. First, we will have an overview of this new codec standard, and illustrate its differentiation from its predecessors, e.g. H264/AVC, H265/HEVC, and VP9. In particular, we will highlight the Screen Content Coding (SCC) and Film Grain Synthesis (FGS) coding tools that are unique to AV1. We will describe the essential approaches for such coding tools as well as elaborate their essential contribution and influence to the industry and the end users. Secondly, we will further present the superior performance of Aurora1 AV1 in various use cases, from VOD, to Live, to RTC, with different requests in terms of encoding speed, latenty, as well as underlying computational resources. We will highlight the essential influence AV1 has brought to the industry, in terms of its nature in open source, royalty free, as well as its wide support by browsers and WebRTC, the most widely acknowledged RTC open source platform. Thirdly, we will in particular focus on the use of machine learning and content-adaptive encoding approaches to the encoding optimization for AV1. We will mention the joint use of machine learning through the use of neural networks together with traditional image processing algorithms to optimize an encoder to produce the minimum bits but indeed capable of producing even better visual quality. Lastly, we will also touch base with several relevant topics, all essential to the final deployment of AV1 to the real applications. We will describe the decoder complexity awareness for the encoder optimization, the topic of Blind Video Quality Assessment (B-VQA), as well as the per-title encoding approach facilitated by machine learning. As a bonus, as an entrepreneur who started to devote myself to the startup of Visionular at a late career stage, after years of only being a software engineer, I would love to share my entrepreneurship experiences with those who are interested.

Bio

Zoe Liu is the Co-Founder of Visionular, a tech startup with its HQ based in the Silicon Valley of California and close to 60 team members globally distributed across the continents of North America, Asia, and Europe. Visionular provides both on premise and cloud based video encoding, processing, and streaming solutions and services, and has served more than 60+ commercial enterprise customers worldwide. Zoe graduated from Purdue in August, 2004 with her Ph.D. earned from the VIPER lab under the supervision of Prof Edward Delp. Her Ph.D. thesis was titled "Layered Scalable And Low Complexity Video Encoding: New Approaches And Theoretic Analysis". Zoe was previously a software engineer with the Google WebM team and has been a key contributor to the newly finalized royalty free video codec standard AOM/AV1. She has devoted to the video codec and real-time communications (RTC) technologies for 20+ years, and contributed to several world class RTC products including Apple FaceTime and Google Glass Video Call. She was a 2018 Google I/O speaker. She has published 50 peer-reviewed international conference and journal papers, including an invited paper co- authored with Prof Maggie Zhu with Purdue ECE, published in the prestigious IEEE journal - The Proceedings of the IEEE to address state-of-the-art technologies in AI+Video Codec.

Hosts

Professor Maggie Zhu, zhuO@purdue.edu, 765-496-0407 and Professor Edward Delp, ace@purdue.edu, 765-494-1740.

Communications, Networking, Signal & Image Processing