Twenty years of Frame Interpolation for Retiming in the Movies
Date & Time
Wednesday, November 11, 2020, 7:30 PM - 8:00 PM
Session Type
Technical session
Anil Kokaram
Frame interpolation is the process of synthesizing a new frame in-between existing frames in an image sequence. It has emerged as an important module in motion picture effects. It is also a key component in television standards conversion used for instance in high frame rate TV sets to upsample the frame rate of received signals and to convert between worldwide television standards e.g. EU and US formats (25 fps and 30 fps respectively). These processes revolve around the manipulation of time and space in a sequence of frames hence frame interpolation is implicated in them all.
This paper marks 2019/20 as a technological milestone when 20 years after the movie "The Matrix" popularised motion-based effects (like slomo and BulletTime/Timeslice) we are starting to see the rapid development of the technology once more from a totally different perspective. Traditional slomo or view interpolation has been achieved by motion-based pixel pushing for a very long time. Two of the industry-leading software tools: Twixtor and Kronos/Nuke his relied on modeling motion in some way since 2000. Since about 2016 however, Deep Learning strategies have been deployed. Deep Learning and AI are perceived as being vastly superior for image and video processing, and we find that in fact for high-quality material, it is only in the last year that Deep Learning has started to compete effectively with the traditional motion-based approaches.
This paper provides a review of the technology used to create in-between frames and presents a probabilistic framework that generalizes frame interpolation algorithms using the concept of motion interpolation.
We explore what Deep Learning has to bring to this field and introduce ideas about hybrid schemes that build on the best of both worlds. Our results show where the traditional motion-based ideas remain more efficient and where the new techniques add value e.g. in occluded areas. Unlike other work in this area, we make the attempt to benchmark both academic and industrial toolkits. We also discuss the impact of training data on measurement of performance and find that at the high-resolution performance of the recent techniques suffer from poor motion handling.
Technical Depth of Presentation
Intermediate/Advanced. This paper is rather technical in nature but has a historical perspective dating to the origins of motion estimation based retiming in the early 2000's.
What Attendees will Benefit Most from this Presentation
Engineers, Technologists, Students, Managers
Take-Aways from this Presentation
1. Although it has been 20 years since this kind of technology was first demonstrated, the last 5 years have seen the birth of a disruption in the area. 2. Deep Learning is improving our ability to perform these kinds of tasks but it is only now 2019/20 starting to compete with existing traditional ideas. 3. Be careful when using Deep Learning algorithms for this kind of task. It may not achieve the kind of leap in performance you think on high-resolution material.