Motion Compensated Frame Rate Upconversion

For many applications, video is transmitted over bandwidth limited channels and low frame rates combined with high compression ratios have to be used to meet the bit rate constraints. A typical example of such an application is digital video broadcasting for mobile devices (i.e. DVB-H), where relatively low frame rates of 15fps are common. As the lower frame rate also translates to a reduced bit rate, a frame rate of 15fps allows the operators to offer more channels in the same amount of spectrum than when higher frame rates are used. Unfortunately, the low frame rate also causes the motion in the video sequence to appear uneven or jerky, as the human visual system starts to notice individual frames when the frame rate is below a certain value.

Motion compensated frame rate upconversion (MCFRUC) can be used to increase the frame rate of a video sequence. As the upconversion algorithm takes motion into account, objects in the interpolated frames are at the correct locations. An example for an upconversion ratio of 2, i.e. the interpolated frame is temporally located in the middle between the original frames, is shown in Fig. 1. Due to the higher frame rate, motion appears smoother than in the original sequence and watching the sequence is more enjoyable for the user.

Fig. 1: Original (left and right) and interpolated frame obtained by MCFRUC

In order to generate interpolated frames, an MCFRUC method estimates the motion between the original frames of the video sequence and uses motion compensation to obtain interpolated frames. A problem is that the motion estimation can in some cases fail, e.g. when objects move fast or new objects appear in the sequence. The erroneous motion estimates cause upconversion artifacts, which can be annoying for the user and significantly reduce the perceived video quality

We propose a MCFRUC method that uses multiple interpolation paths and a median filter in order to mitigate upconversion artifacts. The effect of the median filter is illustrated in Fig. 2 for a frame of the Foreman sequence. The frames (a,b,c) are the inputs of the median filter, the frame (d) is the output. As can be seen, the frame (d) has significantly fewer artifacts than any of the input frames.

Fig. 2: Effect of the median filter

Our method also uses an efficient motion estimation method and is intended for real time operation on mobile devices, such as cell phones or media players. We are currently in the process of publishing our work, please check back in the near future for updates. An upconversion result for the Foreman sequence is shown in Fig. 3 (the video is about 14MB, please allow some time to load).

Fig. 3: Original (left, 15fps) and upconverted sequence (right, 30fps) obtained by our method (14MB video, please be patient)