Using this we can eliminate large numbers of calls to predict intra,
and is also faster than most of the variance functions it replaces.
This is an equivalence transform so coding performance is unaffected.
Encoder speedup is approx 7% when var_tx, super_tx and ext_tx are all
enabled.
Change-Id: I0d4c83afc4a97a1826f3abd864bd68e41bb504fb