Used in Video understanding. see Learning from Unlabeled Video
More general: equivariant feature learning.
Learning invariant feature hierarchies