This_and_That_VDM This is the official implementation of VDM part of This&That: Language-Gesture Controlled Video Generation for Robot Planning. Coming Soon.