Abstract: Image and text matching plays a crucial role in bridging the cross-modal gap between vision and language, and has achieved great progress due to the deep learning. However, the existing ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results