Bridging communication gaps between hearing and hearing-impaired individuals is an important challenge in assistive ...
This repository provides a batch inference pipeline using TransVG for Multimodal Reasoning Competition Track1 (VG-RS). Given a set of image-question pairs, the model outputs the corresponding bounding ...