An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
Обновлено 2024-07-25 14:07:31 +03:00
Bidirectionally transformed strings
Обновлено 2024-01-12 00:12:08 +03:00
This project provides an official implementation of our recent work on real-time multi-object tracking in videos. The previous works conduct object detection and tracking with two separate models so they are very slow. In contrast, we propose a one-stage solution which does detection and tracking with a single network by elegantly solving the alignment problem. The resulting approach achieves groundbreaking results in terms of both accuracy and speed: (1) it ranks first among all the trackers on the MOT challenges; (2) it is significantly faster than the previous state-of-the-arts. In addition, it scales gracefully to handle a large number of objects.
Обновлено 2023-10-04 00:42:58 +03:00
This is an implementation of AAAI'20 paper "Semantics-Aligned Representation Learning for Person Re-identification". We leverages dense semantics to address both the spatial misalignment and semantics misalignment challenges in person re-identification.
Обновлено 2023-06-12 21:21:23 +03:00
DeepSpeech based forced alignment tool
Обновлено 2020-08-11 12:02:04 +03:00
Create a backlog of actionable work items in alignment with various processes based on proven practices.
Обновлено 2020-07-23 05:23:15 +03:00