The ORBIT dataset is a collection of videos of objects in clean and cluttered scenes recorded by people who are blind/low-vision on a mobile phone. The dataset is presented with a teachable object recognition benchmark task which aims to drive few-shot learning on challenging real-world data.
microsoft
machine-learning
computer-vision
video
benchmark
dataset
classification
few-shot-learning
meta-learning
object-recognition
Обновлено 2024-08-13 03:27:45 +03:00
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
video
localization
segmentation
caption-task
coin
joint
msrvtt
multimodal-sentiment-analysis
multimodality
pretrain
pretraining
retrieval-task
video-language
video-text
video-text-retrieval
youcookii
alignment
caption
Обновлено 2024-07-25 14:07:31 +03:00
VideoX: a collection of video cross-modal models
Обновлено 2024-06-03 05:11:25 +03:00
This project provides an official implementation of our recent work on real-time multi-object tracking in videos. The previous works conduct object detection and tracking with two separate models so they are very slow. In contrast, we propose a one-stage solution which does detection and tracking with a single network by elegantly solving the alignment problem. The resulting approach achieves groundbreaking results in terms of both accuracy and speed: (1) it ranks first among all the trackers on the MOT challenges; (2) it is significantly faster than the previous state-of-the-arts. In addition, it scales gracefully to handle a large number of objects.
Обновлено 2023-10-04 00:42:58 +03:00
a transductive approach for video object segmentation
Обновлено 2023-06-27 15:56:56 +03:00
The IntelligentEdgeHOL walks through the process of deploying an Azure IoT Edge module to an Nvidia Jetson Nano device to allow for detection of objects in YouTube videos, RTSP streams, or an attached web cam
Обновлено 2023-03-28 19:44:29 +03:00
Office 365 Video XBlock for Open edX
Обновлено 2022-11-28 22:13:05 +03:00
GitHub repo for Azure Video Analyzer
Обновлено 2022-05-21 04:35:53 +03:00
Python library for transcoding popcorn code into flat video files
Обновлено 2019-03-29 21:16:31 +03:00
AirMozilla is the video broadcasting site for the Mozilla project
Обновлено 2018-05-21 13:41:00 +03:00
☎️ 📦 Snap to enable users to chat and make audio/video calls using Spreed.ME from within Nextcloud
Обновлено 2017-09-06 11:03:51 +03:00