This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
machine-learning
deep-learning
nlp
natural-language-processing
privacy
ner
transformers
pii
named-entity-recognition
spacy
flair
Обновлено 2024-11-22 01:17:04 +03:00
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
nlp
multimodal
beit
beit-3
deepnet
document-ai
foundation-models
kosmos
kosmos-1
layoutlm
layoutxlm
llm
minilm
mllm
pre-trained-model
textdiffuser
trocr
unilm
xlm-e
Обновлено 2024-11-09 13:45:59 +03:00
The ORBIT dataset is a collection of videos of objects in clean and cluttered scenes recorded by people who are blind/low-vision on a mobile phone. The dataset is presented with a teachable object recognition benchmark task which aims to drive few-shot learning on challenging real-world data.
microsoft
machine-learning
computer-vision
video
benchmark
dataset
classification
few-shot-learning
meta-learning
object-recognition
Обновлено 2024-08-13 03:27:45 +03:00
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
video
localization
segmentation
caption-task
coin
joint
msrvtt
multimodal-sentiment-analysis
multimodality
pretrain
pretraining
retrieval-task
video-language
video-text
video-text-retrieval
youcookii
alignment
caption
Обновлено 2024-07-25 14:07:31 +03:00
Decision task for marian-dev
Обновлено 2024-05-21 01:05:14 +03:00
Code to reproduce experiments in the paper "Task-Oriented Dialogue as Dataflow Synthesis" (TACL 2020).
Обновлено 2024-05-01 00:56:47 +03:00
A GA4GH Task Execution Service (TES) compatible implementation for Azure Compute
Обновлено 2023-09-29 18:05:54 +03:00
Multi-Task Deep Neural Networks for Natural Language Understanding
Обновлено 2023-06-13 00:28:35 +03:00
Scripts to parse arxiv documents for NLP tasks
Обновлено 2023-06-12 22:03:00 +03:00
Automatically extracting keyphrases that are salient to the document meanings is an essential step to semantic document understanding. An effective keyphrase extraction (KPE) system can benefit a wide range of natural language processing and information retrieval tasks. Recent neural methods formulate the task as a document-to-keyphrase sequence-to-sequence task. These seq2seq learning models have shown promising results compared to previous KPE systems The recent progress in neural KPE is mostly observed in documents originating from the scientific domain. In real-world scenarios, most potential applications of KPE deal with diverse documents originating from sparse sources. These documents are unlikely to include the structure, prose and be as well written as scientific papers. They often include a much diverse document structure and reside in various domains whose contents target much wider audiences than scientists. To encourage the research community to develop a powerful neural model with key phrase extraction on open domains we have created OpenKP: a dataset of over 150,000 documents with the most relevant keyphrases generated by expert annotation.
Обновлено 2023-06-12 21:21:58 +03:00
Various tasks for CI Automation team
Обновлено 2022-12-08 13:02:06 +03:00
Task and example code for the Malmo Collaborative AI Challenge
Обновлено 2022-09-23 14:13:36 +03:00
An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/pdf/2106.04718.pdf
Обновлено 2022-08-17 19:38:06 +03:00
Unsupervised Domain Adaptation for Computer Vision Tasks
Обновлено 2022-04-21 12:34:36 +03:00
A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.
unity
machine-learning
deep-learning
computer-vision
robotics
manipulation
perception
physics-simulation
pose-estimation
robotics-simulation
simulation
ur3-robot-arm
ros
urdf
motion-planning
synthetic-data
trajectory-generation
tutorial
model-training
autonomy
Обновлено 2022-04-13 20:50:31 +03:00
A celery task class whose execution is delayed until after the request finishes, using request_started and request_finished signals from django.
Обновлено 2022-03-10 16:29:24 +03:00
An automated and scalable approach to generate tasklets from a natural language task query and a website URL. Glider does not require any pre-training. Glider models tasklet extraction as a state space search, where agents can explore a website’s UI and get rewarded when making progress towards task completion. The reward is computed based on the agent’s navigating pattern and the similarity between its trajectory and the task query.
Обновлено 2021-09-03 06:52:47 +03:00
Multitask Multilingual Multimodal Pre-training
Обновлено 2021-05-13 09:56:36 +03:00
A task interface for Jupyter notebooks built on machine learning and big data
Обновлено 2020-04-10 04:23:31 +03:00
Code to generate the Reddit corpus for the DSTC 8 competition Multi-Domain End-to-End Track, Fast Adaptation Task
Обновлено 2019-08-26 17:00:17 +03:00
Distributed Task Queue (development branch)
Обновлено 2016-07-12 04:06:18 +03:00
INACTIVE - http://mzl.la/ghe-archive - Runner is a project that manages starting tasks in a defined order. If tasks fail, the chain can be retried, or halted.
Обновлено 2015-10-29 00:13:56 +03:00
DEPRECATED - Taskboard
Обновлено 2012-05-25 17:37:06 +04:00