DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
machine-learning
data-science
causal-inference
causality
treatment-effects
bayesian-networks
causal-machine-learning
causal-models
do-calculus
graphical-models
python3
Обновлено 2024-11-22 19:51:17 +03:00
Documentation and issues for Pylance
Обновлено 2024-11-22 04:53:59 +03:00
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
machine-learning
deep-learning
nlp
natural-language-processing
privacy
ner
transformers
pii
named-entity-recognition
spacy
flair
Обновлено 2024-11-22 01:17:04 +03:00
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
nlp
multimodal
beit
beit-3
deepnet
document-ai
foundation-models
kosmos
kosmos-1
layoutlm
layoutxlm
llm
minilm
mllm
pre-trained-model
textdiffuser
trocr
unilm
xlm-e
Обновлено 2024-11-09 13:45:59 +03:00
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Обновлено 2024-10-18 13:31:05 +03:00
A research project for natural language generation, containing the official implementations by MSRA NLC team.
Обновлено 2024-07-25 14:27:07 +03:00
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
video
localization
segmentation
caption-task
coin
joint
msrvtt
multimodal-sentiment-analysis
multimodality
pretrain
pretraining
retrieval-task
video-language
video-text
video-text-retrieval
youcookii
alignment
caption
Обновлено 2024-07-25 14:07:31 +03:00
Code to reproduce experiments in the paper "Constrained Language Models Yield Few-Shot Semantic Parsers" (EMNLP 2021).
Обновлено 2024-05-31 20:48:22 +03:00
Azure Search Cognitive Skill to extract technical and business skills from text
Обновлено 2024-04-25 08:03:40 +03:00
This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microsoft Research Asia (MSRA).
nlp
ml
ner
named-entity-recognition
entity-extraction
entity-linking
entity-resolution
grn
language-understanding
linkingpark
nlp-resources
unitrans
xl-ner
bertel
can-ner
cross-lingual-ner
entity-disambiguation
Обновлено 2024-03-16 09:53:11 +03:00
Grounded Language-Image Pre-training
Обновлено 2024-01-24 07:56:13 +03:00
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Обновлено 2024-01-09 18:03:18 +03:00
Oscar and VinVL
Обновлено 2023-08-28 04:34:58 +03:00
A library & tools to evaluate predictive language models.
nlp
language-model
evaluation
evaluation-toolkit
language-model-evaluation
lm-challenge
next-word-prediction
ngram-model
prediction-model
research-tool
Обновлено 2023-08-09 18:22:29 +03:00
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
deep-learning
natural-language-processing
representation-learning
transformers
language-model
natural-language-understanding
pretraining
contrastive-learning
pretrained-language-model
Обновлено 2023-07-25 17:21:55 +03:00
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
deep-learning
pytorch
natural-language-processing
artificial-intelligence
dnn
question-answering
model-compression
text-classification
knowledge-distillation
text-matching
qna
sequence-labeling
Обновлено 2023-07-22 06:07:54 +03:00
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
Обновлено 2023-06-13 00:30:31 +03:00
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
Обновлено 2023-06-13 00:30:31 +03:00
Multi-Task Deep Neural Networks for Natural Language Understanding
Обновлено 2023-06-13 00:28:35 +03:00
Python SDK for the Microsoft Language Understanding Intelligent Service API, part of Cognitive Services
Обновлено 2023-06-12 23:52:58 +03:00
Automatically extracting keyphrases that are salient to the document meanings is an essential step to semantic document understanding. An effective keyphrase extraction (KPE) system can benefit a wide range of natural language processing and information retrieval tasks. Recent neural methods formulate the task as a document-to-keyphrase sequence-to-sequence task. These seq2seq learning models have shown promising results compared to previous KPE systems The recent progress in neural KPE is mostly observed in documents originating from the scientific domain. In real-world scenarios, most potential applications of KPE deal with diverse documents originating from sparse sources. These documents are unlikely to include the structure, prose and be as well written as scientific papers. They often include a much diverse document structure and reside in various domains whose contents target much wider audiences than scientists. To encourage the research community to develop a powerful neural model with key phrase extraction on open domains we have created OpenKP: a dataset of over 150,000 documents with the most relevant keyphrases generated by expert annotation.
Обновлено 2023-06-12 21:21:58 +03:00
The implementation of DeBERTa
representation-learning
self-attention
deeplearning
bert
transformer-encoder
language-model
natural-language-understanding
roberta
Обновлено 2023-03-25 13:22:59 +03:00
Cookiecutter API for creating Custom Skills for Azure Search using Python and Docker
Обновлено 2022-11-28 22:10:04 +03:00
MASS: Masked Sequence to Sequence Pre-training for Language Generation
Обновлено 2022-11-28 22:09:50 +03:00
Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference
Обновлено 2022-11-28 22:08:28 +03:00
Natural Language Processing Best Practices & Examples
machine-learning
deep-learning
nlp
natural-language-processing
best-practices
text-classification
natural-language-understanding
nlu
azure-ml
text
mlflow
pretrained-models
nli
natural-language
natural-language-inference
sota
transfomer
Обновлено 2022-08-29 16:59:54 +03:00
In this project we develop new deep learning models for bootstrapping language understanding models for languages with no labeled data using labeled data from other languages.
Обновлено 2022-08-17 18:39:41 +03:00
Обновлено 2022-08-04 22:59:22 +03:00
MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf
Обновлено 2021-09-11 12:42:41 +03:00
An automated and scalable approach to generate tasklets from a natural language task query and a website URL. Glider does not require any pre-training. Glider models tasklet extraction as a state space search, where agents can explore a website’s UI and get rewarded when making progress towards task completion. The reward is computed based on the agent’s navigating pattern and the similarity between its trajectory and the task query.
Обновлено 2021-09-03 06:52:47 +03:00