Обзор - Git

microsoft / dowhy

Python 0 0

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.

machine-learning data-science causal-inference causality treatment-effects bayesian-networks causal-machine-learning causal-models do-calculus graphical-models python3

Обновлено 2024-11-22 19:51:17 +03:00

microsoft / pylance-release

Python 0 0

Documentation and issues for Pylance

python language-server code-analysis language-server-protocol

Обновлено 2024-11-22 04:53:59 +03:00

microsoft / presidio-research

Python 0 0

This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.

machine-learning deep-learning nlp natural-language-processing privacy ner transformers pii named-entity-recognition spacy flair

Обновлено 2024-11-22 01:17:04 +03:00

microsoft / unilm

Python 0 0

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

nlp multimodal beit beit-3 deepnet document-ai foundation-models kosmos kosmos-1 layoutlm layoutxlm llm minilm mllm pre-trained-model textdiffuser trocr unilm xlm-e

Обновлено 2024-11-09 13:45:59 +03:00

microsoft / Megatron-DeepSpeed

Python 0 0

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Обновлено 2024-10-18 13:31:05 +03:00

microsoft / ProphetNet

Python 0 0

A research project for natural language generation, containing the official implementations by MSRA NLC team.

Обновлено 2024-07-25 14:27:07 +03:00

microsoft / UniVL

Python 0 0

An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"

video localization segmentation caption-task coin joint msrvtt multimodal-sentiment-analysis multimodality pretrain pretraining retrieval-task video-language video-text video-text-retrieval youcookii alignment caption

Обновлено 2024-07-25 14:07:31 +03:00

microsoft / semantic_parsing_with_constrained_lm

Python 0 0

Code to reproduce experiments in the paper "Constrained Language Models Yield Few-Shot Semantic Parsers" (EMNLP 2021).

Обновлено 2024-05-31 20:48:22 +03:00

microsoft / SkillsExtractorCognitiveSearch

Python 0 0

Azure Search Cognitive Skill to extract technical and business skills from text

natural-language-processing cognitive-search azure-search named-entity-recognition

Обновлено 2024-04-25 08:03:40 +03:00

microsoft / vert-papers

Python 0 0

This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microsoft Research Asia (MSRA).

nlp ml ner named-entity-recognition entity-extraction entity-linking entity-resolution grn language-understanding linkingpark nlp-resources unitrans xl-ner bertel can-ner cross-lingual-ner entity-disambiguation

Обновлено 2024-03-16 09:53:11 +03:00

microsoft / GLIP

Python 0 0

Grounded Language-Image Pre-training

Обновлено 2024-01-24 07:56:13 +03:00

microsoft / LoRA

Python 0 0

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

deep-learning pytorch language-model roberta gpt-2 adaptation deberta gpt-3 lora low-rank

Обновлено 2024-01-09 18:03:18 +03:00

microsoft / Oscar

Python 0 0

Oscar and VinVL

image-captioning image-text-search oscar pre-training vinvl vision-and-language vqa

Обновлено 2023-08-28 04:34:58 +03:00

microsoft / LMChallenge

Python 0 0

A library & tools to evaluate predictive language models.

nlp language-model evaluation evaluation-toolkit language-model-evaluation lm-challenge next-word-prediction ngram-model prediction-model research-tool

Обновлено 2023-08-09 18:22:29 +03:00

microsoft / COCO-LM

Python 0 0

[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

deep-learning natural-language-processing representation-learning transformers language-model natural-language-understanding pretraining contrastive-learning pretrained-language-model

Обновлено 2023-07-25 17:21:55 +03:00

microsoft / NeuronBlocks

Python 0 0

NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego

deep-learning pytorch natural-language-processing artificial-intelligence dnn question-answering model-compression text-classification knowledge-distillation text-matching qna sequence-labeling

Обновлено 2023-07-22 06:07:54 +03:00

microsoft / transformers

Python 0 0

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

Обновлено 2023-06-13 00:30:31 +03:00

microsoft / huggingface-transformers

Python 0 0

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

Обновлено 2023-06-13 00:30:31 +03:00

microsoft / MT-DNN

Python 0 0

Multi-Task Deep Neural Networks for Natural Language Understanding

natural-language-processing natural-language-understanding mt-dnn multi-task-learning

Обновлено 2023-06-13 00:28:35 +03:00

microsoft / Cognitive-LUIS-Python

Python 0 0

Python SDK for the Microsoft Language Understanding Intelligent Service API, part of Cognitive Services

Обновлено 2023-06-12 23:52:58 +03:00

microsoft / OpenKP

Python 0 0

Automatically extracting keyphrases that are salient to the document meanings is an essential step to semantic document understanding. An effective keyphrase extraction (KPE) system can benefit a wide range of natural language processing and information retrieval tasks. Recent neural methods formulate the task as a document-to-keyphrase sequence-to-sequence task. These seq2seq learning models have shown promising results compared to previous KPE systems The recent progress in neural KPE is mostly observed in documents originating from the scientific domain. In real-world scenarios, most potential applications of KPE deal with diverse documents originating from sparse sources. These documents are unlikely to include the structure, prose and be as well written as scientific papers. They often include a much diverse document structure and reside in various domains whose contents target much wider audiences than scientists. To encourage the research community to develop a powerful neural model with key phrase extraction on open domains we have created OpenKP: a dataset of over 150,000 documents with the most relevant keyphrases generated by expert annotation.

Обновлено 2023-06-12 21:21:58 +03:00

microsoft / DeBERTa

Python 0 0

The implementation of DeBERTa

representation-learning self-attention deeplearning bert transformer-encoder language-model natural-language-understanding roberta

Обновлено 2023-03-25 13:22:59 +03:00

microsoft / cookiecutter-spacy-fastapi

Python 0 0

Cookiecutter API for creating Custom Skills for Azure Search using Python and Docker

natural-language-processing cognitive-search azure-search fastapi spacy

Обновлено 2022-11-28 22:10:04 +03:00

microsoft / MASS

Python 0 0

MASS: Masked Sequence to Sequence Pre-training for Language Generation

Обновлено 2022-11-28 22:09:50 +03:00

microsoft / SDR

Python 0 0

Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Обновлено 2022-11-28 22:08:28 +03:00

microsoft / nlp-recipes

Python 0 0

Natural Language Processing Best Practices & Examples

machine-learning deep-learning nlp natural-language-processing best-practices text-classification natural-language-understanding nlu azure-ml text mlflow pretrained-models nli natural-language natural-language-inference sota transfomer

Обновлено 2022-08-29 16:59:54 +03:00

microsoft / Multilingual-Model-Transfer

Python 0 0

In this project we develop new deep learning models for bootstrapping language understanding models for languages with no labeled data using labeled data from other languages.

Обновлено 2022-08-17 18:39:41 +03:00

Azure / language-model-pretrain-korean

Python 0 0

Обновлено 2022-08-04 22:59:22 +03:00

microsoft / MPNet

Python 0 0

MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf

Обновлено 2021-09-11 12:42:41 +03:00

microsoft / glider_tasklet_crawler

Python 0 0

An automated and scalable approach to generate tasklets from a natural language task query and a website URL. Glider does not require any pre-training. Glider models tasklet extraction as a state space search, where agents can explore a website’s UI and get rewarded when making progress towards task completion. The reward is computed based on the agent’s navigating pattern and the similarity between its trajectory and the task query.

Обновлено 2021-09-03 06:52:47 +03:00