DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
Обновлено 2024-11-22 19:51:17 +03:00
Documentation and issues for Pylance
Обновлено 2024-11-22 04:53:59 +03:00
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
Обновлено 2024-11-22 01:17:04 +03:00
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Обновлено 2024-11-09 13:45:59 +03:00
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Обновлено 2024-10-18 13:31:05 +03:00
A research project for natural language generation, containing the official implementations by MSRA NLC team.
Обновлено 2024-07-25 14:27:07 +03:00
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
Обновлено 2024-07-25 14:07:31 +03:00
Code to reproduce experiments in the paper "Constrained Language Models Yield Few-Shot Semantic Parsers" (EMNLP 2021).
Обновлено 2024-05-31 20:48:22 +03:00
Azure Search Cognitive Skill to extract technical and business skills from text
Обновлено 2024-04-25 08:03:40 +03:00
This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microsoft Research Asia (MSRA).
Обновлено 2024-03-16 09:53:11 +03:00
Grounded Language-Image Pre-training
Обновлено 2024-01-24 07:56:13 +03:00
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Обновлено 2024-01-09 18:03:18 +03:00
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Обновлено 2023-07-25 17:21:55 +03:00
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
Обновлено 2023-06-13 00:30:31 +03:00
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
Обновлено 2023-06-13 00:30:31 +03:00
Multi-Task Deep Neural Networks for Natural Language Understanding
Обновлено 2023-06-13 00:28:35 +03:00
Python SDK for the Microsoft Language Understanding Intelligent Service API, part of Cognitive Services
Обновлено 2023-06-12 23:52:58 +03:00
Automatically extracting keyphrases that are salient to the document meanings is an essential step to semantic document understanding. An effective keyphrase extraction (KPE) system can benefit a wide range of natural language processing and information retrieval tasks. Recent neural methods formulate the task as a document-to-keyphrase sequence-to-sequence task. These seq2seq learning models have shown promising results compared to previous KPE systems The recent progress in neural KPE is mostly observed in documents originating from the scientific domain. In real-world scenarios, most potential applications of KPE deal with diverse documents originating from sparse sources. These documents are unlikely to include the structure, prose and be as well written as scientific papers. They often include a much diverse document structure and reside in various domains whose contents target much wider audiences than scientists. To encourage the research community to develop a powerful neural model with key phrase extraction on open domains we have created OpenKP: a dataset of over 150,000 documents with the most relevant keyphrases generated by expert annotation.
Обновлено 2023-06-12 21:21:58 +03:00
Cookiecutter API for creating Custom Skills for Azure Search using Python and Docker
Обновлено 2022-11-28 22:10:04 +03:00
MASS: Masked Sequence to Sequence Pre-training for Language Generation
Обновлено 2022-11-28 22:09:50 +03:00
Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference
Обновлено 2022-11-28 22:08:28 +03:00
In this project we develop new deep learning models for bootstrapping language understanding models for languages with no labeled data using labeled data from other languages.
Обновлено 2022-08-17 18:39:41 +03:00
Обновлено 2022-08-04 22:59:22 +03:00
MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf
Обновлено 2021-09-11 12:42:41 +03:00
An automated and scalable approach to generate tasklets from a natural language task query and a website URL. Glider does not require any pre-training. Glider models tasklet extraction as a state space search, where agents can explore a website’s UI and get rewarded when making progress towards task completion. The reward is computed based on the agent’s navigating pattern and the similarity between its trajectory and the task query.
Обновлено 2021-09-03 06:52:47 +03:00