Обзор - Git

microsoft / torchgeo

Python 0 0

TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data

deep-learning pytorch datasets earth-observation models remote-sensing torchvision transforms

Обновлено 2024-11-06 19:56:28 +03:00

microsoft / vision-datasets

Python 0 0

Обновлено 2024-11-06 04:21:03 +03:00

microsoft / subseasonal_data

Python 0 0

Data access package for the SubseasonalClimateUSA dataset

Обновлено 2024-11-05 16:57:54 +03:00

github / transparency

Python 0 0

Structured data files for topics covered by GitHub's Transparency Report

open-data dataset data transparency

Обновлено 2024-09-30 19:42:54 +03:00

Azure / AzurePublicDataset

Jupyter Notebook 0 0

Microsoft Azure Traces

Обновлено 2024-09-28 20:49:03 +03:00

Unity-Technologies / com.unity.perception

C# 0 0

Perception toolkit for sim2real training and validation in Unity

machine-learning computer-vision deep-learning detection domain-randomization object-detection perception pose-estimation segmentation synthetic-dataset-generation

Обновлено 2024-09-23 21:19:23 +03:00

microsoft / qlib

Python 0 0

Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms. including supervised learning, market dynamics modeling, and RL.

machine-learning deep-learning python platform research finance algorithmic-trading auto-quant fintech investment paper quant quant-dataset quant-models quantitative-finance quantitative-trading research-paper stock-data

Обновлено 2024-09-12 18:44:27 +03:00

microsoft / graspologic-js

TypeScript 0 0

A high-performance modern set of graph rendering components, which enables users to visualize large graph datasets on the web.

Обновлено 2024-08-29 15:29:55 +03:00

microsoft / clustered-nanopore-reads-dataset

Markdown 0 0

A dataset of real DNA traces for benchmarking trace reconstruction algorithms

Обновлено 2024-08-13 21:17:32 +03:00

microsoft / ORBIT-Dataset

Python 0 0

The ORBIT dataset is a collection of videos of objects in clean and cluttered scenes recorded by people who are blind/low-vision on a mobile phone. The dataset is presented with a teachable object recognition benchmark task which aims to drive few-shot learning on challenging real-world data.

microsoft machine-learning computer-vision video benchmark dataset classification few-shot-learning meta-learning object-recognition

Обновлено 2024-08-13 03:27:45 +03:00

microsoft / AIforEarthDataSets

Jupyter Notebook 0 0

Notebooks and documentation for AI-for-Earth-managed datasets on Azure

aiforearth

Обновлено 2024-07-25 14:51:04 +03:00

github / government-open-source-policies

Python 0 0

Dataset of Government Open Source Policies

open-source open-data government policies

Обновлено 2024-07-06 05:15:10 +03:00

microsoft / HiTab

Python 0 0

[ACL 2022] A hierarchical table dataset for question answering and data-to-text generation.

Обновлено 2024-04-09 04:30:40 +03:00

microsoft / InnerEye-CreateDataset

C# 0 0

InnerEye dataset creation tool for InnerEye-DeepLearning library. Transforms DICOM data into mask for training Deep Learning models.

Обновлено 2024-03-21 12:52:00 +03:00

microsoft / vert-papers

Python 0 0

This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microsoft Research Asia (MSRA).

nlp ml ner named-entity-recognition entity-extraction entity-linking entity-resolution grn language-understanding linkingpark nlp-resources unitrans xl-ner bertel can-ner cross-lingual-ner entity-disambiguation

Обновлено 2024-03-16 09:53:11 +03:00

Unity-Technologies / PeopleSansPeople

C# 0 0

Unity's privacy-preserving human-centric synthetic data generator

Обновлено 2024-03-05 04:05:37 +03:00

microsoft / Mobius

C# 0 0

C# and F# language binding and extensions to Apache Spark

csharp fsharp spark dataset bigdata spark-streaming streaming apache-spark rdd dataframe dstream eventhubs kafka-streaming mapreduce mobius near-real-time

Обновлено 2024-01-30 22:45:57 +03:00

microsoft / PythonProgrammingPuzzles

Python 0 0

A Dataset of Python Challenges for AI Research

ai program-synthesis programming-competitions puzzles

Обновлено 2023-12-21 00:10:56 +03:00

microsoft / table-transformer

Python 0 0

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.

table-detection table-extraction table-functional-analysis table-structure-recognition

Обновлено 2023-09-07 07:38:34 +03:00

microsoft / MS-SNSD

HTML 0 0

The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) levels desired.

Обновлено 2023-07-07 01:41:03 +03:00

microsoft / mimic_sepsis

Python 0 0

Sepsis cohort from MIMIC dataset

Обновлено 2023-07-07 01:16:14 +03:00

microsoft / normalized_trend_filtering

Jupyter Notebook 0 0

Normalized Trend Filtering for Biomedical Datasets

Обновлено 2023-07-07 01:07:14 +03:00

microsoft / MS-Lumos

Python 0 0

Tools to compare metrics between datasets, accounting for population differences and invariant features.

Обновлено 2023-07-07 00:36:40 +03:00

Unity-Technologies / SynthDet

C# 0 0

SynthDet - An end-to-end object detection pipeline using synthetic data

machine-learning deep-learning computer-vision pose-estimation synthetic-data synthetic-dataset-generation detection domain-randomization object-detection synthetic-dataset

Обновлено 2023-07-05 23:50:31 +03:00

microsoft / Code-Hunt

Markdown 0 0

Code Hunt is a serious education game which has been played by over 140,000 students and enthusiasts over the past year. In the process we have collected over 1.5M programs. We hope that researchers will embark on research into the data. Please fill our quick survey to let us know how you are using the dataset, and get updates about new releases of Code Hunt data. See more on our Code Hunt Research page.

Обновлено 2023-06-27 16:09:36 +03:00

microsoft / RServer-for-HDInsight-example-CriteoDataSet

R 0 0

This repo contains a walkthrough of how to use RServer for HDInsight with large data sets like Criteo.

Обновлено 2023-06-27 16:07:15 +03:00

microsoft / FERPlus

Python 0 0

This is the FER+ new label annotations for the Emotion FER dataset.

Обновлено 2023-06-12 23:52:53 +03:00

microsoft / WhaleTails

Jupyter Notebook 0 0

This project created to analyze, compare and identify whale tails from the Kaggle competition dataset, "Humpback Whale Identification Challenge". It is written in Python, and uses the Keras API with Tensorflow backend. The project implemented both a Siamese Network and a SoftMax classifier with center loss.

Обновлено 2023-06-12 22:29:42 +03:00

microsoft / Optimal-Freshness-Crawl-Scheduling

Python 0 0

Dataset and code for three Web crawling-related papers from SIGIR-2019, NeurIPS-2019. and ICML-2020.

Обновлено 2023-06-12 21:21:59 +03:00

microsoft / MSMARCO-Passage-Ranking

Jupyter Notebook 0 0

MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, and passage ranking. A variant of this task will be the part of TREC and AFIRM 2019. For Updates about TREC 2019 please follow This Repository Passage Reranking task Task Given a query q and a the 1000 most relevant passages P = p1, p2, p3,... p1000, as retrieved by BM25 a succeful system is expected to rerank the most relevant passage as high as possible. For this task not all 1000 relevant items have a human labeled relevant passage. Evaluation will be done using MRR

Обновлено 2023-06-12 21:21:58 +03:00