Translation quality evaluation for Firefox Translations models
Обновлено 2023-10-24 00:14:07 +03:00
Translation quality evaluation for Firefox Translations models
Обновлено 2023-10-24 00:14:07 +03:00
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
Обновлено 2023-09-07 07:38:34 +03:00
A library & tools to evaluate predictive language models.
nlp
language-model
evaluation
evaluation-toolkit
language-model-evaluation
lm-challenge
next-word-prediction
ngram-model
prediction-model
research-tool
Обновлено 2023-08-09 18:22:29 +03:00
Truly Conversational Search is the next logic step in the journey to generate intelligent and useful AI. To understand what this may mean, researchers have voiced a continuous desire to study how people currently converse with search engines. Traditionally, the desire to produce such a comprehensive dataset has been limited because those who have this data (Search Engines) have a responsibility to their users to maintain their privacy and cannot share the data publicly in a way that upholds the trusts users have in the Search Engines. Given these two powerful forces we believe we have a dataset and paradigm that meets both sets of needs: A artificial public dataset that approximates the true data and an ability to evaluate model performance on the real user behavior. What this means is we released a public dataset which is generated by creating artificial sessions using embedding similarity and will test on the original data. To say this again: we are not releasing any private user data but are releasing what we believe to be a good representation of true user interactions.
Обновлено 2023-06-12 21:21:58 +03:00
Record-and-replay tools are indispensable for quality assurance of mobile applications. However, by conducting an empirical study of various existing tools in industrial settings, researchers have concluded that no existing tools under evaluation are sufficient for industrial applications. In this project, we present a record-and-replay tool called SARA towards bridging the gap and targeting a wide adoption.
Обновлено 2023-06-12 21:21:33 +03:00
Обновлено 2023-03-10 02:59:53 +03:00
Forecasting models, development, evaluation, and validation
Обновлено 2022-01-11 21:12:40 +03:00
FS-Mol is A Few-Shot Learning Dataset of Molecules, containing molecular compounds with measurements of activity against a variety of protein targets. The dataset is presented with a model evaluation benchmark which aims to drive few-shot learning research in the domain of molecules and graph-structured data.
Обновлено 2022-01-06 19:18:51 +03:00
A python implementation for a CNTK Fast-RCNN evaluation client
Обновлено 2017-07-04 11:22:40 +03:00