Hong Lu
|
52cc16fb9b
|
Updated token classifier api.
|
2019-05-24 18:09:56 -04:00 |
Hong Lu
|
5258c9cd7e
|
Added some utility functions to the common script. Will be merged with common.py later.
|
2019-05-24 18:09:04 -04:00 |
Richin Jain
|
620f3ebe8c
|
Adding component governance tool to build pipeline.
|
2019-05-24 15:12:19 -04:00 |
miguelgfierro
|
3c1708a21d
|
readme update 📝
|
2019-05-24 18:32:28 +00:00 |
miguelgfierro
|
c8fc93d4b6
|
🐛
|
2019-05-24 17:57:53 +00:00 |
miguelgfierro
|
f0936bd9b1
|
added papermill
|
2019-05-24 16:54:17 +00:00 |
miguelgfierro
|
0f2fcd4f83
|
added new notebooks
|
2019-05-24 15:00:18 +00:00 |
miguelgfierro
|
f03f712cfa
|
added data integration tests with notebooks
|
2019-05-24 14:26:36 +00:00 |
miguelgfierro
|
03b3b387a6
|
refactoring tests
|
2019-05-24 14:06:22 +01:00 |
Said Bleik
|
2daeb1716e
|
Merge pull request #64 from microsoft/casey-gensen-noblank
Gensen noblank bugfix + Add preprocessing tests
|
2019-05-22 15:14:42 -04:00 |
Casey Hong
|
a1da16f391
|
use fixture directly
|
2019-05-22 12:42:01 -04:00 |
Casey Hong
|
1cd36ccff7
|
fix snli noblank bug and add preprocessing tests
|
2019-05-21 23:00:56 -04:00 |
Said Bleik
|
63e546ab3c
|
updated prerocessing, utils, classification
|
2019-05-21 16:45:23 -04:00 |
Hong Lu
|
2473e1a75c
|
Black auto formatting.
|
2019-05-20 18:53:57 -04:00 |
Hong Lu
|
3d1c1862d9
|
Removed old data utils script.
|
2019-05-20 14:08:39 -04:00 |
Hong Lu
|
4a41ec41e8
|
Added a constant file.
|
2019-05-20 14:00:12 -04:00 |
Hong Lu
|
1393c74fb3
|
Minor updates for data class updates.
|
2019-05-20 13:59:38 -04:00 |
Hong Lu
|
9919a7bd35
|
Remived InputFeature class. Use namedtuple instead of class for input data.
|
2019-05-20 13:58:54 -04:00 |
Hong Lu
|
e81138ad08
|
Changed optimizer and number of epochs configuration.
|
2019-05-20 13:58:16 -04:00 |
Said Bleik
|
49bb116474
|
update seq classifer
|
2019-05-17 10:04:46 -04:00 |
Hong Lu
|
eef85dea41
|
Consolidated all configuration classes into a single class.
|
2019-05-16 18:11:21 -04:00 |
Hong Lu
|
7ca29691ae
|
Consolidated some utility functions into BertTokenClassifier.
|
2019-05-16 18:10:47 -04:00 |
Hong Lu
|
d87dfbc2af
|
Minor edits and added docstring.
|
2019-05-16 18:10:14 -04:00 |
Hong Lu
|
2732da2717
|
Updated NER notebook with new BertTokenClassifier class.
|
2019-05-16 18:09:40 -04:00 |
Hong Lu
|
14543fbd52
|
Added yaml configuration file for NER example.
|
2019-05-16 18:08:50 -04:00 |
Miguel González-Fierro
|
9bd941d2f8
|
Merge pull request #61 from microsoft/abhiram-gensim-limit
Added option to limit number of word vectors for glove and word2vec
|
2019-05-16 13:04:41 +01:00 |
Abhiram E
|
ce6d783adf
|
Separated the asserts in tests
|
2019-05-15 10:51:56 -04:00 |
Miguel González-Fierro
|
7aa740606d
|
Merge pull request #59 from microsoft/issue_template
issue template
|
2019-05-15 15:17:51 +01:00 |
Abhiram E
|
52d720e9bf
|
Added option to limit number of word vectors for glove and word2vec
|
2019-05-15 00:22:37 -04:00 |
miguelgfierro
|
a5144f2626
|
issue template
|
2019-05-14 12:21:40 +01:00 |
Said Bleik
|
33da65e0a3
|
Merge pull request #58 from microsoft/janhavi-update-snliNB
[Fix] SNLI notebook and preprocess.py
|
2019-05-13 21:45:01 -04:00 |
Janhavi Mahajan
|
1ed2c4dc0a
|
feat(bug fix) updated snli notebook with to_lowercase_all() instead of to_lowercase() that expects a column name list. Fixed None object returning in to_lowercase when column name list is not passed
|
2019-05-13 18:14:31 -04:00 |
Said Bleik
|
e9c17a961e
|
update BERTSequenceClassifier and notebook
|
2019-05-13 15:18:21 -04:00 |
Said Bleik
|
7430e3b178
|
updated BERTSequenceClassifier + documentation
|
2019-05-13 14:38:54 -04:00 |
Said Bleik
|
7d2d74f975
|
BERTSequenceClassifier
|
2019-05-13 16:31:58 +00:00 |
Said Bleik
|
07ca05dd04
|
Merge pull request #47 from microsoft/maidap-sentence-similarity
Baseline model notebook and embeddings trainer notebook
|
2019-05-11 01:09:28 +00:00 |
Janhavi Mahajan
|
9338f40cdc
|
Merge pull request #57 from microsoft/janhavi-fix-preprocessing-file
Preprocess utils
|
2019-05-10 16:53:37 -04:00 |
Janhavi Mahajan
|
bb5764a56a
|
feat(code fix) rm_nltk_stop_words now expects sentences and stop_word column names
|
2019-05-10 16:50:34 -04:00 |
Janhavi Mahajan
|
197d771208
|
feat(code review comments) generalize nltk utils tokenize, remove_sto_words to more than 2 sentences
|
2019-05-10 16:27:48 -04:00 |
Janhavi Mahajan
|
6e3523810a
|
feat(code review) fix to_nltk_tokens, add to_lowercase_all and to_lowercase as per said's comments
|
2019-05-10 16:27:48 -04:00 |
Courtney Cochrane
|
2058c77a2c
|
Changing structure
|
2019-05-10 10:04:13 -04:00 |
Courtney Cochrane
|
0812acf1be
|
Change based folder structure adjusment
|
2019-05-10 10:00:57 -04:00 |
Courtney Cochrane
|
b1b5ec1b97
|
Add in timer to time each of the embedding trainers
|
2019-05-09 15:16:13 -04:00 |
Courtney Cochrane
|
17dee0d7a3
|
Add detail to data loading part, add links to original papers
|
2019-05-09 14:58:12 -04:00 |
Courtney Cochrane
|
12125a350d
|
add links to original model papers
|
2019-05-09 14:58:12 -04:00 |
Casey Hong
|
4517c0697c
|
persist clean_snli
|
2019-05-09 14:58:12 -04:00 |
Abhiram E
|
49595b8666
|
Moved urls to module constants for pretrained embedding utils.
|
2019-05-09 14:58:12 -04:00 |
Courtney Cochrane
|
bc2f51193c
|
revert Readme to latest branch changes
|
2019-05-09 14:58:12 -04:00 |
Courtney Cochrane
|
aca561829f
|
README edit
|
2019-05-09 14:58:12 -04:00 |
Courtney Cochrane
|
d1b99225a4
|
Integrate fastText embedding loader and small edits
|
2019-05-09 14:58:12 -04:00 |