Casey Hong
e5b12c6f32
resolve merge conflicts
2019-06-11 11:45:30 -04:00
Casey Hong
23d9635230
senteval local and azureml 📓
2019-06-06 10:57:05 -07:00
Abhiram E
f0db07fb3a
Minor change.
2019-06-06 10:20:57 -07:00
Abhiram E
5b1ed5f447
FastText loader - Code changes and unit tests.
...
1. Added methods to download, extract and load glove vectors.
2. Added units test to test the public method.
Other changes
1. Refactored files to add return types to docstrings.
2. Minor changes to path variables.
2019-06-06 10:20:57 -07:00
Abhiram E
2498dbaaa1
Minor changes
2019-06-06 10:18:13 -07:00
abeswara
008bfa2c57
Glove loader - Code changes and unit tests.
...
1. Added methods to download, extract and load glove vectors.
2. Added units tests to test the public methods.
Other changes
1. Made download and extract methods private.
2. Refactored Word2vec unit tests to exclude private methods.
2019-06-06 10:16:46 -07:00
abeswara
c9006c8b65
Word2vec loader - Code changes and unit tests.
...
1. Refactored word2vec loader to perform existing file checks before downloading or extracting.
2. Added units tests to load, download and extract functions.
2019-06-06 10:13:10 -07:00
abeswara
ae31e05a84
Word2vec loader - Code changes and unit tests.
...
1. Refactored word2vec loader to perform existing file checks before downloading or extracting.
2. Added units tests to load, download and extract functions.
2019-06-06 10:12:29 -07:00
Said Bleik
c391c0bba7
Merge pull request #84 from microsoft/abhiram-requests-fix
...
Using tqdm to show progress bar
2019-06-05 13:27:51 -04:00
Abhiram E
3ac927edfa
Using tqdm to show progress bar
2019-06-05 13:08:23 -04:00
Said Bleik
30edbfe28f
Merge pull request #82 from microsoft/abhiram-requests-fix
...
Changed url fetch from urlretrieve to requests
2019-06-04 16:51:53 -04:00
Said Bleik
9e716a60b3
Merge pull request #81 from microsoft/setup
...
Create conda generator and setup.md
2019-06-04 16:34:27 -04:00
Abhiram E
0e296b6291
Changed url fetch from urlretrieve to requests
2019-06-04 16:26:35 -04:00
miguelgfierro
6464d246e8
gitignore and conda file
2019-06-04 14:51:35 +01:00
miguelgfierro
0c24b8d7ab
🐛
2019-06-04 11:37:44 +01:00
miguelgfierro
c06fa8e170
updated setup 📝
2019-06-04 11:29:40 +01:00
miguelgfierro
8c239a61cb
updated setup 📝
2019-06-04 11:29:25 +01:00
miguelgfierro
eaf24ac5d5
conda file
2019-06-04 11:22:35 +01:00
miguelgfierro
af9f671645
update setup
2019-06-04 11:09:15 +01:00
Said Bleik
ba716d109a
Merge pull request #70 from microsoft/datasets
...
Datasets
2019-05-28 13:39:39 -04:00
Said Bleik
96b3015096
Merge pull request #72 from microsoft/abhiram-embedding-fix
...
Fix to limit the memory usage when using fasttext embedding loaders
2019-05-28 13:02:39 -04:00
Abhiram E
36d7411bec
Fix to limit the memory usage when using fasttext embedding loaders. Code changes to use the simpler version
2019-05-28 12:04:57 -04:00
miguelgfierro
7ffc3cb6f6
Merge branch 'datasets' of https://github.com/Microsoft/NLP into datasets
2019-05-28 16:06:52 +01:00
miguelgfierro
835492509b
minor 🐛 in readme
2019-05-28 15:57:56 +01:00
miguelgfierro
aee2197db4
add bigger tolerance
2019-05-28 13:58:05 +00:00
miguelgfierro
403457b3c3
refactor 💥
2019-05-28 13:38:02 +01:00
Said Bleik
2dc37f87eb
Merge pull request #67 from microsoft/test
...
Test
2019-05-24 23:34:03 -04:00
Said Bleik
4fa1aa8bcd
Merge pull request #66 from microsoft/rijai/componentgovernance
...
Adding component governance tool to build pipeline.
2019-05-24 23:31:16 -04:00
Richin Jain
620f3ebe8c
Adding component governance tool to build pipeline.
2019-05-24 15:12:19 -04:00
miguelgfierro
3c1708a21d
readme update 📝
2019-05-24 18:32:28 +00:00
miguelgfierro
c8fc93d4b6
🐛
2019-05-24 17:57:53 +00:00
miguelgfierro
f0936bd9b1
added papermill
2019-05-24 16:54:17 +00:00
miguelgfierro
0f2fcd4f83
added new notebooks
2019-05-24 15:00:18 +00:00
miguelgfierro
f03f712cfa
added data integration tests with notebooks
2019-05-24 14:26:36 +00:00
miguelgfierro
03b3b387a6
refactoring tests
2019-05-24 14:06:22 +01:00
Said Bleik
2daeb1716e
Merge pull request #64 from microsoft/casey-gensen-noblank
...
Gensen noblank bugfix + Add preprocessing tests
2019-05-22 15:14:42 -04:00
Casey Hong
a1da16f391
use fixture directly
2019-05-22 12:42:01 -04:00
Casey Hong
1cd36ccff7
fix snli noblank bug and add preprocessing tests
2019-05-21 23:00:56 -04:00
Miguel González-Fierro
9bd941d2f8
Merge pull request #61 from microsoft/abhiram-gensim-limit
...
Added option to limit number of word vectors for glove and word2vec
2019-05-16 13:04:41 +01:00
Abhiram E
ce6d783adf
Separated the asserts in tests
2019-05-15 10:51:56 -04:00
Miguel González-Fierro
7aa740606d
Merge pull request #59 from microsoft/issue_template
...
issue template
2019-05-15 15:17:51 +01:00
Abhiram E
52d720e9bf
Added option to limit number of word vectors for glove and word2vec
2019-05-15 00:22:37 -04:00
miguelgfierro
a5144f2626
issue template
2019-05-14 12:21:40 +01:00
Said Bleik
33da65e0a3
Merge pull request #58 from microsoft/janhavi-update-snliNB
...
[Fix] SNLI notebook and preprocess.py
2019-05-13 21:45:01 -04:00
Janhavi Mahajan
1ed2c4dc0a
feat(bug fix) updated snli notebook with to_lowercase_all() instead of to_lowercase() that expects a column name list. Fixed None object returning in to_lowercase when column name list is not passed
2019-05-13 18:14:31 -04:00
Said Bleik
07ca05dd04
Merge pull request #47 from microsoft/maidap-sentence-similarity
...
Baseline model notebook and embeddings trainer notebook
2019-05-11 01:09:28 +00:00
Janhavi Mahajan
9338f40cdc
Merge pull request #57 from microsoft/janhavi-fix-preprocessing-file
...
Preprocess utils
2019-05-10 16:53:37 -04:00
Janhavi Mahajan
bb5764a56a
feat(code fix) rm_nltk_stop_words now expects sentences and stop_word column names
2019-05-10 16:50:34 -04:00
Janhavi Mahajan
197d771208
feat(code review comments) generalize nltk utils tokenize, remove_sto_words to more than 2 sentences
2019-05-10 16:27:48 -04:00
Janhavi Mahajan
6e3523810a
feat(code review) fix to_nltk_tokens, add to_lowercase_all and to_lowercase as per said's comments
2019-05-10 16:27:48 -04:00