Benjamin Kilimnik
e0dfd585d8
update organizations and driver license providers, add religions
2022-11-03 04:47:50 -04:00
Benjamin Kilimnik
25927d47e8
update formatting
2022-11-02 13:48:40 -04:00
Benjamin Kilimnik
bb45a73834
update imports
2022-11-02 13:46:20 -04:00
Benjamin Kilimnik
3af26520ce
update group_tokens in plotter
2022-11-02 13:44:37 -04:00
Benjamin Kilimnik
abffbed7a9
update plots for false positives, false negatives
2022-11-02 13:41:28 -04:00
Benjamin Kilimnik
2b4e0652b4
update plotting
2022-11-01 14:02:07 -04:00
Omri Mendels
01a3873bc4
Merge pull request #52 from benkilimnik/update-flair
...
Update Flair Model Training
2022-10-26 10:15:27 +03:00
Benjamin Kilimnik
b5e6e66678
update docstrings
2022-10-26 02:21:39 -04:00
Benjamin Kilimnik
c06b9d3000
update remove_unsupported_entities
2022-10-26 02:18:31 -04:00
Benjamin Kilimnik
49bf2297e8
add plots
2022-10-26 02:17:30 -04:00
Benjamin Kilimnik
12a9fd07ad
add dataset loading utils
2022-10-26 02:02:44 -04:00
Benjamin Kilimnik
ddd189ff25
remove fast embeddings
2022-10-26 01:34:26 -04:00
Benjamin Kilimnik
f85210111a
format
2022-10-26 01:31:15 -04:00
Benjamin Kilimnik
83b4a16cfb
update flair model training
2022-10-26 01:26:18 -04:00
Benjamin Kilimnik
5164386e8b
add data loading, filtering utility functions
2022-10-26 01:05:15 -04:00
Omri Mendels
6146d2ef25
Merge pull request #48 from Robbie-Palmer/tokenize_with_any_model
...
Tokenize with any model
2022-08-04 19:08:56 +03:00
Robbie Palmer
b6d51ff1b2
Undo formatting changes
2022-08-04 16:11:53 +01:00
Omri Mendels
7f99e694c3
Merge pull request #49 from microsoft/bug/remove_old_spacy_model
...
Removed spacy model from requirements.txt
2022-08-03 18:27:08 +03:00
Omri Mendels
ac4c8cdc08
Update azure-pipelines.yml
2022-08-03 18:19:16 +03:00
Omri Mendels
0d8f77f679
Removed spacy model from requirements.txt
2022-08-03 18:15:21 +03:00
Robbie Palmer
34e28a3c72
Fix experiment_tracker.py bugs that prevent execution
...
Format presidio_pseudonymize.py
2022-08-02 09:35:42 +01:00
Robbie Palmer
b33868aa39
Enable any spacy model to be specified for tokenisation
...
Currently most functions that execute tokenisation use the default en_core_web_sm model
2022-08-02 09:34:22 +01:00
Omri Mendels
17bd4229e8
Merge pull request #47 from Robbie-Palmer/patch-1
...
Add Presidio packages required dependencies
2022-07-20 14:42:32 +03:00
Robbie Palmer
435cc04aef
Add Presidio packages to install_requires
...
The `presidio_anonymizer` package is used by `presidio_pseudonymize`
The `presidio_analyzer` package is used by `presidio_pseudonymize`, `scorers` and `models`
2022-07-20 12:24:24 +01:00
Omri Mendels
c49124672f
Merge pull request #45 from microsoft/data_generator_v2
...
Data generator v2
2022-05-09 15:03:41 +03:00
omri374
3d25530d12
bugfix
2022-05-06 20:02:56 +03:00
Omri Mendels
f92bb67cf4
Update README.md
2022-04-27 10:34:38 +03:00
omri374
7b51bdecbe
removed sklearn crfsuite
2022-02-28 11:27:03 +02:00
omri374
73e63043df
Merge branch 'data_generator_v2' of https://github.com/microsoft/presidio-research into data_generator_v2
2022-02-24 13:34:53 +02:00
omri374
f3ff907eba
data
2022-02-24 13:34:22 +02:00
Omri Mendels
4966be58cd
Update azure-pipelines.yml
2022-02-10 18:56:58 +02:00
omri374
03a8464a5b
more updates to dep versions
2022-01-24 15:21:45 +02:00
omri374
34feb10647
loosen numpy requirement
2022-01-24 13:33:22 +02:00
omri374
4b5165eb63
added spacy model to ci
2022-01-24 12:20:11 +02:00
omri374
5b36f39086
removed Corpus type for cases where flair is not installed
2022-01-24 12:15:18 +02:00
omri374
746c091dfd
updates to tests, crf and bug fixes
2022-01-24 12:00:12 +02:00
omri374
5655404a2c
black and flake8-ing the entire code
2022-01-20 00:04:18 +02:00
omri374
c9151c64c2
updates to fake name generator
2022-01-16 01:50:36 +02:00
omri374
f8d08bfa19
minor updates to fake name generator data and processing
2022-01-16 01:27:48 +02:00
omri374
0025339491
bug fix
2022-01-15 23:30:42 +02:00
omri374
ec7d7ac50a
updates to notebooks and some evaluation logic, experiment tracking
2022-01-15 00:42:14 +02:00
omri374
fa7e1d637a
more updates
2022-01-08 02:30:23 +02:00
omri374
f575731de7
new faker based generator and package updates
2021-12-26 17:37:45 +02:00
Omri Mendels
5ab101b82b
Merge pull request #43 from microsoft/omri/faker_record_generator
...
Generating fake data from existing records
2021-11-05 17:04:30 +02:00
omri374
820dcbc9d9
more pythonic return
2021-10-26 22:14:50 +03:00
omri374
36cef6f27a
added one more example
2021-10-26 22:09:39 +03:00
Omri Mendels
4231b14439
updates to docstring and comments
2021-10-26 21:49:35 +03:00
Omri Mendels
d056f667e1
Update Generate data.ipynb
2021-10-26 21:40:11 +03:00
omri374
5ba363dde3
Merge branch 'data_generator_v2' into omri/faker_record_generator
2021-10-26 21:38:57 +03:00
omri374
6126e9d3f0
Merge branch 'omri/faker_record_generator' of https://github.com/microsoft/presidio-research into omri/faker_record_generator
...
# Conflicts:
# notebooks/data generation/Generate data.ipynb
2021-10-26 21:37:49 +03:00