Граф коммитов

273 Коммитов

Автор SHA1 Сообщение Дата
Benjamin Kilimnik e0dfd585d8 update organizations and driver license providers, add religions 2022-11-03 04:47:50 -04:00
Benjamin Kilimnik 25927d47e8 update formatting 2022-11-02 13:48:40 -04:00
Benjamin Kilimnik bb45a73834 update imports 2022-11-02 13:46:20 -04:00
Benjamin Kilimnik 3af26520ce update group_tokens in plotter 2022-11-02 13:44:37 -04:00
Benjamin Kilimnik abffbed7a9 update plots for false positives, false negatives 2022-11-02 13:41:28 -04:00
Benjamin Kilimnik 2b4e0652b4 update plotting 2022-11-01 14:02:07 -04:00
Omri Mendels 01a3873bc4
Merge pull request #52 from benkilimnik/update-flair
Update Flair Model Training
2022-10-26 10:15:27 +03:00
Benjamin Kilimnik b5e6e66678 update docstrings 2022-10-26 02:21:39 -04:00
Benjamin Kilimnik c06b9d3000 update remove_unsupported_entities 2022-10-26 02:18:31 -04:00
Benjamin Kilimnik 49bf2297e8 add plots 2022-10-26 02:17:30 -04:00
Benjamin Kilimnik 12a9fd07ad add dataset loading utils 2022-10-26 02:02:44 -04:00
Benjamin Kilimnik ddd189ff25 remove fast embeddings 2022-10-26 01:34:26 -04:00
Benjamin Kilimnik f85210111a format 2022-10-26 01:31:15 -04:00
Benjamin Kilimnik 83b4a16cfb update flair model training 2022-10-26 01:26:18 -04:00
Benjamin Kilimnik 5164386e8b add data loading, filtering utility functions 2022-10-26 01:05:15 -04:00
Omri Mendels 6146d2ef25
Merge pull request #48 from Robbie-Palmer/tokenize_with_any_model
Tokenize with any model
2022-08-04 19:08:56 +03:00
Robbie Palmer b6d51ff1b2 Undo formatting changes 2022-08-04 16:11:53 +01:00
Omri Mendels 7f99e694c3
Merge pull request #49 from microsoft/bug/remove_old_spacy_model
Removed spacy model from requirements.txt
2022-08-03 18:27:08 +03:00
Omri Mendels ac4c8cdc08
Update azure-pipelines.yml 2022-08-03 18:19:16 +03:00
Omri Mendels 0d8f77f679
Removed spacy model from requirements.txt 2022-08-03 18:15:21 +03:00
Robbie Palmer 34e28a3c72 Fix experiment_tracker.py bugs that prevent execution
Format presidio_pseudonymize.py
2022-08-02 09:35:42 +01:00
Robbie Palmer b33868aa39 Enable any spacy model to be specified for tokenisation
Currently most functions that execute tokenisation use the default en_core_web_sm model
2022-08-02 09:34:22 +01:00
Omri Mendels 17bd4229e8
Merge pull request #47 from Robbie-Palmer/patch-1
Add Presidio packages required dependencies
2022-07-20 14:42:32 +03:00
Robbie Palmer 435cc04aef
Add Presidio packages to install_requires
The `presidio_anonymizer` package is used by `presidio_pseudonymize`
The `presidio_analyzer` package is used by `presidio_pseudonymize`, `scorers` and `models`
2022-07-20 12:24:24 +01:00
Omri Mendels c49124672f
Merge pull request #45 from microsoft/data_generator_v2
Data generator v2
2022-05-09 15:03:41 +03:00
omri374 3d25530d12 bugfix 2022-05-06 20:02:56 +03:00
Omri Mendels f92bb67cf4
Update README.md 2022-04-27 10:34:38 +03:00
omri374 7b51bdecbe removed sklearn crfsuite 2022-02-28 11:27:03 +02:00
omri374 73e63043df Merge branch 'data_generator_v2' of https://github.com/microsoft/presidio-research into data_generator_v2 2022-02-24 13:34:53 +02:00
omri374 f3ff907eba data 2022-02-24 13:34:22 +02:00
Omri Mendels 4966be58cd
Update azure-pipelines.yml 2022-02-10 18:56:58 +02:00
omri374 03a8464a5b more updates to dep versions 2022-01-24 15:21:45 +02:00
omri374 34feb10647 loosen numpy requirement 2022-01-24 13:33:22 +02:00
omri374 4b5165eb63 added spacy model to ci 2022-01-24 12:20:11 +02:00
omri374 5b36f39086 removed Corpus type for cases where flair is not installed 2022-01-24 12:15:18 +02:00
omri374 746c091dfd updates to tests, crf and bug fixes 2022-01-24 12:00:12 +02:00
omri374 5655404a2c black and flake8-ing the entire code 2022-01-20 00:04:18 +02:00
omri374 c9151c64c2 updates to fake name generator 2022-01-16 01:50:36 +02:00
omri374 f8d08bfa19 minor updates to fake name generator data and processing 2022-01-16 01:27:48 +02:00
omri374 0025339491 bug fix 2022-01-15 23:30:42 +02:00
omri374 ec7d7ac50a updates to notebooks and some evaluation logic, experiment tracking 2022-01-15 00:42:14 +02:00
omri374 fa7e1d637a more updates 2022-01-08 02:30:23 +02:00
omri374 f575731de7 new faker based generator and package updates 2021-12-26 17:37:45 +02:00
Omri Mendels 5ab101b82b
Merge pull request #43 from microsoft/omri/faker_record_generator
Generating fake data from existing records
2021-11-05 17:04:30 +02:00
omri374 820dcbc9d9 more pythonic return 2021-10-26 22:14:50 +03:00
omri374 36cef6f27a added one more example 2021-10-26 22:09:39 +03:00
Omri Mendels 4231b14439
updates to docstring and comments 2021-10-26 21:49:35 +03:00
Omri Mendels d056f667e1
Update Generate data.ipynb 2021-10-26 21:40:11 +03:00
omri374 5ba363dde3 Merge branch 'data_generator_v2' into omri/faker_record_generator 2021-10-26 21:38:57 +03:00
omri374 6126e9d3f0 Merge branch 'omri/faker_record_generator' of https://github.com/microsoft/presidio-research into omri/faker_record_generator
# Conflicts:
#	notebooks/data generation/Generate data.ipynb
2021-10-26 21:37:49 +03:00