Граф коммитов

312 Коммитов

Автор SHA1 Сообщение Дата
Omri Mendels a21a17c2cb
Add a link to model classes to simplify configuration (#1472) 2024-11-15 21:34:40 +02:00
Omri Mendels d238da9008
Update community.md (#1469) 2024-11-15 21:33:59 +02:00
Sharon Hart ce63783fcc
Unlock numpy after dropping 3.8 (#1480) 2024-11-06 15:21:45 +02:00
Omri Mendels 33808c2837
Removed python 3.8 support (EOL) and added 3.12 (#1479) 2024-11-04 11:47:02 +02:00
Akshay Karle cc31bb6198
Add a link to HashiCorp vault operator resource (#1468) 2024-10-15 12:41:35 +03:00
Roel Fauconnier 71fa64df96
docs: clarify the docs on deploying presidio to k8s (#1453) 2024-10-14 20:14:53 +03:00
Omri Mendels 21361f9e82
Updates to the transformers conf docs and yaml file (#1467) 2024-10-13 13:33:16 +03:00
Hugo Hobson c54ce2b9cf
Add UK National Insurance Number Recognizer (#1446) 2024-09-22 12:59:26 +03:00
Mark West 1bf22edcda
Update installation.md (#1439) 2024-09-09 13:43:23 +03:00
Krish Dholakia cd7e5471cb
(docs) Use Presidio across Anthropic, Bedrock, VertexAI, Azure OpenAI, etc. w/ LiteLLM Proxy (#1421) 2024-07-24 22:40:38 +03:00
Ranjan Singh d85ba6e5a7
Typo fix added missing ":" after if condition (#1419) 2024-07-22 11:59:05 +03:00
Omri Mendels d46bacb20b
minor notebook changes (#1420) 2024-07-22 11:58:46 +03:00
Roey Ben Chaim c059131383
changing predefined recognizers to use the config file (#1393) 2024-07-17 16:14:21 +03:00
Omri Mendels ac38ccae3c
NLP engine sample + refresh on samples (#1388) 2024-07-10 16:04:47 +03:00
Andreas Eberle 97a7e42b38
Fix the entity filtering of the transformer_recognizer.py analzye function (#1403) 2024-07-09 22:19:20 +03:00
Sharon Hart 2be6de12dc
Fix ports in docs (#1408) 2024-07-04 13:59:03 +03:00
Sharon Hart 4752166f76
From Pipenv to Poetry (#1391)
* Add more tomls

* fix cli versioning

* Drop pipfile

* Drop pipfile

* Drop pipfile

* keep pipfile

* keep pipfile

* without pipfile

* typo

* typo

* typo

* typo

* typo

* typo

* typo

* typo

* readmes

* drop requires

* add requires

* w/o readme

* w/o readme

* w/o readme

* pdm run

* migrate to pdm

* migrate to pdm

* readme rename

* Fix versioning and docker building

* Fix versioning and docker building

* Fix versioning and docker building

* change build be

* change build be

* PDM_VENV_WITH_PIP

* source-includes

* source-includes

* source-includes

* compose logs

* compose logs

* skip hanging

* skip hanging

* skip hanging

* skip hanging

* readd test

* fix test

* remove test

* remove test

* docs and dockerfiles

* Try with poetry

* Try with poetry

* Try with poetry

* Try with poetry

* drop numpy

* version

* no-root

* fix version extraction

* docs and dockerfiles

* revert test

* comment hanging test

* try clause
2024-06-02 12:59:21 +03:00
Roey Ben Chaim 67d583709c
Feature/analyzer documentation (#1384) 2024-05-30 11:04:17 +03:00
Sharon Hart cb0184afa2
Add Ruff linter + Apply Ruff fix (#1379)
* Add Ruff linter + Apply Ruff fix

* Move up linting

* Move up linting

* Move up linting

* docs

* docs
2024-05-12 10:16:47 +03:00
Joshua Hamilton ff31243028
Align ports with documentation and postman collection. (#1375) 2024-05-08 11:33:54 +03:00
areyesfalcon e64d8ecf10
Spanish NIE (Foreigners ID card) recognizer (#1359) 2024-04-24 22:03:42 +03:00
Omri Mendels f29e112fd3
Update conf files location (#1358) 2024-04-18 10:42:18 +03:00
krinal joshi 41e02026f3
feat: Add new recognizer for IN_VOTER #1344 (#1345) 2024-04-16 07:47:43 +03:00
Hiten Mandaliya 5ea004d8ad
New Predefined Recognizer for Indian Passport #1350 (#1351) 2024-04-15 23:14:39 +03:00
honderr c7fa82518d
Added Finnish Personal Identity Code Recognizer. (#1349) 2024-04-09 17:57:58 +03:00
Milton db8ff82541
feat: Implement user-defined entity selection strategies in Presidio Structured (#1319) 2024-03-20 15:49:14 +02:00
Vijay Devane 733cca26cf
Adding Span Marker Recognizer Sample (#1321)
* Adding Span Marker Recognizer Sample

* Removing "O" label

* Added parameters definitions

* Added span marker sample in list of samples
2024-03-13 06:51:43 +02:00
Andreas Varotsis 1911a3d6be
Update spacy_stanza.md (#1325)
Added bugfix from https://github.com/microsoft/presidio/issues/1298
2024-03-07 19:30:43 +02:00
Milton d71c5fbf47
feat: Add Singapore UEN Recognizer (#1315) 2024-02-28 17:13:26 +02:00
Omri Mendels 59af84d131
Added tesseract to installation (#1312) 2024-02-26 09:24:50 +02:00
Omri Mendels 173b52726c
Bugfix in tutorial (#1310) 2024-02-23 18:34:09 +02:00
Devopam Mittra dee6562ab5
predefined pattern recognizer : IN_VEHICLE_REGISTRATION (#1288)
* IN_PAN pattern recognizer

Added India PAN (Permanent Account Number) recognizer

* refined IN_PAN regex

refined the regex for better recognition and enhanced the test cases accordingly

* Update recognizer_registry.py

Fixed lint error that was missed earlier.

* Fixed Lint errors

Added test cases , verification and context data

* Added more test cases in test_in_pan_recognizer.py

Added negative test cases per review comments.

* added IN_AADHAAR recognizer

* Update in_aadhaar_recognizer.py

linted code

* Update in_aadhaar_recognizer.py

update pattern recognizer value per suggestion in review

* added utility function class

added PresidioAnalyzerUtils class with generic functions. removed usage of stdnum

* Create test_analyzer_utils.py

added test cases for analyzer_utils.py in prescribed format

* Update test_recognizer_registry.py

added to the count of predefined recognizers

* added predefined recognizer : IN_VEHICLE_REGISTRATION

Added India specific predefined pattern recognizer for vehicle registration number

* review comments incorporated

reinstated python 3.9 compatibility, reorganized code

* review comments incorporated

Logic reverted from analyzer_utils to recognizer classfile

* added null/min vehicle number size

added min size check to avoid failures per review comment

* incorporated review comments

---------

Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
2024-02-21 09:47:29 +02:00
Omri Mendels 7f09c95b9d
added pseudonimyzation sample (#1296) 2024-02-13 13:55:44 +02:00
Omri Mendels ee84d70b0d
Added the option to add custom operators + pseudonymization sample (#1284) 2024-02-12 15:45:52 +02:00
Omri Mendels 91d8fc7a17
Added docs for structured (#1287) 2024-02-12 15:42:34 +02:00
Devopam Mittra 4008f36828
Add predefined_recognizer: IN_AADHAAR (#1256) 2024-01-30 21:24:11 +02:00
Omri Mendels 607f58c6c0
bug fix in tutorial (#1266) 2024-01-29 22:44:00 +02:00
Andy 3c7eb8909a
minor typo fix (#1264) 2024-01-24 22:30:32 +01:00
Jakob Serlier 966d17a5a5
Feature/presidio-structured (#1192)
* presidio-structured

changelog

Static analysis

docstrings, types

preliminary tests engine

static analysis

isort

Minor refactorings

Update README.md

Fix late binding issues and example

removal of old samples

Refactoring, adding example

pre-clean-break-commit

broken commit, fixing TabularConfigBuilder

Rename TabularConfig

pre-breaking replace commit

removal of some old experimental files

rename tabular to structured

restructuring presidio tabular - pre del commit

Add project TODOs

testing dump presidio tabular

* Add unit tests

* rename engine, add buildfile

* Update setup.py

* lint-build-test

* Update lint-build-test.yml

* Add packages to setup.py

* Update presidio-structured to alpha version

* Update Presidio structured README.md

* Add logging configuration to presidio-structured
module

* Refactor AnalysisBuilder constructor to accept an
optional AnalyzerEngine parameter

* Fix entity mapping in JsonAnalysisBuilder

* Drop type in docstring in analysis builder classes

* Refactor TabularAnalysisBuilder to use
BatchAnalyzerEngine for all columns

* Update data_reader.py with type hints for file
paths

* Update data_reader.py to include additional
keyword arguments in read() method

* Update Transformer to Processor term in
StructuredEngine

* Add PandasDataProcessor as default to StructuredEngine
init

* Move structured sample files to the docs

* Add Presidio Structured  Notebook to samples index

* Remove unnecessary imports in structured sample

* Update to processors in structured __init__ files

* Add explanation for structured table sample

* Delete unnecessary __init__s in structured test

* Fix bug in JsonAnalysisBuilder entity mapping

* pr comments, nits, minor tests

* README

* Add TabularAnalysisBuilder

* Some basic logging

* linting

* Fix typo in logger variable name

* Refactor analysis builder to include score
threshold

* Linting, continued

* Update Pipfile

* Refactor JsonAnalysisBuilder to support language
parameter

* Fix not camel case in TabularAnalysisBuilder

* Add score_threshold parameter to AnalysisBuilder

* Refactor JSON analysis builder to gain consistency

* Remove low score results in JsonAnalysisBuilder

* Add tests to json analysis  with score threshold

* Fix bug in JSON analysis to update map with
nested_mappings

* Fix bug in JSON analysis to take only entity types

* Fix typos in test anl json names and assert values

* Update build-structured.yml

* Create __init__.py

* Type hint fix python <3.10, loggger typo

* Update setup.py

* PR comments variety

* further pr comments

* readme, refactor score, refactor tabular analysis

* Update test_analysis_builder.py

* lint

---------

Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
Co-authored-by: Sharon Hart <sharonh.dev@gmail.com>
Co-authored-by: enrique.botia <enrique.botia@netzima.com>
2024-01-14 11:22:44 +02:00
Omri Mendels 2a8d3ec93b
Azure AI language recognizer (#1228) 2024-01-11 13:13:43 +02:00
Vijay Devane 175c170876
Update index.md (#1241)
Removed the extra symbol that causing error to load the site
2023-12-28 21:20:10 +02:00
Johannes Goslar a3facc8882
Fix spacy_stanza.md example (#1235) 2023-12-26 18:04:03 +02:00
Ari Roffe e7197daa35
Update adding_recognizers.md (#1232) 2023-12-17 18:23:07 +02:00
Ari Roffe 6c22e00e87
Update index.md (#1233)
Seems like https://github.com/microsoft/presidio/pull/1177 broke the link from home page to samples
2023-12-17 11:59:42 +02:00
Omri Mendels b24dbeda54
Updates to demo website with new NLP Engine (#1181) 2023-12-13 18:57:10 +02:00
Etienne 2b7f8b99b5
Change ner_model_configuration from list to map (#1222) 2023-11-30 19:02:27 +02:00
Omri Mendels 6c4cb5495e
Update index.md 2023-11-30 13:05:04 +02:00
Omri Mendels 883afe7a5d
Added a survey link (#1224) 2023-11-30 10:24:46 +02:00
Omri Mendels b756c174ca
Enable regex flags manipulation (#1193) 2023-10-26 12:07:14 +03:00
Omri Mendels 4aaa05fca8
New NlpEngine - docs (#1177) 2023-10-25 15:17:56 +03:00