Граф коммитов

93 Коммитов

Автор SHA1 Сообщение Дата
Omri Mendels e08f44b273
Fix #1442 (#1445) 2024-09-01 18:05:16 +03:00
Omri Mendels edd722d01b
version 2.2.355 (#1410)
* analyzer version

* Update pyproject.toml

* Update pyproject.toml

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md
2024-07-18 08:28:38 +03:00
Roey Ben Chaim c059131383
changing predefined recognizers to use the config file (#1393) 2024-07-17 16:14:21 +03:00
Sharon Hart 4752166f76
From Pipenv to Poetry (#1391)
* Add more tomls

* fix cli versioning

* Drop pipfile

* Drop pipfile

* Drop pipfile

* keep pipfile

* keep pipfile

* without pipfile

* typo

* typo

* typo

* typo

* typo

* typo

* typo

* typo

* readmes

* drop requires

* add requires

* w/o readme

* w/o readme

* w/o readme

* pdm run

* migrate to pdm

* migrate to pdm

* readme rename

* Fix versioning and docker building

* Fix versioning and docker building

* Fix versioning and docker building

* change build be

* change build be

* PDM_VENV_WITH_PIP

* source-includes

* source-includes

* source-includes

* compose logs

* compose logs

* skip hanging

* skip hanging

* skip hanging

* skip hanging

* readd test

* fix test

* remove test

* remove test

* docs and dockerfiles

* Try with poetry

* Try with poetry

* Try with poetry

* Try with poetry

* drop numpy

* version

* no-root

* fix version extraction

* docs and dockerfiles

* revert test

* comment hanging test

* try clause
2024-06-02 12:59:21 +03:00
Sharon Hart e65c89c687
Migrate Python Packaging to pyproject.toml (#1383)
* Migrate Python Packaging to pyproject.toml

* Migrate Python Packaging to pyproject.toml

* Add more tomls

* fix pipeline

* fix cli versioning

* Update pyproject.toml
2024-05-19 08:16:51 +03:00
Sharon Hart 2d92539fca
Fix N818, E721 (#1382)
* Add Ruff linter + Apply Ruff fix

* Move up linting

* Move up linting

* Move up linting

* docs

* Autoformatting, fix D rules

* isort skip

* Fix N818, E721

* drop comment
2024-05-12 15:05:41 +03:00
Sharon Hart 51dc5c6d9d
Auto-formatting, fix D rules (#1381)
* Add Ruff linter + Apply Ruff fix

* Move up linting

* Move up linting

* Move up linting

* docs

* Autoformatting, fix D rules

* skip isort files

* isort skip
2024-05-12 13:12:55 +03:00
Sharon Hart cb0184afa2
Add Ruff linter + Apply Ruff fix (#1379)
* Add Ruff linter + Apply Ruff fix

* Move up linting

* Move up linting

* Move up linting

* docs

* docs
2024-05-12 10:16:47 +03:00
RoeyBC 2805c86221
Loading analyzer engine & recognizer registry from configuration file (#1367) 2024-05-01 15:01:50 +03:00
Omri Mendels ffa29f8d65
Fixed wrong condition for dicom metadata (#1347)
* fixed wrong condition for dicom metadata

* added redacted file for tests
2024-03-29 16:10:02 +03:00
Aivaras Baranauskas b6cc0d76b3
Moved py.typed files to proper locations (#1271)
Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
2024-02-08 20:14:48 +02:00
Roberto Aguiar Lima cf5757c41c
Bugfix issue #1274 (#1275) 2024-02-08 14:05:13 +02:00
Omri Mendels 56fb84d353
Add missing dependencies for image-redactor (#1257) 2024-01-22 09:24:25 +02:00
Jenny Lee dfa1f2377f
fixes bug #1227 (#1231) 2023-12-26 14:38:25 +02:00
Sharon Hart d68c44b2ca
Change default score threshold in image redactor (#1210) 2023-11-12 13:21:27 +02:00
Md Ashhar ac56089dfe
Added py.typed file and modified the setup.py files (#1201) 2023-11-03 23:55:56 +02:00
Omri Mendels 22619f3198
Put org in ignore as it has many FPs (#1200) 2023-10-31 23:48:44 +02:00
Gord Lueck 818c80f978
Msft document intelligence ocr (#1184) 2023-10-18 21:37:58 +03:00
ayabel 6cf5f18cb4
added image processing class to preprocess the image before running OCR (#1166) 2023-10-04 14:42:41 +03:00
Nile Wilson 7400dc4b35
Updating verification engines and enable plotting of custom bboxes (#1164)
* Adding methods to enable plotting of custom bboxes

* Adding test for get_pii_bboxes

* Linting fixes and addition of test for add_custom_bboxes

* Adding use of custom bbox into DICOM verification engine

* Fixing tests for the DICOM verification engine

* Adding in ocr kwargs compatibility and updating tests

* Adding example notebook

* Linting fixes

* Remaining linting fix

---------

Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
2023-09-28 16:02:10 -04:00
Nile Wilson d8541e9733
Improve bbox processor (#1163)
* Adding method decorators

* Updating remove_bbox_padding and test

* Linting fix
2023-09-06 09:42:42 -04:00
Nile Wilson 4e8490ce0c
Updating verification engines to include latest updates to redactor engines (#1162)
* Enabling use of ad-hoc recognizers in verifier

* Adding support to standard image verification engine as well

* Linting fix

* Removing redundant init

* Removing unused import
2023-09-06 09:26:30 -04:00
Nile Wilson 93934a93b2
Adding examples of toggling metadata usage and saving bboxes (#1158) 2023-08-28 08:56:56 +03:00
Nile Wilson 1a12771c90
Improve process names method in DICOM image redactor (#1150)
* Adding method to augment words more thoroughly

* Adding unit test

* PR comments changes

* Linting fix

---------

Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
2023-08-23 17:21:34 -04:00
Nile Wilson 60e1f7db94
Enabling allow list approach with all image redaction (#1145)
* Enabling allow list approach

* Adding in empty list for allow_list in existing unit tests

* Added unit tests for newly introduced methods

* Adding unit test for allow list functionality

* Linting fixes

* Removing spaces in empty lines

* Fix integration test not accounting for empty space removal

* Updating notebook with more examples and adding ad_hoc_recognizers approach to standard image redactor engine as well

* Linting fixes

* Removing incomplete example code

* Fixing section header numbers

* Removing duplicate comment

---------

Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
2023-08-23 16:44:47 -04:00
Nile Wilson 994074bbb4
Changing test exception type check (#1148) 2023-08-23 15:20:57 +03:00
Nile Wilson 8e1bf802c4
Enable toggle of printing output location after redacting from file (#1144) 2023-08-23 12:13:39 +03:00
Sharon Hart 18090c60c6
install from pipfile (#1152) 2023-08-23 11:27:33 +03:00
Nile Wilson fa861aaddf
DICOM redactor improvement: Enable selection of redact approach (#1113)
* Adding in ability to select redact approach

* Adding redact_approach arg into tests so they pass correctly

* Adding test for _get_analyzer_results

* Linting fixes

* Additional linting fixes

* Making default approach the default

* Linting fix

* Replacing patch.object() with patch() in modified tests

* Fixed mocker patch for new _get_analyzer_results in redact_and_return_bbox

* Removing old assertions brought over from incorrect merge conflict resolution

* Replacing call_count == statements with assert_called_once etc

* Replacing in-test instantiation with passing in mock_engine

* Suggested change from PR comment. redact_approach replaced with use_metadata and ability to pass in ad_hoc_recognizers

* Linting fixes

* Additional linting fixes

* Remove output that is not checked

* Commenting out whole unit test file to see impact on build pipeline hangup

* Only commenting out get_analyzer_results tests

* Only commenting out happy path test for get_analyzer_results

* Changing ad_hoc_recognizers type check to raise only TypeError

* Removing type assertion for exception test

* Reintroduce the happy path test for get_analyzer_results

* Commenting out exception test and keeping in happy path test

* Removing constants from parameterize for exception test

* Adding argument= in call to get_analyzer_results in happy path
2023-08-21 10:09:18 -04:00
Nile Wilson e323fed020
DICOM redactor improvement: Enable return of redacted bboxes (#1111)
* Enable return of bboxes used to redact pixels

* Adding return_bboxes arg values into existing tests

* Adding test for return_bbox==True condition

* Adding test for _save_bbox_json()

* Making argument name more clear

* Creating separate method to return redacted image and bboxes

* Linting fix

* Removing Union return type

* Commenting out DICOM verification engine intergration test to see if that is still the cause of unit test hangup

* Renaming test and removing redundancy in unit test for dicom image redactor

* Fixing duplication of call to a single file likely from main merges

* Removing extra cases for redact() test

* Changing mocked return type from None to an empty list

* Commenting out full unit test for redact to see effect on PR build hangup

* Reintroduce verify integration test and non-parameterized redact test

* Commenting out threshold and expected length test to see impact on PR build hang-up

* Undo comment out of image analyzer engine test

* Commenting out all unit tests for dicom image redactor engine

* Comment out unit test for redact()

* Fixing typing

* Commenting out exception test for redact_and_return_bbox

* Updated how exceptions are handled for redact_and_return_bbox, return all unit tests

* Adding IsADirectoryError exception type

* Commenting out happy path test for redact_and_return_bbox

* Commenting out compressed and icon_image_sequence DICOM test input images for redact_and_return_bbox happy path test

* Commenting out the type assertions in happy path test for redact_and_return_bbox

* Commenting out the call count assertions in happy path for redact_and_return_bbox

* Update type assertion and comment out all mocking and mocking assertions for happy path test for redact_and_return_bbox

* Commenting out all assertions in happy path test for redact_and_return_bbox

* Replacing mocker.patch with mocker.patch.object for all mocked methods in happy path test for redact_and_return_bbox

* Changing all mocker.patch.object calls into mocker.patch for happy path test for redact_and_return_bbox

* Reintroduce assertions for happy path test for redact_and_return_bbox

* Turning off assertions for call count again for happy path for redact_and_return_bbox

* Making assertion for returned bbox type even more explicit for happy path test for redact_and_return_bbox

* Turning off type assertions and turning on mock call count assertions for happy path test for redact_and_return_bbox

* Replacing call count assertions with assert_called_once

* Reintroducing type assertions and changing return_value to include some placeholder mock data instead of being empty dictionaries in list

* Comment out the image type assertion

* Turning on image type assertion and turning off bbox type assertions

* Removing assertion for dict

* Using isinstance instead of type ==

* Removing assertions for bbox type

---------

Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
2023-08-02 09:58:14 -04:00
Nile Wilson 67833d5be3
DICOM redactor improvement: Enabling compatibility with compressed images (#1105)
* Adding in methods for compression and modifying existing unit test for adding redact boxes

* Adding unit tests

* Linting fixes

* Fixing exception type

* Adding in python-gdcm dependency

* Adding python-gdcm to piplock

* Updating pipfile.lock

* Switching from gdcm license to python-gdcm license

* Adding in methods for compression and modifying existing unit test for adding redact boxes

* Adding unit tests

* Linting fixes

* Fixing exception type

* Adding in python-gdcm dependency

* Switching from gdcm license to python-gdcm license

* fix ammend

* comment out new tests

* revert

* Incorporating bug fix that was merged in

* Linting fix

* Temporarily commenting out one integration test to see effect on build pipeline hangup

* Adding _strip_score back in

* Commenting out entire integration test for DICOM Image PII verify engine

* Trying alternate assertion for test_eval_dicom_correctly

* Removing all assertions in test_eval_dicom_correctly

* Explicitly including argument names and adding some temporary print statements

* Using deepcopy of passed-in mock DICOM verify results

* Using deepcopy for the passed-in example instance as well

* Renaming any variables that may have overlap with existing variables in the method being test

* Testing effect of mocking verify_dicom_instance in eval_dicom_instance

* Mocking all calls except get_bboxes_from_ocr_results in test_eval_dicom_correctly

* Removing call to method and assertions

* Commenting out test_verify_correctly

* Adding test_verify_correctly back and reducing ambiguity as much as possible

* Removing ambiguity even more

* Changing _strip_score back to not returning but keeping input arg name change

* Commenting out assertions for test_verify_correctly

* Comment out test_verify_correctly assertions and full test_eval_dicom_correctly test

* Adding assertions back into test_verify_correctly

* Commenting out last assertion in test_verify_correctly

* Reformatting to avoid use of all in assert

* Using frozen set comparison for final assert in test_verify_correctly

* Removing final assertion from test_verify_correctly while keeping test_eval_dicom_correctly in

* Commenting out act and assert in test_verify_correctly while keeping test_eval_dicom_correctly in

* Using module-level mock variables and moving act call to less lines in test_verify_correctly

* Removing all assertions but keeping other sections for both tests

* Adding back in first assertions for each test

* Removing all assertions except fist image type assertion in verify test

* Removing all assertions except image type assertion in test_eval_dicom_correctly

* Remove image type assertions but keep all other assertions in

* Remove image type assertions and all assertions for eval test

* Only keep final assertion in verify test

* Only keep the second assertion in verify and nothing else

* Only keep final assertion in eval test

* Enable all assertions and move helper methods above test methods

* Simplifying assertions

* Removing unused code now that we have simplified assertions

* Reverting back but simplifying asserts another way

* Removing eval integration test since unit test for that covers functionality

* Remvoing analyzer results assertion

* Removing unused import

---------

Co-authored-by: Sharon Hart <sharonh.dev@gmail.com>
2023-07-13 16:00:45 -04:00
Nile Wilson a1c5c309ea
DICOM redactor improvement: Preventing distortion when multiple sets of pixels are in one instance (#1109)
* Adding in handling for instances with icon image sequence data

* Linting fix

---------

Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
2023-07-10 10:00:28 -04:00
Sharon Hart a661037841
relock image redactor (#1117) 2023-07-09 11:26:55 +03:00
Nile Wilson 08e7b8909a
Small reordering of kwargs as prereq for allow list functionality (#1110) 2023-07-06 10:39:37 -04:00
Nile Wilson 23a89af2d7
DICOM redactor improvement: Adding exceptions for when DICOM file does not have pixel data (#1104)
* Adding exceptions for when DICOM file does not have pixel data

* Updating unit tests to accomodate new exception

* Fixing linting errors

* Adding in test image

* Fixing f-string
2023-07-06 09:18:50 -04:00
Nile Wilson 42f30bde24
DICOM redactor improvement: Enabling more photometric interpretations (#1103)
* Add improved check for greyscale

* Using uuid for temp file name rather than hard coded single string
2023-07-06 08:57:40 -04:00
Omri Mendels ced96b0f4d
hotfix for removing fixed dependency versions (#1096) 2023-06-21 10:25:34 +03:00
Sharon Hart d459d44ff5
Lock with 3.8, drop building on 3.7 for 3.11 (#1080) 2023-06-01 15:32:33 +03:00
dependabot[bot] a7ff547957
Bump flask from 2.2.3 to 2.3.2 in /presidio-image-redactor (#1066)
Bumps [flask](https://github.com/pallets/flask) from 2.2.3 to 2.3.2.
- [Release notes](https://github.com/pallets/flask/releases)
- [Changelog](https://github.com/pallets/flask/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/flask/compare/2.2.3...2.3.2)

---
updated-dependencies:
- dependency-name: flask
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-02 13:48:34 +03:00
dependabot[bot] 1d50d41a08
Bump werkzeug from 2.2.2 to 2.2.3 in /presidio-image-redactor (#1032) 2023-02-28 14:30:27 +02:00
Nile Wilson 9671db36d3
Adding type check for image passed to redact (#1020) 2023-01-25 09:22:47 -05:00
Nile Wilson eecbdef23e
Fixing typo (photometric interpolation -> photometric interpretation) (#1021)
* Fixing typo

* Linting fix

Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
2023-01-25 09:08:54 -05:00
Omri Mendels 105d9455c5
Install transformers model into the docker image (#912) 2023-01-25 09:06:02 +02:00
Nile Wilson 5512a39ce7
Consolidating general bounding box operations (#1011) 2023-01-24 20:56:10 +02:00
Sharon Hart 11dfa64d63
Image Redactor - REST API to support web applications payload (#1009)
* Image redactor - REST API to support web application

* revert port

* lints

* Fix versioning, bump pillow and analyzer

* lower score

* Fix versioning, bump pillow and analyzer

* try to fix test

* add e2e test

Co-authored-by: sharon <sharon.hart@microsoft.com>
Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
2023-01-18 11:24:55 +02:00
Nile Wilson bce777bb72
Enable thresholding of OCR results (#1001) 2023-01-17 21:34:30 +02:00
Nile Wilson 1d07b03866
Make displaying DICOM verification image optional (#1000) 2023-01-05 17:09:15 +02:00
Nile Wilson b0430cabbb
Add evaluation code for DICOM de-identification (#979)
* Initial commit with old paths (will need to fix)

* Fixing paths

* Adding in notebook and updating documentation

* Adding in WIP notebook and docs

* Adding example notebook

* Clearing some output to reduce file size

* Linting fix v1

* Linting v2

* Add rounding to integration test to account for minor differences in real testing

* Fixing typo with wrong variable name

* Removing check for exact or rounded conf

* Updating logic for comparing ocr_results

* Minor fixes on DICOM verify code based on PR comments

* Changing results and gt format from dictionary to list of dictionaries

* Updating notebook

* Updating docstring format

* Linting fix

* Changing fixture scope for mock_engine

* Update docs/image-redactor/evaluating_dicom_redaction.md

Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>

* Docstring updates

* Improving efficiency of _remove_duplicate_entities

* Add handling for divide by zero in recall and precision calculations

* Making precision and recall calculation functions public

* Moving common fixtures to conftest

* PR comments

Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
2023-01-05 08:01:46 -05:00
Nile Wilson 9dba056745
Updating DICOM redact box fill color selection (#987)
* Adding new function to get pixel array crop

* Adding unit tests

* Removing unused mocker

* Improve code efficiency of _get_array_corners

* Allow users to specify crop_ratio

* Clarifying and fixing docstrings

Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
2023-01-04 14:05:37 -05:00
Nile Wilson a29e1bb9f8
Enabling ImagePiiVerifyEngine to accept kwargs (#978)
* Enabling ImagePiiVerifyEngine to accept kwargs

* Updating changelog

Co-authored-by: Omri Mendels <omri374@users.noreply.github.com>
Co-authored-by: Sharon Hart <sharonh.dev@gmail.com>
2022-12-19 09:09:21 -05:00