This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalence Constant Theory and Matrix Language Theory.
natural-language-processing
python3
code-switching
linguistics
synthetic-data-generation
code-mixing
data-generation
language-modeling
Обновлено 2024-07-31 00:01:52 +03:00
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
machine-learning
python
data-science
ner
synthetic-data
synthetic-data-generation
data-generation
ocr-recognition
synthetic-images
text-alignment
Обновлено 2023-07-20 18:03:32 +03:00