Оптико-геометричні особливості упаковок лікарських засобів у задачах автоматизованого розпізнавання зображень

Ye.O. Datsok; O.V. Yakovleva

doi:10.31649/1681-7893-2026-51-1-130-138

Authors

Ye.O. Datsok Kharkiv National University of Radio Electronics https://orcid.org/0009-0008-5101-5217
O.V. Yakovleva Kharkiv National University of Radio Electronics https://orcid.org/0000-0002-6129-6146

DOI:

https://doi.org/10.31649/1681-7893-2026-51-1-130-138

Keywords:

automated image recognition, multimodal models, OCR, pharmaceutical packaging, computer vision, optical characteristics, image processing, multimodal analysis

Abstract

The paper presents an analysis of the optical and geometric characteristics of pharmaceutical packaging in AI recognition tasks. The study considers the specifics of medication packaging as a complex object for automated image analysis, including the influence of geometric properties, reflective surfaces, small text, multilingual labeling, and illumination conditions on recognition quality. The limitations of classical OCR approaches for this type of packaging are analyzed, particularly those related to text deformation on curved surfaces, glare artifacts, low contrast, and complex image structures. Practical recommendations for photographing the packaging to improve recognition stability are also considered. The findings demonstrate that optical image characteristics significantly influence the effectiveness of AI-based analysis and should be taken into account during the design of multimodal recognition systems.

Author Biographies

Ye.O. Datsok, Kharkiv National University of Radio Electronics

Студентка

O.V. Yakovleva, Kharkiv National University of Radio Electronics

PhD, доцент

References

Billka AI : Website. URL: https://billka.sytoss.com/en/ (дата звернення: 11.05.2026).

Liu Z., Lin Y., Cao Y. et al. Visual Instruction Tuning. Advances in Neural Information Processing Systems (NeurIPS). 2023. Vol. 36. Available: https://arxiv.org/abs/2304.08485 (дата звернення: 11.05.2026).

Liu Y., Stathopoulos S., Petukhova V. et al. DLI-IT: A Deep Learning Approach to Drug Label Identification through Image and Text Embedding. BMC Medical Informatics and Decision Making. 2020. Vol. 20, no. 1. P. 84. doi: https://doi.org/10.1186/s12911-020-1078-3.

Smith R. An Overview of the Tesseract OCR Engine. Proceedings of the Ninth International Conference on Document Analysis and Recognition (ICDAR 2007). 2007. Vol. 2. Pp. 629–633. doi: https://doi.org/10.1109/ICDAR.2007.4376991.

JaidedAI. EasyOCR : GitHub repository. 2024. URL: https://github.com/JaidedAI/EasyOCR (дата звернення: 14.05.2026).

PaddlePaddle Authors. PaddleOCR: Awesome multilingual OCR toolkits based on PaddlePaddle : GitHub repository. 2024. URL: https://github.com/PaddlePaddle/PaddleOCR (дата звернення: 16.05.2026).

Long S., Ruan J., Zhang W. et al. TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes. Proceedings of the European Conference on Computer Vision (ECCV). 2018. Pp. 20–36. doi: https://doi.org/10.48550/arXiv.1807.01544.

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network / Y. Liu et al. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020. Pp. 9809–9818. doi: https://doi.org/10.1109/CVPR42600.2020.00983.

Zhan F., Lu S. ESIR: End-to-End Scene Text Recognition via Iterative Image Rectification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019. Pp. 2059–2068. doi: https://doi.org/10.1109/CVPR.2019.00216.

Liu Y., Stathopoulos S., Petukhova V. et al. DLI-IT: A Deep Learning Approach to Drug Label Identification through Image and Text Embedding. BMC Medical Informatics and Decision Making. 2020. Vol. 20, no. 1. P. 84. doi: https://doi.org/10.1186/s12911-020-1078-3.

Pettersson N., Falkman G., Karlsson M. Multimodal Fine-Grained Grocery Product Recognition Using Image and OCR Text. Machine Vision and Applications. 2024. Vol. 35, no. 5. doi: https://doi.org/10.1007/s00138-024-01549-9.

Huang X., Li Z., Wang Y. et al. OCR-Reasoning Benchmark for Multimodal Large Language Models. OpenReview. 2025. URL: https://openreview.net/forum?id=aH7eyx64pC (дата звернення: 12.05.2026).

Koponen J., Haataja K., Toivanen P. Recent Advancements in Machine Vision Methods for Product Code Recognition: A Systematic Review. F1000Research. 2022. Vol. 11. doi: https://doi.org/10.12688/f1000research.124796.1.

Koponen J., Haataja K., Toivanen P. Text Recognition of Cardboard Pharmaceutical Packages by Utilizing Machine Vision. IS&T International Symposium on Electronic Imaging. 2021. doi: https://doi.org/10.2352/ISSN.2470-1173.2021.10.IPAS-235.

Gromova A., Elangovan N. Automatic Extraction of Medication Information from Cylindrically Distorted Pill Bottle Labels. Machine Learning and Knowledge Extraction. 2022. Vol. 4, no. 4. Pp. 1045–1065. doi: https://doi.org/10.3390/make4040043.

Hou Q., Xie R., Yang M. et al. Text-Aware Single Image Specular Highlight Removal. arXiv preprint. 2021. arXiv:2108.06881. URL: https://arxiv.org/abs/2108.06881 (дата звернення: 16.05.2026).

Evaluating OCR Performance on Food Packaging Labels in South Africa. arXiv preprint. 2025. arXiv:2510.03570. URL: https://arxiv.org/abs/2510.03570 (дата звернення: 16.05.2026).

Sokol, Y., Avrunin, O., Kolisnyk, K., & Zamiatin, P. (2020). Using medical imaging in disaster medicine. Paper presented at the 2020 IEEE 4th International Conference on Intelligent Energy and Power Systems, IEPS 2020 - Proceedings, 287-290. doi:10.1109/IEPS51250.2020.9263175

Selivanova, K. G., Avrunin, O. G., Tymkovych, M. Y., & Manhora, T. V. (2021). 3D Visualization of Human Body Internal Structures Surface During StereoEndoscopic Operations Using Computer Vision Techniques. Przegląd Elektrotechniczny, (9), 30–33. DOI: 10.15199/48.2021.09.06.

Місоченко С. Ю. Дослідження використання вірогіднісних методів у сфері обробки біомедичних зображень / С. Ю.Місоченко, К. Г. Селіванова, О. Г. Аврунін // Інформаційні технології: наука, техніка, технологія, освіта, здоров’я: тези доповідей ХXХ міжнародної науково-практичної конференції MicroCAD2022, 19-21 жовтня 2022 р. – Харків : НТУ «ХПІ», 2022. – C. 902.

Optical-geometric features of medicinal packaging in automated image recognition problems

Authors

DOI:

Keywords:

Abstract

Author Biographies

Ye.O. Datsok, Kharkiv National University of Radio Electronics

O.V. Yakovleva, Kharkiv National University of Radio Electronics

References

Downloads

Published

How to Cite

Issue

Section

Metrics

Downloads

License

Language

Make a Submission

Information