Balancing efficiency and accuracy: incremental learning as a key to big data processing

Authors

  • M.V. Talakh Yuriy Fedkovich Chernivtsi National University
  • Yu.O. Ushenko Yuriy Fedkovich Chernivtsi National University
  • O.V. Kaduk Vinnytsia National Technical University
  • M.Yu. Maksymovych Yuriy Fedkovych Chernivtsi National University

DOI:

https://doi.org/10.31649/1681-7893-2024-48-2-45-57

Keywords:

incremental learning, Big Data, machine learning, streaming data processing, conceptual drift, catastrophic forgetting, adaptive algorithms, online learning.

Abstract

The article provides a comprehensive overview of incremental learning in the context of big data processing. The basic concepts, modern approaches, and key aspects of incremental learning are considered. The advantages of this approach for processing large amounts of data are analyzed, including the efficient use of computing resources, the ability to process streaming data in real time, and adaptability to changes in data. The main limitations and challenges, such as the problem of "catastrophic forgetting", the difficulty of balancing new and old knowledge, dependence on the order of data arrival, and potential loss of accuracy, are investigated. An analysis of specific problems is presented, including the handling of conceptual drift, unbalanced classes, and missing features. Applications of incremental learning in various fields, including data analytics, robotics, autonomous driving, and activity recognition, are discussed. We suggest directions for future research to address the identified problems and improve the effectiveness of incremental learning in the context of big data.

Author Biographies

M.V. Talakh, Yuriy Fedkovich Chernivtsi National University

Ph.D., assistant professor of Computer Science Department

Yu.O. Ushenko, Yuriy Fedkovich Chernivtsi National University

D.Sc.,Professor of Computer Science Department

O.V. Kaduk, Vinnytsia National Technical University

Ph.D., Associate Professor of the Department of Computer Engineering

M.Yu. Maksymovych , Yuriy Fedkovych Chernivtsi National University

Master's student of Computer Science, Department of Computer Science

References

van de Ven G., Tuytelaars T., Tolias A. S. Three types of incremental learning / G. van de Ven, T. Tuytelaars, A. S. Tolias // Nature Machine Intelligence. – 2022. – Vol. 4, No. 12. – P. 1-13. – DOI: 10.1038/s42256-022-00568-3.

Luo Y., Yin L., Bai W., Mao K. An Appraisal of Incremental Learning Methods / Y. Luo, L. Yin, W. Bai, K. Mao // Entropy. – 2020. – Vol. 22, No. 11. – P. 1190.

Hu J., Yan C., Liu X., Li Z., Ren C., Zhang J., Peng D., Yang Y. An integrated classification model for incremental learning / J. Hu, C. Yan, X. Liu, Z. Li, C. Ren, J. Zhang, D. Peng, Y. Yang // Multimedia Tools and Applications. – 2021. – Vol. 80. – P. 17275–17290.

Anowar F., Sadaoui S. Incremental Neural-Network Learning for Big Fraud Data / F. Anowar, S. Sadaoui // IEEE International Conference on Systems, Man, and Cybernetics (IEEE SMC 2020). – 2020. – DOI: 10.1109/SMC42975.2020.9283136.

Eisa A., EL-Rashidy N., Alshehri M. D., El-bakry H. M., Abdelrazek S. Incremental Learning Framework for Mining Big Data Stream / A. Eisa, N. EL-Rashidy, M. D. Alshehri, H. M. El-bakry, S. Abdelrazek // Computers, Materials & Continua. – Tech Science Press. – 2022. – DOI: 10.32604/cmc.2022.021342.

Wen B., Zhu Q. Class-Incremental Learning Based on Big Dataset Pre-Trained Models / B. Wen, Q. Zhu // IEEE Access. – 2023. – Vol. 11. – P. 62028–62038. – DOI: 10.1109/ACCESS.2023.3287771.

Joshi P., Kulkarni P. Incremental Learning: Areas and Methods – A Survey / P. Joshi, P. Kulkarni // International Journal of Data Mining & Knowledge Management Process (IJDKP). – 2012. – Vol. 2, No. 5. – P. 43. – DOI: 10.5121/ijdkp.2012.2504.

Bifet A., Gavaldà R. Learning from Time-Changing Data with Adaptive Windowing / A. Bifet, R. Gavaldà // Proceedings of the Seventh SIAM International Conference on Data Mining, April 26-28, 2007, Minneapolis, Minnesota, USA. – 2007. – P. 443-448.

Thrun S., Mitchell T. Lifelong Robot Learning / S. Thrun, T. Mitchell // Robotics and Autonomous Systems. – 1995. – Vol. 15, No. 1. – P. 25-46.

Bojarski M. End to End Learning for Self-Driving Cars / M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, K. Zieba // arXiv preprint arXiv:1604.07316. – 2016. – P. 1-9.

Lara O. D., Labrador M. A Survey on Human Activity Recognition Using Wearable Sensors / O. D. Lara, M. Labrador // IEEE Communications Surveys & Tutorials. – 2013. – Vol. 15, No. 3. – P. 1192-1209.

Ren S., He K., Girshick R., Sun J. Faster R-CNN: towards real-time object detection with region proposal networks / S. Ren, K. He, R. Girshick, J. Sun // NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems. – 2015. – Vol. 1. – P. 91-99.

Snoek C., Worring M., Smeulders A. W. M. Early versus late fusion in semantic video analysis / C. Snoek, M. Worring, A. W. M. Smeulders // Proceedings of the 13th ACM International Conference on Multimedia, Singapore, November 6-11, 2005. – 2005. – P. 399-402.

Chandola V., Banerjee A., Kumar V. Anomaly Detection: A Survey / V. Chandola, A. Banerjee, V. Kumar // ACM Computing Surveys. – 2009. – Vol. 41, No. 3. – P. 1-58. – DOI: https://doi.org/10.1145/1541880.1541882.

Tasar O., Tarabalka Y., Alliez P. Incremental Learning for Semantic Segmentation of Large-Scale Remote Sensing Data / O. Tasar, Y. Tarabalka, P. Alliez // IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. – 2019. – Vol. PP, No. 99. – P. 1-14. – DOI: 10.1109/JSTARS.2019.2925416.

Kulkarni P., Ade R. Incremental Learning From Unbalanced Data with Concept Class, Concept Drift and Missing Features: A Review / P. Kulkarni, R. Ade // International Journal of Data Mining & Knowledge Management Process. – 2014. – Vol. 4, No. 6. – P. 15-29. – DOI: 10.5121/ijdkp.2014.4602.

Gao J., Ding B., Fan W., Han J., Philip S. Y. Classifying data streams with skewed class distributions and concept drifts / J. Gao, B. Ding, W. Fan, J. Han, S. Y. Philip // IEEE Internet Computing. – 2008. – Vol. 12, No. 6. – P. 37-49.

Chen Z., Liu L., Wang X. Incremental Learning and Generalization in Machine Learning / Z. Chen, L. Liu, X. Wang // Journal of Machine Learning Research. – 2023. – Vol. 24, No. 1. – P. 112-130.

Ditzler G., Polikar R., Oza N. C. Evaluation Metrics for Incremental Learning / G. Ditzler, R. Polikar, N. C. Oza // Data Mining and Knowledge Discovery. – 2022. – Vol. 36, No. 4. – P. 1234-1251.

Pinto J., Khoshgoftaar T. M., Wang D. Challenges in Incremental Learning: Overfitting and Data Drift / J. Pinto, T. M. Khoshgoftaar, D. Wang // IEEE Transactions on Neural Networks and Learning Systems. – 2022. – Vol. 33, No. 3. – P. 678-690.

He X., Zhang X., Liu Y. Techniques for Improving Generalization in Incremental Learning / X. He, X. Zhang, Y. Liu // Pattern Recognition Letters. – 2023. – Vol. 159. – P. 72-80.

Zhou Y., Zhang L., Zhao Y. Handling Global Patterns in Incremental Learning / Y. Zhou, L. Zhang, Y. Zhao // Artificial Intelligence Review. – 2024. – Vol. 57, No. 4. – P. 501-519.

Javed M. Y., Bhatia P., Aslam N. Evaluating Incremental Learning Models: Challenges and Solutions / M. Y. Javed, P. Bhatia, N. Aslam // IEEE Transactions on Knowledge and Data Engineering. – 2023. – Vol. 35, No. 1. – P. 25-37.

Shao W., Liu Y., Liu J. Determining Optimal Stopping Points in Incremental Learning / W. Shao, Y. Liu, J. Liu // Artificial Intelligence Review. – 2023. – Vol. 56, No. 2. – P. 203-218.

Jin X., Zhang H., Yu S. Challenges in Measuring Performance of Incremental Learning Models / X. Jin, H. Zhang, S. Yu // Journal of Computer Science and Technology. – 2023. – Vol. 38, No. 5. – P. 1249-1263.

Hsu Y.-C., Liu Y.-C., Ramasamy A., Kira Z. Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines / Y.-C. Hsu, Y.-C. Liu, A. Ramasamy, Z. Kira // arXiv preprint arXiv:1810.12488. – 2019.

Chao S., Wong D. F. An incremental decision tree learning methodology regarding attributes in medical data mining / S. Chao, D. F. Wong // Machine Learning and Cybernetics, 2009 International Conference on. – 2009. – Vol. 3. – P. 1694-1699. – DOI: 10.1109/ICMLC.2009.5212333.

Deng J., Haojian Zhang2 , Jianhua Hu2 , and Yunkuan Wang Incremental Prototype Tuning for Class Incremental Learning / J. Deng, et al. // arXiv preprint arXiv:2204.03410. – 2022.

Ralaivola L., d'Alche-Buc F. Incremental support vector machine learning: A local approach / L. Ralaivola, F. d'Alche-Buc // International Conference on Artificial Neural Networks (ICANN 2001), Vienna, Austria: Springer-Verlag. – 2001. – P. 322-330.

Hagras H. Toward Human-Understandable, Explainable AI / H. Hagras // Computer. – 2018. – Vol. 51, No. 9. – P. 28-36.

Doshi-Velez F., Kim B. Towards A Rigorous Science of Interpretable Machine Learning / F. Doshi-Velez, B. Kim // arXiv preprint arXiv:1702.08608. – 2017.

Brundage, M., Avin, S., Wang, J., Belfield, H., Krueger, G., Hadfield, G., Khlaaf, H., Yang, J., Toner, H., Fong, R., Maharaj, T., Koh, P. W., Hooker, S., Leung, J., Trask, A., Bluemke, E., Lebensold, J., O’Keefe, C., Koren, M., ... Anderljung, M. (2020). Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims. arXiv. https://arxiv.org/abs/2004.07213

Wójcik Waldemar, Smolarz Andrzej (2017). Information Technology in Medical Diagnostics, July 11, 2017 by CRC Press, 210 Pages.

Wójcik W., Pavlov S., Kalimoldayev M. (2019). Information Technology in Medical Diagnostics II. London: Taylor & Francis Group, CRC Press, Balkema book, 336 Pages.

Highly linear Microelectronic Sensors Signal Converters Based on Push-Pull Amplifier Circuits / edited by Waldemar Wojcik and Sergii Pavlov, Monograph, (2022) NR 181, Lublin, Comitet Inzynierii Srodowiska PAN, 283 Pages. ISBN 978-83-63714-80-2

Pavlov Sergii, Avrunin Oleg, Hrushko Oleksandr, and etc. (2021). System of three-dimensional human face images formation for plastic and reconstructive medicine // Teaching and subjects on bio-medical engineering Approaches and experiences from the BIOART-project Peter Arras and David Luengo (Eds.), , Corresponding authors, Peter Arras and David Luengo. Printed by Acco cv, Leuven (Belgium). - 22 P. ISBN: 978-94-641-4245-7.

Pavlov S.V., Avrunin O.G., etc. (2019). Intellectual technologies in medical diagnosis, treatment and rehabilitation: monograph / [S. In edited by S. Pavlov, O. Avrunin. - Vinnytsia: PP "TD "Edelweiss and K", 260 p. ISBN 978-617-7237-59-3.

Downloads

Abstract views: 3

Published

2024-11-19

How to Cite

[1]
M. . Talakh, Y. . Ushenko, O. . Kaduk, and M. . Maksymovych, “Balancing efficiency and accuracy: incremental learning as a key to big data processing”, Опт-ел. інф-енерг. техн., vol. 48, no. 2, pp. 45–57, Nov. 2024.

Issue

Section

OptoElectronic/Digital Methods and Systems for Image/Signal Processing

Metrics

Downloads

Download data is not yet available.

Most read articles by the same author(s)

<< < 1 2 3 > >>