Balancing efficiency and accuracy: incremental learning as a key to big data processing
DOI:
https://doi.org/10.31649/1681-7893-2024-48-2-45-57Keywords:
incremental learning, Big Data, machine learning, streaming data processing, conceptual drift, catastrophic forgetting, adaptive algorithms, online learning.Abstract
The article provides a comprehensive overview of incremental learning in the context of big data processing. The basic concepts, modern approaches, and key aspects of incremental learning are considered. The advantages of this approach for processing large amounts of data are analyzed, including the efficient use of computing resources, the ability to process streaming data in real time, and adaptability to changes in data. The main limitations and challenges, such as the problem of "catastrophic forgetting", the difficulty of balancing new and old knowledge, dependence on the order of data arrival, and potential loss of accuracy, are investigated. An analysis of specific problems is presented, including the handling of conceptual drift, unbalanced classes, and missing features. Applications of incremental learning in various fields, including data analytics, robotics, autonomous driving, and activity recognition, are discussed. We suggest directions for future research to address the identified problems and improve the effectiveness of incremental learning in the context of big data.
References
van de Ven G., Tuytelaars T., Tolias A. S. Three types of incremental learning / G. van de Ven, T. Tuytelaars, A. S. Tolias // Nature Machine Intelligence. – 2022. – Vol. 4, No. 12. – P. 1-13. – DOI: 10.1038/s42256-022-00568-3.
Luo Y., Yin L., Bai W., Mao K. An Appraisal of Incremental Learning Methods / Y. Luo, L. Yin, W. Bai, K. Mao // Entropy. – 2020. – Vol. 22, No. 11. – P. 1190.
Hu J., Yan C., Liu X., Li Z., Ren C., Zhang J., Peng D., Yang Y. An integrated classification model for incremental learning / J. Hu, C. Yan, X. Liu, Z. Li, C. Ren, J. Zhang, D. Peng, Y. Yang // Multimedia Tools and Applications. – 2021. – Vol. 80. – P. 17275–17290.
Anowar F., Sadaoui S. Incremental Neural-Network Learning for Big Fraud Data / F. Anowar, S. Sadaoui // IEEE International Conference on Systems, Man, and Cybernetics (IEEE SMC 2020). – 2020. – DOI: 10.1109/SMC42975.2020.9283136.
Eisa A., EL-Rashidy N., Alshehri M. D., El-bakry H. M., Abdelrazek S. Incremental Learning Framework for Mining Big Data Stream / A. Eisa, N. EL-Rashidy, M. D. Alshehri, H. M. El-bakry, S. Abdelrazek // Computers, Materials & Continua. – Tech Science Press. – 2022. – DOI: 10.32604/cmc.2022.021342.
Wen B., Zhu Q. Class-Incremental Learning Based on Big Dataset Pre-Trained Models / B. Wen, Q. Zhu // IEEE Access. – 2023. – Vol. 11. – P. 62028–62038. – DOI: 10.1109/ACCESS.2023.3287771.
Joshi P., Kulkarni P. Incremental Learning: Areas and Methods – A Survey / P. Joshi, P. Kulkarni // International Journal of Data Mining & Knowledge Management Process (IJDKP). – 2012. – Vol. 2, No. 5. – P. 43. – DOI: 10.5121/ijdkp.2012.2504.
Bifet A., Gavaldà R. Learning from Time-Changing Data with Adaptive Windowing / A. Bifet, R. Gavaldà // Proceedings of the Seventh SIAM International Conference on Data Mining, April 26-28, 2007, Minneapolis, Minnesota, USA. – 2007. – P. 443-448.
Thrun S., Mitchell T. Lifelong Robot Learning / S. Thrun, T. Mitchell // Robotics and Autonomous Systems. – 1995. – Vol. 15, No. 1. – P. 25-46.
Bojarski M. End to End Learning for Self-Driving Cars / M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, K. Zieba // arXiv preprint arXiv:1604.07316. – 2016. – P. 1-9.
Lara O. D., Labrador M. A Survey on Human Activity Recognition Using Wearable Sensors / O. D. Lara, M. Labrador // IEEE Communications Surveys & Tutorials. – 2013. – Vol. 15, No. 3. – P. 1192-1209.
Ren S., He K., Girshick R., Sun J. Faster R-CNN: towards real-time object detection with region proposal networks / S. Ren, K. He, R. Girshick, J. Sun // NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems. – 2015. – Vol. 1. – P. 91-99.
Snoek C., Worring M., Smeulders A. W. M. Early versus late fusion in semantic video analysis / C. Snoek, M. Worring, A. W. M. Smeulders // Proceedings of the 13th ACM International Conference on Multimedia, Singapore, November 6-11, 2005. – 2005. – P. 399-402.
Chandola V., Banerjee A., Kumar V. Anomaly Detection: A Survey / V. Chandola, A. Banerjee, V. Kumar // ACM Computing Surveys. – 2009. – Vol. 41, No. 3. – P. 1-58. – DOI: https://doi.org/10.1145/1541880.1541882.
Tasar O., Tarabalka Y., Alliez P. Incremental Learning for Semantic Segmentation of Large-Scale Remote Sensing Data / O. Tasar, Y. Tarabalka, P. Alliez // IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. – 2019. – Vol. PP, No. 99. – P. 1-14. – DOI: 10.1109/JSTARS.2019.2925416.
Kulkarni P., Ade R. Incremental Learning From Unbalanced Data with Concept Class, Concept Drift and Missing Features: A Review / P. Kulkarni, R. Ade // International Journal of Data Mining & Knowledge Management Process. – 2014. – Vol. 4, No. 6. – P. 15-29. – DOI: 10.5121/ijdkp.2014.4602.
Gao J., Ding B., Fan W., Han J., Philip S. Y. Classifying data streams with skewed class distributions and concept drifts / J. Gao, B. Ding, W. Fan, J. Han, S. Y. Philip // IEEE Internet Computing. – 2008. – Vol. 12, No. 6. – P. 37-49.
Chen Z., Liu L., Wang X. Incremental Learning and Generalization in Machine Learning / Z. Chen, L. Liu, X. Wang // Journal of Machine Learning Research. – 2023. – Vol. 24, No. 1. – P. 112-130.
Ditzler G., Polikar R., Oza N. C. Evaluation Metrics for Incremental Learning / G. Ditzler, R. Polikar, N. C. Oza // Data Mining and Knowledge Discovery. – 2022. – Vol. 36, No. 4. – P. 1234-1251.
Pinto J., Khoshgoftaar T. M., Wang D. Challenges in Incremental Learning: Overfitting and Data Drift / J. Pinto, T. M. Khoshgoftaar, D. Wang // IEEE Transactions on Neural Networks and Learning Systems. – 2022. – Vol. 33, No. 3. – P. 678-690.
He X., Zhang X., Liu Y. Techniques for Improving Generalization in Incremental Learning / X. He, X. Zhang, Y. Liu // Pattern Recognition Letters. – 2023. – Vol. 159. – P. 72-80.
Zhou Y., Zhang L., Zhao Y. Handling Global Patterns in Incremental Learning / Y. Zhou, L. Zhang, Y. Zhao // Artificial Intelligence Review. – 2024. – Vol. 57, No. 4. – P. 501-519.
Javed M. Y., Bhatia P., Aslam N. Evaluating Incremental Learning Models: Challenges and Solutions / M. Y. Javed, P. Bhatia, N. Aslam // IEEE Transactions on Knowledge and Data Engineering. – 2023. – Vol. 35, No. 1. – P. 25-37.
Shao W., Liu Y., Liu J. Determining Optimal Stopping Points in Incremental Learning / W. Shao, Y. Liu, J. Liu // Artificial Intelligence Review. – 2023. – Vol. 56, No. 2. – P. 203-218.
Jin X., Zhang H., Yu S. Challenges in Measuring Performance of Incremental Learning Models / X. Jin, H. Zhang, S. Yu // Journal of Computer Science and Technology. – 2023. – Vol. 38, No. 5. – P. 1249-1263.
Hsu Y.-C., Liu Y.-C., Ramasamy A., Kira Z. Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines / Y.-C. Hsu, Y.-C. Liu, A. Ramasamy, Z. Kira // arXiv preprint arXiv:1810.12488. – 2019.
Chao S., Wong D. F. An incremental decision tree learning methodology regarding attributes in medical data mining / S. Chao, D. F. Wong // Machine Learning and Cybernetics, 2009 International Conference on. – 2009. – Vol. 3. – P. 1694-1699. – DOI: 10.1109/ICMLC.2009.5212333.
Deng J., Haojian Zhang2 , Jianhua Hu2 , and Yunkuan Wang Incremental Prototype Tuning for Class Incremental Learning / J. Deng, et al. // arXiv preprint arXiv:2204.03410. – 2022.
Ralaivola L., d'Alche-Buc F. Incremental support vector machine learning: A local approach / L. Ralaivola, F. d'Alche-Buc // International Conference on Artificial Neural Networks (ICANN 2001), Vienna, Austria: Springer-Verlag. – 2001. – P. 322-330.
Hagras H. Toward Human-Understandable, Explainable AI / H. Hagras // Computer. – 2018. – Vol. 51, No. 9. – P. 28-36.
Doshi-Velez F., Kim B. Towards A Rigorous Science of Interpretable Machine Learning / F. Doshi-Velez, B. Kim // arXiv preprint arXiv:1702.08608. – 2017.
Brundage, M., Avin, S., Wang, J., Belfield, H., Krueger, G., Hadfield, G., Khlaaf, H., Yang, J., Toner, H., Fong, R., Maharaj, T., Koh, P. W., Hooker, S., Leung, J., Trask, A., Bluemke, E., Lebensold, J., O’Keefe, C., Koren, M., ... Anderljung, M. (2020). Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims. arXiv. https://arxiv.org/abs/2004.07213
Wójcik Waldemar, Smolarz Andrzej (2017). Information Technology in Medical Diagnostics, July 11, 2017 by CRC Press, 210 Pages.
Wójcik W., Pavlov S., Kalimoldayev M. (2019). Information Technology in Medical Diagnostics II. London: Taylor & Francis Group, CRC Press, Balkema book, 336 Pages.
Highly linear Microelectronic Sensors Signal Converters Based on Push-Pull Amplifier Circuits / edited by Waldemar Wojcik and Sergii Pavlov, Monograph, (2022) NR 181, Lublin, Comitet Inzynierii Srodowiska PAN, 283 Pages. ISBN 978-83-63714-80-2
Pavlov Sergii, Avrunin Oleg, Hrushko Oleksandr, and etc. (2021). System of three-dimensional human face images formation for plastic and reconstructive medicine // Teaching and subjects on bio-medical engineering Approaches and experiences from the BIOART-project Peter Arras and David Luengo (Eds.), , Corresponding authors, Peter Arras and David Luengo. Printed by Acco cv, Leuven (Belgium). - 22 P. ISBN: 978-94-641-4245-7.
Pavlov S.V., Avrunin O.G., etc. (2019). Intellectual technologies in medical diagnosis, treatment and rehabilitation: monograph / [S. In edited by S. Pavlov, O. Avrunin. - Vinnytsia: PP "TD "Edelweiss and K", 260 p. ISBN 978-617-7237-59-3.
Downloads
-
PDF
Downloads: 12
Published
How to Cite
Issue
Section
License
Автори, які публікуються у цьому журналі, погоджуються з наступними умовами:- Автори залишають за собою право на авторство своєї роботи та передають журналу право першої публікації цієї роботи на умовах ліцензії Creative Commons Attribution License, котра дозволяє іншим особам вільно розповсюджувати опубліковану роботу з обов'язковим посиланням на авторів оригінальної роботи та першу публікацію роботи у цьому журналі.
- Автори мають право укладати самостійні додаткові угоди щодо неексклюзивного розповсюдження роботи у тому вигляді, в якому вона була опублікована цим журналом (наприклад, розміщувати роботу в електронному сховищі установи або публікувати у складі монографії), за умови збереження посилання на першу публікацію роботи у цьому журналі.
- Політика журналу дозволяє і заохочує розміщення авторами в мережі Інтернет (наприклад, у сховищах установ або на особистих веб-сайтах) рукопису роботи, як до подання цього рукопису до редакції, так і під час його редакційного опрацювання, оскільки це сприяє виникненню продуктивної наукової дискусії та позитивно позначається на оперативності та динаміці цитування опублікованої роботи (див. The Effect of Open Access).