Advances in Smart Systems Research

Publisher Future Technology Publications
Vol. 6 No. 2 CIMA 2017 Workshop Papers from the KES2017 Conference
Journal ISSN 2050-8662
 
Article TitleAn Empirical Study of Active Learning for Text Classification
Primary AuthorStamatis Karlos, Technical Educational Institute of Western Greece
Other Author(s) Nikos Fazakis; Sotiris Kotsiantis; Kyriakos Sgarbas
Pages 1 - 15
Article ID k17is-220
Publication Date 05-Nov-17
Abstract

Abstract Text categorization or better text classification has recently attracted the interest of several researchers, since the amount of generated documents on daily basis is vast and on many situations their manipulation is infeasible without using any appropriate Machine Learning tools. Several variants of real-life applications belong to this field and much research has been made the last two decades over them. However, default learning methods do not exploit uncategorized files which are in abundance on several fields. Thus, new learning schemes are exploited for boosting learning performance of supervised algorithms. Active Learning is such a representative example, incorporating both labeled and unlabeled data and integrating human's expertise knowledge with the obtained predictions by supervised learners. In this work, four learners are compared under two different Active Learning approaches against random sampling, examining the efficacy of annotating unlabeled documents that verify specific queries. Classification error has been recorded for two different public provided datasets highlighting the improved learning behavior of using specific queries instead of random sampling approach, under the existence of a really small portion of the initial data.

 View Paper
Remarks Papers presented at 7th International Workshop on Combinations of Intelligent Methods and Applications (CIMA) as part of 21st International Conference on Knowledge-Based and Intelligent Information & Engineering Systems, 6-8 September 2017, Marseille, France