Abstract

Paper Title/ Authors Name Download View

A REVIEW OF NOVEL AND EFFECTIVE APPROACH FOR CLUSTERING DOCUMENTS

Yashashvi Hirwale, Nisha Bhalse


In the computerized inspection of text documents usually substantial amount of files are explored every day. Much of the data contained in those files consists of disorganized text, whose examination is difficult to be achieved by computer research workers. In this situation, computerized methods of search are of great significance to the analyst. In particular, algorithms for clustering of documents can be a start to the expedition of new and beneficial knowledge from the records under search. Here we will represent an approach that is useful for recommendations of readers of a news portal. We will illustrate the suggested approach by performing extensive tests on datasets with the renowned k-means clustering algorithm. Experiments will be performed on the set of data which is ready for use and distinct specifications of the database are analyzed for achieving the defined objective. In addition, two relative validity indexes were used to naturally find out the number of clusters. The objective of this work is to increase the efficiency and preciseness of existing clustering algorithm. At last, we will also represent several drawbacks of some classical clustering techniques that can be valuable for researchers and masters of text mining.