Skip navigation


Please use this identifier to cite or link to this item: http://localhost:8080/xmlui/handle/123456789/991
Title: Large Scale Data Clustering Using Various-Widths Clustering Approach
Authors: Agashe, Harshal R.
Banait, S. S.
Keywords: Clustering, k-Nearest Neighbor, Tree Index, large scale data, Map Reduce
Issue Date: Jan-2017
Publisher: International Journal for Scientific Research & Development
Abstract: To perform a clustering widely used and most powerful technique is k-nearest neighbor. This approach required large computational cost for high dimensional datasets. The proposed work focuses on k-NN is based on various clustering widths on large scale data. We are proposing modified kNN approach with MapReduce parallel computing algorithm and clusters grouping with goal of improving the performance in terms of clustering time, preprocessing costs and querying cost while working with high dimensional data. First we are presenting the kNN method using various width clustering to efficiently extract the kNNs for input query object from the dataset. The given dataset is clustered using global width then each cluster that satisfies its predefined criteria i.e threshold value is recursively clustered using their local width. To prune unlikely clusters triangle inequality was used earlier, but we designed tree based approach in which centers of clusters grouped into the tree based index to maximize the more clusters pruning. To reduce the processing time and clustering time, we designed parallel computing algorithm based on MapReduce.
URI: http://192.168.3.232:8080/jspui/handle/123456789/991
ISSN: 2321-0613
Appears in Collections:PG - Students

Files in This Item:
File Description SizeFormat 
IJSRDV5I10324.pdfLarge Scale Data Clustering Using Various-Widths Clustering Approach330.08 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.