Please use this identifier to cite or link to this item:
http://localhost:8080/xmlui/handle/123456789/991
Title: | Large Scale Data Clustering Using Various-Widths Clustering Approach |
Authors: | Agashe, Harshal R. Banait, S. S. |
Keywords: | Clustering, k-Nearest Neighbor, Tree Index, large scale data, Map Reduce |
Issue Date: | Jan-2017 |
Publisher: | International Journal for Scientific Research & Development |
Abstract: | To perform a clustering widely used and most powerful technique is k-nearest neighbor. This approach required large computational cost for high dimensional datasets. The proposed work focuses on k-NN is based on various clustering widths on large scale data. We are proposing modified kNN approach with MapReduce parallel computing algorithm and clusters grouping with goal of improving the performance in terms of clustering time, preprocessing costs and querying cost while working with high dimensional data. First we are presenting the kNN method using various width clustering to efficiently extract the kNNs for input query object from the dataset. The given dataset is clustered using global width then each cluster that satisfies its predefined criteria i.e threshold value is recursively clustered using their local width. To prune unlikely clusters triangle inequality was used earlier, but we designed tree based approach in which centers of clusters grouped into the tree based index to maximize the more clusters pruning. To reduce the processing time and clustering time, we designed parallel computing algorithm based on MapReduce. |
URI: | http://192.168.3.232:8080/jspui/handle/123456789/991 |
ISSN: | 2321-0613 |
Appears in Collections: | PG - Students |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
IJSRDV5I10324.pdf | Large Scale Data Clustering Using Various-Widths Clustering Approach | 330.08 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.