Fuzzy Logic based Document Inspection Using TF-IDF  
  Authors : M. Kishore Babu

 

Clustering is one the main area in data mining literature. There are various algorithms for clustering. There are several clustering approaches available in the literature to cluster the document. But most of the existing clustering techniques suffer from a wide range of limitations. The existing clustering approaches face the issues like practical applicability, very less accuracy, more classification time etc. In recent times, inclusion of fuzzy logic in clustering results in better clustering results. One of the widely used fuzzy logic based clustering is Fuzzy C-Means (FCM) Clustering. In order to further improve the performance of clustering, this thesis uses Modified Fuzzy C-Means (MFCM) Clustering. Before clustering, the documents are ranked using Term Frequency–Inverse Document Frequency (TF–IDF) technique. From the experimental results, it can be observed that the proposed technique results in better clustering results when compared to the existing technique.

 

Published In : IJCAT Journal Volume 3, Issue 10

Date of Publication : October 2016

Pages : 443-447

Figures :04

Tables : --

Publication Link :Fuzzy Logic based Document Inspection Using TF-IDF

 

 

 

M. Kishore Babu : Assistant Professor, Department of CSE, Universal College of Engineering & Technology, Guntur, AP, India.

 

 

 

 

 

 

 

Fuzzy C-means Clustering, Datasets, and Multi View Point Clustering

Clustering decides the connections between information objects in the data source. The things are arranged or arranged based on the key of “maximizing the infraclass similarity and reducing the interclass similarity”. It discovers out something useful from data source. Clustering has its roots in many areas, such as information exploration, research, biology, and device learning etc. Clustering methods can be divided into various types: Dividing methods, Hierarchical methods, Solidity centered methods, Grid-based methods; Design centered methods, Probabilistic methods, and Chart theoretic and Unclear methods. The Powerful mean algorithm are the significant concentrate of this dissertation work. Dynamic mean criteria generate good groups automatically because there is no need to described the number of groups before head but in Powerful mean criteria each data factor can be a participant of one and only one group at a time. In other terms we can say that the sum of account grades of each information point in all groups is similar to one and in all the staying groups its account quality is zero .In our thesis dynamic criteria is customized using fuzzy criteria. By implementing fuzzy criteria over Powerful criteria we can show the account of each information factor in all groups .By applying Unclear criteria over Powerful criteria clustering can be at an extremely quicker rate. It is appropriate to a large amount of information saved in databases. The overall results are significant in displaying that Powerful criteria display membership of each information factor in every groups.

 

 

 

 

 

 

 

 

 

[1] X. Wu, V. Kumar, J.R. Quinlan, J. Ghosh, Q. Yang, H. Motoda, G.J. McLachlan, A. Ng, B. Liu, P.S. Yu, Z.-H. Zhou, M. Steinbach, D.J. Hand, and D. Steinberg, “Top 10 Algorithms in Data Mining,” Knowledge Information Systems, vol. 14, no. 1, pp. 1-37, 2007. [2] I. Guyon, U.V. Luxburg, and R.C. Williamson, “Clustering:Science or Art?,” Proc. NIPS Workshop Clustering Theory, 2009. [3] I. Dhillon and D. Modha, “Concept Decompositions for Large Sparse Text Data Using Clustering,” Machine Learning, vol. 42, nos. 1/2, pp. 143-175, Jan. 2001. [4] S. Zhong, “Efficient Online Spherical K-means Clustering,” Proc. IEEE Int’l Joint Conf. Neural Networks (IJCNN), pp. 3180-3185, 2005. [5] A. Banerjee, S. Merugu, I. Dhillon, and J. Ghosh, “Clustering with Bregman Divergences,” J. Machine Learning Research, vol. 6, pp. 1705-1749, Oct. 2005. [6] E. Pekalska, A. Harol, R.P.W. Duin, B. Spillmann, and H. Bunke, “Non-Euclidean or Non-Metric Measures Can Be Informative,” Structural, Syntactic, and Statistical Pattern Recognition, vol. 4109, pp. 871-880, 2006. [7] M. Pelillo, “What Is a Cluster? Perspectives from Game Theory,” Proc. NIPS Workshop Clustering Theory, 2009. [8] D. Lee and J. Lee, “Dynamic Dissimilarity Measure for Support Based Clustering,” IEEE Trans. Knowledge and Data Eng., vol. 22, no. 6, pp. 900-905, June 2010. [9] A. Banerjee, I. Dhillon, J. Ghosh, and S. Sra, “Clustering on the Unit Hypersphere Using Von Mises- Fisher Distributions,” J. Machine Learning Research, vol. 6, pp. 1345-1382, Sept. 2005. [10] Mr.Kamakshaiah K, Dr.R..Seshadri “Water Quality Analysis Using Enhanced K-Means Clustering" International Journal of Advanced Research in Computer Science and Software Engineering, Volume 5, Issue 10 October (2015) ISSN : 2277-128X [11] T. Q. Chen and Y. Lu, “Color image segmentation an innovative approach”, Pattern recognition, vol. 35, 2002, pp. 395-405. [12] Y. Yang, C. Zheng, and P. Lin, “Fuzzy c-means clustering algorithm with a novel penalty term for image segmentation” Optoelectronic review, Vol.13, Issue 4, 2005, pp. 309-315. [13] Mr. Kamakshaiah K, Dr.R..Seshadri “Ground Water Quality Assessment Using Data Mining Techniques” International journal of Computer Applications Volume 76-No 15, August 2013, ISSN: 0975-8887. [14] Mr.Kamakshaiah K, Dr.R..Seshadri, “Classification Ground Water Process Using PC Based K-Means Clustering ", International Journal of Applied Sciences, Engineering and Management, Vol 3,-No61, November 2014,ISSN: 2320-3439.