Spectral clustering will map the data points of the original space into a low-dimensional eigen-space to make them linearly separable, so it is able to process the data with complex structures. However, spectral clustering needs to store the entire similarity matrix and requires eigen-decomposition. Both procedures will consume a lot of time and space resources, limiting the application of spectral clus-tering algorithm in large-scale data environment. To reduce the complexity of spectral clustering algorithm, we may use the Nyström extension technique to calculate the approxi- mate eigenvectors by sampling a few of data points. This method sacrifices the clustering accuracy in exchange for the improvement of the algorithm efficiency. To select more
representative sample points to reflect the distribution of data sets much better, this paper designs a dynamic incremental sampling method used for the Nyström spectral clustering, in which the data points are sampled according to different probability distributions and we theoretically prove that the
increase of sampling times can effectively decrease the sam-pling error. The feasibility and effectiveness of the proposed algorithm are analyzed by the experiments on UCI machine learning data sets.
Merupakan Unit Pendukung Akademis (UPA) yang bersama-sama dengan unit lain melaksanakan Tri Dharma Perguruan Tinggi (PT) melalui menghimpun, memilih, mengolah, merawat serta
melayankan sumber informasi kepada civitas akademika Universitas Jember khususnya dan masyarakat akademis pada umumnya.