Classification of text information with the use of bigram analysis
Abstract:
In this paper we consider examples of applying the method of spectral analysis of nonsymmetric matrices to construct indicators of classification in the structuring of textual information. The split indicator is the value of the cosine of the angle between the left and right eigenvectors, corresponding respectively to the minimum and maximum real eigenvalues of the stochastic matrix of conditional bigrams.
Keywords:
stochastic matrix, spectral portrait, text classification
Publication language:russian, pages:22
Research direction:
Mathematical modelling in actual problems of science and technics