KIAM Main page Web Library  •  Publication Searh  Русский 
Publication

KIAM Preprint № 67, Moscow, 2022
Authors: Kislitsyna M.Y.
The text preprocessing influence analyze for author identification problem by bigram method
Abstract:
On the example of sufficiently representative number of authors and texts, a comparative analysis of the impact of text preprocessing programs on the possibility of identifying authors is carried out. The question of the sensitivity of the identification error by the proportion of changes in the source text is investigated. It is shown that the author's originality is preserved after preprocessing almost at the level of the original text.
Keywords:
machine classification, text preprocessing, bigram distribution, author identification
Publication language: russian,  pages: 18
Research direction:
Mathematical modelling in actual problems of science and technics
Russian source text:
Export link to publication in format:   RIS    BibTeX
View statistics (updated once a day)
over the last 30 days — 14 (+1), total hit from 14.11.2022 — 290
About authors:
  • Kislitsyna Maria Yurievna,  voronina.miu@yandex.ruorcid.org/0000-0002-2542-8914KIAM RAS