KIAM Main page Web Library  •  Publication Searh  Русский 
Publication

KIAM Preprint № 17, Moscow, 2024
Authors: Kislitsyna M.Y., Orlov Y.N.
Statistical analysis of the complete corpus of fiction in Russian and recognition of the author
Abstract:
Statistics of reference trigrams for the complete corpus of literary texts in Russian, including translated foreign authors, have been collected. Distributions of distances from individual texts to standards are constructed. The nearest reference method for recognizing the author of the text has been tested. The error was determined by genres, subgroups of authors and by the corpus as a whole. A classification of errors has been carried out to develop a correction method.
Keywords:
trigrams, nearest neighbor method, text author recognition
Publication language: russian,  pages: 24
Research direction:
Mathematical modelling in actual problems of science and technics
Russian source text:
Export link to publication in format:   RIS    BibTeX
View statistics (updated once a day)
over the last 30 days — 25 (+2), total hit from 07.03.2024 — 231
About authors:
  • Kislitsyna Maria Yurievna,  voronina.miu@yandex.ruorcid.org/0000-0002-2542-8914KIAM RAS
  • Orlov Yurii Nikolaevich,  ov31509f@yandex.ruorcid.org/0000-0002-1356-5137KIAM RAS