KIAM Main page Web Library  •  Publication Searh  Ðóññêèé 
Publication

Conference material: "Proceedings of the 7th International Conference “Futurity designing. Digital reality problems” (February 15-16, 2024, Moscow)"
Authors: Gromov V.A., Borodin N.S., Kogan A.S., Dang Q.N., Yerbolova A.S., Bayan H.
Spot the bot: large-scale natural language structure
Abstract:
In the modern world, specialized programs (bots) write comments, news, reviews, which may contain false information. As a result, it is extremely important to know whether a given text was written by a real person or a bot. This work aims to study the semantic trajectories of texts in natural languages to analyse the aforementioned problem. The study utilizes the concepts of vector embeddings and their n-grams, as well as methods for (1) clustering the semantic space, (2) analysing the position of texts on the 'entropy-complexity' plane, (3) estimating the intrinsic dimensionalities of vector language representations, and (4) topological data analysis.
Keywords:
semantic trajectories, natural language processing, bots, clustering, entropy-complexity plane, intrinsic dimensionality, topological data analysis
Publication language: russian,  pages: 32 (p. 281-312)
Russian source text:
Export link to publication in format:   RIS    BibTeX
About authors:
  • Gromov Vasilii Aleksandrovich,  orcid.org/0000-0001-5891-6597National Research University Higher School of Economics
  • Borodin Nikita Sergeevich,  orcid.org/0000-0002-7102-4443National Research University Higher School of Economics
  • Kogan Alexandra Sergeevna,  orcid.org/0000-0002-6009-5203National Research University Higher School of Economics
  • Dang Quynh Nhu,  orcid.org/0000-0003-0450-7063National Research University Higher School of Economics
  • Yerbolova Asel Serikanovna,  orcid.org/0009-0007-7119-4665National Research University Higher School of Economics
  • Bayan Hendawi,  orcid.org/0000-0003-0096-8612National Research University Higher School of Economics