Conference material: "Scientific service & Internet: proceedings of the 21th All-Russian Scientific Conference (September 23-28, 2019, Novorossiysk)"
Matching of authors and publications in multilingual bibliographic knowledge bases
The problem of cross-lingual matching of authors and publications is a special case of the task of assigning a unique identifier to the same real-world entity in multilingual data sources. This paper presents the results of experiments with several versions of a cross-lingual system for matching authors of English-language publications based on a Russian-language data source. Since different heuristics have been tested in these versions of the system, here we consider only those that have given the best results. An important element of the system is its interactive visualization tool, which provides information on the distribution of publications by authors, showing the distribution of each group of publications by co-authors and years of publication, as well as providing the ability to edit the results of analysis. The visualization system is supplemented with methods for the ordering of adjacency matrices. Experiments have shown that the main source of improving the quality of disambiguation algorithm is extending the set of confirmed publications. The approaches used in this system are applicable to solving the problem of linking named entities in various multilingual data sources.
multilingual knowledge bases, cross-lingual matching of authors and publications, identity resolution, clustering, interactive visualization