From August 22 to August 31, a summer scientific school on probability theory was held. Students and young researchers from Novosibirsk, Moscow, St. Petersburg and other cities attended the course "New statistical methods for analyzing texts in natural language" from a researcher of the Mathematical Center in Akademgorodok, Doctor of Physical and Mathematical Sciences Artem P. Kovalevsky.
Within the framework of the school, the participants got acquainted with statistical tests based on both simple and rather complex concepts of probability theory:
For a better assimilation of the material of the school, students were invited to participate in projects that were devoted to the issue of attribution (identification of authorship) of the text. For example, for one of the projects, the question was raised: which reconstruction of the 10th chapter of Eugene Onegin, D. L. Bykov or A. Yu. Chernov, is most similar to the first 9 chapters written by A. S. Pushkin. The results of the project are mixed, and the research will continue outside the school. Another project was devoted to testing the hypothesis that each author uses a certain set of words (author's invariant) "consistently" in all their works, and the frequency of occurrence of representatives of the invariant varies from author to author.
As a result of the project, using the example of the task of distinguishing texts from Dostoevsky and Tolstoy, a statistical test was built that uses an invariant made up of particles and distinguishes authors' novels in 92% of cases.
According to tradition, during the summer school lectures were organized, in which students consolidated the material they had learned.
A participant of the school, Alexander A. Trushin, a 1st-year master's program student of the Department of Mathematics and Mechanics of NSU, shared his impressions:
The summer school was a great opportunity for me to touch a real project in practice in a team of mathematicians. Being unfamiliar with almost all the other participants, I was able to quickly join the team, and by the end of the school, I freely and with pleasure communicated with the team both about mathematical projects and abstract topics. Thanks for this should be said first of all to the organizers of the summer school, in particular, to Evgeny I. Prokopenko.
But networking is not the only successful event. The lecture material and proposed projects allowed me to look at the analysis of texts from an angle that was unexpected for me. The curator of my project never left me lost, all the time I understood what I was doing and why.
The summer probabilistic school leaves a very pleasant aftertaste — not only the knowledge gained, but also the feeling of meaningful time spent, pumped self-confidence and a desire to work further.
The school was held with the support of the Mathematical Center in Akademgorodok.