Web service for ambiguous transliteration of full sentences from Latin to Cyrillic alphabet

Spasov, Stojance (2014) Web service for ambiguous transliteration of full sentences from Latin to Cyrillic alphabet. Masters thesis, University Goce Delcev - Stip.

[thumbnail of Magisterska_teza_Stojance_Spasov-2014.pdf]
Preview
Text
Magisterska_teza_Stojance_Spasov-2014.pdf

Download (2MB) | Preview

Abstract

There are many web-based applications and social networks with published Macedonian contents written in Latin alphabet, despite the opportunity for using Cyrillic alphabet, which questions the existence of ambiguity of certain words in the sentences. This is due to the Internet users who use Latin alphabet, as a custom or difficulty, for writing contents (for example advertisements, comments etc.) or use electronic devices without option for Cyrillic alphabet as an input language. For overcoming this issue, the process of transliteration is used.

This process transforms all words from Latin to Cyrillic alphabet and as a result, only a few of them will have more solutions i.e. different meanings of the word. For example, from „postar” we obtain „постар”, „поштар” or from teza we obtain „тежа”, „теза”.

This master paper defines and presents the basic concepts of ambiguous transliteration, which is the key step to solving the meaning and purpose of the words in the full sentences. In the same time, it describes the three types of dictionaries: open dictionary or OD (downloaded from OpenOffice Dictionaries), expository dictionary or ED (downloaded from digital dictionary of Macedonian language), combined dictionary or CD (combination of OD and ED), used to make many analyses during the research. From the point of view of ambiguous transliteration solution, an algorithm is defined, which uses conditional probability and Bayes’ formulas. The decision is implemented as a new transliteration algorithm in different types of contents. During the research, an automatic testing of 300 selected sentences, divided to three groups, is performed.

For practical application of the developed algorithm, as a separate part of this master paper, a web service is designed, which can make transliteration of electronic messages written on Latin alphabet, web publications, older texts etc. through user interface and service module. In the last part of the paper, a more detailed description of the web service by application of architecture (REST) and its comparison to PHP and MySQL technologies is given.

Item Type: Thesis (Masters)
Uncontrolled Keywords: conditional probability, Bayes’ formulas, algorithm, user interface, service module, architecture (REST)
Subjects: Natural sciences > Computer and information sciences
Social Sciences > Educational sciences
Natural sciences > Matematics
Divisions: Faculty of Computer Science
Depositing User: Stojance Spasov
Date Deposited: 08 Oct 2014 13:03
Last Modified: 10 Dec 2018 10:43
URI: https://eprints.ugd.edu.mk/id/eprint/11084

Actions (login required)

View Item View Item