Sign up
    Navigare           Noutati
   Despre consortiu
   Despre CILR
   Lista membrilor
      CILR 1/10/01
      Atelier de lucru 3/11/05
      Atelier de lucru 3/11/06
      Atelier de lucru 14-15/12/07



The Balkan WordNet aimed at combining effectively Balkan lexicography and modern computation. The most ambitious feature of the BalkaNet was its attempt to represent semantic relations and organize lexical information from Balkan languages in terms of word meanings. The main objective was the development of the individual WordNets and their combination in a common lexical database. Moreover, the BalkaNet aimed not only at combining Balkan word forms in an on line dictionary, but at a further expansion of the EuroWordNet by tracing and exploring the relativity of Romance languages and Balkan languages. Finally, one main goal of the BalkaNet was to promote the study of the less studied Balkan languages, by creating a large-scale linguistic resource and to develop a database for multi-lingual information retrieval, by expanding words in one language to words in another language.

BalkaNet has developed  a multilingual database with WordNets for a set of Balkan languages Each WordNet is seen as a linguistic ontology that reflects the lexicalization and the semantic relations between the different concepts of the language. Like the EuroWordNet, it is structured around the notion of a synset (a set of synonymous word meanings, between which basic semantic relations are expressed). Each Balkan partner participated in the following tasks:

  1. selection of a set of basic concepts, i.e. important meanings that play a major role in a number of semantic relations in Balkan languages. These concepts functioned as the main source for the development of the synsets.
  2. development and semantic organization of each language's synsets.
  3. development of an Inter-Lingual-Index, a database consisting of an unstructured list of concepts (ILI records), mainly taken from EuroWordNet but adapted to improve the synsets matching across Balkan languages. The ILI gives access to a shared top-ontology and a domain-ontology.
  4. design of the architecture of a multilingual database in which the multilingual synsets are encoded. The database enables the linking and matching of all WordNets, following the language specific modules. To relate the multi- lingual synsets across a common database, more general and global matches has been used, so as to tackle potential problems that might have appeared.
  5. loading of the language specific data to the database and comparison of a large set of WordNets to indicate the differences in the relations across the WordNets. Special interfaces have been developed to carry out this kind of comparison.
  6. validation of the BalkaNet system, testing and improvement of the system by adding new concepts, which may be missing, improving the matching, adapting the ILI records, etc. The EuroWordNet documentation proved to be a useful reference during the implementation of the project.

A central multilingual lexical database with WordNets for a large set of Central and Eastern European languages has been developed. Each WordNet will cover the general vocabulary of these languages so as to tackle the problem of multilinguality among Balkan countries. Furthermore, an adjustment of the Balkan WordNet to the EuroWordNet project will be made so as to extend it and make a cross-language information retrieval efficient for the less-favored Balkan languages.

Balkanet official site:

Final report

The special issue of the Romanian Journal of Information Science and Technology on Balkanet, list of publications

  • Octombrie 2007
    La sectiunea Activitati/Evenimente a fost adaugata pagina dedicata Atelierului de Lucru Resurse Lingvistice Romanesti si Instrumente pentru Prelucrarea Limbii Romane, 14 - 15 decembrie 2007, Iasi.
 top          home
Vizitatori  21846