Term suggestion techniques for Scientific Databases Full text

Petsios Theofilos
School of Electrical and Computer Engineering, NTUA
2011
Diploma Thesis
Abstract. The main purpose of this thesis was the development of a better term suggestion mechanism for DIANA web applications. We developed several tools to manage search engines, focusing on edit distance and n-gram techniques. These tools mainly consist of programs written in Perl, in order to construct and maintain inverted indexes for ngram-based search engines and mysql udfs which implement operations concerning n-grams. We modified the graphic interface of the web application with the use of php and ajax, implemented in yii framework. Overall, we achieved a major improvement in time response of the average query on the web application. The options offered by the search engine where improved in terms of variety and the ease of use of the web application improved as well. We also created a series of administration tools for DIANA administrators. These tools consist of programs to manage databases which include inverted indexes for search operations, and are applicable to any operating system. The system administrator has the ability to choose the construction of indexes of variable gram length and assign an arbitrary weight to the grams used. Finally, we made changes to flamingo software installer in order for it to be applicable to Mac OS X.