8 ANNIS framework
find your way through: https://corpus-tools.org/annis/, install ANNIS on your system and try to import the zipped ANNIS SES corpus you find in the HU-box.
> folder: sketch engine Work, namescheme of latest zip: [datestamp]_SES_annis_tagged_corpus.zip
8.1 SES sample procedure to create ANNIS corpus
the following is just for documentation of the process; you wont have to follow these steps, just follow above instructions to install ANNIS on your system and import the zipped corpus.
- upload files in HU box folder
version without header for SketchEngine uploadto SketchEngine > create new corpus - expert compiler settings > adapt
docschemeto >sesCPT- with that done you can already explore the SES corpus in the SketchEngine GUI using the built in CQL (corpus query language) commands.
- download corpus (vertical)
- corpus is now a database of token, PoS, lemma; tagged according to the GermanRF tagset1 used by SketchEngine
- process database in: conc-essai.R
- splits PoS tag (scheme:
x.x.x.x.x) into seperate columns defining classes of PoS tags - writes single .xlsx files for each kid into folder
- ANNIS preprocessing:
- pepper:
xls > treetagger formatfrom .xlsx files folder. parameter file - pepper:
treetagger > annis graph formatfrom treetagger files folder. parameter file - zip annis graph files
- pepper:
- splits PoS tag (scheme:
- upload annis.zip to ANNIS localhost server
8.2 ANNIS ready to use installation:
please find here: link follows an ANNIS server installation with the SES corpus ready to use. (! 20230904: the link is not yet freely available, use the link shared in moodle if you dont want to use your own local installation !)