9 ANNIS framework
find your way through: https://corpus-tools.org/annis/, install ANNIS on your system and try to import the zipped ANNIS SES corpus you find in the HU-box. > folder: sketch engine Work
, namescheme of latest zip: [datestamp]_SES_annis_tagged_corpus.zip
9.1 SES sample procedure to create ANNIS corpus
the following is just for documentation of the process; you wont have to follow these steps, just follow above instructions to install ANNIS on your system and import the zipped corpus.
- upload files in HU box folder version without header for SketchEngine upload
to SketchEngine > create new corpus
- expert compiler settings > adapt docscheme
to > sesCPT
- with that done you can already explore the SES corpus in the SketchEngine GUI using the built in CQL (corpus query language) commands.
- download corpus (vertical)
- corpus is now a database of token, PoS, lemma; tagged according to the GermanRF tagset1 used by SketchEngine
- process database in: conc-essai.R
- splits PoS tag (scheme: x.x.x.x.x
) into seperate columns defining classes of PoS tags
- writes single .xlsx files for each kid into folder
- ANNIS preprocessing:
- pepper: xls > treetagger format
from .xlsx files folder. parameter file
- pepper: treetagger > annis graph format
from treetagger files folder. parameter file
- zip annis graph files
- upload annis.zip to ANNIS localhost server
9.2 ANNIS ready to use installation:
please find here: link follows an ANNIS server installation with the SES corpus ready to use. (! 20230904: the link is not yet freely available, use the link shared in moodle if you dont want to use your own local installation !)
Bates u. a., „Fitting Linear Mixed-Effects Models Using lme4“. 2015. doi: 10.18637/jss.v067.i01↩︎