Corpora with tag diachronic (6)

Barometrs

Corpus of News Portal Comments

2011–2022, 26M comments (642M tokens)
Developers: RSU, IMCS UL

Disertācijas

Corpus of Latvian PhD Theses

1993–2020, 16.7M words (23.4M tokens)
Developers: IMCS UL

Likumi

Corpus of Legal Acts of the Republic of Latvia

1990–2022, 73.9M words (116.2M tokens)
Developers: IMCS UL

Saeima

Corpus of the Saeima (the Parliament of Latvia)

1993–2022, 20M words (24M tokens)
Developers: IMCS UL, RSU

Subtitri

Latvian Subtitles of Public Broadcasting

2015–2020, 1200 hours (10.8M tokens)
Developers: IMCS UL

Ziņas

Articles from Latvian news portals

2020–2022, 357.2M words (513.5M tokens)
Developers: IMCS UL
B. Saulīte, R. Darģis, N. Grūzītis, I. Auziņa, K. Levāne-Petrova, L. Pretkalniņa, L. Rituma, P. Paikens, A. Znotiņš, L. Strankale, K. Pokratniece, I. Poikāns, G. Bārzdiņš, I. Skadiņa, A. Baklāne, V. Saulespurēns, J. Ziediņš.
Latvian National Corpora Collection – Korpuss.lv
Proceedings of the 13th Language Resources and Evaluation Conference (LREC), 2022, pp. 5123–5129
PDF   BibTeX