LVMED  Search Word Frequency List

Latvian Radiology Speech Corpus

An anonymised text corpus of digital imaging reports – manual transcriptions of examination dictations. The corpus covers the following modalities: computed tomography, magnetic resonance, mammography, computed radiography (x-ray) and ultrasound.

Publication to be cited:
R. Dargis, N. Gruzitis, I. Auzina, K. Stepanovs
Creation of Language Resources for the Development of a Medical Speech Recognition System for Latvian
IOS Press, 2020
Corpus size 30 hours (157k tokens)
Development period 2022
Developers Institute of Mathematics and Computer Science UL, Riga East University Hospital
Funding European Regional Development Fund (