Latin scansion with neural networks

Luuk Nolden and Suzan Verberne

My research takes a look at the scansion of Latin poetry using neural networks. Having created training sets using the rule-based approaches by the Pedecerto and Anceps projects, we investigate the best way to scan dactylic verse. Subsequently, we investigate the generalisability of a model trained on dactylic meter to other systems, like the iambic trimeter. We find that an LSTM with one-hot encoding outperforms Conditional Random Fields when scanning dactylic meter, with f1-scores of 0.99 for the long, short and elision labels versus f1-scores of 0.90. Additionally, the models have no problems scanning Latin from different authors, time periods and genres. The only preference seems to be to provide at least 3000 lines of poetry for f1-scores of 0.95. Generalising the models trained on dactylic verse to trimeters seems unfeasible, with f1 scores of ~0.45 for the long and short labels and 0.80 for elision. Using word embeddings trained on syllable level as input to the LSTM does improve scores to ~0.55. However, training on noisy trimeter data (with anceps labels instead of dedicated long/short labels) results in f1-scores of 0.85. This seems to suggest that a model performs best when trained and tested on a specific metrical system.