Command Palette
Search for a command to run...
{{\c{C}}a{\u{g}}r{\i} {\c{C}}{\o}ltekin Taraka Rama}

Abstract
This paper describes our results at the NLI shared task 2017. We participated in essays, speech, and fusion task that uses text, speech, and i-vectors for the task of identifying the native language of the given input. In the essay track, a linear SVM system using word bigrams and character 7-grams performed the best. In the speech track, an LDA classifier based only on i-vectors performed better than a combination system using text features from speech transcriptions and i-vectors. In the fusion task, we experimented with systems that used combination of i-vectors with higher order n-grams features, combination of i-vectors with word unigrams, a mean probability ensemble, and a stacked ensemble system. Our finding is that word unigrams in combination with i-vectors achieve higher score than systems trained with larger number of $n$-gram features. Our best-performing systems achieved F1-scores of 87.16{%}, 83.33{%} and 91.75{%} on the essay track, the speech track and the fusion track respectively.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| native-language-identification-on-italki-nli | Tubasfs | Average F1: 0.5807 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.