Command Palette
Search for a command to run...
Liliang Ren Zixuan Zhang Han Wang Clare R. Voss Chengxiang Zhai Heng Ji

Abstract
Modern large-scale Pre-trained Language Models (PLMs) have achieved tremendous success on a wide range of downstream tasks. However, most of the LM pre-training objectives only focus on text reconstruction, but have not sought to learn latent-level interpretable representations of sentences. In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge. Besides, the language model pre-trained with such an objective also significantly improves Information Extraction related downstream tasks in both supervised and few-shot settings. Our code is publicly available at: https://github.com/renll/SparseLT.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| few-shot-ner-on-few-nerd-inter | BERT-SparseLT + CONTaiNER | 10 way 1~2 shot: 52.75 10 way 5~10 shot: 62.43 5 way 1~2 shot: 57.14 5 way 5~10 shot: 66.17 |
| few-shot-ner-on-few-nerd-intra | BERT-SparseLT+CONTainNER | 10 way 1~2 shot: 40.48 10 way 5~10 shot: 53.04 5 way 1~2 shot: 47.20 5 way 5~10 shot: 59.67 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.