Command Palette
Search for a command to run...
Li Xiangyang ; Dong Kuicai ; Lee Yi Quan ; Xia Wei ; Zhang Hao ; Dai Xinyi ; Wang Yasheng ; Tang Ruiming

Abstract
Despite the substantial success of Information Retrieval (IR) in various NLPtasks, most IR systems predominantly handle queries and corpora in naturallanguage, neglecting the domain of code retrieval. Code retrieval is criticallyimportant yet remains under-explored, with existing methods and benchmarksinadequately representing the diversity of code in various domains and tasks.Addressing this gap, we present COIR (Code Information Retrieval Benchmark), arobust and comprehensive benchmark specifically designed to assess coderetrieval capabilities. COIR comprises ten meticulously curated code datasets,spanning eight distinctive retrieval tasks across seven diverse domains. Wefirst discuss the construction of COIR and its diverse dataset composition.Further, we evaluate nine widely used retrieval models using COIR, uncoveringsignificant difficulties in performing code retrieval tasks even withstate-of-the-art systems. To facilitate easy adoption and integration withinexisting research workflows, COIR has been developed as a user-friendly Pythonframework, readily installable via pip. It shares same data schema as otherpopular benchmarks like MTEB and BEIR, enabling seamless cross-benchmarkevaluations. Through COIR, we aim to invigorate research in the code retrievaldomain, providing a versatile benchmarking tool that encourages furtherdevelopment and exploration of code retrieval systems.https://github.com/CoIR-team/coir.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| code-search-on | Voyage-code-002 | nDCG@10: 56.26 |
| code-search-on-coir | Voyage-code-002 | nDCG@10: 56.26 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.