HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

A Burstiness-aware Approach for Document Dating

{Kjetil Nørvåg Nattiya Kanhabua Dimitrios Gunopulos Dimitrios Kotzias Theodoros Lappas Dimitrios Kotsakos}

A Burstiness-aware Approach for Document Dating

Abstract

A large number of mainstream applications, like temporal search, event detection, and trend identification, assume knowledge of the timestamp of every document in a given textual collection. In many cases, however, the required timestamps are either unavailable or ambiguous. A characteristic instance of this problem emerges in the context of large repositories of old digitized documents. For such documents, the timestamp may be corrupted during the digitization process, or may simply be unavailable. In this paper, we study the task of approximating the timestamp of a document, so-called document dating. We propose a contentbased method and use recent advances in the domain of term burstiness, which allow it to overcome the drawbacks of previous document dating methods, e.g. the fix time partition strategy. We use an extensive experimental evaluation on different datasets to validate the efficacy and advantages of our methodology, showing that our method outperforms the state of the art methods on document dating.

Benchmarks

BenchmarkMethodologyMetrics
document-dating-on-apwBurstySimDater
Accuracy: 45.9
document-dating-on-nytBurstySimDater
Accuracy: 38.5

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp