messiaen/full-lattice-search

Full Text Search Over Probabilistic Lattices with Elasticsearch!

34
/ 100
Emerging

This tool helps you search through large collections of audio transcripts, scanned documents, or machine translations that might contain errors or alternative interpretations. It takes probabilistic 'lattices' (like those from an ASR system or OCR), which represent multiple possible words or phrases at each point, and lets you search them. The output is highly relevant search results, even when the original transcription is uncertain. It's designed for data analysts, linguists, or operations teams working with imperfect data from automated processing.

No commits in the last 6 months.

Use this if you need to perform accurate full-text searches across vast amounts of automatically generated text, where each word or phrase might have multiple probabilistic alternatives.

Not ideal if your text data is already perfectly accurate and unambiguous, as the added complexity of lattice search won't provide significant benefits.

speech-to-text document-processing machine-translation linguistics information-retrieval
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 13 / 25

How are scores calculated?

Stars

10

Forks

2

Language

Java

License

Apache-2.0

Last pushed

Nov 20, 2020

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/messiaen/full-lattice-search"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.