dbamman/book-nlp

Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.com/booknlp/booknlp)

37
/ 100
Emerging

This tool helps researchers in literary studies or digital humanities automatically analyze long English texts, like novels. It processes a plain text file, identifying characters, mapping aliases to a single character, and attributing dialogue. The output is a highly detailed, annotated version of the text and a JSON file with character features, useful for large-scale textual analysis.

316 stars. No commits in the last 6 months.

Use this if you need to deeply analyze literary texts, track characters, and understand narrative structure without manually reading and annotating every single book.

Not ideal if you're working with short documents, non-English texts, or primarily need a simple word count or topic modeling without deep character and discourse analysis.

literary-analysis digital-humanities textual-scholarship narrative-analysis character-studies
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 19 / 25

How are scores calculated?

Stars

316

Forks

46

Language

Java

License

Last pushed

Feb 04, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/dbamman/book-nlp"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.