alex-raw/imsdb_parse

Parse movie scripts for linguistic analysis

20
/ 100
Experimental

This tool helps linguists and researchers analyze movie and TV scripts by automatically breaking them down into their core components like dialogue, character names, and scene headings. You input raw HTML script files, and it outputs structured data in XML format, ready for detailed linguistic analysis. It's designed for anyone studying language patterns and structure within film and television content.

No commits in the last 6 months.

Use this if you need to reliably extract and categorize different elements from movie or TV scripts for linguistic research or content analysis.

Not ideal if you need to process scripts from a wide variety of sources beyond IMSDB, or if you require robust handling for diverse formatting styles.

linguistic-analysis screenplay-analysis corpus-linguistics text-mining media-studies
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

7

Forks

Language

Python

License

GPL-3.0

Last pushed

Feb 16, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/alex-raw/imsdb_parse"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.