strubell/preprocess-conll05

Scripts for preprocessing the CoNLL-2005 SRL dataset.

30
/ 100
Emerging

This helps computational linguists and NLP researchers prepare the CoNLL-2005 Semantic Role Labeling (SRL) dataset. It takes the raw Penn TreeBank and CoNLL-2005 data as input and produces structured text files with word forms, part-of-speech tags, gold syntax, and labeled semantic arguments, ready for training or evaluating SRL models.

No commits in the last 6 months.

Use this if you need to standardize and enrich the CoNLL-2005 dataset for semantic role labeling research, especially if you plan to convert constituency parses to dependency parses or use BIO format for span representation.

Not ideal if you are working with a different natural language processing task or dataset, as these scripts are specifically tailored for CoNLL-2005 SRL.

natural-language-processing semantic-role-labeling computational-linguistics text-annotation corpus-preparation
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 8 / 25
Community 16 / 25

How are scores calculated?

Stars

24

Forks

6

Language

Shell

License

Last pushed

Mar 28, 2019

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/strubell/preprocess-conll05"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.