BramVanroy/spacy_conll
Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doc and its sentences and tokens. Can also be used as a command-line tool.
This tool helps linguists, computational linguists, and NLP researchers convert raw text into a standardized format called CoNLL-U. It takes text as input and processes it using natural language processing models (like spaCy, Stanza, or UDPipe) to produce detailed linguistic annotations (like parts of speech, lemmas, and dependencies) in a structured, plain-text or tabular (Pandas DataFrame) CoNLL-U output. The end-user is typically someone who needs to analyze text with precise grammatical and syntactic information.
Used by 1 other package. No commits in the last 6 months. Available on PyPI.
Use this if you need to process text and extract detailed grammatical information in the CoNLL-U format for linguistic analysis, dataset creation, or further NLP tasks.
Not ideal if you only need high-level text summaries or general sentiment analysis, as its primary purpose is deep linguistic annotation.
Stars
81
Forks
18
Language
Python
License
BSD-2-Clause
Category
Last pushed
Jul 02, 2024
Commits (30d)
0
Dependencies
1
Reverse dependents
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/BramVanroy/spacy_conll"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
blmoistawinde/HarvestText
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法
huspacy/huspacy
HuSpaCy: industrial-strength Hungarian natural language processing
bnosac/udpipe
R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based...
polm/unidic-py
Unidic packaged for installation via pip.
tanloong/neosca
L2SCA & LCA fork: cross-platform, GUI, without Java dependency