proycon/folia
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitions
This project provides FoLiA, a standardized XML-based format for storing and exchanging language resources with rich linguistic annotations. It accepts raw text or existing annotated corpora and produces a meticulously structured FoLiA XML file that details various linguistic features. Linguists, computational linguists, and researchers working with annotated text data will find this useful for managing and sharing their datasets.
Used by 1 other package. Available on PyPI.
Use this if you need a flexible and highly expressive format to represent diverse linguistic annotations in your language resources or corpora.
Not ideal if you primarily work with very simple plain text or require a format solely for basic, unstructured text without any linguistic markup.
Stars
65
Forks
10
Language
Python
License
GPL-3.0
Category
Last pushed
Dec 09, 2025
Commits (30d)
0
Dependencies
3
Reverse dependents
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/proycon/folia"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
apache/opennlp
Apache OpenNLP
stanfordnlp/CoreNLP
CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing,...
stanfordnlp/python-stanford-corenlp
Python interface to CoreNLP using a bidirectional server-client interface.
dkpro/dkpro-core
Collection of software components for natural language processing (NLP) based on the Apache UIMA...
apache/opennlp-sandbox
Apache OpenNLP Sandbox