ML4GLand/SeqPro
Genomic sequence preprocessing toolkit
This toolkit helps bioinformaticians and genomic researchers efficiently prepare DNA/RNA (and some protein) sequence data for analysis. It takes raw sequence strings or numerical representations and outputs processed sequences, such as one-hot encoded, padded, or reverse-complemented versions, ready for downstream computational tasks. It's designed for anyone working with genetic or protein sequences who needs to perform common data manipulation steps.
Use this if you need to quickly and robustly preprocess genomic or proteomic sequence data, performing tasks like one-hot encoding, padding, reverse complementing, or calculating sequence content.
Not ideal if your primary need is complex sequence alignment, de novo assembly, or deep phylogenetic analysis, as this tool focuses on preprocessing rather than advanced bioinformatics algorithms.
Stars
13
Forks
—
Language
Python
License
MIT
Category
Last pushed
Jan 13, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/ML4GLand/SeqPro"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
helicalAI/helical
A framework for state-of-the-art pre-trained bio foundation models on genomics and...
instadeepai/nucleotide-transformer
Foundation Models for Genomics & Transcriptomics
ML-Bioinfo-CEITEC/genomic_benchmarks
Benchmarks for classification of genomic sequences
FunctionLab/selene
a framework for training sequence-level deep learning networks
modernatx/seqlike
Unified biological sequence manipulation in Python