PennShenLab/FREEFORM
FREEFORM | Knowledge-Driven Feature Selection and Engineering with Large Language Models
This tool helps researchers predict complex traits or diseases from genetic information by identifying and improving the most relevant genetic markers. It takes raw genotype data and a target phenotype, and outputs a refined set of genetic features that are easier to interpret and lead to more accurate predictions. This is for geneticists, biomedical researchers, and computational biologists working with complex genetic datasets.
Use this if you need to predict complex biological traits or diseases from high-dimensional genetic data, especially when you need to understand which specific genetic variations are most influential, and you have limited data.
Not ideal if your data is not genotype-phenotype related or if you are not interested in leveraging existing biological knowledge from large language models for feature improvement.
Stars
11
Forks
1
Language
Jupyter Notebook
License
—
Category
Last pushed
Feb 16, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/PennShenLab/FREEFORM"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NVIDIA-NeMo/Curator
Scalable data pre processing and curation toolkit for LLMs
MigoXLab/dingo
Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool
data-prep-kit/data-prep-kit
Open source project for data preparation for GenAI applications
TheDataStation/pneuma
LLM-Powered Data Discovery System for Tabular Data
cleanlab/cleanlab-studio
Client interface to Cleanlab Studio