PennShenLab/FREEFORM

FREEFORM | Knowledge-Driven Feature Selection and Engineering with Large Language Models

/ 100

Emerging

This tool helps researchers predict complex traits or diseases from genetic information by identifying and improving the most relevant genetic markers. It takes raw genotype data and a target phenotype, and outputs a refined set of genetic features that are easier to interpret and lead to more accurate predictions. This is for geneticists, biomedical researchers, and computational biologists working with complex genetic datasets.

Use this if you need to predict complex biological traits or diseases from high-dimensional genetic data, especially when you need to understand which specific genetic variations are most influential, and you have limited data.

Not ideal if your data is not genotype-phenotype related or if you are not interested in leveraging existing biological knowledge from large language models for feature improvement.

genetics biomedical research phenotype prediction genotype data analysis hereditary diseases

No License No Package No Dependents

Maintenance 10 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

NVIDIA-NeMo/Curator

Scalable data pre processing and curation toolkit for LLMs

MigoXLab/dingo

Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool

data-prep-kit/data-prep-kit

Open source project for data preparation for GenAI applications

TheDataStation/pneuma

LLM-Powered Data Discovery System for Tabular Data

cleanlab/cleanlab-studio

Client interface to Cleanlab Studio

Explore LLM Tools

All categories Trending LLM Tool directory Insights