Strong-AI-Lab/Logical-and-abstract-reasoning

Evaluation on Logical Reasoning and Abstract Reasoning Challenges

/ 100

Emerging

This tool helps AI researchers and practitioners evaluate and fine-tune Large Language Models (LLMs) specifically on logical and abstract reasoning challenges. It takes existing LLM configurations and various reasoning datasets as input, then outputs performance metrics in a CSV file, showing how well the models understand and apply logic. Users can also fine-tune HuggingFace models on specific reasoning datasets to improve their performance.

No commits in the last 6 months.

Use this if you are an AI researcher or machine learning engineer looking to rigorously test and improve the logical and abstract reasoning capabilities of Large Language Models.

Not ideal if you are a general user looking for a pre-trained LLM for day-to-day tasks or if you are not comfortable with command-line interfaces and model configuration files.

AI research Large Language Models model evaluation natural language processing reasoning benchmarks

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

open-thought/reasoning-gym

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Hmbown/Hegelion

Dialectical reasoning architecture for LLMs (Thesis → Antithesis → Synthesis)

LLM360/Reasoning360

A repo for open research on building large reasoning models

bowang-lab/BioReason

BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model | NeurIPS '25

TsinghuaC3I/Awesome-RL-for-LRMs

A Survey of Reinforcement Learning for Large Reasoning Models

Explore LLM Tools

All categories Trending LLM Tool directory Insights