laiso/ts-bench

Measure and compare the performance of AI coding agents on TypeScript tasks.

/ 100

Emerging

This tool helps AI developers and researchers evaluate the performance of different AI coding agents on TypeScript programming tasks. You provide the agent (like Claude) and a TypeScript coding challenge, and it outputs pass/fail results to compare how well various models generate correct code. This is for anyone creating, fine-tuning, or selecting AI models for code generation.

210 stars.

Use this if you need to quickly benchmark and compare the effectiveness of various AI coding agents in solving TypeScript programming problems.

Not ideal if you require lab-grade, highly precise, and statistically rigorous performance evaluations of AI models.

AI-developer-tools code-generation TypeScript-development AI-model-evaluation software-engineering-automation

No License No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 7 / 25

Community 9 / 25

How are scores calculated?

Stars

210

Forks

Language

TypeScript

License

—

Featured in

You're Shipping AI You Can't Measure

Higher-rated alternatives

StonyBrookNLP/appworld

🌍 AppWorld: A Controllable World of Apps and People for Benchmarking Function Calling and...

qualifire-dev/rogue

AI Agent Evaluator & Red Team Platform

microsoft/WindowsAgentArena

Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of...

future-agi/ai-evaluation

Evaluation Framework for all your AI related Workflows

agentscope-ai/OpenJudge

OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards

Explore AI Agents

All categories Trending AI Agent directory Insights