maty-bohacek/xgboost-vs-gpt4

Official Implementation of the 'When XGBoost Outperforms GPT-4 on Text Classification: A Case Study' NAACL-W 2024 paper

23
/ 100
Experimental

This project helps data scientists, machine learning engineers, and NLP researchers efficiently classify text. It compares traditional ensemble methods (like XGBoost) with large language models (LLMs) for tasks like news trustworthiness classification, showing what kind of text data goes in and what classification decision comes out. It provides insights into when a simpler, more established pipeline might be more effective than a complex LLM.

No commits in the last 6 months.

Use this if you need to classify text and want to understand whether a traditional machine learning approach or a large language model is more suitable for your specific dataset and performance requirements.

Not ideal if you are looking for a pre-built, production-ready text classification API that you can use without any custom model training or configuration.

text-classification news-analysis machine-learning-engineering natural-language-processing model-comparison
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 8 / 25
Community 9 / 25

How are scores calculated?

Stars

16

Forks

2

Language

Python

License

Last pushed

Dec 16, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/maty-bohacek/xgboost-vs-gpt4"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.