maty-bohacek/xgboost-vs-gpt4
Official Implementation of the 'When XGBoost Outperforms GPT-4 on Text Classification: A Case Study' NAACL-W 2024 paper
This project helps data scientists, machine learning engineers, and NLP researchers efficiently classify text. It compares traditional ensemble methods (like XGBoost) with large language models (LLMs) for tasks like news trustworthiness classification, showing what kind of text data goes in and what classification decision comes out. It provides insights into when a simpler, more established pipeline might be more effective than a complex LLM.
No commits in the last 6 months.
Use this if you need to classify text and want to understand whether a traditional machine learning approach or a large language model is more suitable for your specific dataset and performance requirements.
Not ideal if you are looking for a pre-built, production-ready text classification API that you can use without any custom model training or configuration.
Stars
16
Forks
2
Language
Python
License
—
Category
Last pushed
Dec 16, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/maty-bohacek/xgboost-vs-gpt4"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
giacbrd/ShallowLearn
An experiment about re-implementing supervised learning models based on shallow neural network...
javedsha/text-classification
Machine Learning and NLP: Text Classification using python, scikit-learn and NLTK
Wluper/edm
Python package for understanding the difficulty of text classification datasets. (in CoNNL 2018)
chicago-justice-project/article-tagging
Natural Language Processing of Chicago news articles
fendouai/Awesome-Text-Classification
Awesome-Text-Classification Projects,Papers,Tutorial .