Intelligent-Internet/II-Commons
This repository `II-Commons` contains tools for managing text and image datasets, including loading, fetching, and embedding large datasets.
This project helps organizations and individuals build and manage extensive, shared knowledge bases using text and image data. It takes raw text documents and images, processes them, and outputs an organized, searchable knowledge base ready for use in AI applications like chatbots or content recommendation systems. Data scientists, AI researchers, and developers building AI-powered products would use this to manage their training data.
No commits in the last 6 months.
Use this if you need to create, manage, and retrieve information from large collections of text and image data for AI model training or information retrieval.
Not ideal if you're looking for a simple document search tool or don't need to process and embed very large datasets for AI applications.
Stars
33
Forks
13
Language
Python
License
Apache-2.0
Category
Last pushed
Jul 22, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/Intelligent-Internet/II-Commons"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
WangRongsheng/awesome-LLM-resources
🧑🚀 全世界最好的LLM资料总结(多模态生成、Agent、辅助编程、AI审稿、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the...
SylphAI-Inc/AdalFlow
AdalFlow: The library to build & auto-optimize LLM applications.
LazyAGI/LazyLLM
Easiest and laziest way for building multi-agent LLMs applications.
luhengshiwo/LLMForEverybody
每个人都能看懂的大模型知识分享,LLMs春/秋招大模型面试前必看,让你和面试官侃侃而谈
katanaml/sparrow
Structured data extraction and instruction calling with ML, LLM and Vision LLM