Evaluation Metrics
Topic related to evaluation-metrics
⚡️A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion 🐍
Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including OpenAI Agents SDK, CrewAI, Langchain, Autogen, AG2, and CamelAI
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data for end-to-end AI benchmarking
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems
Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13
STT 한글 문장 인식기 출력 스크립트의 외자 오류율(CER), 단어 오류율(WER)을 계산하는 Python 함수 패키지
The most comprehensive Python package for evaluating survival analysis models.