VMLU favicon

VMLU

Free
VMLU screenshot
Click to visit website
Feature this AI

About

VMLU is a human-centric benchmark suite designed to assess the overall capabilities of foundation models, specifically for the Vietnamese language. It comprises four distinct datasets: Vi-MQA, Vi-SQuAD, Vi-DROP, and Vi-Dialog, each targeting different aspects of LLM performance, including general knowledge, reading comprehension, logical reasoning, and conversational ability. Vi-MQA, for instance, is a multiple-choice question answering benchmark with 58 subjects across STEM, Humanities, Social Sciences, and 'Others', covering various difficulty levels. The dataset primarily originates from examinations by esteemed educational institutions and the Ministry of Education and Training. By providing comprehensive and diverse evaluation tasks, VMLU enriches Vietnamese NLP evaluation, driving the development of more robust foundation models and encouraging further research in LLMs. Datasets are available for download, and a GitHub repository offers extensive information, benchmarking results, and replication code.

Platform
Web
Task
model benchmarking

Features

vietnamese multitask language understanding benchmark

accessible datasets and benchmarking code

support for various difficulty levels (elementary to professional)

58 distinct subjects across diverse domains

vi-dialog: dialogue dataset

vi-drop: discrete reasoning over paragraphs

vi-squad: stanford question answering dataset

vi-mqa: multiple-choice question answering

Pricing Plans

Free
Free Plan

Access to all VMLU datasets

GitHub repository access

Benchmarking code

Publicly available model results

Job Opportunities

There are currently no job postings for this AI tool.

Explore AI Career Opportunities

Social Media

Ratings & Reviews

No ratings available yet. Be the first to rate this tool!

Alternatives

Needle-in-a-Needlestack favicon
Needle-in-a-Needlestack

Needle-in-a-Needlestack is an open-source platform for benchmarking large language models on their long-context understanding and retrieval capabilities.

View Details

Featured Tools

GirlfriendGPT favicon
GirlfriendGPT

NSFW AI chat platform with customizable characters, AI image generation, and voice chat. Explore roleplay and intimate interactions with AI companions.

View Details
xMates AI favicon
xMates AI

xMates AI is a next-generation AI chat app powered by large language models, offering human-like interactions and roleplaying with customizable AI characters.

View Details
Promptix favicon
Promptix

Promptix is a macOS app that lets you run AI in any application with a hotkey. It helps you write faster, translate, polish text, and use custom prompts.

View Details
BestStock AI favicon
BestStock AI

BestStock AI is an AI-powered financial analysis platform, automating data processing and delivering predictive insights across financial instruments.

View Details
Gempix2 AI favicon
Gempix2 AI

Gempix2 AI is a free online AI photo and image editor, powered by NanoBanana 2 technology, offering advanced tools for professional-quality visual transformations.

View Details
AI Animate Image favicon
AI Animate Image

AI Animate Image revolutionizes how you create animated content from static images. Our advanced AI image animator turns photos into animation with stunning realism.

View Details
Wan 2.2 favicon
Wan 2.2

Wan 2.2 is an open-source AI video generation tool using MoE architecture, transforming text or images into professional 720P cinematic videos.

View Details
Wan 2.2 Animate favicon
Wan 2.2 Animate

Wan 2.2 Animate is a free online AI tool that transforms any character with advanced AI-powered animations, precise facial expressions, and dynamic body movements without registration.

View Details
Soora2 favicon
Soora2

Soora2 is a global Sora 2 AI video generation platform offering text-to-video, image-to-video, and AI editing tools without watermarks.

View Details