Awesome

A New Benchmark for Automatic Essay Scoring in Portuguese

This repository contains all the code used in the experiments provided in A New Benchmark for Automatic Essay Scoring in Portuguese, which was presented at PROPOR 2024.

Dataset

The dataset used in this project is the AES ENEM Dataset, which can be accessed at:

AES ENEM Dataset on Hugging Face

Loading the Dataset

The dataset can be loaded using the following commands:

For loading the sourceAWithGraders slice of the dataset:

dataset = load_dataset("kamel-usp/aes_enem_dataset", "sourceAWithGraders", cache_dir="/tmp/aes_enem")

For loading the sourceB slice of the dataset:

dataset = load_dataset("kamel-usp/aes_enem_dataset", "sourceB", cache_dir="/tmp/aes_enem")

For loading the sourceAOnly slice of the dataset:

dataset = load_dataset("kamel-usp/aes_enem_dataset", "sourceAOnly", cache_dir="/tmp/aes_enem")

Notebooks

We have three Jupyter notebooks available for different purposes:

Train Models Experiment (train_models_experiment.ipynb):

Used for training models on sourceA data.

SourceB MLM Pretraining (sourceB_mlm_pretraining.ipynb):

Used for training sourceB data without a classification head, using MLM (Masked Language Modeling) loss.

SourceB Classification-Head Pretraining (sourceB_classification-head_pretraining.ipynb):

Used for fine-tuning all sourceB data for each concept available using Ordinal Regression.

Models

All models trained using these notebooks are available on Hugging Face under the models tab:

Models on Hugging Face

The notebooks vary constant parameters such as REFERENCE_CONCEPT, OBJECTIVE, and variant. These parameters are used to train new concepts and specify the objective (classification, regression, or ordinal regression) and the BERT model variant (base or large).

Usage

If you want to use the trained models, you can load them by:

TOKENIZER_NAME = "TOKENIZER_NAME = f"neuralmind/bert-{VARIANT}-portuguese-cased"
MODEL_NAME = "kamel-usp/aes_enem_models-sourceA-{OBJECTIVE}-from-{FINETUNED-MODEL}-{VARIANT}-C{REFERENCE_CONCEPT}"

tokenizer = AutoTokenizer.from_pretrained(TOKENIZER_NAME, use_fast=True)
model = AutoModelForSequenceClassification.from_pretrained(
        MODEL_NAME, 
        cache_dir="/tmp/", 
        num_labels=6,
    )

If you use regression, num_labels must be 1. All model variants are available on Hugging Face

Contact

Please feel free to contact authors through the email shown in the paper.

Citation

If you use this dataset or any of the associated resources in your research, please cite the following:

@inproceedings{silveira-etal-2024-new,
title = "A New Benchmark for Automatic Essay Scoring in {P}ortuguese",
author = "Silveira, Igor Cataneo and Barbosa, Andr{'e} and Mau{'a}, Denis Deratani",
editor = "Gamallo, Pablo and Claro, Daniela and Teixeira, Ant{'o}nio and Real, Livy and
Garcia, Marcos and Oliveira, Hugo Gon{\c{c}}alo and Amaro, Raquel",
booktitle = "Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1",
month = mar,
year = "2024",
address = "Santiago de Compostela, Galicia/Spain",
publisher = "Association for Computational Lingustics",
url = "https://aclanthology.org/2024.propor-1.23",
pages = "228--237",
}