AMwithLLMs-Meta-Llama-3.1-8B-Instruct-bnb-4bit

This model is a fine-tuned version of unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit on the Persuasive Essays (PE), Cornell eRulemaking Corpus (CDCP), and Abstracts of Randomized Control Trials (AbstRCT) datasets. It implements the fine-tuning process as described in Argument Mining with Fine-Tuned Large Language Models (Cabessa et al., COLING 2025) and availabile at https://github.com/mohammadoumar/AMwithLLMs.

Citation

@inproceedings{cabessa-etal-2025-argument,
    author = "Cabessa, Jeremie and Hernault, Hugo and Mushtaq, Umer",
    title = "Argument Mining with Fine-Tuned Large Language Models",
    publisher = "Association for Computational Linguistics",
    booktitle = "Proceedings of the 31st International Conference on Computational Linguistics",
    editor = "Rambow, Owen and Wanner, Leo and Apidianaki, Marianna and Al-Khalifa, Hend and Eugenio, Barbara Di and Schockaert, Steven",
    month = jan,
    year = "2025",
    address = "Abu Dhabi, UAE",
    url = "https://aclanthology.org/2025.coling-main.442/",
    pages = "6624--6635",
}

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 5
mixed_precision_training: Native AMP

Framework versions

PEFT 0.15.2
Transformers 4.52.4
Pytorch 2.7.1+cu126
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for andrewelawrence/AMwithLLMs-Meta-Llama-3.1-8B-Instruct-bnb-4bit

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Quantized

unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit

Adapter

(58)

this model

andrewelawrence
/

AMwithLLMs-Meta-Llama-3.1-8B-Instruct-bnb-4bit

AMwithLLMs-Meta-Llama-3.1-8B-Instruct-bnb-4bit

Citation

Training hyperparameters

Framework versions

Model tree for andrewelawrence/AMwithLLMs-Meta-Llama-3.1-8B-Instruct-bnb-4bit

Evaluation results