LexiMind — Multi-Task Transformer Model

LexiMind is a custom-built multi-task encoder-decoder Transformer that jointly performs abstractive summarization, emotion detection (multi-label, 28 classes), and topic classification (7 classes). It uses a FLAN-T5-base initialization with several architectural enhancements.

Architecture

Component	Detail
Base	FLAN-T5-base (272M parameters)
Encoder	12 layers, 768 hidden dim, 12 heads
Decoder	12 layers, 768 hidden dim, 12 heads
FFN	Gated-GELU, d_ff = 2048
Position	Relative position bias (T5 style)
Vocab	32 128 tokens (SentencePiece)
Summarization head	Decoder → linear projection → vocab
Emotion head	Attention-pooled encoder → 28-class sigmoid
Topic head	[CLS]-pooled encoder → 7-class softmax
Task sampling	Temperature-based (τ = 2.0) with proportional mixing

Training

Data: CNN/DailyMail + BookSum (summarization), GoEmotions (emotion), AG News (topic)
Epochs: 8 (~9 hours on a single NVIDIA RTX 4070)
Optimizer: AdamW, lr = 3e-4, weight decay = 0.01
Scheduler: Linear warmup (500 steps) + cosine decay
Gradient clipping: max norm = 1.0
Mixed precision: FP16 via PyTorch AMP

Evaluation Results

Task	Metric	Value
Summarization	ROUGE-1	0.309
Summarization	ROUGE-L	0.185
Summarization	BLEU-4	0.024
Topic Classification	Accuracy	85.7%
Topic Classification	Macro F1	0.854
Emotion Detection	Sample-Avg F1	0.352
Emotion Detection	Micro F1	0.443

Files

File	Description
`best.pt`	Full model checkpoint (state dict + optimizer + metadata)
`labels.json`	Emotion (28) and topic (7) label mappings
`tokenizer.json`	SentencePiece tokenizer (flat format)
`hf_tokenizer/`	HuggingFace-compatible tokenizer directory

Usage

import torch
from src.models.factory import build_model
from src.utils.io import load_labels

labels = load_labels("labels.json")
model = build_model(config, labels)

ckpt = torch.load("best.pt", map_location="cpu")
model.load_state_dict(ckpt["model_state_dict"])
model.eval()

See the full codebase at github.com/OliverPerrin/LexiMind for inference scripts, API server, and Gradio demo.

License

MIT

Downloads last month: -; Downloads are not tracked for this model. How to track

Datasets used to train OliverPerrin/LexiMind-Model

Space using OliverPerrin/LexiMind-Model 1

Evaluation results

rouge1
self-reported

0.309
rougeL
self-reported

0.185
bleu4
self-reported

0.024
accuracy
self-reported

0.857
f1
self-reported

0.854
f1
self-reported

0.352