Emin Temiz's picture

Emin Temiz PRO

etemiz

·

https://pickabrain.ai

AI & ML interests

Alignment

Recent Activity

updated a model 1 day ago

etemiz/Ostrich-32B-Qwen3-260120-bnb-4bit

published a model 1 day ago

etemiz/Ostrich-32B-Qwen3-260120-bnb-4bit

posted an update 5 days ago

which one is better for alignment? ORPO or GSPO? I think ORPO is pretty good and fast but GSPO makes it attack its own opinions, reflecting on itself, correcting itself. Although GSPO is much slower, it may still be pretty effective. And for GSPO you don't have to provide the whole reasoning corpus, you just provide the end result (One word maybe to answer a binary question). And GSPO may be better than GRPO because it is rewarding 'train of thoughts' whereas GRPO is rewarding single tokens. Alignment is mostly train of thoughts, not a single token like a math answer..

View all activity

Organizations

None yet

etemiz 's datasets

None public yet