Matej Sirovatka's picture

2 2 9

Matej Sirovatka

siro1

·

S1ro1

AI & ML interests

None yet

Recent Activity

updated a dataset about 10 hours ago

siro1/test2

published a dataset about 10 hours ago

siro1/test2

commented on their article 3 days ago

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

View all activity

Organizations

updated a dataset about 10 hours ago

siro1/test2

Viewer • Updated about 10 hours ago • 1 • 6

published a dataset about 10 hours ago

siro1/test2

Viewer • Updated about 10 hours ago • 1 • 6

commented on Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training 3 days ago

Yes it is, world_size refers to the total number of GPUs, which in general (there are exceptions, for example commonly EP), is just a product of all parallelism dimensions. However, data_parallel_size = dp_shard_size * dp_replicate_size, as dp_world_size denotes how many distinct batches you dispatch. To get the number of GPUs each batch is ran on (i.e. the degree of model parallelism), it is calculated as a product of non-data-parallel sizes - non_data_parallel_size = tp_size * cp_size * sp_size * pp_size. Then from that total_size = data_parallel_size * non_data_parallel_size

All the assumptions above hold under some constraints, which can be broken by i.e. EP, which steals from dp_shard_size in most common implementations.

updated 2 datasets 6 days ago

siro1/test

Viewer • Updated 6 days ago • 1 • 26

siro1/kernel-book-z-ai-glm-4.6-traces

Viewer • Updated 6 days ago • 18.2k • 18

published a dataset 6 days ago

siro1/kernel-book-z-ai-glm-4.6-traces

Viewer • Updated 6 days ago • 18.2k • 18

updated a collection 26 days ago

BackendBench Evals

Evaluations on the BackendBench verifiers environment • 5 items • Updated 26 days ago

updated a dataset 26 days ago

siro1/backendbench-prime-intellect-intellect-3

Viewer • Updated 26 days ago • 408 • 17

published a dataset 26 days ago

siro1/backendbench-prime-intellect-intellect-3

Viewer • Updated 26 days ago • 408 • 17

updated a collection 26 days ago

BackendBench Evals

Evaluations on the BackendBench verifiers environment • 5 items • Updated 26 days ago

updated a dataset 26 days ago

siro1/backendbench-z-ai-glm-4.5-air

Viewer • Updated 26 days ago • 408 • 22

published a dataset 26 days ago

siro1/backendbench-z-ai-glm-4.5-air

Viewer • Updated 26 days ago • 408 • 22

updated a collection 26 days ago

BackendBench Evals

Evaluations on the BackendBench verifiers environment • 5 items • Updated 26 days ago

updated a dataset 27 days ago

siro1/backendbench-openai-gpt-5.2

Viewer • Updated 27 days ago • 408 • 26

published a dataset 27 days ago

siro1/backendbench-openai-gpt-5.2

Viewer • Updated 27 days ago • 408 • 26

updated a dataset 27 days ago

siro1/backendbench-openai-gpt-oss-120b

Viewer • Updated 27 days ago • 408 • 20

published a dataset 27 days ago

siro1/backendbench-openai-gpt-oss-120b

Viewer • Updated 27 days ago • 408 • 20

updated a dataset 27 days ago

siro1/backendbench-qwen-qwen3-coder

Viewer • Updated 27 days ago • 408 • 21