bondscell_results$c55dcd4a-8438-4679-9c4a-78cceec6835dqueued¤logsrunning¦outputbodymimetext/plainrootassigneelast_run_timestampAeЪlpersist_js_state·has_pluto_hook_features§cell_id$c55dcd4a-8438-4679-9c4a-78cceec6835ddepends_on_disabled_cells§runtimelapublished_object_keysdepends_on_skipped_cells§errored$6be49c46-4900-4457-81b4-0704cd7da0afqueued¤logsrunning¦outputbody

Would it be more efficient to have a specialized implementation instead of combining existing collectives ?

Let the size of each $x_{i,j}$ be $n/p$ bytes.

  1. MPI_Reduce acts on the concatenation $x_{:,j}$ which has length $n$ bytes hence the complexity is $\log_2(p)(\alpha + \beta n + \gamma n)$

  2. MPI_Scatter has the same complexity as MPI_Gather (since it's the same but backwards in time) : $\log_2(p) \alpha + \beta n$

In total, we have the complexity $\log_2(p) (\alpha + \beta n + \gamma n)$. Can we do better ?

Start exchanging between 1 and 2 and simultaneously exchanging between 3 and 4. The complexity is $\alpha + 2(\beta + \gamma) n/4$.

procid1234
$x_{1,1} + x_{1,2}$$x_{1,3} + x_{1,4}$
$x_{2,1} + x_{2,2}$$x_{2,3} + x_{2,4}$
$x_{3,1} + x_{3,2}$$x_{3,3} + x_{3,4}$
$x_{4,1} + x_{4,2}$$x_{4,3} + x_{4,4}$

Next, we exchange between 1 and 3 and simultaneously between 2 and 4. The complexity is $\alpha + (\beta + \gamma) n/4$. In total, we have complexity

$$\begin{align} \log_2(p) \alpha + (\beta + \gamma) n(p/2 + \cdots + 4 + 2 + 1)/p & = \log_2(p) \alpha + (\beta + \gamma) n(p-1)/p\\ & \approx \log_2(p) \alpha + (\beta + \gamma) n. \end{align}$$

This is better than the approaches combining existing collectives above since we removed the $\log_2(p)$ in front of $\beta$ and $\gamma$.

mimetext/htmlrootassigneelast_run_timestampAeЍcpersist_js_state·has_pluto_hook_features§cell_id$6be49c46-4900-4457-81b4-0704cd7da0afdepends_on_disabled_cells§runtimeWPpublished_object_keysdepends_on_skipped_cells§errored$c1285653-38ba-418b-bdf5-cda99440998dqueued¤logsrunning¦outputbody mimetext/htmlrootassigneelast_run_timestampAeС1persist_js_state·has_pluto_hook_features§cell_id$c1285653-38ba-418b-bdf5-cda99440998ddepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$4fdb4cd6-a794-4b14-84b0-72f484c6ea86queued¤logsrunning¦outputbody@

All gather

mimetext/htmlrootassigneelast_run_timestampAeϗpersist_js_state÷has_pluto_hook_features§cell_id$4fdb4cd6-a794-4b14-84b0-72f484c6ea86depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$a59db59c-d34e-4abd-8865-9907607e06a8queued¤logsrunning¦outputbody mimetext/htmlrootassigneelast_run_timestampAe!persist_js_state·has_pluto_hook_features§cell_id$a59db59c-d34e-4abd-8865-9907607e06a8depends_on_disabled_cells§runtime:published_object_keysdepends_on_skipped_cells§errored$5a566137-fbd1-45b2-9a55-e4aded366bb3queued¤logsrunning¦outputbodyق

Single Program Multiple Data (SPMD)

mimetext/htmlrootassigneelast_run_timestampAeϊfpersist_js_state÷has_pluto_hook_features§cell_id$5a566137-fbd1-45b2-9a55-e4aded366bb3depends_on_disabled_cells§runtimeFpublished_object_keysdepends_on_skipped_cells§errored$ad3559d1-6180-4eaa-b97d-3c1f10f036b9queued¤logsrunning¦outputbody8

Reduce

mimetext/htmlrootassigneelast_run_timestampAeϗpersist_js_state÷has_pluto_hook_features§cell_id$ad3559d1-6180-4eaa-b97d-3c1f10f036b9depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$88f33f35-d922-4d98-af4a-ebb79d9b7dc6queued¤logsrunning¦outputbodymimetext/plainrootassigneelast_run_timestampAeСXpersist_js_state·has_pluto_hook_features§cell_id$88f33f35-d922-4d98-af4a-ebb79d9b7dc6depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$b540d5e3-6686-479a-b2c7-c1f65b85b6baqueued¤logsrunning¦outputbodyt

Profiling with NVIDIA Nsight Systems

mimetext/htmlrootassigneelast_run_timestampAeϝٝpersist_js_state÷has_pluto_hook_features§cell_id$b540d5e3-6686-479a-b2c7-c1f65b85b6badepends_on_disabled_cells§runtimehpublished_object_keysdepends_on_skipped_cells§errored$f6f9447c-9bc9-432d-bd80-2c39f9d842f8queued¤logsrunning¦outputbodymimetext/htmlrootassigneelast_run_timestampAeC2persist_js_state·has_pluto_hook_features§cell_id$f6f9447c-9bc9-432d-bd80-2c39f9d842f8depends_on_disabled_cells§runtimeDpublished_object_keysdepends_on_skipped_cells§errored$9a100ccf-1ad3-4d2c-bbe0-e297969eb69equeued¤logsrunning¦outputbody<

Topology

mimetext/htmlrootassigneelast_run_timestampAeϝpersist_js_state÷has_pluto_hook_features§cell_id$9a100ccf-1ad3-4d2c-bbe0-e297969eb69edepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$a1b2d090-d498-4d5d-90a0-8cdc648dc833queued¤logsrunning¦outputbodyJ

Distributed sum

mimetext/htmlrootassigneelast_run_timestampAeϘJpersist_js_state÷has_pluto_hook_features§cell_id$a1b2d090-d498-4d5d-90a0-8cdc648dc833depends_on_disabled_cells§runtime1published_object_keysdepends_on_skipped_cells§errored$8b83570a-6982-47e5-a167-a6d6afee0f7dqueued¤logsrunning¦outputbodychildrene

Before

procid1234
$x$
text/htmlchildren1

text/htmlclassnamestylemargin: 50pt;'application/vnd.pluto.divelement+object

After

procid1234
$x$$x$$x$$x$
text/htmlclassnamestyle#display: flex; flex-direction: row;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAeЇpersist_js_state·has_pluto_hook_features§cell_id$8b83570a-6982-47e5-a167-a6d6afee0f7ddepends_on_disabled_cells§runtimeh published_object_keysdepends_on_skipped_cells§errored$e3474aea-ee14-4c78-ae46-5badc66a543aqueued¤logsrunning¦outputbodymimetext/plainrootassigneelast_run_timestampAeС|̰persist_js_state·has_pluto_hook_features§cell_id$e3474aea-ee14-4c78-ae46-5badc66a543adepends_on_disabled_cells§runtimeWpublished_object_keysdepends_on_skipped_cells§errored$21d507f6-02f8-4f8b-84f1-bcb84731df66queued¤logsrunning¦outputbody<

Fat-tree

mimetext/htmlrootassigneelast_run_timestampAeϡJpersist_js_state÷has_pluto_hook_features§cell_id$21d507f6-02f8-4f8b-84f1-bcb84731df66depends_on_disabled_cells§runtime˵published_object_keysdepends_on_skipped_cells§errored$de72d596-0daf-4629-bbb5-20bb8a67cbedqueued¤logsrunning¦outputbodyi

What is the number of edges ? What is the bisection width ?

Number of edges is $n\log_2(n)$ and bisection width is $n/2$.

mimetext/htmlrootassigneelast_run_timestampAeְpersist_js_state·has_pluto_hook_features§cell_id$de72d596-0daf-4629-bbb5-20bb8a67cbeddepends_on_disabled_cells§runtimewpublished_object_keysdepends_on_skipped_cells§errored$c253bb24-ad76-4b58-8dfc-7dc2576e3db5queued¤logsrunning¦outputbodyR

Bisection bandwidth

mimetext/htmlrootassigneelast_run_timestampAeϠzpersist_js_state÷has_pluto_hook_features§cell_id$c253bb24-ad76-4b58-8dfc-7dc2576e3db5depends_on_disabled_cells§runtime

What is the diameter and bisection width of $n$ computer nodes ?

Diameter is $2\log_2(n)$ and bisection width is 1.

mimetext/htmlrootassigneelast_run_timestampAeF}persist_js_state·has_pluto_hook_features§cell_id$e4d1de1d-d57a-48ab-ad7a-c09b427daa03depends_on_disabled_cells§runtime9published_object_keysdepends_on_skipped_cells§errored$133f4c7d-33e0-4e13-b716-f538125436caqueued¤logsrunning¦outputbodychildrenchildrenh

There can be $n$ simultaneous communication at the same time, provided that each input communicate with a different output. The figure on the right provides an example of such non-conflicting communication with the black dots indicating that the input of that row communicates to the corresponding output (case (a) of above figure). The switch at row 1 and column 2 is just propagating the input data horizontally and output data vertically (case (b) of above figure). The switch at row 0 and column 5 is receiving no data.

text/htmlclassnamestyleflex: 0 1 64.68%;'application/vnd.pluto.divelement+objectchildrenclassnamestyleflex: 0 0 2%;'application/vnd.pluto.divelement+objectchildrenVtext/htmlclassnamestyleflex: 0 1 33.32%;'application/vnd.pluto.divelement+objectclassnamestyle#display: flex; flex-direction: row;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAemIpersist_js_state·has_pluto_hook_features§cell_id$133f4c7d-33e0-4e13-b716-f538125436cadepends_on_disabled_cells§runtime0DRpublished_object_keysdepends_on_skipped_cells§errored$3a50ca06-06e8-4a61-ade2-afbfc52ca655queued¤logsrunning¦outputbody mimetext/htmlrootassigneelast_run_timestampAeЕΰpersist_js_state·has_pluto_hook_features§cell_id$3a50ca06-06e8-4a61-ade2-afbfc52ca655depends_on_disabled_cells§runtime{epublished_object_keysdepends_on_skipped_cells§errored$6c1984f6-4e36-4637-b0da-c7dd8b0f9ff0queued¤logsrunning¦outputbodymimetext/plainrootassigneelast_run_timestampAeСðpersist_js_state·has_pluto_hook_features§cell_id$6c1984f6-4e36-4637-b0da-c7dd8b0f9ff0depends_on_disabled_cells§runtime՟published_object_keysdepends_on_skipped_cells§errored$de20bf96-7d33-4a78-8147-f0b7f8488e46queued¤logsrunning¦outputbody e

Would it be more efficient to have a specialized implementation instead of combining existing collectives ?

Let the size of $x_i$ be $n/p$ bytes.

  1. MPI_Gather has complexity $\log_2(p)\alpha + \beta n$

  2. MPI_Bcast acts on the concatenation $x_:$ which has length $n$ bytes so the complexity is $\log_2(p) (\alpha + \beta n)$

In total, we have the complexity $\log_2(p) (\alpha + \beta n)$. Can we do better ?

Start exchanging between 1 and 2 and simultaneously exchanging between 3 and 4. The complexity is $\alpha + \beta n/4$.

procid1234
$x_1$$x_1$
$x_2$$x_2$
$x_3$$x_3$
$x_4$$x_4$

Next, we exchange between 1 and 3 and simultaneously between 2 and 4. The complexity is $\alpha + 2\beta n/4$. In total, we have complexity

$$\begin{align} \log_2(p) \alpha + \beta n(1 + 2 + 4 + \cdots + p/2)/p & = \log_2(p) \alpha + \beta n(p-1)/p\\ & \approx \log_2(p) \alpha + \beta n. \end{align}$$

mimetext/htmlrootassigneelast_run_timestampAeЊfpersist_js_state·has_pluto_hook_features§cell_id$de20bf96-7d33-4a78-8147-f0b7f8488e46depends_on_disabled_cells§runtimeaҵpublished_object_keysdepends_on_skipped_cells§errored$51d70f9a-cd67-44b9-8fd1-5ab70b526c7aqueued¤logsrunning¦outputbodyJ

Launching a job

mimetext/htmlrootassigneelast_run_timestampAeϚpersist_js_state÷has_pluto_hook_features§cell_id$51d70f9a-cd67-44b9-8fd1-5ab70b526c7adepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$98392c40-6542-4a26-8552-c0960bbaa6a6queued¤logsrunning¦outputbody&
mimetext/htmlrootassigneelast_run_timestampAeϟpersist_js_state÷has_pluto_hook_features§cell_id$98392c40-6542-4a26-8552-c0960bbaa6a6depends_on_disabled_cells§runtime \vpublished_object_keysdepends_on_skipped_cells§errored$39b055f5-3dbf-403c-b21e-210e3813d8b0queued¤logsrunning¦outputbodymimetext/htmlrootassigneelast_run_timestampAe?/persist_js_state·has_pluto_hook_features§cell_id$39b055f5-3dbf-403c-b21e-210e3813d8b0depends_on_disabled_cells§runtime=published_object_keysdepends_on_skipped_cells§errored$b0ca0392-71b8-4f44-8c6c-0978a02a0e6cqueued¤logslinemsgٔCompiling : `mpicc -O3 -I/home/runner/.julia/artifacts/207eb5b8330e24674fe59b50d72f4b3d946219c8/include /tmp/jl_Ym4PTm/main.c -o /tmp/jl_Ym4PTm/bin`text/plaincell_id$b0ca0392-71b8-4f44-8c6c-0978a02a0e6ckwargsidSimpleClang_e77d60b3fileA/home/runner/.julia/packages/SimpleClang/N4VZY/src/SimpleClang.jlgroupSimpleClanglevelInfolinemsg+Running : `mpiexec -n 2 /tmp/jl_Ym4PTm/bin`text/plaincell_id$b0ca0392-71b8-4f44-8c6c-0978a02a0e6ckwargsidSimpleClang_9f8d091bfileA/home/runner/.julia/packages/SimpleClang/N4VZY/src/SimpleClang.jlgroupSimpleClanglevelInfolinemsgbProcess 1/2 is running on node <> Process 0/2 is running on node <> text/plaincell_id$b0ca0392-71b8-4f44-8c6c-0978a02a0e6ckwargsidPlutoRunner_d1acb81efileP/home/runner/.julia/packages/Pluto/1XRxx/src/runner/PlutoRunner/src/io/stdout.jlgroupstdoutlevelLogLevel(-555)running¦outputbodyX
  int name_length = MPI_MAX_PROCESSOR_NAME;
  char proc_name[name_length];
  MPI_Get_processor_name(proc_name,&name_length);
  printf("Process %d/%d is running on node <<%s>>\n",
	 procid,nprocs,proc_name);
mimetext/htmlrootassigneelast_run_timestampAeѬq7persist_js_state·has_pluto_hook_features§cell_id$b0ca0392-71b8-4f44-8c6c-0978a02a0e6cdepends_on_disabled_cells§runtime*published_object_keysdepends_on_skipped_cells§errored$26aa369f-e5c7-4fe5-8b6b-903f4f4e91baqueued¤logslinemsg~[1] I have received 1 B in 0.000003 sec [1] I have received 2 B in 0.000001 sec [1] I have received 4 B in 0.000000 sec [1] I have received 8 B in 0.000001 sec [1] I have received 16 B in 0.000001 sec [1] I have received 32 B in 0.000001 sec [1] I have received 64 B in 0.000001 sec [1] I have received 128 B in 0.000010 sec [1] I have received 256 B in 0.000000 sec [1] I have received 512 B in 0.000002 sec [1] I have received 1024 B in 0.000001 sec [1] I have received 2048 B in 0.000000 sec [1] I have received 4096 B in 0.000016 sec [1] I have received 8192 B in 0.000017 sec [1] I have received 16384 B in 0.000014 sec [1] I have received 32768 B in 0.000025 sec [1] I have received 65536 B in 0.000050 sec [1] I have received 131072 B in 0.000087 sec [1] I have received 262144 B in 0.000154 sec [1] I have received 524288 B in 0.000332 sec [1] I have received 1048576 B in 0.000564 sec text/plaincell_id$26aa369f-e5c7-4fe5-8b6b-903f4f4e91bakwargsidPlutoRunner_d1acb81efileP/home/runner/.julia/packages/Pluto/1XRxx/src/runner/PlutoRunner/src/io/stdout.jlgroupstdoutlevelLogLevel(-555)running¦outputbody.
  for(int size = 1; size <= (1<<20); size <<= 1){
    char* buf = malloc(size);
    if (procid == 0) {
      MPI_Barrier(MPI_COMM_WORLD);
      MPI_Send(buf, size, MPI_CHAR, procid + 1, 0, comm);
    }
    else {
      MPI_Irecv(buf, size, MPI_CHAR, procid - 1, 0, MPI_COMM_WORLD, &rqst);
      MPI_Barrier(MPI_COMM_WORLD);
      double tic = MPI_Wtime();
      MPI_Wait(&rqst, MPI_STATUS_IGNORE);
      double toc = MPI_Wtime();
      printf("[%d] I have received %d B in %f sec\n", procid, size, (toc-tic));
    }
  }
mimetext/htmlrootassigneelast_run_timestampAepersist_js_state·has_pluto_hook_features§cell_id$26aa369f-e5c7-4fe5-8b6b-903f4f4e91badepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$a103c5af-42fe-4f8c-b78c-6946895105d7queued¤logsrunning¦outputbody7

num_processes = 2

mimetext/htmlrootassigneelast_run_timestampAe1!persist_js_state·has_pluto_hook_features§cell_id$a103c5af-42fe-4f8c-b78c-6946895105d7depends_on_disabled_cells§runtime spublished_object_keysdepends_on_skipped_cells§errored$d8bb1d43-bf42-4a09-bdeb-5db406ef1ccdqueued¤logsrunning¦outputbodychildrenchildren(
  • srun : Synchronous (blocked) job

[blegat@lm4-f001 ~]$ srun --time=1 pwd
srun: job 3491072 queued and waiting for resources
srun: job 3491072 has been allocated resources
/home/users/b/l/blegat
  • $ sbatch submit.sh : Asynchronous job, get status with

  • $ squeue --me

  • More details on the README

text/htmlclassnamestyle!margin-right: 30px; flex-grow: 1;'application/vnd.pluto.divelement+objectgtext/htmlclassnamestyle#display: flex; flex-direction: row;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAe>_9persist_js_state·has_pluto_hook_features§cell_id$d8bb1d43-bf42-4a09-bdeb-5db406ef1ccddepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$82230d6c-25ce-4d12-8842-e0651fc4b143queued¤logsrunning¦outputbodyp

Processor name identifies the node

mimetext/htmlrootassigneelast_run_timestampAeϋlpersist_js_state÷has_pluto_hook_features§cell_id$82230d6c-25ce-4d12-8842-e0651fc4b143depends_on_disabled_cells§runtimeOTpublished_object_keysdepends_on_skipped_cells§errored$fc43b343-79cd-4342-8d80-8ea72cf34942queued¤logsrunning¦outputbodychildren5

Before

procid1234
$x_1$
$x_2$
$x_3$
$x_4$
text/htmlchildren1

text/htmlclassnamestylemargin: 50pt;'application/vnd.pluto.divelement+object4

After

procid1234
$x_1$
$x_2$
$x_3$
$x_4$
text/htmlclassnamestyle#display: flex; flex-direction: row;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAeЈ persist_js_state·has_pluto_hook_features§cell_id$fc43b343-79cd-4342-8d80-8ea72cf34942depends_on_disabled_cells§runtime)published_object_keysdepends_on_skipped_cells§errored$d2104fbd-ba22-4501-b03a-8809271d598bqueued¤logsrunning¦outputbodyX

Blocking communication

mimetext/htmlrootassigneelast_run_timestampAeϘ;persist_js_state÷has_pluto_hook_features§cell_id$d2104fbd-ba22-4501-b03a-8809271d598bdepends_on_disabled_cells§runtime \published_object_keysdepends_on_skipped_cells§errored$8a527c17-bf2b-4e6b-937f-ef3a269c5112queued¤logsrunning¦outputbodyu]mimetext/htmlrootassigneelast_run_timestampAe3Lpersist_js_state·has_pluto_hook_features§cell_id$8a527c17-bf2b-4e6b-937f-ef3a269c5112depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$be0e3ba0-18cc-4b9a-a56d-2566f5148faequeued¤logsrunning¦outputbody:

mimetext/htmlrootassigneelast_run_timestampAe6persist_js_state·has_pluto_hook_features§cell_id$be0e3ba0-18cc-4b9a-a56d-2566f5148faedepends_on_disabled_cells§runtime2_published_object_keysdepends_on_skipped_cells§errored$2ff573a3-4a84-4497-9305-2d97e35e5e3dqueued¤logsrunning¦outputbody

Can MPI_Reduce_scatter be implemented by combining existing collectives ?

MPI_Reduce_scatter can be implemented by MPI_Reduce followed by MPI_Scatter

mimetext/htmlrootassigneelast_run_timestampAeЍpersist_js_state·has_pluto_hook_features§cell_id$2ff573a3-4a84-4497-9305-2d97e35e5e3ddepends_on_disabled_cells§runtime~Zpublished_object_keysdepends_on_skipped_cells§errored$1b617828-e2b2-4a94-a120-59fa533d3e11queued¤logsrunning¦outputbodyT

Bandwidth $\texttt{bw}(u, v)$ is the bandwidth of the cable if $(u, v) \in E$ or 0 otherwise. Given $S, T \subseteq V$,

$$\begin{align} \text{Width} &\qquad & w(S, T) & = |\{ (u, v) \in E \mid u \in S, v \in T \}|\\ \text{Bandwidth} & & \texttt{bw}(S, T) & = \sum_{u\in S, v\not\in S} w(u,v) \end{align}$$

mimetext/htmlrootassigneelast_run_timestampAeϠ{ݰpersist_js_state÷has_pluto_hook_features§cell_id$1b617828-e2b2-4a94-a120-59fa533d3e11depends_on_disabled_cells§runtimeޥpublished_object_keysdepends_on_skipped_cells§errored$c3590376-06ed-45a4-af0b-2d46f1a387c8queued¤logsrunning¦outputbodychildrenchildren3

Free up memory.

text/htmlclassnamestyleflex-grow: 1;'application/vnd.pluto.divelement+object_
MPI_Finalize();
text/htmlclassnamestyle#display: flex; flex-direction: row;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAe&_persist_js_state·has_pluto_hook_features§cell_id$c3590376-06ed-45a4-af0b-2d46f1a387c8depends_on_disabled_cells§runtimeT}published_object_keysdepends_on_skipped_cells§errored$35ba1eea-56ae-4b74-af96-21ec5a93c455queued¤logsrunning¦outputbodyْ

You could simply add lmpi but using mpicc and mpic++ is easier.

mimetext/htmlrootassigneelast_run_timestampAeϗ persist_js_state÷has_pluto_hook_features§cell_id$35ba1eea-56ae-4b74-af96-21ec5a93c455depends_on_disabled_cells§runtime/published_object_keysdepends_on_skipped_cells§errored$488b0c17-4f0f-43bf-a16c-b9faa7ae0595queued¤logsrunning¦outputbody{ mimetext/htmlrootassigneelast_run_timestampAeupersist_js_state·has_pluto_hook_features§cell_id$488b0c17-4f0f-43bf-a16c-b9faa7ae0595depends_on_disabled_cells§runtime;^published_object_keysdepends_on_skipped_cells§errored$10a1b3a7-21c7-4f97-93e1-006ad3aea40dqueued¤logsrunning¦outputbody>

Butterfly

mimetext/htmlrootassigneelast_run_timestampAeϢװpersist_js_state÷has_pluto_hook_features§cell_id$10a1b3a7-21c7-4f97-93e1-006ad3aea40ddepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$b68eb860-a5b4-4e9e-9fbf-6eb6ce43ae69queued¤logsrunning¦outputbodyvv

What is the bisection width of a $n \times n$ 2D array ?

It is $n = \sqrt{|V|}$:

What is the bisection width of a $n^d$ $d$D array ?

It is 1 for $d = 1$, $n$ for $d = 2$ and $n^2$ for $d = 3$. In general, it is $n^{d-1} = |V|^{(d-1)/d}$

mimetext/htmlrootassigneelast_run_timestampAeA/persist_js_state·has_pluto_hook_features§cell_id$b68eb860-a5b4-4e9e-9fbf-6eb6ce43ae69depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$3dc860be-016d-49ee-8535-7d9457c70f85queued¤logsrunning¦outputbodyV

What is the graph diameter ?

$|V| - 1$ if $u$ and $v$ are extreme points of the array

mimetext/htmlrootassigneelast_run_timestampAeЩ֦persist_js_state·has_pluto_hook_features§cell_id$3dc860be-016d-49ee-8535-7d9457c70f85depends_on_disabled_cells§runtime=/published_object_keysdepends_on_skipped_cells§errored$2e4dc3f9-a132-444f-a35d-f583823a7dfdqueued¤logsrunning¦outputbody

What is the graph diameter of a $n \times n$ 2D array ?

It is $2(n-1)$, attained for opposite vertices of the square.

What is the graph diameter of a $n^d$ $d$D array ?

It is $d(n-1)$, attained for opposite vertices of the hypercube.

mimetext/htmlrootassigneelast_run_timestampAeљpersist_js_state·has_pluto_hook_features§cell_id$2e4dc3f9-a132-444f-a35d-f583823a7dfddepends_on_disabled_cells§runtime Ypublished_object_keysdepends_on_skipped_cells§errored$8981b5e2-2497-478e-ab28-a14b62f6f916queued¤logslinemsgْgcc -I/usr/lib/x86_64-linux-gnu/openmpi/include -I/usr/lib/x86_64-linux-gnu/openmpi/include/openmpi -L/usr/lib/x86_64-linux-gnu/openmpi/lib -lmpi text/plaincell_id$8981b5e2-2497-478e-ab28-a14b62f6f916kwargsidPlutoRunner_d1acb81efileP/home/runner/.julia/packages/Pluto/1XRxx/src/runner/PlutoRunner/src/io/stdout.jlgroupstdoutlevelLogLevel(-555)running¦outputbody:Process(`mpicc -show`, ProcessExited(0))mimetext/plainrootassigneelast_run_timestampAek볰persist_js_state·has_pluto_hook_features§cell_id$8981b5e2-2497-478e-ab28-a14b62f6f916depends_on_disabled_cells§runtime۵published_object_keysdepends_on_skipped_cells§errored$34a10003-2c32-4332-b3e6-ce70eec0cbbequeued¤logsrunning¦outputbody:

Example

mimetext/htmlrootassigneelast_run_timestampAeϘȰpersist_js_state÷has_pluto_hook_features§cell_id$34a10003-2c32-4332-b3e6-ce70eec0cbbedepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$a0566fdb-a08d-4bcf-9b2f-ed211c9f111fqueued¤logsrunning¦outputbodyy mimetext/htmlrootassigneelast_run_timestampAebpersist_js_state·has_pluto_hook_features§cell_id$a0566fdb-a08d-4bcf-9b2f-ed211c9f111fdepends_on_disabled_cells§runtime=published_object_keysdepends_on_skipped_cells§errored$655e980d-b4e9-4f56-a5ae-380072242d27queued¤logsrunning¦outputbodychildrenHtext/html=8text/htmlclassnamestyle#display: flex; flex-direction: row;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAeFHpersist_js_state·has_pluto_hook_features§cell_id$655e980d-b4e9-4f56-a5ae-380072242d27depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$4788d8b4-2efa-4489-80c3-71f405513644queued¤logsrunning¦outputbody2

num_processes = 2

mimetext/htmlrootassigneelast_run_timestampAeЎ persist_js_state·has_pluto_hook_features§cell_id$4788d8b4-2efa-4489-80c3-71f405513644depends_on_disabled_cells§runtimeKpublished_object_keysdepends_on_skipped_cells§errored$233c13ff-f008-40b0-a6c5-c5395b2215ecqueued¤logsrunning¦outputbody,

Lower bound complexity with $p$ processes if each $x_i$ has length $n/p$ bytes ?

Lower bound : $\log_2(p) \alpha$ using spanning tree algorithm and $\beta n$ as all message need to sent at least once. spanning tree is advantageous if $\alpha$ is larger than $\beta$ and direct to 1 if otherwise. In practice, you want a mix of both.

First send $x_2$ from 2 to 1 and simultaneously send $x_4$ from 4 to 3. Complexity is $\alpha + \beta n/4$

procid1234
$x_1$
$x_2$$x_2$
$x_3$
$x_4$$x_4$

Then send $(x_3, x_4)$ from 3 to 1. Complexity is $\alpha + 2\beta n/4$

procid1234
$x_1$
$x_2$$x_2$
$x_3$$x_3$
$x_4$$x_4$$x_4$

In total, it is $2\alpha + 3\beta n/4$. In general, we have

$$\log_2(p)\alpha + \beta n(1 + 2 + 4 + \cdots + p/2)/p = \log_2(p)\alpha + \beta n(p - 1)/p \approx \log_2(p)\alpha + \beta n$$

mimetext/htmlrootassigneelast_run_timestampAeЉ8persist_js_state·has_pluto_hook_features§cell_id$233c13ff-f008-40b0-a6c5-c5395b2215ecdepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$6fc34de1-469b-41a9-9677-ff3182f7a498queued¤logsrunning¦outputbodys

Can MPI_Allgather be implemented by combining existing collectives ?

MPI_Allgather can be implemented by MPI_Gather followed by MPI_Bcast

mimetext/htmlrootassigneelast_run_timestampAeЊ<persist_js_state·has_pluto_hook_features§cell_id$6fc34de1-469b-41a9-9677-ff3182f7a498depends_on_disabled_cells§runtime.published_object_keysdepends_on_skipped_cells§errored$b53ec488-ff25-4647-ab00-fbf90963a795queued¤logsrunning¦outputbodyٺ

blocking factor : Ratio between upper links and lower links. Ratio is 1 for fat-tree to prevent bottlenecks if all nodes start communicating.

mimetext/htmlrootassigneelast_run_timestampAeϡܰpersist_js_state÷has_pluto_hook_features§cell_id$b53ec488-ff25-4647-ab00-fbf90963a795depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$c3c848ff-526a-450d-9b1c-5d9d3ccccf28queued¤logsrunning¦outputbodyd

Eager vs rendezvous protocol

mimetext/htmlrootassigneelast_run_timestampAeϘpersist_js_state÷has_pluto_hook_features§cell_id$c3c848ff-526a-450d-9b1c-5d9d3ccccf28depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3bqueued¤logsrunning¦outputbodymimetext/plainrootassigneelast_run_timestampAeϿA%persist_js_state·has_pluto_hook_features§cell_id$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3bdepends_on_disabled_cells§runtime(published_object_keysdepends_on_skipped_cells§errored$4569aa05-9963-4976-ac63-caf3f3979e83queued¤logsrunning¦outputbody

Blocking send/received with MPI_Send and MPI_Recv.

The network cannot buffer the whole message (unless it is short). The sender need to wait for the receiver to be ready and then transfer its copy of the data.

mimetext/htmlrootassigneelast_run_timestampAeϘpersist_js_state÷has_pluto_hook_features§cell_id$4569aa05-9963-4976-ac63-caf3f3979e83depends_on_disabled_cells§runtimempublished_object_keysdepends_on_skipped_cells§errored$7565e3da-84ce-42b6-8d4b-3615576f33b7queued¤logsrunning¦outputbody%img (generic function with 3 methods)mimetext/plainrootassigneelast_run_timestampAexSpersist_js_state·has_pluto_hook_features§cell_id$7565e3da-84ce-42b6-8d4b-3615576f33b7depends_on_disabled_cells§runtimes;published_object_keysdepends_on_skipped_cells§errored$7fc70992-973a-43c6-904a-dd1b622a5ed8queued¤logsrunning¦outputbodyDD

What is the bisection width ?

The bisection width is 1 :

mimetext/htmlrootassigneelast_run_timestampAe>ˁpersist_js_state·has_pluto_hook_features§cell_id$7fc70992-973a-43c6-904a-dd1b622a5ed8depends_on_disabled_cells§runtime Ͳpublished_object_keysdepends_on_skipped_cells§errored$fa024a5d-52a6-459d-894d-13a60ec723d2queued¤logsrunning¦outputbody

What are the differences with Min-Cut ?

In Min-Cut, we fix a node in $S$, a node in $V \setminus S$ and the cardinality of S is not constrained. These differences allow Min-Cut to be solvable in polynomial time.

mimetext/htmlrootassigneelast_run_timestampAeЩspersist_js_state·has_pluto_hook_features§cell_id$fa024a5d-52a6-459d-894d-13a60ec723d2depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$f7f097cb-d7bd-49eb-a030-ac26f8f61a67queued¤logsrunning¦outputbodyc

Fat-tree need large switches, alternative is butterfly network:

mimetext/htmlrootassigneelast_run_timestampAeϢ3persist_js_state÷has_pluto_hook_features§cell_id$f7f097cb-d7bd-49eb-a030-ac26f8f61a67depends_on_disabled_cells§runtime~published_object_keysdepends_on_skipped_cells§errored$3a2bfd4e-0ce6-4a79-a578-fc1b4ef563c5queued¤logsrunning¦outputbody

How to fix it ?

We should load gompi or at least OpenMPI:

[blegat@lm4-f001 examples]$ module load OpenMPI
[blegat@lm4-f001 examples]$ mpicc procname.c
[blegat@lm4-f001 examples]$ mpiexec -n 4 a.out
Process 1/4 is running on node <<lm4-f001>>
Process 3/4 is running on node <<lm4-f001>>
Process 0/4 is running on node <<lm4-f001>>
Process 2/4 is running on node <<lm4-f001>>

Why are they all on same node ?

We are on the login node, we need to run jobs on the compute nodes using Slurm !

mimetext/htmlrootassigneelast_run_timestampAeУpersist_js_state·has_pluto_hook_features§cell_id$3a2bfd4e-0ce6-4a79-a578-fc1b4ef563c5depends_on_disabled_cells§runtime published_object_keysdepends_on_skipped_cells§errored$a79c410a-bebf-434c-9730-568e0ff4f4c7queued¤logsrunning¦outputbody٨

Consortium des Équipements de Calcul Intensif (CÉCI)

mimetext/htmlrootassigneelast_run_timestampAeϚZpersist_js_state÷has_pluto_hook_features§cell_id$a79c410a-bebf-434c-9730-568e0ff4f4c7depends_on_disabled_cells§runtime\published_object_keysdepends_on_skipped_cells§errored$7cf59087-efca-4f03-90dc-f2acefdcbc8aqueued¤logsrunning¦outputbodyL

Let's try it

mimetext/htmlrootassigneelast_run_timestampAeϘEpersist_js_state÷has_pluto_hook_features§cell_id$7cf59087-efca-4f03-90dc-f2acefdcbc8adepends_on_disabled_cells§runtimeRpublished_object_keysdepends_on_skipped_cells§errored$5d72bf87-7f3a-4229-9d7a-2e63c115087dqueued¤logsrunning¦outputbody?

Lower bound complexity with $p$ processes if $x$ has length $n$ bytes ?

Lower bound : $\log_2(p) (\alpha + \beta n)$ using spanning tree algorithm:

After first communication (1 → 3):

procid1234
$x$$x$

After second communication (1 → 2 and 3 → 4 at the same time):

procid1234
$x$$x$$x$$x$

mimetext/htmlrootassigneelast_run_timestampAeЈNpersist_js_state·has_pluto_hook_features§cell_id$5d72bf87-7f3a-4229-9d7a-2e63c115087ddepends_on_disabled_cells§runtimeѵpublished_object_keysdepends_on_skipped_cells§errored$e796b093-9c1d-4656-9acb-918de53f7e4dqueued¤logsrunning¦outputbody<

Crossbar

mimetext/htmlrootassigneelast_run_timestampAeϡkpersist_js_state÷has_pluto_hook_features§cell_id$e796b093-9c1d-4656-9acb-918de53f7e4ddepends_on_disabled_cells§runtimeNPpublished_object_keysdepends_on_skipped_cells§errored$954f1ab1-1e2f-458b-96d7-a1746631fac7queued¤logsrunning¦outputbodymimetext/plainrootassigneelast_run_timestampAeՋepersist_js_state·has_pluto_hook_features§cell_id$954f1ab1-1e2f-458b-96d7-a1746631fac7depends_on_disabled_cells§runtimeQL'published_object_keysdepends_on_skipped_cells§errored$a771f33f-7ed1-41aa-bee0-c215729a8c8dqueued¤logsrunning¦outputbodyP

Distributed vector

mimetext/htmlrootassigneelast_run_timestampAeϘ&Wpersist_js_state÷has_pluto_hook_features§cell_id$a771f33f-7ed1-41aa-bee0-c215729a8c8ddepends_on_disabled_cells§runtimeQpublished_object_keysdepends_on_skipped_cells§errored$8da580fe-6b56-4d8f-ad43-aed7b728a06equeued¤logsrunning¦outputbody3
  • Worst case pairwise communication of two groups $S$ and $V \setminus S$ of almost ($\pm 1$) equal size.

  • NP-hard to compute for general graphs

mimetext/htmlrootassigneelast_run_timestampAeϠpersist_js_state÷has_pluto_hook_features§cell_id$8da580fe-6b56-4d8f-ad43-aed7b728a06edepends_on_disabled_cells§runtimeݵpublished_object_keysdepends_on_skipped_cells§errored$39f48c25-6efb-4ff2-aedc-9d3e722dad24queued¤logsrunning¦outputbodyڰ
  • Follow README instructions to create an account and setup your computer

    • Don't wait the last minute, if you get into trouble it's easier to get this setup before you actually need it

  • Select cluster from the list + manneback for GPU. You only have access to Tier-2 clusters. This sadly leaves out:

    • Tier-1 clusters such as Lucia

    • Tier-0 cluster such as from

  • Connect with SSH using ssh lemaitre4 or ssh manneback.

mimetext/htmlrootassigneelast_run_timestampAe3ٰpersist_js_state·has_pluto_hook_features§cell_id$39f48c25-6efb-4ff2-aedc-9d3e722dad24depends_on_disabled_cells§runtimeD published_object_keysdepends_on_skipped_cells§errored$e119c2d3-1e24-464f-b812-62f28c00a913queued¤logsrunning¦outputbodyH

Reduce scatter

mimetext/htmlrootassigneelast_run_timestampAeϗqpersist_js_state÷has_pluto_hook_features§cell_id$e119c2d3-1e24-464f-b812-62f28c00a913depends_on_disabled_cells§runtimeꢵpublished_object_keysdepends_on_skipped_cells§errored$a258eec9-f4f6-49bd-8470-8541836f5f6bqueued¤logsrunning¦outputbodychildren5

Before

procid1234
$x_1$
$x_2$
$x_3$
$x_4$
text/htmlchildren1

text/htmlclassnamestylemargin: 50pt;'application/vnd.pluto.divelement+object

After MPI_Allgather

procid1234
$x_1$$x_1$$x_1$$x_1$
$x_2$$x_2$$x_2$$x_2$
$x_3$$x_3$$x_3$$x_3$
$x_4$$x_4$$x_4$$x_4$
text/htmlclassnamestyle#display: flex; flex-direction: row;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAeЊYpersist_js_state·has_pluto_hook_features§cell_id$a258eec9-f4f6-49bd-8470-8541836f5f6bdepends_on_disabled_cells§runtimeMZpublished_object_keysdepends_on_skipped_cells§errored$79b405a5-54b5-4727-a0cd-b79522ad109fqueued¤logsrunning¦outputbodyH

Point-to-point

mimetext/htmlrootassigneelast_run_timestampAeϘepersist_js_state÷has_pluto_hook_features§cell_id$79b405a5-54b5-4727-a0cd-b79522ad109fdepends_on_disabled_cells§runtimemvpublished_object_keysdepends_on_skipped_cells§errored$93f0c63c-b597-4f89-809c-7af0476f319aqueued¤logsrunning¦outputbody

MPI_Isend and MPI_Irecv where I stands for immediate or incomplete. MPI_Wait can be used to wait for the send and receive to finish.

mimetext/htmlrootassigneelast_run_timestampAeϚAȰpersist_js_state÷has_pluto_hook_features§cell_id$93f0c63c-b597-4f89-809c-7af0476f319adepends_on_disabled_cells§runtimeHpublished_object_keysdepends_on_skipped_cells§errored$944d827e-bc6a-4de8-b959-5fde8790bedcqueued¤logsrunning¦outputbody
[laptop]$ ssh lemaitre4
[blegat@lm4-f001 ~]$ cd LINMA2710/examples
[blegat@lm4-f001 examples]$ mpicc procname.c
-bash: mpicc: command not found
mimetext/htmlrootassigneelast_run_timestampAeϝŰpersist_js_state÷has_pluto_hook_features§cell_id$944d827e-bc6a-4de8-b959-5fde8790bedcdepends_on_disabled_cells§runtime]published_object_keysdepends_on_skipped_cells§errored$c04bcc96-e5fe-4d6e-a12e-40dcde58c62equeued¤logsrunning¦outputbody
  • MPI is an open standard for distributed computing

  • Many implementations:

    • MPICH, from and

    • Open MPI (not to be confused with )

    • commercial implementations from , , , and

mimetext/htmlrootassigneelast_run_timestampAe'İpersist_js_state·has_pluto_hook_features§cell_id$c04bcc96-e5fe-4d6e-a12e-40dcde58c62edepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$c420ad25-6af1-4fb4-823a-b6bbd4e10f7fqueued¤logsrunning¦outputbodychildren

Before

procid1234
$x_1$$x_2$$x_3$$x_4$
text/htmlchildren1

text/htmlclassnamestylemargin: 50pt;'application/vnd.pluto.divelement+object

After

procid1234
$x_1 + x_2 + x_3 + x_4$
text/htmlclassnamestyle#display: flex; flex-direction: row;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAeЉаpersist_js_state·has_pluto_hook_features§cell_id$c420ad25-6af1-4fb4-823a-b6bbd4e10f7fdepends_on_disabled_cells§runtimeӧpublished_object_keysdepends_on_skipped_cells§errored$a6c337c4-0c81-4463-ad4f-9a4528d953abqueued¤logsrunning¦outputbodyz

Message Passing Interface (MPI)

mimetext/htmlrootassigneelast_run_timestampAeϋ
  • Specializing on topology is important for communication libraries like MPI/NCCL. For instance, Deepseek-V3 by-passed NCCL and used PTX directly to hardcode how their hardware should be used.

  • Specified in Slurm's topology.conf file.

  • Source : [Eij10; Section 2.7]

mimetext/htmlrootassigneelast_run_timestampAeФopersist_js_state·has_pluto_hook_features§cell_id$921b5a18-0733-4032-a543-9d60e254b1b2depends_on_disabled_cells§runtime>58published_object_keysdepends_on_skipped_cells§errored$2257220c-6f0e-4edf-9fea-7e388b84df9bqueued¤logsrunning¦outputbodyl

Multidimensional array and torus

mimetext/htmlrootassigneelast_run_timestampAeϡ persist_js_state÷has_pluto_hook_features§cell_id$2257220c-6f0e-4edf-9fea-7e388b84df9bdepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$c45ff9b5-35d9-4a9d-a801-c762333a1f02queued¤logsrunning¦outputbodymimetext/plainrootassigneelast_run_timestampAenpersist_js_state·has_pluto_hook_features§cell_id$c45ff9b5-35d9-4a9d-a801-c762333a1f02depends_on_disabled_cells§runtimeIC׵published_object_keysdepends_on_skipped_cells§errored$4aac6ab5-053a-4f60-9e2e-e8d61ff0cecbqueued¤logsrunning¦outputbody7mimetext/htmlrootassigneelast_run_timestampAeH!%persist_js_state·has_pluto_hook_features§cell_id$4aac6ab5-053a-4f60-9e2e-e8d61ff0cecbdepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$61af27f1-9f83-42f1-a419-06d12ea62133queued¤logsrunning¦outputbody{ mimetext/htmlrootassigneelast_run_timestampAeԀpersist_js_state·has_pluto_hook_features§cell_id$61af27f1-9f83-42f1-a419-06d12ea62133depends_on_disabled_cells§runtime>published_object_keysdepends_on_skipped_cells§errored$52d428d5-cb33-4f2a-89eb-3a8ce3f5bb81queued¤logsrunning¦outputbody

Each process runs the same executable. So how can we make them do different things ?

Even if the code is the same, MPI_Comm_rank will give different procid so the part of the program depending on the value of procid will differ.

mimetext/htmlrootassigneelast_run_timestampAe,persist_js_state·has_pluto_hook_features§cell_id$52d428d5-cb33-4f2a-89eb-3a8ce3f5bb81depends_on_disabled_cells§runtimeU1published_object_keysdepends_on_skipped_cells§errored$d7e31ced-4eb2-4221-b83f-462e8f32fe89queued¤logsrunning¦outputbodyZ mimetext/htmlrootassigneelast_run_timestampAeГ+Apersist_js_state·has_pluto_hook_features§cell_id$d7e31ced-4eb2-4221-b83f-462e8f32fe89depends_on_disabled_cells§runtime ypublished_object_keysdepends_on_skipped_cells§errored$2c84bd84-b54d-4594-b9f8-35db2124d7e8queued¤logsrunning¦outputbody>

Hypercube

mimetext/htmlrootassigneelast_run_timestampAeϡ,Epersist_js_state÷has_pluto_hook_features§cell_id$2c84bd84-b54d-4594-b9f8-35db2124d7e8depends_on_disabled_cells§runtimeFpublished_object_keysdepends_on_skipped_cells§errored$32f740e7-9338-4c42-8eaf-ce8022412c50queued¤logsrunning¦outputbody^

Nonblocking communication

mimetext/htmlrootassigneelast_run_timestampAeϚ persist_js_state÷has_pluto_hook_features§cell_id$32f740e7-9338-4c42-8eaf-ce8022412c50depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$db16e939-b490-497b-a03f-80ce2e8485afqueued¤logsrunning¦outputbody"

Lower bound complexity with $p$ processes if each $x_i$ has length $n$ bytes and the arithmetic complexity is $\gamma$ ?

Lower bound : $\log_2(p) (\alpha + \beta n) + \log_2(p) \gamma n$ using spanning tree algorithm:

First communication (2 → 1 and 4 → 3 at the same time):

procid1234
$x_1 + x_2$$x_3 + x_4$

Then second communication (3 → 1)

mimetext/htmlrootassigneelast_run_timestampAeЉxpersist_js_state·has_pluto_hook_features§cell_id$db16e939-b490-497b-a03f-80ce2e8485afdepends_on_disabled_cells§runtime Xpublished_object_keysdepends_on_skipped_cells§errored$7d37fbea-baa3-43ec-b003-a4707017a4cfqueued¤logsrunning¦outputbody6

Rings

mimetext/htmlrootassigneelast_run_timestampAeϠ2persist_js_state÷has_pluto_hook_features§cell_id$7d37fbea-baa3-43ec-b003-a4707017a4cfdepends_on_disabled_cells§runtimezpublished_object_keysdepends_on_skipped_cells§errored$568057f5-b0b8-4225-8e4b-5eec911a52efqueued¤logsrunning¦outputbody:

Example

mimetext/htmlrootassigneelast_run_timestampAeϚapersist_js_state÷has_pluto_hook_features§cell_id$568057f5-b0b8-4225-8e4b-5eec911a52efdepends_on_disabled_cells§runtimecpublished_object_keysdepends_on_skipped_cells§errored$370f0f20-e373-4028-bca1-83e93678cbcbqueued¤logsrunning¦outputbody%0mimetext/htmlrootassigneelast_run_timestampAe"$persist_js_state·has_pluto_hook_features§cell_id$370f0f20-e373-4028-bca1-83e93678cbcbdepends_on_disabled_cells§runtime 7_published_object_keysdepends_on_skipped_cells§errored$1bac238f-79c8-4f9f-a187-bacb288de3b0queued¤logsrunning¦outputbodyPNG  IHDRBbKGDIDATxXAPAд"D RN66ܥˉښH29EAJl[%?i5 1"&?ǽ1<9s<pƻ/}sEd=@  (`$@  (`$@  (`$@  (`$@  (`$@  (`$@  (`$@  (`$p=~+++;w"000&&f̙?EQdT---\/fff:99I @ %{uqqIHHXhȑ#wBٳGL+`f8;;͚5 455͘1CLIXS%!!c !W^(Jnn ، !#ϝO`(`0`3M&x~ȑ#B@k`8 IXNNN~appp|IXXؽ{8 Zu&v-[ !W^!xwk׮]lِ!C$O uа\???5c ٳ=<^RRl0{BEEɓE\3ݻWQQ(JmmOOOccΝ;ǍV믿n+7|sԩ׺'''[zf`ժUB 6tTRR">>->|xmmm폤gff*f͚ (/[ի]\\~)oP& `徾w1ߏk0!7nضm߯g```nn{ַ~}#G_30!DQQQO=zT1޷pYfM<^:v옢(/KEQ̙3s͛7kKjjjEQo2dz(***??ߌ?q 7oܰaC]]ٳ~3g *~ĉ .lhhHOOLLLo,1?RqbsIII^^㣣3cƌ'O6666,44ɓ'N;wn_Raaa>hdd7|(wGGǛ7o*}veH7_f(o񆿿qdzsrrlb{(`***/^ ?t/Z*!!A@SGY Lsss\\\[ZZdO9XIqqqzzzaabĈ˗/_n%wرA1{כ8ɓ'C~aرcG_UUU/[bIP׳o3?R&lݺu=u ooNݻ7 ͊N2eɒ%ǏR`XSS/.^gmll S7u;vtpAA̙3OTSSK/-ZHY__5kg?{'[ wLQ?O~~~ .R_wxǟLu̙+++e%ӥK &0 $d͛Sy3`EQrss/^ .755T{̙ _JGe9MyőF6l? !|||{1a}G'O,xꩧN8w<~xUUբEϟ{]vM陓&MzG~~~UUU]Squu1bij>Q@oΝ[XXyt弼I&M+&NYUU|}}G>յnnn}~?/5;s9|7޽{С>sʕb!D^^޼y<<<ֶ"]ճ/]du9s;vLF?8 ?mmmc/_6#F?~TTӟF//SN 3lm!DEEGGO8QYC~BOOOѸbŊnFO{ڭWfgg+ǛؾfFDD9sjq(`?7h?")))66 X[[[o\YY)pww3F=lCCCCNNNFFF]]"00pڵ˖-S5J !FQ}3 v6;;;33^1{GFF: xGB׿^~=>Q;۷o߿۷WKHjPqqqzzÇE:ts=i&삽aG._߽{`0\h`Qm]a8Diox:Cp=$ 6xxx  TЎ`(`u{655 !_}PsAӧOoݺV`(`h]ץ g̘!{.hKQЮ~~~Fq  ZvuÆ [d $l m0z4 G&xnٲ\1rȸ1c 6!'''##Nve˖ 2Dh zJ!{\\h5jѠnڿFFƵkׄIIINNNG]!MssCRSSK!h\bѠC!H G\0s-a% {HD.su=K;P V& a5X `m^بkoڴi̙Q0SNm۶wڷoߝ;w)))O<칠+0̆Bg4,`)RXXVRR"pssݸqرce?k׮ׯ !FcTT`(` z1eJJJEEXjUbbѣeYcckjj֭[t`(` DKKK~~~ZZŋ=h\z {nnnnjjիW^^^+W>|`{(`YP0x Doۓ7l!{4hp`".CQ螺޽{᯾jhhM;}֭[YYץg̘!{.f+LAX0#nFQߎmذaK,vlYp;Nwܐz4 G4R/aܲeKyybȑqqqIIIcƌ=[ 999uuuBk.[lȐ!Gikk{뭷+++qqqFqԨQG­[ߟq5!wRRRll`U>;wN3sC~BOOOѸb gggvY`Asssyy?.qB=!֕~ʔ) !xww}?www=&n_{CGmm… ϝ;w}ݸqӅcƌ;v옇9tE~+**k.w(!EaM׼ Q@z- :ȯ%MEz- ȯMBz- rȯ߁Fz ȯu@z üȯ5{Cz \ȯ~{Dze!<+ 5=+`_ȯ(n^- pgW;0j^!0; H6ajW0zG~2_#ݢI ە0-d_B~;& CE~m~ .2 ko;(gꬬR!DHH /CzmZ_|7o|| d!C~ᄏdɒ[n?8qģGK /\Ν;폻DDDH B~꿀N1*ӧK rٳgwL?uꛝWgeeuMoywM͛֟E~VW`+ؿ7;߿/` {XWؿfWӧBBB9 ,o_}^:uԛ7ovzͭrܸqRov x„ tss+((޵_}cꛝ_V]z5;;Do{~ov@(`$@  (`$@  (`$@  (`$@  (`$@  (`$@  (`$@  (`$@  (`$?'I@>01D1IENDB`mimeimage/pngrootassigneelast_run_timestampAepersist_js_state·has_pluto_hook_features§cell_id$1bac238f-79c8-4f9f-a187-bacb288de3b0depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$40606ee3-38cc-4123-9b86-b774bf89e499queued¤logsrunning¦outputbodyB

Collectives

mimetext/htmlrootassigneelast_run_timestampAeϗ+persist_js_state÷has_pluto_hook_features§cell_id$40606ee3-38cc-4123-9b86-b774bf89e499depends_on_disabled_cells§runtimeĵpublished_object_keysdepends_on_skipped_cells§errored$8df4ff2f-d176-4b4e-a525-665b5d07ea52queued¤logsrunning¦outputbodymimetext/plainrootassigneelast_run_timestampAe;persist_js_state·has_pluto_hook_features§cell_id$8df4ff2f-d176-4b4e-a525-665b5d07ea52depends_on_disabled_cells§runtimeH-7published_object_keysdepends_on_skipped_cells§errored$23bfbe95-7ba2-41b9-bd8b-dc4baa3ad53aqueued¤logsrunning¦outputbodyz

What is the bisection width ?

The bisection width is 2:

mimetext/htmlrootassigneelast_run_timestampAe>persist_js_state·has_pluto_hook_features§cell_id$23bfbe95-7ba2-41b9-bd8b-dc4baa3ad53adepends_on_disabled_cells§runtimeK̵published_object_keysdepends_on_skipped_cells§errored$49b596b8-891d-4f3f-a6a4-a62cc8237df3queued¤logsrunning¦outputbody

Def: Graph diameter

Graph diameter is $d(G) := \max_{u, v \in V} d(G, u, v)$

mimetext/htmlrootassigneelast_run_timestampAeJkpersist_js_state·has_pluto_hook_features§cell_id$49b596b8-891d-4f3f-a6a4-a62cc8237df3depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$cf799c26-1cea-4b38-9a15-8497813bd668queued¤logsrunning¦outputbody@

MPI basics

mimetext/htmlrootassigneelast_run_timestampAeϋpersist_js_state÷has_pluto_hook_features§cell_id$cf799c26-1cea-4b38-9a15-8497813bd668depends_on_disabled_cells§runtime7published_object_keysdepends_on_skipped_cells§errored$5441e428-b320-433c-acde-15fe6bf58537queued¤logslinemsgٜg++ -I/usr/lib/x86_64-linux-gnu/openmpi/include -I/usr/lib/x86_64-linux-gnu/openmpi/include/openmpi -L/usr/lib/x86_64-linux-gnu/openmpi/lib -lmpi_cxx -lmpi text/plaincell_id$5441e428-b320-433c-acde-15fe6bf58537kwargsidPlutoRunner_d1acb81efileP/home/runner/.julia/packages/Pluto/1XRxx/src/runner/PlutoRunner/src/io/stdout.jlgroupstdoutlevelLogLevel(-555)running¦outputbody;Process(`mpic++ -show`, ProcessExited(0))mimetext/plainrootassigneelast_run_timestampAelAxpersist_js_state·has_pluto_hook_features§cell_id$5441e428-b320-433c-acde-15fe6bf58537depends_on_disabled_cells§runtime*published_object_keysdepends_on_skipped_cells§errored$bfab5c2d-61c3-468b-9ddf-4aaa49cb7785queued¤logsrunning¦outputbody
mimetext/htmlrootassigneelast_run_timestampAeφBpersist_js_state÷has_pluto_hook_features§cell_id$bfab5c2d-61c3-468b-9ddf-4aaa49cb7785depends_on_disabled_cells§runtime published_object_keysdepends_on_skipped_cells§errored$7d9ac5f9-39bf-4052-ad8a-ac0fec15c64aqueued¤logsrunning¦outputbodyٗ

Processes that are on the same node share the same processor_name (the hostname).

mimetext/htmlrootassigneelast_run_timestampAeϖ"persist_js_state÷has_pluto_hook_features§cell_id$7d9ac5f9-39bf-4052-ad8a-ac0fec15c64adepends_on_disabled_cells§runtimefpublished_object_keysdepends_on_skipped_cells§errored$86394e1c-0ff4-449a-8940-4b5906d8b6f0queued¤logsrunning¦outputbody

What is the graph diameter ?

$|V|/2$

mimetext/htmlrootassigneelast_run_timestampAe6ʰpersist_js_state·has_pluto_hook_features§cell_id$86394e1c-0ff4-449a-8940-4b5906d8b6f0depends_on_disabled_cells§runtimevpublished_object_keysdepends_on_skipped_cells§errored$97d3cf3f-ddac-4850-8b05-bdc0c4741f61queued¤logsrunning¦outputbody

What are the number of switches, edges, graph diameter and bisection width for $n$ computer nodes ?

  • There are $n^2$ switches one per intersection. This makes this architecture only suitable for small $n$.

  • The number of edges is : $|E| = 2n^2$ which consists of $n$ connections from an input to a switch, $n$ connections from a switch to an output and $2n(n-1)$ connections between switches.

  • The diameter 2 if we don't count the in-between switches or $2n$ if we coun't them.

  • The bisection width is $n/2$.

mimetext/htmlrootassigneelast_run_timestampAeӔ-persist_js_state·has_pluto_hook_features§cell_id$97d3cf3f-ddac-4850-8b05-bdc0c4741f61depends_on_disabled_cells§runtime ӵpublished_object_keysdepends_on_skipped_cells§errored$9b4cae31-c319-444e-98c8-2c0bfc6dfa0cqueued¤logsrunning¦outputbody>

Broadcast

mimetext/htmlrootassigneelast_run_timestampAeϗK^persist_js_state÷has_pluto_hook_features§cell_id$9b4cae31-c319-444e-98c8-2c0bfc6dfa0cdepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$b9a9e335-1328-4c63-a213-ce21263bc201queued¤logsrunning¦outputbody

Can MPI_Allreduce be implemented by combining existing collectives ?

Let the size of each $x_i$ be $n$ bytes. MPI_Allreduce can be implemented either by combining MPI_Reduce followed by MPI_Bcast or MPI_Reduce_scatter followed by MPI_Allgather. The first choice would lead to a complexity of $\log_2(p)(\alpha + \beta n + \gamma n )$. The second would lead to a complexity of $\log_2(p)\alpha + \beta n + \gamma n$. This second approach is faster for large $p$ since we removed $\log_2(p)$ in front of $\beta$ and $\gamma$.

mimetext/htmlrootassigneelast_run_timestampAeЍTpersist_js_state·has_pluto_hook_features§cell_id$b9a9e335-1328-4c63-a213-ce21263bc201depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$7b1d26c6-9499-4e44-84c8-c272737a175equeued¤logsrunning¦outputbody8

Gather

mimetext/htmlrootassigneelast_run_timestampAeϗjfpersist_js_state÷has_pluto_hook_features§cell_id$7b1d26c6-9499-4e44-84c8-c272737a175edepends_on_disabled_cells§runtimefpublished_object_keysdepends_on_skipped_cells§errored$360091c4-d3a0-462d-abcf-b9bbb9480871queued¤logsrunning¦outputbodyD

Linear array

mimetext/htmlrootassigneelast_run_timestampAeϠŗpersist_js_state÷has_pluto_hook_features§cell_id$360091c4-d3a0-462d-abcf-b9bbb9480871depends_on_disabled_cells§runtime2}published_object_keysdepends_on_skipped_cells§errored$f2417047-33fc-4489-8e89-115bc6b46c13queued¤logsrunning¦outputbody mimetext/htmlrootassigneelast_run_timestampAeHpersist_js_state·has_pluto_hook_features§cell_id$f2417047-33fc-4489-8e89-115bc6b46c13depends_on_disabled_cells§runtimepߵpublished_object_keysdepends_on_skipped_cells§errored$1152dec8-3810-42b1-bb2a-8755dcaef56cqueued¤logsrunning¦outputbody%img1 (generic function with 1 method)mimetext/plainrootassigneelast_run_timestampAePpersist_js_state·has_pluto_hook_features§cell_id$1152dec8-3810-42b1-bb2a-8755dcaef56cdepends_on_disabled_cells§runtime published_object_keysdepends_on_skipped_cells§errored$8f46daf1-9ca2-4a08-99aa-4ed68af218b8queued¤logsrunning¦outputbodyy mimetext/htmlrootassigneelast_run_timestampAepersist_js_state·has_pluto_hook_features§cell_id$8f46daf1-9ca2-4a08-99aa-4ed68af218b8depends_on_disabled_cells§runtime?published_object_keysdepends_on_skipped_cells§errored$143dca7c-f9a4-472a-a4bc-4578e4e8413bqueued¤logsrunning¦outputbody4

Tree

mimetext/htmlrootassigneelast_run_timestampAeϡpersist_js_state÷has_pluto_hook_features§cell_id$143dca7c-f9a4-472a-a4bc-4578e4e8413bdepends_on_disabled_cells§runtimeȵpublished_object_keysdepends_on_skipped_cells§errored$c0daf219-cb87-4203-b835-49ab7eb955bequeued¤logsrunning¦outputbody
[local computer]$ ssh lemaitre4

[blegat@lm4-f001 ~]$ module list

Currently Loaded Modules:
  1) tis/2018.01 (S)   2) releases/2023a (S)   3) StdEnv

  Where:
   S:  Module is Sticky, requires --force to unload or purge
[blegat@lm4-f001 ~]$ mpicc
-bash: mpicc: command not found

[blegat@lm4-f001 ~]$ module load gompi/2023a

[blegat@lm4-f001 ~]$ mpicc
gcc: fatal error: no input files
compilation terminated.

[blegat@lm4-f001 ~]$ module list

Currently Loaded Modules:
  1) tis/2018.01                   (S)  11) libpciaccess/0.17-GCCcore-12.3.0
  2) releases/2023a                (S)  12) hwloc/2.9.1-GCCcore-12.3.0
  3) StdEnv                             13) OpenSSL/1.1
  4) GCCcore/12.3.0                     14) libevent/2.1.12-GCCcore-12.3.0
  5) zlib/1.2.13-GCCcore-12.3.0         15) UCX/1.14.1-GCCcore-12.3.0
  6) binutils/2.40-GCCcore-12.3.0       16) libfabric/1.18.0-GCCcore-12.3.0
  7) GCC/12.3.0                         17) PMIx/4.2.4-GCCcore-12.3.0
  8) numactl/2.0.16-GCCcore-12.3.0      18) UCC/1.2.0-GCCcore-12.3.0
  9) XZ/5.4.2-GCCcore-12.3.0            19) OpenMPI/4.1.5-GCC-12.3.0
 10) libxml2/2.11.4-GCCcore-12.3.0      20) gompi/2023a

  Where:
   S:  Module is Sticky, requires --force to unload or purge
mimetext/htmlrootassigneelast_run_timestampAeУpڰpersist_js_state·has_pluto_hook_features§cell_id$c0daf219-cb87-4203-b835-49ab7eb955bedepends_on_disabled_cells§runtime.(published_object_keysdepends_on_skipped_cells§errored$d04b9af5-f004-4ca4-b1c9-2c86d46cb37dqueued¤logsrunning¦outputbody
  • Each node input is a row and each node output is a column; source of figure below.

  • Each intersection is a switch. The cases (a) and (c) represent conflicting cases where two inputs want to simultaneously communicate with the same output.

mimetext/htmlrootassigneelast_run_timestampAeϡ}persist_js_state÷has_pluto_hook_features§cell_id$d04b9af5-f004-4ca4-b1c9-2c86d46cb37ddepends_on_disabled_cells§runtime#published_object_keysdepends_on_skipped_cells§errored$fc705b81-7310-44cc-ad9f-dc2cf8a9b645queued¤logsrunning¦outputbodywPNG  IHDRXbKGD,IDATxkLSSJJ\- P 8 ΅1c['{j2m\4Qa8N.O6[D0.KL͟_H=-7Te$D(FB#B!@h 4B!FB#B!@h 4B!FB#B!@h 4B!FB#*=Pj~ۮ$yzzhکST^zI$Nh^_Z]]]&d2cZݳZ}}}:Ft^^^i ]>>>~~~:ǥlJπIh4޼yh4[nɩ2LN2eڴinnnݓ7͒$ʻr\ݻg' ~700pڴiwpP,V]`0{=]]]gΜ-j===Lܼ&z!f:;;޽kIz^ s#NjP{{3ahfd2uttLw޾}d2ݹs֭[ߨj@spIX?mddd\]]g͚%?˵g̘۷ |My/`0twws 7o^XXXhhhhh^WT/~rω:jllvژ#gj:((H~?Yg000 w_j0Kƴiã"## Bqƍ+W\r|^Z.7oR:7b;wy̙3"""4N>]<!Z^^^ꫯʔ)S$4555555667zzz}J d@'RYYYUUU]]}ڵGKHH7w\Ŀ~ՆW666^zѷ丸|/!Tj~ AAA Ήwt1""b…IIIIIIaaa *`ddrpp~ٳ_{E-^82278wG.… 98inn.+++//xbooQR ²X,555UUU555o߶ߥhbbb,Y̑E`|‰500PSSSPPp7oڷ{뭷SGGG]]ymmjy-^855555/tLpl+W;w~zʕ+SRRRRRzC˗/_xbLcbb.]|d^>7p܌:uh4jutttjjjZZ… ]\H\VV&]geeJOLӧOIR-Z= c}jk׮ݴiҥK9%...>qDAA믿^Y 'O~7%66vƍYYYΜ9ssg̘~Xƙp!lkk;xÇ$IZ~ƍcbb tcǎYYY>>>QB8::Z^^w'mٲe\ W?cNNNgg$Ijzٲe[lYfFQz:8fÇׯKj333nݚhZ999%%%Ɉ6mںu+orƯСCG7n9sңΝ;Ginn$IѬ[nǎJ<3 amm}_ZwܙAU]]_?k;vXn8Um6[aa޽{+++%Ih4;wWz48pСCiHfںuex$EEEy]]$IӧO޵k0ĉ_|ECC$I:n۶m~ңO!X,Gٻw^]vm޼Y)= _~e޽$y{{ٳg۶mSNUz.)a) ,ؼy3g!IRhh={6lczoܗdNK.}g$I!!!tDQQјt:|Ǐ?x@c[\\\:zϝ;{nJ%zc`0/oj<stOZNknnvߣYY1444f1N'ؿ?5$Iniiq JO^l>pg^og%]/c.YD>,]B) :#t}>$!|u>ݻw^4 <.qFFcxxlãeيL5qX: {ssGnnS^: {2x˗/K}v)bי^8G !1^`B!@h 4B!FB#B!@h 4B!FB#B!@h 4B!FB#B!@h 4B!FB#B!@h 4B!FB#B!@h 4B!FB#B!@h 4B!FB#B!@h 4B!!П{DIENDB`mimeimage/pngrootassigneelast_run_timestampAe persist_js_state·has_pluto_hook_features§cell_id$fc705b81-7310-44cc-ad9f-dc2cf8a9b645depends_on_disabled_cells§runtime9published_object_keysdepends_on_skipped_cells§errored$6d2b3dbc-0686-49f0-904a-56c3ce63b4ddqueued¤logsrunning¦outputbodychildrenchildrenن

Initializes MPI, remove mpiexec, etc... from argc and argv.

text/htmlclassnamestyleflex-grow: 1;'application/vnd.pluto.divelement+objectn
MPI_Init(&argc, &argv)
text/htmlclassnamestyle#display: flex; flex-direction: row;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAempersist_js_state·has_pluto_hook_features§cell_id$6d2b3dbc-0686-49f0-904a-56c3ce63b4dddepends_on_disabled_cells§runtimeY1xpublished_object_keysdepends_on_skipped_cells§errored$e172f5c5-8b96-4efd-9cf3-805c58d1a6a3queued¤logsrunning¦outputbody+definition (generic function with 1 method)mimetext/plainrootassigneelast_run_timestampAepersist_js_state·has_pluto_hook_features§cell_id$e172f5c5-8b96-4efd-9cf3-805c58d1a6a3depends_on_disabled_cells§runtime ǵpublished_object_keysdepends_on_skipped_cells§errored$ce7bf747-7116-4e76-9004-f234317046c3queued¤logslinemsg~[1] I have received 1 B in 0.000058 sec [1] I have received 2 B in 0.000001 sec [1] I have received 4 B in 0.000000 sec [1] I have received 8 B in 0.000000 sec [1] I have received 16 B in 0.000000 sec [1] I have received 32 B in 0.000000 sec [1] I have received 64 B in 0.000000 sec [1] I have received 128 B in 0.000001 sec [1] I have received 256 B in 0.000000 sec [1] I have received 512 B in 0.000001 sec [1] I have received 1024 B in 0.000001 sec [1] I have received 2048 B in 0.000001 sec [1] I have received 4096 B in 0.000017 sec [1] I have received 8192 B in 0.000009 sec [1] I have received 16384 B in 0.000013 sec [1] I have received 32768 B in 0.000035 sec [1] I have received 65536 B in 0.000047 sec [1] I have received 131072 B in 0.000085 sec [1] I have received 262144 B in 0.000153 sec [1] I have received 524288 B in 0.000311 sec [1] I have received 1048576 B in 0.000551 sec text/plaincell_id$ce7bf747-7116-4e76-9004-f234317046c3kwargsidPlutoRunner_d1acb81efileP/home/runner/.julia/packages/Pluto/1XRxx/src/runner/PlutoRunner/src/io/stdout.jlgroupstdoutlevelLogLevel(-555)running¦outputbody
  int tag = 0;
  for(int size = 1; size <= (1<<20); size <<= 1){
    char* buf = malloc(size);
    if (procid == 0) {
        MPI_Send(buf, size, MPI_CHAR, procid + 1, tag++, comm);
    }
    else {
      double tic = MPI_Wtime();
      MPI_Recv(buf, size, MPI_CHAR, procid - 1, tag++, comm, MPI_STATUS_IGNORE);
      double toc = MPI_Wtime();
      printf("[%d] I have received %d B in %f sec\n", procid, size, (toc-tic));
    }
  }
mimetext/htmlrootassigneelast_run_timestampAe⸀persist_js_state·has_pluto_hook_features§cell_id$ce7bf747-7116-4e76-9004-f234317046c3depends_on_disabled_cells§runtime }published_object_keysdepends_on_skipped_cells§errored$d722a86d-6d51-4d91-ac22-53af94c91497queued¤logsrunning¦outputbodychildrenchildren٘

Get the id of processes. procid is different for different processes.

text/htmlclassnamestyleflex-grow: 1;'application/vnd.pluto.divelement+objectه
int procid;
MPI_Comm_rank(MPI_COMM_WORLD, &procid);
text/htmlclassnamestyle&display: flex; flex-direction: column;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAegpersist_js_state·has_pluto_hook_features§cell_id$d722a86d-6d51-4d91-ac22-53af94c91497depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$273ad3a6-cb32-49bb-8702-fdaf8597e812queued¤logsrunning¦outputbodyق

Different processes may be on the same node

mimetext/htmlrootassigneelast_run_timestampAeϋ:persist_js_state÷has_pluto_hook_features§cell_id$273ad3a6-cb32-49bb-8702-fdaf8597e812depends_on_disabled_cells§runtime+|published_object_keysdepends_on_skipped_cells§errored$6041a909-d26c-4ab1-836b-29953c578759queued¤logsrunning¦outputbody

What is the number of edges ? What is the bisection width ?

Same as fat-tree.

mimetext/htmlrootassigneelast_run_timestampAe ذpersist_js_state·has_pluto_hook_features§cell_id$6041a909-d26c-4ab1-836b-29953c578759depends_on_disabled_cells§runtimeǵpublished_object_keysdepends_on_skipped_cells§errored$b94cd399-0370-49e9-a522-056f3af22955queued¤logsrunning¦outputbodymimetext/htmlrootassigneelast_run_timestampAe!аpersist_js_state·has_pluto_hook_features§cell_id$b94cd399-0370-49e9-a522-056f3af22955depends_on_disabled_cells§runtime@published_object_keysdepends_on_skipped_cells§errored$58e12afd-6eb0-4731-bd57-d9ae7ab4e164queued¤logsrunning¦outputbodyV^

LINMA2710 - Scientific Computing Distributed Computing with MPI

P.-A. Absil and B. Legat

      mimetext/htmlrootassigneelast_run_timestampAeϿ]persist_js_state·has_pluto_hook_features§cell_id$58e12afd-6eb0-4731-bd57-d9ae7ab4e164depends_on_disabled_cells§runtime.kpublished_object_keysdepends_on_skipped_cells§errored$141d162c-c817-498f-be16-f1cd35d82487queued¤logsrunning¦outputbody

How to collect the partial sums ?

MPI_Reduce

mimetext/htmlrootassigneelast_run_timestampAeЍOpersist_js_state·has_pluto_hook_features§cell_id$141d162c-c817-498f-be16-f1cd35d82487depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$e44b0038-d68f-4a49-9da2-67fbcbe098c3queued¤logsrunning¦outputbodyPNG  IHDRXbKGDRIDATx1Nhaڎ(sԈ$H('WujVZ &) 4@_>& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@?f: jZ׋܇F;%i$.//>l6{||_]]TONCm/3'bfn]d`2:x"t80 777zwNIm/#2ǫQ&B@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@& !iB@? ̪lIENDB`mimeimage/pngrootassigneelast_run_timestampAepersist_js_state·has_pluto_hook_features§cell_id$e44b0038-d68f-4a49-9da2-67fbcbe098c3depends_on_disabled_cells§runtimeOpublished_object_keysdepends_on_skipped_cells§errored$b5a3e471-af4a-466f-bbae-96306bcc7563queued¤logsrunning¦outputbodychildrenchildrenك

Get the number of processes. nprocs is the same on all processes.

text/htmlclassnamestyleflex-grow: 1;'application/vnd.pluto.divelement+objectه
int nprocs;
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
text/htmlclassnamestyle&display: flex; flex-direction: column;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAe鮂persist_js_state·has_pluto_hook_features§cell_id$b5a3e471-af4a-466f-bbae-96306bcc7563depends_on_disabled_cells§runtimeD:published_object_keysdepends_on_skipped_cells§errored$21b6133f-db59-4885-9b3d-331c3d6ef306queued¤logsrunning¦outputbody>

Compiling

mimetext/htmlrootassigneelast_run_timestampAeϖBpersist_js_state÷has_pluto_hook_features§cell_id$21b6133f-db59-4885-9b3d-331c3d6ef306depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$beee4908-d519-413a-964f-149bb82cdbb8queued¤logsrunning¦outputbody6

Slurm

mimetext/htmlrootassigneelast_run_timestampAeϝpersist_js_state÷has_pluto_hook_features§cell_id$beee4908-d519-413a-964f-149bb82cdbb8depends_on_disabled_cells§runtimeYܵpublished_object_keysdepends_on_skipped_cells§errored$091dd042-580b-4fda-8086-e048663aed6cqueued¤logsrunning¦outputbody&
  • NVIDIA Nsight Systems can profile CUDA code but also MPI

  • Available on manneback after loading CUDA with

[laptop]$ ssh manneback
[blegat@mbackf1 ~]$ nsys
-bash: nsys: command not found
[blegat@mbackf1 ~]$ ml CUDA
[blegat@mbackf1 ~]$ nsys
mimetext/htmlrootassigneelast_run_timestampAe>persist_js_state·has_pluto_hook_features§cell_id$091dd042-580b-4fda-8086-e048663aed6cdepends_on_disabled_cells§runtime Upublished_object_keysdepends_on_skipped_cells§errored$60bc118f-6795-43f9-97a2-865fd1704895queued¤logsrunning¦outputbody>

Allreduce

mimetext/htmlrootassigneelast_run_timestampAeϗްpersist_js_state÷has_pluto_hook_features§cell_id$60bc118f-6795-43f9-97a2-865fd1704895depends_on_disabled_cells§runtime8published_object_keysdepends_on_skipped_cells§errored$16f8d28b-f201-4fe5-8446-68d7d9ddfb3cqueued¤logsrunning¦outputbody{ mimetext/htmlrootassigneelast_run_timestampAe+persist_js_state·has_pluto_hook_features§cell_id$16f8d28b-f201-4fe5-8446-68d7d9ddfb3cdepends_on_disabled_cells§runtime?published_object_keysdepends_on_skipped_cells§errored$35aa1295-642f-4525-bf19-df2a42ff39d6queued¤logslinemsgٔCompiling : `mpicc -O3 -I/home/runner/.julia/artifacts/207eb5b8330e24674fe59b50d72f4b3d946219c8/include /tmp/jl_Vy945P/main.c -o /tmp/jl_Vy945P/bin`text/plaincell_id$35aa1295-642f-4525-bf19-df2a42ff39d6kwargsidSimpleClang_e77d60b3fileA/home/runner/.julia/packages/SimpleClang/N4VZY/src/SimpleClang.jlgroupSimpleClanglevelInfolinemsg+Running : `mpiexec -n 2 /tmp/jl_Vy945P/bin`text/plaincell_id$35aa1295-642f-4525-bf19-df2a42ff39d6kwargsidSimpleClang_9f8d091bfileA/home/runner/.julia/packages/SimpleClang/N4VZY/src/SimpleClang.jlgroupSimpleClanglevelInfolinemsgٟproc id : 1 / 2 4:8 proc id : 1 / 2 : [local = 35.000000] : [total = 0.000000] proc id : 0 / 2 0:3 proc id : 0 / 2 : [local = 10.000000] : [total = 45.000000] text/plaincell_id$35aa1295-642f-4525-bf19-df2a42ff39d6kwargsidPlutoRunner_d1acb81efileP/home/runner/.julia/packages/Pluto/1XRxx/src/runner/PlutoRunner/src/io/stdout.jlgroupstdoutlevelLogLevel(-555)running¦outputbody
  for (int i = stride * procid; i < last; i++)
    local_sum += vec[i];
  float total = 0;
  MPI_Reduce(&local_sum, &total, 1, MPI_FLOAT, MPI_SUM, 0, comm);
  if (verbose >= 1)
    fprintf(stderr, "proc id : %d / %d : [local = %f] : [total = %f]\n", procid, nprocs, local_sum, total);
mimetext/htmlrootassigneelast_run_timestampAe]persist_js_state·has_pluto_hook_features§cell_id$35aa1295-642f-4525-bf19-df2a42ff39d6depends_on_disabled_cells§runtime=(published_object_keysdepends_on_skipped_cells§errored$e832ce25-94e2-4743-854d-02b52cc7b56dqueued¤logsrunning¦outputbody mimetext/htmlrootassigneelast_run_timestampAeВpersist_js_state·has_pluto_hook_features§cell_id$e832ce25-94e2-4743-854d-02b52cc7b56ddepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$9612a1ef-fd3a-4a58-87b0-b2255ac86331queued¤logsrunning¦outputbodyH

Graph diameter

mimetext/htmlrootassigneelast_run_timestampAeϞpersist_js_state÷has_pluto_hook_features§cell_id$9612a1ef-fd3a-4a58-87b0-b2255ac86331depends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$d7117a24-aba6-4479-a40e-5005310a6b38queued¤logsrunning¦outputbodyy mimetext/htmlrootassigneelast_run_timestampAe5persist_js_state·has_pluto_hook_features§cell_id$d7117a24-aba6-4479-a40e-5005310a6b38depends_on_disabled_cells§runtime;published_object_keysdepends_on_skipped_cells§errored$0e640e07-82c7-4dab-a8f1-2f634bbebdeaqueued¤logsrunning¦outputbodychildren text/html#/text/htmlclassnamestyle#display: flex; flex-direction: row;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAe,H԰persist_js_state·has_pluto_hook_features§cell_id$0e640e07-82c7-4dab-a8f1-2f634bbebdeadepends_on_disabled_cells§runtime'published_object_keysdepends_on_skipped_cells§errored$f2ebc6fb-e07c-4922-897d-9bbe0f5fa1d0queued¤logsrunning¦outputbodychildren

The bisection width is:

$$\min_{S \subset V : \lfloor |V|/2 \rfloor \le |S| \le \lceil |V|/2 \rceil} \quad w(S, V \setminus S)$$

text/htmlchildren text/htmlclassnamestyleflex-grow: 1;'application/vnd.pluto.divelement+object

The bisection bandwidth is:

$$\min_{S \subset V : \lfloor |V|/2 \rfloor \le |S| \le \lceil |V|/2 \rceil} \quad \texttt{bw}(S, V \setminus S)$$

text/htmlclassnamestyle#display: flex; flex-direction: row;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAeЩGcpersist_js_state·has_pluto_hook_features§cell_id$f2ebc6fb-e07c-4922-897d-9bbe0f5fa1d0depends_on_disabled_cells§runtime`Upublished_object_keysdepends_on_skipped_cells§errored$55e96151-2aa1-4ea0-b672-2038c57d911equeued¤logsrunning¦outputbodyq mimetext/htmlrootassigneelast_run_timestampAe6` persist_js_state·has_pluto_hook_features§cell_id$55e96151-2aa1-4ea0-b672-2038c57d911edepends_on_disabled_cells§runtimeɵpublished_object_keysdepends_on_skipped_cells§errored$67dee339-98b4-4714-88b2-8098a13235f2queued¤logsrunning¦outputbody9

There are two protocols:

  • Rendezvous protocol

    1. the sender sends a header;

    2. the receiver returns a ‘ready-to-send’ message;

    3. the sender sends the actual data.

  • Eager protocol the message is buffered so MPI_Send can return eagerly, before the receiver is even ready

Eager protocol is used if the data size is smaller than the eager limit. To force the rendezvous protocol, use MPI_Ssend.

mimetext/htmlrootassigneelast_run_timestampAeϙ;persist_js_state÷has_pluto_hook_features§cell_id$67dee339-98b4-4714-88b2-8098a13235f2depends_on_disabled_cells§runtime Vpublished_object_keysdepends_on_skipped_cells§errored$4e32f7fb-cd5a-4190-9c92-ba4029313475queued¤logsrunning¦outputbodyRmimetext/htmlrootassigneelast_run_timestampAepersist_js_state·has_pluto_hook_features§cell_id$4e32f7fb-cd5a-4190-9c92-ba4029313475depends_on_disabled_cells§runtimelpublished_object_keysdepends_on_skipped_cells§errored$dbc19cbb-1349-4904-b655-2452aa7e2452queued¤logsrunning¦outputbodychildren}

Before

procid1234
$x_{1,1}$$x_{1,2}$$x_{1,3}$$x_{1,4}$
$x_{2,1}$$x_{2,2}$$x_{2,3}$$x_{2,4}$
$x_{3,1}$$x_{3,2}$$x_{3,3}$$x_{3,4}$
$x_{4,1}$$x_{4,2}$$x_{4,3}$$x_{4,4}$
text/html

After MPI_Reduce_scatter

procid1234
$x_{1,1} + \cdots + x_{1,4}$
$x_{2,1} + \cdots + x_{2,4}$
$x_{3,1} + \cdots + x_{3,4}$
$x_{4,1} + \cdots + x_{4,4}$
text/htmlclassnamestyle&display: flex; flex-direction: column;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAeЌpersist_js_state·has_pluto_hook_features§cell_id$dbc19cbb-1349-4904-b655-2452aa7e2452depends_on_disabled_cells§runtime^~published_object_keysdepends_on_skipped_cells§errored$0d69e94b-492a-4acc-adba-a2126b871724queued¤logsrunning¦outputbodychildren

Before

procid1234
$x_1$$x_2$$x_3$$x_4$
text/html7

After MPI_Allreduce

procid1234
$x_1 + \cdots + x_4$$x_1 + \cdots + x_4$$x_1 + \cdots + x_4$$x_1 + \cdots + x_4$
text/htmlclassnamestyle&display: flex; flex-direction: column;mime'application/vnd.pluto.divelement+objectrootassigneelast_run_timestampAeЍpersist_js_state·has_pluto_hook_features§cell_id$0d69e94b-492a-4acc-adba-a2126b871724depends_on_disabled_cells§runtime(published_object_keysdepends_on_skipped_cells§errored$4309dc43-aeb8-4ec7-94fe-0e320b784349queued¤logsrunning¦outputbodyJ

Special case of multidimensional array

mimetext/htmlrootassigneelast_run_timestampAeϡLMpersist_js_state÷has_pluto_hook_features§cell_id$4309dc43-aeb8-4ec7-94fe-0e320b784349depends_on_disabled_cells§runtime4published_object_keysdepends_on_skipped_cells§errored$1551122c-70ae-4e37-b3fb-4be91fcc4afbqueued¤logsrunning¦outputbodyp

How to order the nodes so that consecutive nodes in the order are adjacent in the graph ?

Map nodes to binary number and use Gray code.

mimetext/htmlrootassigneelast_run_timestampAeC'persist_js_state·has_pluto_hook_features§cell_id$1551122c-70ae-4e37-b3fb-4be91fcc4afbdepends_on_disabled_cells§runtime)published_object_keysdepends_on_skipped_cells§errored$3e98c0ca-1b47-4631-83d7-cd0c8c0a431dqueued¤logsrunning¦outputbodymimetext/plainrootassigneelast_run_timestampAeϿdpersist_js_state·has_pluto_hook_features§cell_id$3e98c0ca-1b47-4631-83d7-cd0c8c0a431ddepends_on_disabled_cells§runtime<'published_object_keysdepends_on_skipped_cells§errored$3ec3c058-a94d-4717-b99f-66373f2fa31dqueued¤logsrunning¦outputbody+mimetext/htmlrootassigneelast_run_timestampAeH.persist_js_state·has_pluto_hook_features§cell_id$3ec3c058-a94d-4717-b99f-66373f2fa31ddepends_on_disabled_cells§runtime published_object_keysdepends_on_skipped_cells§errored±cell_dependencies$c55dcd4a-8438-4679-9c4a-78cceec6835dprecedence_heuristic cell_id$c55dcd4a-8438-4679-9c4a-78cceec6835ddownstream_cells_mappath$e44b0038-d68f-4a49-9da2-67fbcbe098c3$fc705b81-7310-44cc-ad9f-dc2cf8a9b645upstream_cells_mapLuxor.DrawinglinecircleBoolcurvestrokepathLuxor.finishPointsignLuxor.background-@drawLuxor.sethue+*Luxor.originLuxor$8df4ff2f-d176-4b4e-a525-665b5d07ea52moveLuxor.preview$6be49c46-4900-4457-81b4-0704cd7da0afprecedence_heuristic cell_id$6be49c46-4900-4457-81b4-0704cd7da0afdownstream_cells_mapupstream_cells_map@md_strFoldablegetindex$c1285653-38ba-418b-bdf5-cda99440998dprecedence_heuristic cell_id$c1285653-38ba-418b-bdf5-cda99440998ddownstream_cells_mapupstream_cells_map@md_strasideFoldabletipgetindex$4fdb4cd6-a794-4b14-84b0-72f484c6ea86precedence_heuristic cell_id$4fdb4cd6-a794-4b14-84b0-72f484c6ea86downstream_cells_mapupstream_cells_map@md_strgetindex$a59db59c-d34e-4abd-8865-9907607e06a8precedence_heuristic cell_id$a59db59c-d34e-4abd-8865-9907607e06a8downstream_cells_mapupstream_cells_map@md_strasideciteintro$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3bgetindex$5a566137-fbd1-45b2-9a55-e4aded366bb3precedence_heuristic cell_id$5a566137-fbd1-45b2-9a55-e4aded366bb3downstream_cells_mapupstream_cells_map@md_strgetindex$ad3559d1-6180-4eaa-b97d-3c1f10f036b9precedence_heuristic cell_id$ad3559d1-6180-4eaa-b97d-3c1f10f036b9downstream_cells_mapupstream_cells_map@md_strgetindex$88f33f35-d922-4d98-af4a-ebb79d9b7dc6precedence_heuristic cell_id$88f33f35-d922-4d98-af4a-ebb79d9b7dc6downstream_cells_mapmpicc_cmd$c0daf219-cb87-4203-b835-49ab7eb955beupstream_cells_map@md_strgetindex$b540d5e3-6686-479a-b2c7-c1f65b85b6baprecedence_heuristic cell_id$b540d5e3-6686-479a-b2c7-c1f65b85b6badownstream_cells_mapupstream_cells_map@md_strgetindex$f6f9447c-9bc9-432d-bd80-2c39f9d842f8precedence_heuristic cell_id$f6f9447c-9bc9-432d-bd80-2c39f9d842f8downstream_cells_mapupstream_cells_map=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7$9a100ccf-1ad3-4d2c-bbe0-e297969eb69eprecedence_heuristic cell_id$9a100ccf-1ad3-4d2c-bbe0-e297969eb69edownstream_cells_mapupstream_cells_map@md_strgetindex$a1b2d090-d498-4d5d-90a0-8cdc648dc833precedence_heuristic cell_id$a1b2d090-d498-4d5d-90a0-8cdc648dc833downstream_cells_mapupstream_cells_map@md_strgetindex$8b83570a-6982-47e5-a167-a6d6afee0f7dprecedence_heuristic cell_id$8b83570a-6982-47e5-a167-a6d6afee0f7ddownstream_cells_mapupstream_cells_map@md_strhboxDictDiv=>getindex$e3474aea-ee14-4c78-ae46-5badc66a543aprecedence_heuristic cell_id$e3474aea-ee14-4c78-ae46-5badc66a543adownstream_cells_maplist_1$c0daf219-cb87-4203-b835-49ab7eb955beupstream_cells_map@md_strFoldablegetindex$21d507f6-02f8-4f8b-84f1-bcb84731df66precedence_heuristic cell_id$21d507f6-02f8-4f8b-84f1-bcb84731df66downstream_cells_mapupstream_cells_map@md_strgetindex$de72d596-0daf-4629-bbb5-20bb8a67cbedprecedence_heuristic cell_id$de72d596-0daf-4629-bbb5-20bb8a67cbeddownstream_cells_mapupstream_cells_map@md_strFoldablegetindex$c253bb24-ad76-4b58-8dfc-7dc2576e3db5precedence_heuristic cell_id$c253bb24-ad76-4b58-8dfc-7dc2576e3db5downstream_cells_mapupstream_cells_map@md_strgetindex$e4d1de1d-d57a-48ab-ad7a-c09b427daa03precedence_heuristic cell_id$e4d1de1d-d57a-48ab-ad7a-c09b427daa03downstream_cells_mapupstream_cells_map@md_strFoldableimg$7565e3da-84ce-42b6-8d4b-3615576f33b7getindex$133f4c7d-33e0-4e13-b716-f538125436caprecedence_heuristic cell_id$133f4c7d-33e0-4e13-b716-f538125436cadownstream_cells_mapupstream_cells_map@md_strimg1$1152dec8-3810-42b1-bb2a-8755dcaef56cTwoColumnWideLeftgetindex$3a50ca06-06e8-4a61-ade2-afbfc52ca655precedence_heuristic cell_id$3a50ca06-06e8-4a61-ade2-afbfc52ca655downstream_cells_mapupstream_cells_map@md_strasidecitepara$3e98c0ca-1b47-4631-83d7-cd0c8c0a431dgetindex$6c1984f6-4e36-4637-b0da-c7dd8b0f9ff0precedence_heuristic cell_id$6c1984f6-4e36-4637-b0da-c7dd8b0f9ff0downstream_cells_maplist_2$c0daf219-cb87-4203-b835-49ab7eb955beupstream_cells_map@md_strFoldablegetindex$de20bf96-7d33-4a78-8147-f0b7f8488e46precedence_heuristic cell_id$de20bf96-7d33-4a78-8147-f0b7f8488e46downstream_cells_mapupstream_cells_map@md_strFoldablegetindex$51d70f9a-cd67-44b9-8fd1-5ab70b526c7aprecedence_heuristic cell_id$51d70f9a-cd67-44b9-8fd1-5ab70b526c7adownstream_cells_mapupstream_cells_map@md_strgetindex$98392c40-6542-4a26-8552-c0960bbaa6a6precedence_heuristic cell_id$98392c40-6542-4a26-8552-c0960bbaa6a6downstream_cells_mapupstream_cells_map@md_strgetindex$39b055f5-3dbf-403c-b21e-210e3813d8b0precedence_heuristic cell_id$39b055f5-3dbf-403c-b21e-210e3813d8b0downstream_cells_mapupstream_cells_mapimg$7565e3da-84ce-42b6-8d4b-3615576f33b7$b0ca0392-71b8-4f44-8c6c-0978a02a0e6cprecedence_heuristic cell_id$b0ca0392-71b8-4f44-8c6c-0978a02a0e6cdownstream_cells_mapupstream_cells_mapExample$c45ff9b5-35d9-4a9d-a801-c762333a1f02compile_and_runprocname_num_processes$a103c5af-42fe-4f8c-b78c-6946895105d7$26aa369f-e5c7-4fe5-8b6b-903f4f4e91baprecedence_heuristic cell_id$26aa369f-e5c7-4fe5-8b6b-903f4f4e91badownstream_cells_mapupstream_cells_mapExample$c45ff9b5-35d9-4a9d-a801-c762333a1f02compile_and_run$a103c5af-42fe-4f8c-b78c-6946895105d7precedence_heuristic cell_id$a103c5af-42fe-4f8c-b78c-6946895105d7downstream_cells_mapprocname_num_processes$b0ca0392-71b8-4f44-8c6c-0978a02a0e6cupstream_cells_map@md_strCore:Base.get@bindSliderBasePlutoRunnerPlutoRunner.create_bondCore.applicablegetindex$d8bb1d43-bf42-4a09-bdeb-5db406ef1ccdprecedence_heuristic cell_id$d8bb1d43-bf42-4a09-bdeb-5db406ef1ccddownstream_cells_mapupstream_cells_map@md_strhboxDictDiv=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7getindex$82230d6c-25ce-4d12-8842-e0651fc4b143precedence_heuristic cell_id$82230d6c-25ce-4d12-8842-e0651fc4b143downstream_cells_mapupstream_cells_map@md_strgetindex$fc43b343-79cd-4342-8d80-8ea72cf34942precedence_heuristic cell_id$fc43b343-79cd-4342-8d80-8ea72cf34942downstream_cells_mapupstream_cells_map@md_strhboxDictDiv=>getindex$d2104fbd-ba22-4501-b03a-8809271d598bprecedence_heuristic cell_id$d2104fbd-ba22-4501-b03a-8809271d598bdownstream_cells_mapupstream_cells_map@md_strgetindex$8a527c17-bf2b-4e6b-937f-ef3a269c5112precedence_heuristic cell_id$8a527c17-bf2b-4e6b-937f-ef3a269c5112downstream_cells_mapupstream_cells_map=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7$be0e3ba0-18cc-4b9a-a56d-2566f5148faeprecedence_heuristic cell_id$be0e3ba0-18cc-4b9a-a56d-2566f5148faedownstream_cells_mapupstream_cells_map@md_str=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7getindex$2ff573a3-4a84-4497-9305-2d97e35e5e3dprecedence_heuristic cell_id$2ff573a3-4a84-4497-9305-2d97e35e5e3ddownstream_cells_mapupstream_cells_map@md_strFoldablegetindex$1b617828-e2b2-4a94-a120-59fa533d3e11precedence_heuristic cell_id$1b617828-e2b2-4a94-a120-59fa533d3e11downstream_cells_mapupstream_cells_map@md_strgetindex$c3590376-06ed-45a4-af0b-2d46f1a387c8precedence_heuristic cell_id$c3590376-06ed-45a4-af0b-2d46f1a387c8downstream_cells_mapupstream_cells_map@md_strhboxBase.getindex@c_strBaseDivDict=>$35ba1eea-56ae-4b74-af96-21ec5a93c455precedence_heuristic cell_id$35ba1eea-56ae-4b74-af96-21ec5a93c455downstream_cells_mapupstream_cells_map@md_strgetindex$488b0c17-4f0f-43bf-a16c-b9faa7ae0595precedence_heuristic cell_id$488b0c17-4f0f-43bf-a16c-b9faa7ae0595downstream_cells_mapupstream_cells_mapasideciteintro$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3b$10a1b3a7-21c7-4f97-93e1-006ad3aea40dprecedence_heuristic cell_id$10a1b3a7-21c7-4f97-93e1-006ad3aea40ddownstream_cells_mapupstream_cells_map@md_strgetindex$b68eb860-a5b4-4e9e-9fbf-6eb6ce43ae69precedence_heuristic cell_id$b68eb860-a5b4-4e9e-9fbf-6eb6ce43ae69downstream_cells_mapupstream_cells_map@md_strFoldable=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7getindex$3dc860be-016d-49ee-8535-7d9457c70f85precedence_heuristic cell_id$3dc860be-016d-49ee-8535-7d9457c70f85downstream_cells_mapupstream_cells_map@md_strFoldablegetindex$2e4dc3f9-a132-444f-a35d-f583823a7dfdprecedence_heuristic cell_id$2e4dc3f9-a132-444f-a35d-f583823a7dfddownstream_cells_mapupstream_cells_map@md_strFoldablegetindex$8981b5e2-2497-478e-ab28-a14b62f6f916precedence_heuristic cell_id$8981b5e2-2497-478e-ab28-a14b62f6f916downstream_cells_mapupstream_cells_mapruncmd_gen@cmd$34a10003-2c32-4332-b3e6-ce70eec0cbbeprecedence_heuristic cell_id$34a10003-2c32-4332-b3e6-ce70eec0cbbedownstream_cells_mapupstream_cells_map@md_strgetindex$a0566fdb-a08d-4bcf-9b2f-ed211c9f111fprecedence_heuristic cell_id$a0566fdb-a08d-4bcf-9b2f-ed211c9f111fdownstream_cells_mapupstream_cells_mapasideciteintro$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3b$655e980d-b4e9-4f56-a5ae-380072242d27precedence_heuristic cell_id$655e980d-b4e9-4f56-a5ae-380072242d27downstream_cells_mapupstream_cells_maphboximg$7565e3da-84ce-42b6-8d4b-3615576f33b7$4788d8b4-2efa-4489-80c3-71f405513644precedence_heuristic cell_id$4788d8b4-2efa-4489-80c3-71f405513644downstream_cells_mapsum_num_processes$35aa1295-642f-4525-bf19-df2a42ff39d6upstream_cells_map@md_strCore:Base.get@bindSliderBasePlutoRunnerPlutoRunner.create_bondCore.applicablegetindex$233c13ff-f008-40b0-a6c5-c5395b2215ecprecedence_heuristic cell_id$233c13ff-f008-40b0-a6c5-c5395b2215ecdownstream_cells_mapupstream_cells_map@md_strFoldablegetindex$6fc34de1-469b-41a9-9677-ff3182f7a498precedence_heuristic cell_id$6fc34de1-469b-41a9-9677-ff3182f7a498downstream_cells_mapupstream_cells_map@md_strFoldablegetindex$b53ec488-ff25-4647-ab00-fbf90963a795precedence_heuristic cell_id$b53ec488-ff25-4647-ab00-fbf90963a795downstream_cells_mapupstream_cells_map@md_strgetindex$c3c848ff-526a-450d-9b1c-5d9d3ccccf28precedence_heuristic cell_id$c3c848ff-526a-450d-9b1c-5d9d3ccccf28downstream_cells_mapupstream_cells_map@md_strgetindex$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3bprecedence_heuristic cell_id$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3bdownstream_cells_mapciteintro$921b5a18-0733-4032-a543-9d60e254b1b2$d7117a24-aba6-4479-a40e-5005310a6b38$8f46daf1-9ca2-4a08-99aa-4ed68af218b8$a0566fdb-a08d-4bcf-9b2f-ed211c9f111f$61af27f1-9f83-42f1-a419-06d12ea62133$488b0c17-4f0f-43bf-a16c-b9faa7ae0595$16f8d28b-f201-4fe5-8446-68d7d9ddfb3c$a59db59c-d34e-4abd-8865-9907607e06a8$f2417047-33fc-4489-8e89-115bc6b46c13upstream_cells_map*$4569aa05-9963-4976-ac63-caf3f3979e83precedence_heuristic cell_id$4569aa05-9963-4976-ac63-caf3f3979e83downstream_cells_mapupstream_cells_map@md_strgetindex$7565e3da-84ce-42b6-8d4b-3615576f33b7precedence_heuristic cell_id$7565e3da-84ce-42b6-8d4b-3615576f33b7downstream_cells_mapsave_imagePath$7565e3da-84ce-42b6-8d4b-3615576f33b7imgpathURL$7565e3da-84ce-42b6-8d4b-3615576f33b7img$c04bcc96-e5fe-4d6e-a12e-40dcde58c62e$4e32f7fb-cd5a-4190-9c92-ba4029313475$b94cd399-0370-49e9-a522-056f3af22955$370f0f20-e373-4028-bca1-83e93678cbcb$0e640e07-82c7-4dab-a8f1-2f634bbebdea$8a527c17-bf2b-4e6b-937f-ef3a269c5112$39f48c25-6efb-4ff2-aedc-9d3e722dad24$55e96151-2aa1-4ea0-b672-2038c57d911e$be0e3ba0-18cc-4b9a-a56d-2566f5148fae$d8bb1d43-bf42-4a09-bdeb-5db406ef1ccd$091dd042-580b-4fda-8086-e048663aed6c$7fc70992-973a-43c6-904a-dd1b622a5ed8$23bfbe95-7ba2-41b9-bd8b-dc4baa3ad53a$39b055f5-3dbf-403c-b21e-210e3813d8b0$b68eb860-a5b4-4e9e-9fbf-6eb6ce43ae69$f6f9447c-9bc9-432d-bd80-2c39f9d842f8$1551122c-70ae-4e37-b3fb-4be91fcc4afb$655e980d-b4e9-4f56-a5ae-380072242d27$e4d1de1d-d57a-48ab-ad7a-c09b427daa03$4aac6ab5-053a-4f60-9e2e-e8d61ff0cecb$3ec3c058-a94d-4717-b99f-66373f2fa31d$1152dec8-3810-42b1-bb2a-8755dcaef56cupstream_cells_map!HypertextLiteral.BypassPlutoUI.LocalResourcePlutoUI$8df4ff2f-d176-4b4e-a525-665b5d07ea52joinpathHypertextLiteral.contentstartswithStringendPlutoTeachingTools$8df4ff2f-d176-4b4e-a525-665b5d07ea52@htlHypertextLiteral.attribute_pair&PlutoTeachingTools.RobustLocalResourcePath$7565e3da-84ce-42b6-8d4b-3615576f33b7HypertextLiteral.ResultHypertextLiteral$8df4ff2f-d176-4b4e-a525-665b5d07ea52URL$7565e3da-84ce-42b6-8d4b-3615576f33b7*@__DIR__insplit$7fc70992-973a-43c6-904a-dd1b622a5ed8precedence_heuristic cell_id$7fc70992-973a-43c6-904a-dd1b622a5ed8downstream_cells_mapupstream_cells_map@md_strFoldable=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7getindex$fa024a5d-52a6-459d-894d-13a60ec723d2precedence_heuristic cell_id$fa024a5d-52a6-459d-894d-13a60ec723d2downstream_cells_mapupstream_cells_map@md_strFoldablegetindex$f7f097cb-d7bd-49eb-a030-ac26f8f61a67precedence_heuristic cell_id$f7f097cb-d7bd-49eb-a030-ac26f8f61a67downstream_cells_mapupstream_cells_map@md_strgetindex$3a2bfd4e-0ce6-4a79-a578-fc1b4ef563c5precedence_heuristic cell_id$3a2bfd4e-0ce6-4a79-a578-fc1b4ef563c5downstream_cells_mapupstream_cells_map@md_strFoldablegetindex$a79c410a-bebf-434c-9730-568e0ff4f4c7precedence_heuristic cell_id$a79c410a-bebf-434c-9730-568e0ff4f4c7downstream_cells_mapupstream_cells_map@md_strgetindex$7cf59087-efca-4f03-90dc-f2acefdcbc8aprecedence_heuristic cell_id$7cf59087-efca-4f03-90dc-f2acefdcbc8adownstream_cells_mapupstream_cells_map@md_strgetindex$5d72bf87-7f3a-4229-9d7a-2e63c115087dprecedence_heuristic cell_id$5d72bf87-7f3a-4229-9d7a-2e63c115087ddownstream_cells_mapupstream_cells_map@md_strFoldablegetindex$e796b093-9c1d-4656-9acb-918de53f7e4dprecedence_heuristic cell_id$e796b093-9c1d-4656-9acb-918de53f7e4ddownstream_cells_mapupstream_cells_map@md_strgetindex$954f1ab1-1e2f-458b-96d7-a1746631fac7precedence_heuristic cell_id$954f1ab1-1e2f-458b-96d7-a1746631fac7downstream_cells_maptree$1bac238f-79c8-4f9f-a187-bacb288de3b0upstream_cells_mapLuxor.DrawinglinecircleLuxor.finishPointsignLuxor.background-@drawLuxor.sethue+*Luxor.origintextLuxor.previewLuxor$8df4ff2f-d176-4b4e-a525-665b5d07ea52$a771f33f-7ed1-41aa-bee0-c215729a8c8dprecedence_heuristic cell_id$a771f33f-7ed1-41aa-bee0-c215729a8c8ddownstream_cells_mapupstream_cells_map@md_strgetindex$8da580fe-6b56-4d8f-ad43-aed7b728a06eprecedence_heuristic cell_id$8da580fe-6b56-4d8f-ad43-aed7b728a06edownstream_cells_mapupstream_cells_map@md_strgetindex$39f48c25-6efb-4ff2-aedc-9d3e722dad24precedence_heuristic cell_id$39f48c25-6efb-4ff2-aedc-9d3e722dad24downstream_cells_mapupstream_cells_map@md_str=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7getindex$e119c2d3-1e24-464f-b812-62f28c00a913precedence_heuristic cell_id$e119c2d3-1e24-464f-b812-62f28c00a913downstream_cells_mapupstream_cells_map@md_strgetindex$a258eec9-f4f6-49bd-8470-8541836f5f6bprecedence_heuristic cell_id$a258eec9-f4f6-49bd-8470-8541836f5f6bdownstream_cells_mapupstream_cells_map@md_strhboxDictDiv=>getindex$79b405a5-54b5-4727-a0cd-b79522ad109fprecedence_heuristic cell_id$79b405a5-54b5-4727-a0cd-b79522ad109fdownstream_cells_mapupstream_cells_map@md_strgetindex$93f0c63c-b597-4f89-809c-7af0476f319aprecedence_heuristic cell_id$93f0c63c-b597-4f89-809c-7af0476f319adownstream_cells_mapupstream_cells_map@md_strgetindex$944d827e-bc6a-4de8-b959-5fde8790bedcprecedence_heuristic cell_id$944d827e-bc6a-4de8-b959-5fde8790bedcdownstream_cells_mapupstream_cells_map@md_strgetindex$c04bcc96-e5fe-4d6e-a12e-40dcde58c62eprecedence_heuristic cell_id$c04bcc96-e5fe-4d6e-a12e-40dcde58c62edownstream_cells_mapupstream_cells_map@md_str=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7getindex$c420ad25-6af1-4fb4-823a-b6bbd4e10f7fprecedence_heuristic cell_id$c420ad25-6af1-4fb4-823a-b6bbd4e10f7fdownstream_cells_mapupstream_cells_map@md_strhboxDictDiv=>getindex$a6c337c4-0c81-4463-ad4f-9a4528d953abprecedence_heuristic cell_id$a6c337c4-0c81-4463-ad4f-9a4528d953abdownstream_cells_mapupstream_cells_map@md_strgetindex$921b5a18-0733-4032-a543-9d60e254b1b2precedence_heuristic cell_id$921b5a18-0733-4032-a543-9d60e254b1b2downstream_cells_mapupstream_cells_map@md_strciteintro$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3bgetindex$2257220c-6f0e-4edf-9fea-7e388b84df9bprecedence_heuristic cell_id$2257220c-6f0e-4edf-9fea-7e388b84df9bdownstream_cells_mapupstream_cells_map@md_strgetindex$c45ff9b5-35d9-4a9d-a801-c762333a1f02precedence_heuristic cell_id$c45ff9b5-35d9-4a9d-a801-c762333a1f02downstream_cells_mapExample$b0ca0392-71b8-4f44-8c6c-0978a02a0e6c$35aa1295-642f-4525-bf19-df2a42ff39d6$ce7bf747-7116-4e76-9004-f234317046c3$26aa369f-e5c7-4fe5-8b6b-903f4f4e91ba$c45ff9b5-35d9-4a9d-a801-c762333a1f02SimpleClang.compile_libcodeSimpleClang.compile_and_run$c45ff9b5-35d9-4a9d-a801-c762333a1f02upstream_cells_mapsplitCCodedirnamejoinpathExample$c45ff9b5-35d9-4a9d-a801-c762333a1f02errorStringSimpleClang.compile_and_run$c45ff9b5-35d9-4a9d-a801-c762333a1f02endSimpleClang$8df4ff2f-d176-4b4e-a525-665b5d07ea52@__DIR__==CLCodeCppCoderead$4aac6ab5-053a-4f60-9e2e-e8d61ff0cecbprecedence_heuristic cell_id$4aac6ab5-053a-4f60-9e2e-e8d61ff0cecbdownstream_cells_mapupstream_cells_map=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7$61af27f1-9f83-42f1-a419-06d12ea62133precedence_heuristic cell_id$61af27f1-9f83-42f1-a419-06d12ea62133downstream_cells_mapupstream_cells_mapasideciteintro$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3b$52d428d5-cb33-4f2a-89eb-3a8ce3f5bb81precedence_heuristic cell_id$52d428d5-cb33-4f2a-89eb-3a8ce3f5bb81downstream_cells_mapupstream_cells_map@md_strFoldablegetindex$d7e31ced-4eb2-4221-b83f-462e8f32fe89precedence_heuristic cell_id$d7e31ced-4eb2-4221-b83f-462e8f32fe89downstream_cells_mapupstream_cells_map@md_strasideFoldablegetindex$2c84bd84-b54d-4594-b9f8-35db2124d7e8precedence_heuristic cell_id$2c84bd84-b54d-4594-b9f8-35db2124d7e8downstream_cells_mapupstream_cells_map@md_strgetindex$32f740e7-9338-4c42-8eaf-ce8022412c50precedence_heuristic cell_id$32f740e7-9338-4c42-8eaf-ce8022412c50downstream_cells_mapupstream_cells_map@md_strgetindex$db16e939-b490-497b-a03f-80ce2e8485afprecedence_heuristic cell_id$db16e939-b490-497b-a03f-80ce2e8485afdownstream_cells_mapupstream_cells_map@md_strFoldablegetindex$7d37fbea-baa3-43ec-b003-a4707017a4cfprecedence_heuristic cell_id$7d37fbea-baa3-43ec-b003-a4707017a4cfdownstream_cells_mapupstream_cells_map@md_strgetindex$568057f5-b0b8-4225-8e4b-5eec911a52efprecedence_heuristic cell_id$568057f5-b0b8-4225-8e4b-5eec911a52efdownstream_cells_mapupstream_cells_map@md_strgetindex$370f0f20-e373-4028-bca1-83e93678cbcbprecedence_heuristic cell_id$370f0f20-e373-4028-bca1-83e93678cbcbdownstream_cells_mapupstream_cells_mapimg$7565e3da-84ce-42b6-8d4b-3615576f33b7$1bac238f-79c8-4f9f-a187-bacb288de3b0precedence_heuristic cell_id$1bac238f-79c8-4f9f-a187-bacb288de3b0downstream_cells_mapupstream_cells_maptree$954f1ab1-1e2f-458b-96d7-a1746631fac7$40606ee3-38cc-4123-9b86-b774bf89e499precedence_heuristic cell_id$40606ee3-38cc-4123-9b86-b774bf89e499downstream_cells_mapupstream_cells_map@md_strgetindex$8df4ff2f-d176-4b4e-a525-665b5d07ea52precedence_heuristiccell_id$8df4ff2f-d176-4b4e-a525-665b5d07ea52downstream_cells_mapStaticArraysSimpleClang$c45ff9b5-35d9-4a9d-a801-c762333a1f02PlutoUI$58e12afd-6eb0-4731-bd57-d9ae7ab4e164$7565e3da-84ce-42b6-8d4b-3615576f33b7HypertextLiteral$58e12afd-6eb0-4731-bd57-d9ae7ab4e164$7565e3da-84ce-42b6-8d4b-3615576f33b7ExperimentalLayoutBenchmarkToolsMarkdown$e172f5c5-8b96-4efd-9cf3-805c58d1a6a3Luxor$c55dcd4a-8438-4679-9c4a-78cceec6835d$954f1ab1-1e2f-458b-96d7-a1746631fac7PlutoTeachingTools$58e12afd-6eb0-4731-bd57-d9ae7ab4e164$7565e3da-84ce-42b6-8d4b-3615576f33b7upstream_cells_map$23bfbe95-7ba2-41b9-bd8b-dc4baa3ad53aprecedence_heuristic cell_id$23bfbe95-7ba2-41b9-bd8b-dc4baa3ad53adownstream_cells_mapupstream_cells_map@md_strFoldable=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7getindex$49b596b8-891d-4f3f-a6a4-a62cc8237df3precedence_heuristic cell_id$49b596b8-891d-4f3f-a6a4-a62cc8237df3downstream_cells_mapupstream_cells_map@md_strdefinition$e172f5c5-8b96-4efd-9cf3-805c58d1a6a3getindex$cf799c26-1cea-4b38-9a15-8497813bd668precedence_heuristic cell_id$cf799c26-1cea-4b38-9a15-8497813bd668downstream_cells_mapupstream_cells_map@md_strgetindex$5441e428-b320-433c-acde-15fe6bf58537precedence_heuristic cell_id$5441e428-b320-433c-acde-15fe6bf58537downstream_cells_mapupstream_cells_mapruncmd_gen@cmd$bfab5c2d-61c3-468b-9ddf-4aaa49cb7785precedence_heuristic cell_id$bfab5c2d-61c3-468b-9ddf-4aaa49cb7785downstream_cells_mapupstream_cells_map@md_strgetindex$7d9ac5f9-39bf-4052-ad8a-ac0fec15c64aprecedence_heuristic cell_id$7d9ac5f9-39bf-4052-ad8a-ac0fec15c64adownstream_cells_mapupstream_cells_map@md_strgetindex$86394e1c-0ff4-449a-8940-4b5906d8b6f0precedence_heuristic cell_id$86394e1c-0ff4-449a-8940-4b5906d8b6f0downstream_cells_mapupstream_cells_map@md_strFoldablegetindex$97d3cf3f-ddac-4850-8b05-bdc0c4741f61precedence_heuristic cell_id$97d3cf3f-ddac-4850-8b05-bdc0c4741f61downstream_cells_mapupstream_cells_map@md_strFoldablegetindex$9b4cae31-c319-444e-98c8-2c0bfc6dfa0cprecedence_heuristic cell_id$9b4cae31-c319-444e-98c8-2c0bfc6dfa0cdownstream_cells_mapupstream_cells_map@md_strgetindex$b9a9e335-1328-4c63-a213-ce21263bc201precedence_heuristic cell_id$b9a9e335-1328-4c63-a213-ce21263bc201downstream_cells_mapupstream_cells_map@md_strFoldablegetindex$7b1d26c6-9499-4e44-84c8-c272737a175eprecedence_heuristic cell_id$7b1d26c6-9499-4e44-84c8-c272737a175edownstream_cells_mapupstream_cells_map@md_strgetindex$360091c4-d3a0-462d-abcf-b9bbb9480871precedence_heuristic cell_id$360091c4-d3a0-462d-abcf-b9bbb9480871downstream_cells_mapupstream_cells_map@md_strgetindex$f2417047-33fc-4489-8e89-115bc6b46c13precedence_heuristic cell_id$f2417047-33fc-4489-8e89-115bc6b46c13downstream_cells_mapupstream_cells_map@md_strasideciteintro$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3bgetindex$1152dec8-3810-42b1-bb2a-8755dcaef56cprecedence_heuristic cell_id$1152dec8-3810-42b1-bb2a-8755dcaef56cdownstream_cells_mapimg1$133f4c7d-33e0-4e13-b716-f538125436caupstream_cells_mapimg$7565e3da-84ce-42b6-8d4b-3615576f33b7$8f46daf1-9ca2-4a08-99aa-4ed68af218b8precedence_heuristic cell_id$8f46daf1-9ca2-4a08-99aa-4ed68af218b8downstream_cells_mapupstream_cells_mapasideciteintro$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3b$143dca7c-f9a4-472a-a4bc-4578e4e8413bprecedence_heuristic cell_id$143dca7c-f9a4-472a-a4bc-4578e4e8413bdownstream_cells_mapupstream_cells_map@md_strgetindex$c0daf219-cb87-4203-b835-49ab7eb955beprecedence_heuristic cell_id$c0daf219-cb87-4203-b835-49ab7eb955bedownstream_cells_mapupstream_cells_map@md_strgetindexlist_2$6c1984f6-4e36-4637-b0da-c7dd8b0f9ff0list_1$e3474aea-ee14-4c78-ae46-5badc66a543ampicc_cmd$88f33f35-d922-4d98-af4a-ebb79d9b7dc6$d04b9af5-f004-4ca4-b1c9-2c86d46cb37dprecedence_heuristic cell_id$d04b9af5-f004-4ca4-b1c9-2c86d46cb37ddownstream_cells_mapupstream_cells_map@md_strgetindex$fc705b81-7310-44cc-ad9f-dc2cf8a9b645precedence_heuristic cell_id$fc705b81-7310-44cc-ad9f-dc2cf8a9b645downstream_cells_mapupstream_cells_mappath$c55dcd4a-8438-4679-9c4a-78cceec6835d$6d2b3dbc-0686-49f0-904a-56c3ce63b4ddprecedence_heuristic cell_id$6d2b3dbc-0686-49f0-904a-56c3ce63b4dddownstream_cells_mapupstream_cells_map@md_strhboxBase.getindex@c_strBaseDivDict=>$e172f5c5-8b96-4efd-9cf3-805c58d1a6a3precedence_heuristic cell_id$e172f5c5-8b96-4efd-9cf3-805c58d1a6a3downstream_cells_mapdefinition$49b596b8-891d-4f3f-a6a4-a62cc8237df3upstream_cells_mapMarkdown.AdmonitionMarkdown.MDMarkdown$8df4ff2f-d176-4b4e-a525-665b5d07ea52$ce7bf747-7116-4e76-9004-f234317046c3precedence_heuristic cell_id$ce7bf747-7116-4e76-9004-f234317046c3downstream_cells_mapupstream_cells_mapExample$c45ff9b5-35d9-4a9d-a801-c762333a1f02compile_and_run$d722a86d-6d51-4d91-ac22-53af94c91497precedence_heuristic cell_id$d722a86d-6d51-4d91-ac22-53af94c91497downstream_cells_mapupstream_cells_map@md_strBase.getindex@c_strBaseDictvboxDiv=>$273ad3a6-cb32-49bb-8702-fdaf8597e812precedence_heuristic cell_id$273ad3a6-cb32-49bb-8702-fdaf8597e812downstream_cells_mapupstream_cells_map@md_strgetindex$6041a909-d26c-4ab1-836b-29953c578759precedence_heuristic cell_id$6041a909-d26c-4ab1-836b-29953c578759downstream_cells_mapupstream_cells_map@md_strFoldablegetindex$b94cd399-0370-49e9-a522-056f3af22955precedence_heuristic cell_id$b94cd399-0370-49e9-a522-056f3af22955downstream_cells_mapupstream_cells_mapimg$7565e3da-84ce-42b6-8d4b-3615576f33b7$58e12afd-6eb0-4731-bd57-d9ae7ab4e164precedence_heuristic cell_id$58e12afd-6eb0-4731-bd57-d9ae7ab4e164downstream_cells_mapupstream_cells_mapPlutoTeachingTools$8df4ff2f-d176-4b4e-a525-665b5d07ea52HypertextLiteral.BypassHypertextLiteral.ResultHypertextLiteral$8df4ff2f-d176-4b4e-a525-665b5d07ea52PlutoUI$8df4ff2f-d176-4b4e-a525-665b5d07ea52HypertextLiteral.content$PlutoTeachingTools.ChooseDisplayMode@htlPlutoUI.TableOfContents$141d162c-c817-498f-be16-f1cd35d82487precedence_heuristic cell_id$141d162c-c817-498f-be16-f1cd35d82487downstream_cells_mapupstream_cells_map@md_strFoldablegetindex$e44b0038-d68f-4a49-9da2-67fbcbe098c3precedence_heuristic cell_id$e44b0038-d68f-4a49-9da2-67fbcbe098c3downstream_cells_mapupstream_cells_mappath$c55dcd4a-8438-4679-9c4a-78cceec6835d$b5a3e471-af4a-466f-bbae-96306bcc7563precedence_heuristic cell_id$b5a3e471-af4a-466f-bbae-96306bcc7563downstream_cells_mapupstream_cells_map@md_strBase.getindex@c_strBaseDictvboxDiv=>$21b6133f-db59-4885-9b3d-331c3d6ef306precedence_heuristic cell_id$21b6133f-db59-4885-9b3d-331c3d6ef306downstream_cells_mapupstream_cells_map@md_strgetindex$beee4908-d519-413a-964f-149bb82cdbb8precedence_heuristic cell_id$beee4908-d519-413a-964f-149bb82cdbb8downstream_cells_mapupstream_cells_map@md_strgetindex$091dd042-580b-4fda-8086-e048663aed6cprecedence_heuristic cell_id$091dd042-580b-4fda-8086-e048663aed6cdownstream_cells_mapupstream_cells_map@md_str=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7getindex$60bc118f-6795-43f9-97a2-865fd1704895precedence_heuristic cell_id$60bc118f-6795-43f9-97a2-865fd1704895downstream_cells_mapupstream_cells_map@md_strgetindex$16f8d28b-f201-4fe5-8446-68d7d9ddfb3cprecedence_heuristic cell_id$16f8d28b-f201-4fe5-8446-68d7d9ddfb3cdownstream_cells_mapupstream_cells_mapasideciteintro$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3b$35aa1295-642f-4525-bf19-df2a42ff39d6precedence_heuristic cell_id$35aa1295-642f-4525-bf19-df2a42ff39d6downstream_cells_mapupstream_cells_mapExample$c45ff9b5-35d9-4a9d-a801-c762333a1f02sum_num_processes$4788d8b4-2efa-4489-80c3-71f405513644compile_and_run$e832ce25-94e2-4743-854d-02b52cc7b56dprecedence_heuristic cell_id$e832ce25-94e2-4743-854d-02b52cc7b56ddownstream_cells_mapupstream_cells_map@md_strasideFoldablegetindex$9612a1ef-fd3a-4a58-87b0-b2255ac86331precedence_heuristic cell_id$9612a1ef-fd3a-4a58-87b0-b2255ac86331downstream_cells_mapupstream_cells_map@md_strgetindex$d7117a24-aba6-4479-a40e-5005310a6b38precedence_heuristic cell_id$d7117a24-aba6-4479-a40e-5005310a6b38downstream_cells_mapupstream_cells_mapasideciteintro$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3b$0e640e07-82c7-4dab-a8f1-2f634bbebdeaprecedence_heuristic cell_id$0e640e07-82c7-4dab-a8f1-2f634bbebdeadownstream_cells_mapupstream_cells_maphbox=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7$f2ebc6fb-e07c-4922-897d-9bbe0f5fa1d0precedence_heuristic cell_id$f2ebc6fb-e07c-4922-897d-9bbe0f5fa1d0downstream_cells_mapupstream_cells_map@md_strhboxBase.getindexBase.Docs.HTML@html_strDivBaseDict=>$55e96151-2aa1-4ea0-b672-2038c57d911eprecedence_heuristic cell_id$55e96151-2aa1-4ea0-b672-2038c57d911edownstream_cells_mapupstream_cells_mapaside=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7$67dee339-98b4-4714-88b2-8098a13235f2precedence_heuristic cell_id$67dee339-98b4-4714-88b2-8098a13235f2downstream_cells_mapupstream_cells_map@md_strgetindex$4e32f7fb-cd5a-4190-9c92-ba4029313475precedence_heuristic cell_id$4e32f7fb-cd5a-4190-9c92-ba4029313475downstream_cells_mapupstream_cells_mapimg$7565e3da-84ce-42b6-8d4b-3615576f33b7$dbc19cbb-1349-4904-b655-2452aa7e2452precedence_heuristic cell_id$dbc19cbb-1349-4904-b655-2452aa7e2452downstream_cells_mapupstream_cells_map@md_strvboxgetindex$0d69e94b-492a-4acc-adba-a2126b871724precedence_heuristic cell_id$0d69e94b-492a-4acc-adba-a2126b871724downstream_cells_mapupstream_cells_map@md_strvboxgetindex$4309dc43-aeb8-4ec7-94fe-0e320b784349precedence_heuristic cell_id$4309dc43-aeb8-4ec7-94fe-0e320b784349downstream_cells_mapupstream_cells_map@md_strgetindex$1551122c-70ae-4e37-b3fb-4be91fcc4afbprecedence_heuristic cell_id$1551122c-70ae-4e37-b3fb-4be91fcc4afbdownstream_cells_mapupstream_cells_map@md_strFoldable=>img$7565e3da-84ce-42b6-8d4b-3615576f33b7getindex$3e98c0ca-1b47-4631-83d7-cd0c8c0a431dprecedence_heuristic cell_id$3e98c0ca-1b47-4631-83d7-cd0c8c0a431ddownstream_cells_mapcitepara$3a50ca06-06e8-4a61-ade2-afbfc52ca655upstream_cells_map*$3ec3c058-a94d-4717-b99f-66373f2fa31dprecedence_heuristic cell_id$3ec3c058-a94d-4717-b99f-66373f2fa31ddownstream_cells_mapupstream_cells_mapimg$7565e3da-84ce-42b6-8d4b-3615576f33b7cell_execution_order$8df4ff2f-d176-4b4e-a525-665b5d07ea52$58e12afd-6eb0-4731-bd57-d9ae7ab4e164$bfab5c2d-61c3-468b-9ddf-4aaa49cb7785$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3b$3e98c0ca-1b47-4631-83d7-cd0c8c0a431d$5a566137-fbd1-45b2-9a55-e4aded366bb3$a6c337c4-0c81-4463-ad4f-9a4528d953ab$cf799c26-1cea-4b38-9a15-8497813bd668$6d2b3dbc-0686-49f0-904a-56c3ce63b4dd$b5a3e471-af4a-466f-bbae-96306bcc7563$d722a86d-6d51-4d91-ac22-53af94c91497$c3590376-06ed-45a4-af0b-2d46f1a387c8$52d428d5-cb33-4f2a-89eb-3a8ce3f5bb81$273ad3a6-cb32-49bb-8702-fdaf8597e812$82230d6c-25ce-4d12-8842-e0651fc4b143$7d9ac5f9-39bf-4052-ad8a-ac0fec15c64a$a103c5af-42fe-4f8c-b78c-6946895105d7$21b6133f-db59-4885-9b3d-331c3d6ef306$35ba1eea-56ae-4b74-af96-21ec5a93c455$8981b5e2-2497-478e-ab28-a14b62f6f916$5441e428-b320-433c-acde-15fe6bf58537$40606ee3-38cc-4123-9b86-b774bf89e499$9b4cae31-c319-444e-98c8-2c0bfc6dfa0c$8b83570a-6982-47e5-a167-a6d6afee0f7d$5d72bf87-7f3a-4229-9d7a-2e63c115087d$7b1d26c6-9499-4e44-84c8-c272737a175e$fc43b343-79cd-4342-8d80-8ea72cf34942$233c13ff-f008-40b0-a6c5-c5395b2215ec$ad3559d1-6180-4eaa-b97d-3c1f10f036b9$c420ad25-6af1-4fb4-823a-b6bbd4e10f7f$db16e939-b490-497b-a03f-80ce2e8485af$4fdb4cd6-a794-4b14-84b0-72f484c6ea86$a258eec9-f4f6-49bd-8470-8541836f5f6b$6fc34de1-469b-41a9-9677-ff3182f7a498$de20bf96-7d33-4a78-8147-f0b7f8488e46$e119c2d3-1e24-464f-b812-62f28c00a913$dbc19cbb-1349-4904-b655-2452aa7e2452$2ff573a3-4a84-4497-9305-2d97e35e5e3d$6be49c46-4900-4457-81b4-0704cd7da0af$60bc118f-6795-43f9-97a2-865fd1704895$0d69e94b-492a-4acc-adba-a2126b871724$b9a9e335-1328-4c63-a213-ce21263bc201$a1b2d090-d498-4d5d-90a0-8cdc648dc833$a771f33f-7ed1-41aa-bee0-c215729a8c8d$141d162c-c817-498f-be16-f1cd35d82487$7cf59087-efca-4f03-90dc-f2acefdcbc8a$4788d8b4-2efa-4489-80c3-71f405513644$e832ce25-94e2-4743-854d-02b52cc7b56d$79b405a5-54b5-4727-a0cd-b79522ad109f$d2104fbd-ba22-4501-b03a-8809271d598b$4569aa05-9963-4976-ac63-caf3f3979e83$34a10003-2c32-4332-b3e6-ce70eec0cbbe$d7e31ced-4eb2-4221-b83f-462e8f32fe89$c3c848ff-526a-450d-9b1c-5d9d3ccccf28$67dee339-98b4-4714-88b2-8098a13235f2$3a50ca06-06e8-4a61-ade2-afbfc52ca655$32f740e7-9338-4c42-8eaf-ce8022412c50$93f0c63c-b597-4f89-809c-7af0476f319a$568057f5-b0b8-4225-8e4b-5eec911a52ef$a79c410a-bebf-434c-9730-568e0ff4f4c7$c1285653-38ba-418b-bdf5-cda99440998d$88f33f35-d922-4d98-af4a-ebb79d9b7dc6$e3474aea-ee14-4c78-ae46-5badc66a543a$6c1984f6-4e36-4637-b0da-c7dd8b0f9ff0$c0daf219-cb87-4203-b835-49ab7eb955be$51d70f9a-cd67-44b9-8fd1-5ab70b526c7a$944d827e-bc6a-4de8-b959-5fde8790bedc$3a2bfd4e-0ce6-4a79-a578-fc1b4ef563c5$beee4908-d519-413a-964f-149bb82cdbb8$b540d5e3-6686-479a-b2c7-c1f65b85b6ba$9a100ccf-1ad3-4d2c-bbe0-e297969eb69e$921b5a18-0733-4032-a543-9d60e254b1b2$9612a1ef-fd3a-4a58-87b0-b2255ac86331$98392c40-6542-4a26-8552-c0960bbaa6a6$c253bb24-ad76-4b58-8dfc-7dc2576e3db5$1b617828-e2b2-4a94-a120-59fa533d3e11$f2ebc6fb-e07c-4922-897d-9bbe0f5fa1d0$8da580fe-6b56-4d8f-ad43-aed7b728a06e$fa024a5d-52a6-459d-894d-13a60ec723d2$360091c4-d3a0-462d-abcf-b9bbb9480871$3dc860be-016d-49ee-8535-7d9457c70f85$c55dcd4a-8438-4679-9c4a-78cceec6835d$e44b0038-d68f-4a49-9da2-67fbcbe098c3$7d37fbea-baa3-43ec-b003-a4707017a4cf$fc705b81-7310-44cc-ad9f-dc2cf8a9b645$86394e1c-0ff4-449a-8940-4b5906d8b6f0$d7117a24-aba6-4479-a40e-5005310a6b38$2257220c-6f0e-4edf-9fea-7e388b84df9b$2e4dc3f9-a132-444f-a35d-f583823a7dfd$8f46daf1-9ca2-4a08-99aa-4ed68af218b8$2c84bd84-b54d-4594-b9f8-35db2124d7e8$4309dc43-aeb8-4ec7-94fe-0e320b784349$a0566fdb-a08d-4bcf-9b2f-ed211c9f111f$e796b093-9c1d-4656-9acb-918de53f7e4d$d04b9af5-f004-4ca4-b1c9-2c86d46cb37d$97d3cf3f-ddac-4850-8b05-bdc0c4741f61$61af27f1-9f83-42f1-a419-06d12ea62133$143dca7c-f9a4-472a-a4bc-4578e4e8413b$954f1ab1-1e2f-458b-96d7-a1746631fac7$1bac238f-79c8-4f9f-a187-bacb288de3b0$21d507f6-02f8-4f8b-84f1-bcb84731df66$b53ec488-ff25-4647-ab00-fbf90963a795$de72d596-0daf-4629-bbb5-20bb8a67cbed$488b0c17-4f0f-43bf-a16c-b9faa7ae0595$10a1b3a7-21c7-4f97-93e1-006ad3aea40d$f7f097cb-d7bd-49eb-a030-ac26f8f61a67$6041a909-d26c-4ab1-836b-29953c578759$16f8d28b-f201-4fe5-8446-68d7d9ddfb3c$a59db59c-d34e-4abd-8865-9907607e06a8$f2417047-33fc-4489-8e89-115bc6b46c13$7565e3da-84ce-42b6-8d4b-3615576f33b7$c04bcc96-e5fe-4d6e-a12e-40dcde58c62e$4e32f7fb-cd5a-4190-9c92-ba4029313475$b94cd399-0370-49e9-a522-056f3af22955$370f0f20-e373-4028-bca1-83e93678cbcb$0e640e07-82c7-4dab-a8f1-2f634bbebdea$8a527c17-bf2b-4e6b-937f-ef3a269c5112$39f48c25-6efb-4ff2-aedc-9d3e722dad24$55e96151-2aa1-4ea0-b672-2038c57d911e$be0e3ba0-18cc-4b9a-a56d-2566f5148fae$d8bb1d43-bf42-4a09-bdeb-5db406ef1ccd$091dd042-580b-4fda-8086-e048663aed6c$7fc70992-973a-43c6-904a-dd1b622a5ed8$23bfbe95-7ba2-41b9-bd8b-dc4baa3ad53a$39b055f5-3dbf-403c-b21e-210e3813d8b0$b68eb860-a5b4-4e9e-9fbf-6eb6ce43ae69$f6f9447c-9bc9-432d-bd80-2c39f9d842f8$1551122c-70ae-4e37-b3fb-4be91fcc4afb$655e980d-b4e9-4f56-a5ae-380072242d27$e4d1de1d-d57a-48ab-ad7a-c09b427daa03$4aac6ab5-053a-4f60-9e2e-e8d61ff0cecb$3ec3c058-a94d-4717-b99f-66373f2fa31d$1152dec8-3810-42b1-bb2a-8755dcaef56c$133f4c7d-33e0-4e13-b716-f538125436ca$c45ff9b5-35d9-4a9d-a801-c762333a1f02$b0ca0392-71b8-4f44-8c6c-0978a02a0e6c$35aa1295-642f-4525-bf19-df2a42ff39d6$ce7bf747-7116-4e76-9004-f234317046c3$26aa369f-e5c7-4fe5-8b6b-903f4f4e91ba$e172f5c5-8b96-4efd-9cf3-805c58d1a6a3$49b596b8-891d-4f3f-a6a4-a62cc8237df3last_hot_reload_timeshortpath3_mpi.jlprocess_statusreadypath7/home/runner/work/LINMA2710/LINMA2710/Lectures/3_mpi.jlpluto_versionv0.20.24last_save_timeAe˾ߪcell_order$58e12afd-6eb0-4731-bd57-d9ae7ab4e164$bfab5c2d-61c3-468b-9ddf-4aaa49cb7785$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3b$3e98c0ca-1b47-4631-83d7-cd0c8c0a431d$5a566137-fbd1-45b2-9a55-e4aded366bb3$a6c337c4-0c81-4463-ad4f-9a4528d953ab$c04bcc96-e5fe-4d6e-a12e-40dcde58c62e$cf799c26-1cea-4b38-9a15-8497813bd668$6d2b3dbc-0686-49f0-904a-56c3ce63b4dd$b5a3e471-af4a-466f-bbae-96306bcc7563$d722a86d-6d51-4d91-ac22-53af94c91497$c3590376-06ed-45a4-af0b-2d46f1a387c8$52d428d5-cb33-4f2a-89eb-3a8ce3f5bb81$273ad3a6-cb32-49bb-8702-fdaf8597e812$4e32f7fb-cd5a-4190-9c92-ba4029313475$82230d6c-25ce-4d12-8842-e0651fc4b143$7d9ac5f9-39bf-4052-ad8a-ac0fec15c64a$b0ca0392-71b8-4f44-8c6c-0978a02a0e6c$a103c5af-42fe-4f8c-b78c-6946895105d7$21b6133f-db59-4885-9b3d-331c3d6ef306$35ba1eea-56ae-4b74-af96-21ec5a93c455$8981b5e2-2497-478e-ab28-a14b62f6f916$5441e428-b320-433c-acde-15fe6bf58537$40606ee3-38cc-4123-9b86-b774bf89e499$b94cd399-0370-49e9-a522-056f3af22955$9b4cae31-c319-444e-98c8-2c0bfc6dfa0c$8b83570a-6982-47e5-a167-a6d6afee0f7d$5d72bf87-7f3a-4229-9d7a-2e63c115087d$7b1d26c6-9499-4e44-84c8-c272737a175e$fc43b343-79cd-4342-8d80-8ea72cf34942$233c13ff-f008-40b0-a6c5-c5395b2215ec$ad3559d1-6180-4eaa-b97d-3c1f10f036b9$c420ad25-6af1-4fb4-823a-b6bbd4e10f7f$db16e939-b490-497b-a03f-80ce2e8485af$4fdb4cd6-a794-4b14-84b0-72f484c6ea86$a258eec9-f4f6-49bd-8470-8541836f5f6b$6fc34de1-469b-41a9-9677-ff3182f7a498$de20bf96-7d33-4a78-8147-f0b7f8488e46$e119c2d3-1e24-464f-b812-62f28c00a913$dbc19cbb-1349-4904-b655-2452aa7e2452$2ff573a3-4a84-4497-9305-2d97e35e5e3d$6be49c46-4900-4457-81b4-0704cd7da0af$60bc118f-6795-43f9-97a2-865fd1704895$0d69e94b-492a-4acc-adba-a2126b871724$b9a9e335-1328-4c63-a213-ce21263bc201$a1b2d090-d498-4d5d-90a0-8cdc648dc833$a771f33f-7ed1-41aa-bee0-c215729a8c8d$370f0f20-e373-4028-bca1-83e93678cbcb$141d162c-c817-498f-be16-f1cd35d82487$7cf59087-efca-4f03-90dc-f2acefdcbc8a$35aa1295-642f-4525-bf19-df2a42ff39d6$4788d8b4-2efa-4489-80c3-71f405513644$e832ce25-94e2-4743-854d-02b52cc7b56d$79b405a5-54b5-4727-a0cd-b79522ad109f$d2104fbd-ba22-4501-b03a-8809271d598b$0e640e07-82c7-4dab-a8f1-2f634bbebdea$4569aa05-9963-4976-ac63-caf3f3979e83$34a10003-2c32-4332-b3e6-ce70eec0cbbe$ce7bf747-7116-4e76-9004-f234317046c3$d7e31ced-4eb2-4221-b83f-462e8f32fe89$c3c848ff-526a-450d-9b1c-5d9d3ccccf28$67dee339-98b4-4714-88b2-8098a13235f2$3a50ca06-06e8-4a61-ade2-afbfc52ca655$32f740e7-9338-4c42-8eaf-ce8022412c50$8a527c17-bf2b-4e6b-937f-ef3a269c5112$93f0c63c-b597-4f89-809c-7af0476f319a$568057f5-b0b8-4225-8e4b-5eec911a52ef$26aa369f-e5c7-4fe5-8b6b-903f4f4e91ba$a79c410a-bebf-434c-9730-568e0ff4f4c7$39f48c25-6efb-4ff2-aedc-9d3e722dad24$55e96151-2aa1-4ea0-b672-2038c57d911e$be0e3ba0-18cc-4b9a-a56d-2566f5148fae$c0daf219-cb87-4203-b835-49ab7eb955be$c1285653-38ba-418b-bdf5-cda99440998d$88f33f35-d922-4d98-af4a-ebb79d9b7dc6$e3474aea-ee14-4c78-ae46-5badc66a543a$6c1984f6-4e36-4637-b0da-c7dd8b0f9ff0$51d70f9a-cd67-44b9-8fd1-5ab70b526c7a$944d827e-bc6a-4de8-b959-5fde8790bedc$3a2bfd4e-0ce6-4a79-a578-fc1b4ef563c5$beee4908-d519-413a-964f-149bb82cdbb8$d8bb1d43-bf42-4a09-bdeb-5db406ef1ccd$b540d5e3-6686-479a-b2c7-c1f65b85b6ba$091dd042-580b-4fda-8086-e048663aed6c$9a100ccf-1ad3-4d2c-bbe0-e297969eb69e$921b5a18-0733-4032-a543-9d60e254b1b2$9612a1ef-fd3a-4a58-87b0-b2255ac86331$98392c40-6542-4a26-8552-c0960bbaa6a6$49b596b8-891d-4f3f-a6a4-a62cc8237df3$c253bb24-ad76-4b58-8dfc-7dc2576e3db5$1b617828-e2b2-4a94-a120-59fa533d3e11$f2ebc6fb-e07c-4922-897d-9bbe0f5fa1d0$8da580fe-6b56-4d8f-ad43-aed7b728a06e$fa024a5d-52a6-459d-894d-13a60ec723d2$360091c4-d3a0-462d-abcf-b9bbb9480871$e44b0038-d68f-4a49-9da2-67fbcbe098c3$3dc860be-016d-49ee-8535-7d9457c70f85$7fc70992-973a-43c6-904a-dd1b622a5ed8$c55dcd4a-8438-4679-9c4a-78cceec6835d$7d37fbea-baa3-43ec-b003-a4707017a4cf$fc705b81-7310-44cc-ad9f-dc2cf8a9b645$86394e1c-0ff4-449a-8940-4b5906d8b6f0$23bfbe95-7ba2-41b9-bd8b-dc4baa3ad53a$d7117a24-aba6-4479-a40e-5005310a6b38$2257220c-6f0e-4edf-9fea-7e388b84df9b$39b055f5-3dbf-403c-b21e-210e3813d8b0$2e4dc3f9-a132-444f-a35d-f583823a7dfd$b68eb860-a5b4-4e9e-9fbf-6eb6ce43ae69$8f46daf1-9ca2-4a08-99aa-4ed68af218b8$2c84bd84-b54d-4594-b9f8-35db2124d7e8$4309dc43-aeb8-4ec7-94fe-0e320b784349$f6f9447c-9bc9-432d-bd80-2c39f9d842f8$1551122c-70ae-4e37-b3fb-4be91fcc4afb$a0566fdb-a08d-4bcf-9b2f-ed211c9f111f$e796b093-9c1d-4656-9acb-918de53f7e4d$d04b9af5-f004-4ca4-b1c9-2c86d46cb37d$655e980d-b4e9-4f56-a5ae-380072242d27$133f4c7d-33e0-4e13-b716-f538125436ca$97d3cf3f-ddac-4850-8b05-bdc0c4741f61$61af27f1-9f83-42f1-a419-06d12ea62133$143dca7c-f9a4-472a-a4bc-4578e4e8413b$1bac238f-79c8-4f9f-a187-bacb288de3b0$e4d1de1d-d57a-48ab-ad7a-c09b427daa03$954f1ab1-1e2f-458b-96d7-a1746631fac7$21d507f6-02f8-4f8b-84f1-bcb84731df66$4aac6ab5-053a-4f60-9e2e-e8d61ff0cecb$b53ec488-ff25-4647-ab00-fbf90963a795$de72d596-0daf-4629-bbb5-20bb8a67cbed$488b0c17-4f0f-43bf-a16c-b9faa7ae0595$10a1b3a7-21c7-4f97-93e1-006ad3aea40d$f7f097cb-d7bd-49eb-a030-ac26f8f61a67$3ec3c058-a94d-4717-b99f-66373f2fa31d$6041a909-d26c-4ab1-836b-29953c578759$16f8d28b-f201-4fe5-8446-68d7d9ddfb3c$a59db59c-d34e-4abd-8865-9907607e06a8$f2417047-33fc-4489-8e89-115bc6b46c13$8df4ff2f-d176-4b4e-a525-665b5d07ea52$1152dec8-3810-42b1-bb2a-8755dcaef56c$7565e3da-84ce-42b6-8d4b-3615576f33b7$c45ff9b5-35d9-4a9d-a801-c762333a1f02$e172f5c5-8b96-4efd-9cf3-805c58d1a6a3published_objectsnbpkginstall_time_nsy instantiatedòinstalled_versionsSimpleClang0.1.0StaticArrays1.9.17PlutoUI0.7.79HypertextLiteral1.0.0BenchmarkTools1.6.3Luxor4.4.1MarkdownstdlibPlutoTeachingTools0.4.7terminal_outputsStaticArrays Resolving... ===  Project No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Project.toml`  Manifest No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Manifest.toml` Instantiating... === Precompiling... === Waiting for notebook process to start... Done. Starting precompilation...SimpleClang Resolving... ===  Project No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Project.toml`  Manifest No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Manifest.toml` Instantiating... === Precompiling... === Waiting for notebook process to start... Done. Starting precompilation...nbpkg_sync Resolving... ===  Project No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Project.toml`  Manifest No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Manifest.toml` Instantiating... === Precompiling... === Waiting for notebook process to start... Done. Starting precompilation...PlutoUI Resolving... ===  Project No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Project.toml`  Manifest No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Manifest.toml` Instantiating... === Precompiling... === Waiting for notebook process to start... Done. Starting precompilation...HypertextLiteral Resolving... ===  Project No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Project.toml`  Manifest No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Manifest.toml` Instantiating... === Precompiling... === Waiting for notebook process to start... Done. Starting precompilation...BenchmarkTools Resolving... ===  Project No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Project.toml`  Manifest No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Manifest.toml` Instantiating... === Precompiling... === Waiting for notebook process to start... Done. Starting precompilation...Markdown Resolving... ===  Project No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Project.toml`  Manifest No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Manifest.toml` Instantiating... === Precompiling... === Waiting for notebook process to start... Done. Starting precompilation...Luxor Resolving... ===  Project No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Project.toml`  Manifest No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Manifest.toml` Instantiating... === Precompiling... === Waiting for notebook process to start... Done. Starting precompilation...PlutoTeachingTools Resolving... ===  Project No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Project.toml`  Manifest No packages added to or removed from `~/.julia/scratchspaces/c3e4b0f8-55cb-11ea-2926-15256bba5781/pkg_envs/env_jrucwapcwq/Manifest.toml` Instantiating... === Precompiling... === Waiting for notebook process to start... Done. Starting precompilation...enabled÷restart_recommended_msgrestart_required_msgbusy_packageswaiting_for_permission,waiting_for_permission_but_probably_disabled«cell_inputs$c55dcd4a-8438-4679-9c4a-78cceec6835dcell_id$c55dcd4a-8438-4679-9c4a-78cceec6835dcoderfunction path(ring::Bool; s = 80, offset = 0.04) off(a, b) = a + sign(b - a) * offset p(i, j) = Point(i * s, j * s) c(m, i, j) = circle(p(i, j), 0.06s, action = :fill) a(i1, j1, i2, j2) = line(p(off(i1, i2), off(j1, j2)), p(off(i2, i1), off(j2, j1)), action = :stroke) function ac(i1, j1, i2, j2, m) a(i1, j1, i2, j2) c(m, i2, j2) end @draw begin c("1", -3, 0) ac(-3, 0, -2, 0, "2") ac(-2, 0, -1, 0, "3") ac(-1, 0, 0, 0, "4") ac(0, 0, 1, 0, "5") if ring move(p(off(1, 0), off(0, -1))) curve(p(off(1, 2), off(0, -1)), p(-1, -1), p(off(-3, -2), off(0, -1))) strokepath() end end 7.5s 1.7s end;metadatashow_logsèdisabled®skip_as_script«code_folded$6be49c46-4900-4457-81b4-0704cd7da0afcell_id$6be49c46-4900-4457-81b4-0704cd7da0afcodeFoldable( md""" Would it be more efficient to have a specialized implementation instead of combining existing collectives ? """, md""" Let the size of each ``x_{i,j}`` be ``n/p`` bytes. 1. `MPI_Reduce` acts on the concatenation ``x_{:,j}`` which has length ``n`` bytes hence the complexity is ``\log_2(p)(\alpha + \beta n + \gamma n)`` 2. `MPI_Scatter` has the same complexity as `MPI_Gather` (since it's the same but backwards in time) : ``\log_2(p) \alpha + \beta n`` In total, we have the complexity ``\log_2(p) (\alpha + \beta n + \gamma n)``. Can we do better ? Start exchanging between 1 and 2 and simultaneously exchanging between 3 and 4. The complexity is ``\alpha + 2(\beta + \gamma) n/4``. | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_{1,1} + x_{1,2}`` | | ``x_{1,3} + x_{1,4}`` | | | | | ``x_{2,1} + x_{2,2}`` | | ``x_{2,3} + x_{2,4}`` | | | ``x_{3,1} + x_{3,2}`` | | ``x_{3,3} + x_{3,4}`` | | | | | ``x_{4,1} + x_{4,2}`` | | ``x_{4,3} + x_{4,4}`` | Next, we exchange between 1 and 3 and simultaneously between 2 and 4. The complexity is ``\alpha + (\beta + \gamma) n/4``. In total, we have complexity ```math \begin{align} \log_2(p) \alpha + (\beta + \gamma) n(p/2 + \cdots + 4 + 2 + 1)/p & = \log_2(p) \alpha + (\beta + \gamma) n(p-1)/p\\ & \approx \log_2(p) \alpha + (\beta + \gamma) n. \end{align} ``` This is better than the approaches combining existing collectives above since we removed the ``\log_2(p)`` in front of ``\beta`` and ``\gamma``. """, )metadatashow_logsèdisabled®skip_as_script«code_folded$c1285653-38ba-418b-bdf5-cda99440998dcell_id$c1285653-38ba-418b-bdf5-cda99440998dcodeaside(tip(Foldable(md"Use `module spider` to see which version are available", md""" ``` [blegat@lm4-f001 ~]$ module spider gompi ---------------------------- gompi: ---------------------------- Description: GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support. Versions: gompi/2021b gompi/2022b gompi/2023a gompi/2023b ---------------------------- For detailed information about a specific "gompi" package (including how to load the modules) use the module's full name. Note that names that have a trailing (E) are extensions provided by other modules. For example: $ module spider gompi/2023b ---------------------------- ``` """)), v_offset = -300)metadatashow_logsèdisabled®skip_as_script«code_folded$4fdb4cd6-a794-4b14-84b0-72f484c6ea86cell_id$4fdb4cd6-a794-4b14-84b0-72f484c6ea86codemd"## All gather"metadatashow_logsèdisabled®skip_as_script«code_folded$a59db59c-d34e-4abd-8865-9907607e06a8cell_id$a59db59c-d34e-4abd-8865-9907607e06a8code@aside(md"""From $(citeintro("Figure 2.27"))""", v_offset = -200)metadatashow_logsèdisabled®skip_as_script«code_folded$5a566137-fbd1-45b2-9a55-e4aded366bb3cell_id$5a566137-fbd1-45b2-9a55-e4aded366bb3code)md"# Single Program Multiple Data (SPMD)"metadatashow_logsèdisabled®skip_as_script«code_folded$ad3559d1-6180-4eaa-b97d-3c1f10f036b9cell_id$ad3559d1-6180-4eaa-b97d-3c1f10f036b9codemd"## Reduce"metadatashow_logsèdisabled®skip_as_script«code_folded$88f33f35-d922-4d98-af4a-ebb79d9b7dc6cell_id$88f33f35-d922-4d98-af4a-ebb79d9b7dc6codempicc_cmd = md""" ```sh [blegat@lm4-f001 ~]$ mpicc -bash: mpicc: command not found [blegat@lm4-f001 ~]$ module load gompi/2023a [blegat@lm4-f001 ~]$ mpicc gcc: fatal error: no input files compilation terminated. ``` """;metadatashow_logsèdisabled®skip_as_script«code_folded$b540d5e3-6686-479a-b2c7-c1f65b85b6bacell_id$b540d5e3-6686-479a-b2c7-c1f65b85b6bacode+md"## Profiling with NVIDIA Nsight Systems"metadatashow_logsèdisabled®skip_as_script«code_folded$f6f9447c-9bc9-432d-bd80-2c39f9d842f8cell_id$f6f9447c-9bc9-432d-bd80-2c39f9d842f8code٣img("https://raw.githubusercontent.com/VictorEijkhout/TheArtOfHPC_vol1_scientificcomputing/refs/heads/main/booksources/graphics/hypercubes.jpg", :width => "400pt")metadatashow_logsèdisabled®skip_as_script«code_folded$9a100ccf-1ad3-4d2c-bbe0-e297969eb69ecell_id$9a100ccf-1ad3-4d2c-bbe0-e297969eb69ecodemd"# Topology"metadatashow_logsèdisabled®skip_as_script«code_folded$a1b2d090-d498-4d5d-90a0-8cdc648dc833cell_id$a1b2d090-d498-4d5d-90a0-8cdc648dc833codemd"# Distributed sum"metadatashow_logsèdisabled®skip_as_script«code_folded$8b83570a-6982-47e5-a167-a6d6afee0f7dcell_id$8b83570a-6982-47e5-a167-a6d6afee0f7dcodehbox([ md"""Before | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x`` | | | | """, Div(md"` `", style = Dict("margin" => "50pt")), md"""After | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x`` | ``x`` | ``x`` | ``x`` | """, ])metadatashow_logsèdisabled®skip_as_script«code_folded$e3474aea-ee14-4c78-ae46-5badc66a543acell_id$e3474aea-ee14-4c78-ae46-5badc66a543acodelist_1 = Foldable(md"`[blegat@lm4-f001 ~]$ module list`", md""" ``` Currently Loaded Modules: 1) tis/2018.01 (S) 2) releases/2023a (S) 3) StdEnv Where: S: Module is Sticky, requires --force to unload or purge ``` """);metadatashow_logsèdisabled®skip_as_script«code_folded$21d507f6-02f8-4f8b-84f1-bcb84731df66cell_id$21d507f6-02f8-4f8b-84f1-bcb84731df66codemd"## Fat-tree"metadatashow_logsèdisabled®skip_as_script«code_folded$de72d596-0daf-4629-bbb5-20bb8a67cbedcell_id$de72d596-0daf-4629-bbb5-20bb8a67cbedcodeٖFoldable(md"What is the number of edges ? What is the bisection width ?", md""" Number of edges is ``n\log_2(n)`` and bisection width is ``n/2``. """)metadatashow_logsèdisabled®skip_as_script«code_folded$c253bb24-ad76-4b58-8dfc-7dc2576e3db5cell_id$c253bb24-ad76-4b58-8dfc-7dc2576e3db5codemd"## Bisection bandwidth"metadatashow_logsèdisabled®skip_as_script«code_folded$e4d1de1d-d57a-48ab-ad7a-c09b427daa03cell_id$e4d1de1d-d57a-48ab-ad7a-c09b427daa03codeFoldable(md"What is the diameter and bisection width of ``n`` computer nodes ?", md""" Diameter is ``2\log_2(n)`` and bisection width is 1. $(img("https://upload.wikimedia.org/wikipedia/commons/d/da/Bisected_tree.jpg") ) """)metadatashow_logsèdisabled®skip_as_script«code_folded$133f4c7d-33e0-4e13-b716-f538125436cacell_id$133f4c7d-33e0-4e13-b716-f538125436cacodeCTwoColumnWideLeft( md""" There can be ``n`` simultaneous communication at the same time, provided that each input communicate with a different output. The figure on the right provides an example of such non-conflicting communication with the black dots indicating that the input of that row communicates to the corresponding output (case (a) of above figure). The switch at row 1 and column 2 is just propagating the input data horizontally and output data vertically (case (b) of above figure). The switch at row 0 and column 5 is receiving no data. """, img1("crossbar.jpg"), )metadatashow_logsèdisabled®skip_as_script«code_folded$3a50ca06-06e8-4a61-ade2-afbfc52ca655cell_id$3a50ca06-06e8-4a61-ade2-afbfc52ca655codeBaside(md"""See $(citepara("Section 4.1.4.2"))""", v_offset = -100)metadatashow_logsèdisabled®skip_as_script«code_folded$6c1984f6-4e36-4637-b0da-c7dd8b0f9ff0cell_id$6c1984f6-4e36-4637-b0da-c7dd8b0f9ff0code_list_2 = Foldable(md"`[blegat@lm4-f001 ~]$ module list`", md""" ``` Currently Loaded Modules: 1) tis/2018.01 (S) 11) libpciaccess/0.17-GCCcore-12.3.0 2) releases/2023a (S) 12) hwloc/2.9.1-GCCcore-12.3.0 3) StdEnv 13) OpenSSL/1.1 4) GCCcore/12.3.0 14) libevent/2.1.12-GCCcore-12.3.0 5) zlib/1.2.13-GCCcore-12.3.0 15) UCX/1.14.1-GCCcore-12.3.0 6) binutils/2.40-GCCcore-12.3.0 16) libfabric/1.18.0-GCCcore-12.3.0 7) GCC/12.3.0 17) PMIx/4.2.4-GCCcore-12.3.0 8) numactl/2.0.16-GCCcore-12.3.0 18) UCC/1.2.0-GCCcore-12.3.0 9) XZ/5.4.2-GCCcore-12.3.0 19) OpenMPI/4.1.5-GCC-12.3.0 10) libxml2/2.11.4-GCCcore-12.3.0 20) gompi/2023a Where: S: Module is Sticky, requires --force to unload or purge ``` """);metadatashow_logsèdisabled®skip_as_script«code_folded$de20bf96-7d33-4a78-8147-f0b7f8488e46cell_id$de20bf96-7d33-4a78-8147-f0b7f8488e46codeOFoldable( md""" Would it be more efficient to have a specialized implementation instead of combining existing collectives ? """, md""" Let the size of ``x_i`` be ``n/p`` bytes. 1. `MPI_Gather` has complexity ``\log_2(p)\alpha + \beta n`` 2. `MPI_Bcast` acts on the concatenation ``x_:`` which has length ``n`` bytes so the complexity is ``\log_2(p) (\alpha + \beta n)`` In total, we have the complexity ``\log_2(p) (\alpha + \beta n)``. Can we do better ? Start exchanging between 1 and 2 and simultaneously exchanging between 3 and 4. The complexity is ``\alpha + \beta n/4``. | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_1`` | ``x_1`` | | | | | ``x_2`` | ``x_2`` | | | | | | | ``x_3`` | ``x_3`` | | | | | ``x_4`` | ``x_4`` | Next, we exchange between 1 and 3 and simultaneously between 2 and 4. The complexity is ``\alpha + 2\beta n/4``. In total, we have complexity ```math \begin{align} \log_2(p) \alpha + \beta n(1 + 2 + 4 + \cdots + p/2)/p & = \log_2(p) \alpha + \beta n(p-1)/p\\ & \approx \log_2(p) \alpha + \beta n. \end{align} ``` """, )metadatashow_logsèdisabled®skip_as_script«code_folded$51d70f9a-cd67-44b9-8fd1-5ab70b526c7acell_id$51d70f9a-cd67-44b9-8fd1-5ab70b526c7acodemd"## Launching a job"metadatashow_logsèdisabled®skip_as_script«code_folded$98392c40-6542-4a26-8552-c0960bbaa6a6cell_id$98392c40-6542-4a26-8552-c0960bbaa6a6codemd""" * Consider graph ``G`` with nodes ``v`` corresponding to computer nodes or switches. * There is an edge ``(u, v) \in E`` if there is an ethernet cable **directly** connecting ``u`` and ``v``. * ``e \in E`` are ethernet cables of bandwidth ``w_e`` * Distance (unweighted) from node ``u \in V`` to node ``v \in V`` is ``d(G, u, v)`` - Does not depend on bandwidth ``w_e`` of edges of the path """metadatashow_logsèdisabled®skip_as_script«code_folded$39b055f5-3dbf-403c-b21e-210e3813d8b0cell_id$39b055f5-3dbf-403c-b21e-210e3813d8b0codeٌimg("https://raw.githubusercontent.com/VictorEijkhout/TheArtOfHPC_vol1_scientificcomputing/refs/heads/main/booksources/graphics/torus.jpeg")metadatashow_logsèdisabled®skip_as_script«code_folded$b0ca0392-71b8-4f44-8c6c-0978a02a0e6ccell_id$b0ca0392-71b8-4f44-8c6c-0978a02a0e6ccodeلcompile_and_run(Example("MPI/procname.c"); mpi = true, verbose = 1, show_run_command = true, num_processes = procname_num_processes)metadatashow_logsèdisabled®skip_as_script«code_folded$26aa369f-e5c7-4fe5-8b6b-903f4f4e91bacell_id$26aa369f-e5c7-4fe5-8b6b-903f4f4e91bacodeKcompile_and_run(Example("MPI/mpi_bench2.c"), mpi = true, num_processes = 2)metadatashow_logsèdisabled®skip_as_script«code_folded$a103c5af-42fe-4f8c-b78c-6946895105d7cell_id$a103c5af-42fe-4f8c-b78c-6946895105d7codeamd"`num_processes` = $(@bind procname_num_processes Slider(2:8, default = 2, show_value = true))"metadatashow_logsèdisabled®skip_as_script«code_folded$d8bb1d43-bf42-4a09-bdeb-5db406ef1ccdcell_id$d8bb1d43-bf42-4a09-bdeb-5db406ef1ccdcodehbox([Div(md""" * `srun` : Synchronous (blocked) job ``` [blegat@lm4-f001 ~]$ srun --time=1 pwd srun: job 3491072 queued and waiting for resources srun: job 3491072 has been allocated resources /home/users/b/l/blegat ``` * `$ sbatch submit.sh` : Asynchronous job, get status with * `$ squeue --me` * More details on the [README](https://github.com/blegat/LINMA2710) """, style = Dict("flex-grow" => "1", "margin-right" => "30px")), md""" $(img("https://upload.wikimedia.org/wikipedia/commons/3/3a/Slurm_logo.svg", :width => "160px", :height => "160px")) See [CÉCI documentation](https://support.ceci-hpc.be/doc/_contents/QuickStart/SubmittingJobs/SlurmTutorial.html) """, ])metadatashow_logsèdisabled®skip_as_script«code_folded$82230d6c-25ce-4d12-8842-e0651fc4b143cell_id$82230d6c-25ce-4d12-8842-e0651fc4b143code)md"## Processor name identifies the node"metadatashow_logsèdisabled®skip_as_script«code_folded$fc43b343-79cd-4342-8d80-8ea72cf34942cell_id$fc43b343-79cd-4342-8d80-8ea72cf34942codehbox([ md"""Before | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_1`` | | | | | | | ``x_2`` | | | | | | | ``x_3`` | | | | | | | ``x_4`` | """, Div(md"` `", style = Dict("margin" => "50pt")), md"""After | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_1`` | | | | | | ``x_2`` | | | | | | ``x_3`` | | | | | | ``x_4`` | | | | """, ])metadatashow_logsèdisabled®skip_as_script«code_folded$d2104fbd-ba22-4501-b03a-8809271d598bcell_id$d2104fbd-ba22-4501-b03a-8809271d598bcodemd"## Blocking communication"metadatashow_logsèdisabled®skip_as_script«code_folded$8a527c17-bf2b-4e6b-937f-ef3a269c5112cell_id$8a527c17-bf2b-4e6b-937f-ef3a269c5112code٫img("https://raw.githubusercontent.com/VictorEijkhout/TheArtOfHPC_vol2_parallelprogramming/refs/heads/main/booksources/graphics/send-nonblocking.jpeg", :height => "200pt")metadatashow_logsèdisabled®skip_as_script«code_folded$be0e3ba0-18cc-4b9a-a56d-2566f5148faecell_id$be0e3ba0-18cc-4b9a-a56d-2566f5148faecodekmd"""## $(img("https://github.com/TACC/Lmod/raw/main/logos/2x/Lmod-4color%402x.png", :height => "30px"))"""metadatashow_logsèdisabled®skip_as_script«code_folded$2ff573a3-4a84-4497-9305-2d97e35e5e3dcell_id$2ff573a3-4a84-4497-9305-2d97e35e5e3dcodeٰFoldable(md"Can `MPI_Reduce_scatter` be implemented by combining existing collectives ?", md"`MPI_Reduce_scatter` can be implemented by `MPI_Reduce` followed by `MPI_Scatter`")metadatashow_logsèdisabled®skip_as_script«code_folded$1b617828-e2b2-4a94-a120-59fa533d3e11cell_id$1b617828-e2b2-4a94-a120-59fa533d3e11codeJmd""" Bandwidth ``\texttt{bw}(u, v)`` is the bandwidth of the cable if ``(u, v) \in E`` or 0 otherwise. Given ``S, T \subseteq V``, ```math \begin{align} \text{Width} &\qquad & w(S, T) & = |\{ (u, v) \in E \mid u \in S, v \in T \}|\\ \text{Bandwidth} & & \texttt{bw}(S, T) & = \sum_{u\in S, v\not\in S} w(u,v) \end{align} ``` """metadatashow_logsèdisabled®skip_as_script«code_folded$c3590376-06ed-45a4-af0b-2d46f1a387c8cell_id$c3590376-06ed-45a4-af0b-2d46f1a387c8codeghbox([ Div(md""" Free up memory. """; style = Dict("flex-grow" => "1")), c""" MPI_Finalize(); """, ])metadatashow_logsèdisabled®skip_as_script«code_folded$35ba1eea-56ae-4b74-af96-21ec5a93c455cell_id$35ba1eea-56ae-4b74-af96-21ec5a93c455codeOmd""" You could simply add `lmpi` but using `mpicc` and `mpic++` is easier. """metadatashow_logsèdisabled®skip_as_script«code_folded$488b0c17-4f0f-43bf-a16c-b9faa7ae0595cell_id$488b0c17-4f0f-43bf-a16c-b9faa7ae0595code4aside(citeintro("Section 2.7.6.3"), v_offset = -150)metadatashow_logsèdisabled®skip_as_script«code_folded$10a1b3a7-21c7-4f97-93e1-006ad3aea40dcell_id$10a1b3a7-21c7-4f97-93e1-006ad3aea40dcodemd"## Butterfly"metadatashow_logsèdisabled®skip_as_script«code_folded$b68eb860-a5b4-4e9e-9fbf-6eb6ce43ae69cell_id$b68eb860-a5b4-4e9e-9fbf-6eb6ce43ae69codeFoldable(md"What is the bisection width of a ``n \times n`` 2D array ?", md""" It is ``n = \sqrt{|V|}``: $(img("https://upload.wikimedia.org/wikipedia/commons/2/2f/Bisected_mesh.jpg", :width => "300pt")) $(Foldable(md"What is the bisection width of a ``n^d`` ``d``D array ?", md"It is 1 for ``d = 1``, ``n`` for ``d = 2`` and ``n^2`` for ``d = 3``. In general, it is ``n^{d-1} = |V|^{(d-1)/d}``")) """)metadatashow_logsèdisabled®skip_as_script«code_folded$3dc860be-016d-49ee-8535-7d9457c70f85cell_id$3dc860be-016d-49ee-8535-7d9457c70f85codenFoldable(md"What is the graph diameter ?", md"``|V| - 1`` if ``u`` and ``v`` are extreme points of the array")metadatashow_logsèdisabled®skip_as_script«code_folded$2e4dc3f9-a132-444f-a35d-f583823a7dfdcell_id$2e4dc3f9-a132-444f-a35d-f583823a7dfdcode"Foldable(md"What is the graph diameter of a ``n \times n`` 2D array ?", md""" It is ``2(n-1)``, attained for opposite vertices of the square. $(Foldable(md"What is the graph diameter of a ``n^d`` ``d``D array ?", md"It is ``d(n-1)``, attained for opposite vertices of the hypercube.")) """)metadatashow_logsèdisabled®skip_as_script«code_folded$8981b5e2-2497-478e-ab28-a14b62f6f916cell_id$8981b5e2-2497-478e-ab28-a14b62f6f916coderun(`mpicc -show`)metadatashow_logsèdisabled®skip_as_script«code_folded$34a10003-2c32-4332-b3e6-ce70eec0cbbecell_id$34a10003-2c32-4332-b3e6-ce70eec0cbbecodemd"## Example"metadatashow_logsèdisabled®skip_as_script«code_folded$a0566fdb-a08d-4bcf-9b2f-ed211c9f111fcell_id$a0566fdb-a08d-4bcf-9b2f-ed211c9f111fcode2aside(citeintro("Section 2.7.5"), v_offset = -150)metadatashow_logsèdisabled®skip_as_script«code_folded$655e980d-b4e9-4f56-a5ae-380072242d27cell_id$655e980d-b4e9-4f56-a5ae-380072242d27codehbox([ img("https://ars.els-cdn.com/content/image/3-s2.0-B9781558608528500043-f01-09-9781558608528.jpg"), img("https://ars.els-cdn.com/content/image/3-s2.0-B9781558608528500043-f01-10-9781558608528.jpg"), ])metadatashow_logsèdisabled®skip_as_script«code_folded$4788d8b4-2efa-4489-80c3-71f405513644cell_id$4788d8b4-2efa-4489-80c3-71f405513644code\md"`num_processes` = $(@bind sum_num_processes Slider(2:8, default = 2, show_value = true))"metadatashow_logsèdisabled®skip_as_script«code_folded$233c13ff-f008-40b0-a6c5-c5395b2215eccell_id$233c13ff-f008-40b0-a6c5-c5395b2215eccodeFoldable( md"Lower bound complexity with ``p`` processes if each ``x_i`` has length ``n/p`` bytes ?", md""" Lower bound : ``\log_2(p) \alpha`` using *spanning tree* algorithm and ``\beta n`` as all message need to sent at least once. *spanning tree* is advantageous if ``\alpha`` is larger than ``\beta`` and direct to `1` if otherwise. In practice, you want a mix of both. First send ``x_2`` from 2 to 1 and simultaneously send ``x_4`` from 4 to 3. Complexity is ``\alpha + \beta n/4`` | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_1`` | | | | | | ``x_2`` | ``x_2`` | | | | | | | ``x_3`` | | | | | | ``x_4`` | ``x_4`` | Then send ``(x_3, x_4)`` from 3 to 1. Complexity is ``\alpha + 2\beta n/4`` | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_1`` | | | | | | ``x_2`` | ``x_2`` | | | | | ``x_3`` | | ``x_3`` | | | | ``x_4`` | | ``x_4`` | ``x_4`` | In total, it is ``2\alpha + 3\beta n/4``. In general, we have ```math \log_2(p)\alpha + \beta n(1 + 2 + 4 + \cdots + p/2)/p = \log_2(p)\alpha + \beta n(p - 1)/p \approx \log_2(p)\alpha + \beta n ``` """ )metadatashow_logsèdisabled®skip_as_script«code_folded$6fc34de1-469b-41a9-9677-ff3182f7a498cell_id$6fc34de1-469b-41a9-9677-ff3182f7a498code٤Foldable(md"Can `MPI_Allgather` be implemented by combining existing collectives ?", md"`MPI_Allgather` can be implemented by `MPI_Gather` followed by `MPI_Bcast`")metadatashow_logsèdisabled®skip_as_script«code_folded$b53ec488-ff25-4647-ab00-fbf90963a795cell_id$b53ec488-ff25-4647-ab00-fbf90963a795codeٙmd""" *blocking factor* : Ratio between upper links and lower links. Ratio is 1 for fat-tree to prevent bottlenecks if all nodes start communicating. """metadatashow_logsèdisabled®skip_as_script«code_folded$c3c848ff-526a-450d-9b1c-5d9d3ccccf28cell_id$c3c848ff-526a-450d-9b1c-5d9d3ccccf28code#md"## Eager vs rendezvous protocol"metadatashow_logsèdisabled®skip_as_script«code_folded$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3bcell_id$063f0acc-c023-46d0-9ed9-fbd7fbdcfa3bcode*citeintro(what) = "[Eij10; " * what * "]";metadatashow_logsèdisabled®skip_as_script«code_folded$4569aa05-9963-4976-ac63-caf3f3979e83cell_id$4569aa05-9963-4976-ac63-caf3f3979e83codemd""" Blocking send/received with `MPI_Send` and `MPI_Recv`. The network cannot buffer the whole message (unless it is short). The sender need to wait for the receiver to be ready and then transfer its copy of the data. """metadatashow_logsèdisabled®skip_as_script«code_folded$7565e3da-84ce-42b6-8d4b-3615576f33b7cell_id$7565e3da-84ce-42b6-8d4b-3615576f33b7codetbegin struct Path path::String end function imgpath(path::Path) file = path.path if !('.' in file) file = file * ".png" end return joinpath(joinpath(@__DIR__, "images", file)) end function img(path::Path, args...; kws...) return PlutoUI.LocalResource(imgpath(path), args...) end struct URL url::String end function save_image(url::URL, html_attributes...; name = split(url.url, '/')[end], kws...) path = joinpath("cache", name) return PlutoTeachingTools.RobustLocalResource(url.url, path, html_attributes...), path end function img(url::URL, args...; kws...) r, _ = save_image(url, args...; kws...) return @htl("$r") end function img(file::String, args...; kws...) if startswith(file, "http") img(URL(file), args...; kws...) else img(Path(file), args...; kws...) end end end metadatashow_logsèdisabled®skip_as_script«code_folded$7fc70992-973a-43c6-904a-dd1b622a5ed8cell_id$7fc70992-973a-43c6-904a-dd1b622a5ed8codeټFoldable(md"What is the bisection width ?", md""" The bisection width is 1 : $(img("https://upload.wikimedia.org/wikipedia/commons/7/79/Bisected_linear_array.jpg", :width => "300pt")) """)metadatashow_logsèdisabled®skip_as_script«code_folded$fa024a5d-52a6-459d-894d-13a60ec723d2cell_id$fa024a5d-52a6-459d-894d-13a60ec723d2codeFoldable(md"What are the differences with Min-Cut ?", md""" In Min-Cut, we fix a node in ``S``, a node in ``V \setminus S`` and the cardinality of `S` is not constrained. These differences allow Min-Cut to be solvable in polynomial time. """)metadatashow_logsèdisabled®skip_as_script«code_folded$f7f097cb-d7bd-49eb-a030-ac26f8f61a67cell_id$f7f097cb-d7bd-49eb-a030-ac26f8f61a67codeCmd"Fat-tree need large switches, alternative is butterfly network:"metadatashow_logsèdisabled®skip_as_script«code_folded$3a2bfd4e-0ce6-4a79-a578-fc1b4ef563c5cell_id$3a2bfd4e-0ce6-4a79-a578-fc1b4ef563c5code'Foldable(md"How to fix it ?", md""" We should load `gompi` or at least `OpenMPI`: ```sh [blegat@lm4-f001 examples]$ module load OpenMPI [blegat@lm4-f001 examples]$ mpicc procname.c [blegat@lm4-f001 examples]$ mpiexec -n 4 a.out Process 1/4 is running on node <> Process 3/4 is running on node <> Process 0/4 is running on node <> Process 2/4 is running on node <> ``` $(Foldable(md"Why are they all on same node ?", md"We are on the *login node*, we need to run jobs on the *compute nodes* using Slurm !")) """)metadatashow_logsèdisabled®skip_as_script«code_folded$a79c410a-bebf-434c-9730-568e0ff4f4c7cell_id$a79c410a-bebf-434c-9730-568e0ff4f4c7code "15pt")) cluster from [the list](https://www.ceci-hpc.be/clusters.html) + `manneback` for GPU. You only have access to Tier-2 clusters. This sadly leaves out: - Tier-1 clusters such as Lucia - Tier-0 cluster such as $(img("https://www.lumi-supercomputer.eu/wp-content/uploads/2020/02/lumi_logo.png", :height => "15pt")) from $(img("https://upload.wikimedia.org/wikipedia/commons/8/8f/HPC_JU_logo_RGB.svg", :height => "20pt")) * Connect with SSH using `ssh lemaitre4` or `ssh manneback`. """metadatashow_logsèdisabled®skip_as_script«code_folded$e119c2d3-1e24-464f-b812-62f28c00a913cell_id$e119c2d3-1e24-464f-b812-62f28c00a913codemd"## Reduce scatter"metadatashow_logsèdisabled®skip_as_script«code_folded$a258eec9-f4f6-49bd-8470-8541836f5f6bcell_id$a258eec9-f4f6-49bd-8470-8541836f5f6bcodehbox([ md"""Before | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_1`` | | | | | | | ``x_2`` | | | | | | | ``x_3`` | | | | | | | ``x_4`` | """, Div(md"` `", style = Dict("margin" => "50pt")), md"""After `MPI_Allgather` | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_1`` | ``x_1`` | ``x_1`` | ``x_1`` | | | ``x_2`` | ``x_2`` | ``x_2`` | ``x_2`` | | | ``x_3`` | ``x_3`` | ``x_3`` | ``x_3`` | | | ``x_4`` | ``x_4`` | ``x_4`` | ``x_4`` | """, ])metadatashow_logsèdisabled®skip_as_script«code_folded$79b405a5-54b5-4727-a0cd-b79522ad109fcell_id$79b405a5-54b5-4727-a0cd-b79522ad109fcodemd"# Point-to-point"metadatashow_logsèdisabled®skip_as_script«code_folded$93f0c63c-b597-4f89-809c-7af0476f319acell_id$93f0c63c-b597-4f89-809c-7af0476f319acodeٚmd""" `MPI_Isend` and `MPI_Irecv` where `I` stands for `immediate` or `incomplete`. `MPI_Wait` can be used to wait for the send and receive to finish. """metadatashow_logsèdisabled®skip_as_script«code_folded$944d827e-bc6a-4de8-b959-5fde8790bedccell_id$944d827e-bc6a-4de8-b959-5fde8790bedccode٣md""" ```sh [laptop]$ ssh lemaitre4 [blegat@lm4-f001 ~]$ cd LINMA2710/examples [blegat@lm4-f001 examples]$ mpicc procname.c -bash: mpicc: command not found ``` """metadatashow_logsèdisabled®skip_as_script«code_folded$c04bcc96-e5fe-4d6e-a12e-40dcde58c62ecell_id$c04bcc96-e5fe-4d6e-a12e-40dcde58c62ecodemd""" * MPI $(img("https://avatars.githubusercontent.com/u/14836989", name = "MPI.png", :height => "20pt")) is an open standard for distributed computing * [Many implementations](https://www.mpi-forum.org/implementation-status/): - MPICH, from $(img("https://upload.wikimedia.org/wikipedia/commons/6/65/ArgonneLaboratoryLogo.png", :height => "20pt")) and $(img("https://upload.wikimedia.org/wikipedia/commons/6/69/Mississippi_State_University_logo.svg", :height => "20pt")) - Open MPI $(img("https://upload.wikimedia.org/wikipedia/commons/6/6f/Open_MPI_logo.png", :height => "20pt")) (not to be confused with $(img("https://upload.wikimedia.org/wikipedia/commons/e/eb/OpenMP_logo.png", :width => "45pt"))) - commercial implementations from $(img("https://upload.wikimedia.org/wikipedia/commons/4/46/Hewlett_Packard_Enterprise_logo.svg", :height => "20pt")), $(img("https://upload.wikimedia.org/wikipedia/commons/6/6a/Intel_logo_%282020%2C_dark_blue%29.svg", :height => "15pt")), $(img("https://upload.wikimedia.org/wikipedia/commons/9/96/Microsoft_logo_%282012%29.svg", :height => "15pt")), and $(img("https://upload.wikimedia.org/wikipedia/commons/9/96/NEC_logo.svg", :height => "15pt")) """metadatashow_logsèdisabled®skip_as_script«code_folded$c420ad25-6af1-4fb4-823a-b6bbd4e10f7fcell_id$c420ad25-6af1-4fb4-823a-b6bbd4e10f7fcode.hbox([ md"""Before | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_1`` | ``x_2`` | ``x_3`` | ``x_4`` | """, Div(md"` `", style = Dict("margin" => "50pt")), md"""After | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_1 + x_2 + x_3 + x_4`` | | | | """, ])metadatashow_logsèdisabled®skip_as_script«code_folded$a6c337c4-0c81-4463-ad4f-9a4528d953abcell_id$a6c337c4-0c81-4463-ad4f-9a4528d953abcode&md"## Message Passing Interface (MPI)"metadatashow_logsèdisabled®skip_as_script«code_folded$921b5a18-0733-4032-a543-9d60e254b1b2cell_id$921b5a18-0733-4032-a543-9d60e254b1b2codeNmd""" * Specializing on topology is important for communication libraries like MPI/NCCL. For instance, Deepseek-V3 by-passed NCCL and used PTX directly to hardcode how their hardware should be used. * Specified in [Slurm's `topology.conf` file](https://slurm.schedmd.com/topology.conf.html). * Source : $(citeintro("Section 2.7")) """metadatashow_logsèdisabled®skip_as_script«code_folded$2257220c-6f0e-4edf-9fea-7e388b84df9bcell_id$2257220c-6f0e-4edf-9fea-7e388b84df9bcode'md"## Multidimensional array and torus"metadatashow_logsèdisabled®skip_as_script«code_folded$c45ff9b5-35d9-4a9d-a801-c762333a1f02cell_id$c45ff9b5-35d9-4a9d-a801-c762333a1f02codebegin struct Example name::String end function code(example::Example) code = read(joinpath(dirname(@__DIR__), "examples", example.name), String) ext = split(example.name, '.')[end] if ext == "c" return CCode(code) elseif ext == "cpp" || ext == "cc" return CppCode(code) elseif ext == "cl" return CLCode(code) else error("Unrecognized extension `$ext`.") end end function SimpleClang.compile_and_run(example::Example; kws...) return SimpleClang.compile_and_run(code(example); kws...) end function SimpleClang.compile_lib(example::Example; kws...) return SimpleClang.compile_lib(code(example); kws...) end endmetadatashow_logsèdisabled®skip_as_script«code_folded$4aac6ab5-053a-4f60-9e2e-e8d61ff0cecbcell_id$4aac6ab5-053a-4f60-9e2e-e8d61ff0cecbcode١img("https://raw.githubusercontent.com/VictorEijkhout/TheArtOfHPC_vol1_scientificcomputing/refs/heads/main/booksources/graphics/fattree5.jpg", :width => "500pt")metadatashow_logsèdisabled®skip_as_script«code_folded$61af27f1-9f83-42f1-a419-06d12ea62133cell_id$61af27f1-9f83-42f1-a419-06d12ea62133code4aside(citeintro("Section 2.7.6.1"), v_offset = -200)metadatashow_logsèdisabled®skip_as_script«code_folded$52d428d5-cb33-4f2a-89eb-3a8ce3f5bb81cell_id$52d428d5-cb33-4f2a-89eb-3a8ce3f5bb81codeFoldable( md"Each process runs the **same** executable. So how can we make them do different things ?", md"Even if the code is the same, `MPI_Comm_rank` will give different `procid` so the part of the program depending on the value of `procid` will differ.", )metadatashow_logsèdisabled®skip_as_script«code_folded$d7e31ced-4eb2-4221-b83f-462e8f32fe89cell_id$d7e31ced-4eb2-4221-b83f-462e8f32fe89codeWaside(Foldable(md"Is this timing bandwith accurately ?", md"No, the time also includes the time that process 0 has to wait until process 1 is ready to start receiving. If the message is too small, it will just buffer the message and `MPI_Send` could return before the other process even reached `MPI_Recv`, see next slide." ), v_offset = -500)metadatashow_logsèdisabled®skip_as_script«code_folded$2c84bd84-b54d-4594-b9f8-35db2124d7e8cell_id$2c84bd84-b54d-4594-b9f8-35db2124d7e8codemd"## Hypercube"metadatashow_logsèdisabled®skip_as_script«code_folded$32f740e7-9338-4c42-8eaf-ce8022412c50cell_id$32f740e7-9338-4c42-8eaf-ce8022412c50code md"## Nonblocking communication"metadatashow_logsèdisabled®skip_as_script«code_folded$db16e939-b490-497b-a03f-80ce2e8485afcell_id$db16e939-b490-497b-a03f-80ce2e8485afcodeFoldable( md"Lower bound complexity with ``p`` processes if each ``x_i`` has length $n$ bytes and the arithmetic complexity is ``\gamma`` ?", md"""Lower bound : ``\log_2(p) (\alpha + \beta n) + \log_2(p) \gamma n`` using *spanning tree* algorithm: First communication (2 → 1 and 4 → 3 at the same time): | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_1 + x_2`` | | ``x_3 + x_4`` | | Then second communication (3 → 1) """ )metadatashow_logsèdisabled®skip_as_script«code_folded$7d37fbea-baa3-43ec-b003-a4707017a4cfcell_id$7d37fbea-baa3-43ec-b003-a4707017a4cfcodemd"## Rings"metadatashow_logsèdisabled®skip_as_script«code_folded$568057f5-b0b8-4225-8e4b-5eec911a52efcell_id$568057f5-b0b8-4225-8e4b-5eec911a52efcodemd"## Example"metadatashow_logsèdisabled®skip_as_script«code_folded$370f0f20-e373-4028-bca1-83e93678cbcbcell_id$370f0f20-e373-4028-bca1-83e93678cbcbcodeُimg("https://raw.githubusercontent.com/VictorEijkhout/TheArtOfHPC_vol2_parallelprogramming/refs/heads/main/booksources/graphics/mpi-array.png")metadatashow_logsèdisabled®skip_as_script«code_folded$1bac238f-79c8-4f9f-a187-bacb288de3b0cell_id$1bac238f-79c8-4f9f-a187-bacb288de3b0codetree()metadatashow_logsèdisabled®skip_as_script«code_folded$40606ee3-38cc-4123-9b86-b774bf89e499cell_id$40606ee3-38cc-4123-9b86-b774bf89e499codemd"# Collectives"metadatashow_logsèdisabled®skip_as_script«code_folded$8df4ff2f-d176-4b4e-a525-665b5d07ea52cell_id$8df4ff2f-d176-4b4e-a525-665b5d07ea52codeًusing SimpleClang, PlutoUI, PlutoUI.ExperimentalLayout, HypertextLiteral, Luxor, StaticArrays, BenchmarkTools, PlutoTeachingTools, Markdownmetadatashow_logsèdisabled®skip_as_script«code_folded$23bfbe95-7ba2-41b9-bd8b-dc4baa3ad53acell_id$23bfbe95-7ba2-41b9-bd8b-dc4baa3ad53acodeٳFoldable(md"What is the bisection width ?", md""" The bisection width is 2: $(img("https://upload.wikimedia.org/wikipedia/commons/5/51/Bisected_ring.jpg", :width => "300pt")) """)metadatashow_logsèdisabled®skip_as_script«code_folded$49b596b8-891d-4f3f-a6a4-a62cc8237df3cell_id$49b596b8-891d-4f3f-a6a4-a62cc8237df3code^definition("Graph diameter", md"*Graph diameter* is ``d(G) := \max_{u, v \in V} d(G, u, v)``")metadatashow_logsèdisabled®skip_as_script«code_folded$cf799c26-1cea-4b38-9a15-8497813bd668cell_id$cf799c26-1cea-4b38-9a15-8497813bd668codemd"## MPI basics"metadatashow_logsèdisabled®skip_as_script«code_folded$5441e428-b320-433c-acde-15fe6bf58537cell_id$5441e428-b320-433c-acde-15fe6bf58537coderun(`mpic++ -show`)metadatashow_logsèdisabled®skip_as_script«code_folded$bfab5c2d-61c3-468b-9ddf-4aaa49cb7785cell_id$bfab5c2d-61c3-468b-9ddf-4aaa49cb7785code(md""" * [Eij10] V. Eijkhout. [Introduction to High Performance Scientific Computing](https://theartofhpc.com/istc.html). 3 Edition, Vol. 1 (Lulu.com, 2010). * [Eij17] V. Eijkhout. [Parallel Programming in MPI and OpenMP](https://theartofhpc.com/pcse.html). 2 Edition, Vol. 2 (Lulu.com, 2017). """metadatashow_logsèdisabled®skip_as_script«code_folded$7d9ac5f9-39bf-4052-ad8a-ac0fec15c64acell_id$7d9ac5f9-39bf-4052-ad8a-ac0fec15c64acode_md""" Processes that are on the same node share the same `processor_name` (the `hostname`). """metadatashow_logsèdisabled®skip_as_script«code_folded$86394e1c-0ff4-449a-8940-4b5906d8b6f0cell_id$86394e1c-0ff4-449a-8940-4b5906d8b6f0code9Foldable(md"What is the graph diameter ?", md"``|V|/2``")metadatashow_logsèdisabled®skip_as_script«code_folded$97d3cf3f-ddac-4850-8b05-bdc0c4741f61cell_id$97d3cf3f-ddac-4850-8b05-bdc0c4741f61code+Foldable(md"What are the number of switches, edges, graph diameter and bisection width for ``n`` computer nodes ?", md""" * There are ``n^2`` switches one per intersection. This makes this architecture only suitable for small ``n``. * The number of edges is : ``|E| = 2n^2`` which consists of ``n`` connections from an input to a switch, ``n`` connections from a switch to an output and ``2n(n-1)`` connections between switches. * The diameter 2 if we don't count the in-between switches or ``2n`` if we coun't them. * The bisection width is ``n/2``. """)metadatashow_logsèdisabled®skip_as_script«code_folded$9b4cae31-c319-444e-98c8-2c0bfc6dfa0ccell_id$9b4cae31-c319-444e-98c8-2c0bfc6dfa0ccodemd"## Broadcast"metadatashow_logsèdisabled®skip_as_script«code_folded$b9a9e335-1328-4c63-a213-ce21263bc201cell_id$b9a9e335-1328-4c63-a213-ce21263bc201codeEFoldable( md"Can `MPI_Allreduce` be implemented by combining existing collectives ?", md""" Let the size of each ``x_i`` be ``n`` bytes. `MPI_Allreduce` can be implemented either by combining `MPI_Reduce` followed by `MPI_Bcast` or `MPI_Reduce_scatter` followed by `MPI_Allgather`. The first choice would lead to a complexity of ``\log_2(p)(\alpha + \beta n + \gamma n )``. The second would lead to a complexity of ``\log_2(p)\alpha + \beta n + \gamma n``. This second approach is faster for large ``p`` since we removed ``\log_2(p)`` in front of ``\beta`` and ``\gamma``. """, )metadatashow_logsèdisabled®skip_as_script«code_folded$7b1d26c6-9499-4e44-84c8-c272737a175ecell_id$7b1d26c6-9499-4e44-84c8-c272737a175ecodemd"## Gather"metadatashow_logsèdisabled®skip_as_script«code_folded$360091c4-d3a0-462d-abcf-b9bbb9480871cell_id$360091c4-d3a0-462d-abcf-b9bbb9480871codemd"## Linear array"metadatashow_logsèdisabled®skip_as_script«code_folded$f2417047-33fc-4489-8e89-115bc6b46c13cell_id$f2417047-33fc-4489-8e89-115bc6b46c13code@aside(md"""From $(citeintro("Figure 2.30"))""", v_offset = -200)metadatashow_logsèdisabled®skip_as_script«code_folded$1152dec8-3810-42b1-bb2a-8755dcaef56ccell_id$1152dec8-3810-42b1-bb2a-8755dcaef56ccode٠img1(f, args...) = img("https://raw.githubusercontent.com/VictorEijkhout/TheArtOfHPC_vol1_scientificcomputing/refs/heads/main/booksources/graphics/$f", args...)metadatashow_logsèdisabled®skip_as_script«code_folded$8f46daf1-9ca2-4a08-99aa-4ed68af218b8cell_id$8f46daf1-9ca2-4a08-99aa-4ed68af218b8code2aside(citeintro("Section 2.7.4"), v_offset = -150)metadatashow_logsèdisabled®skip_as_script«code_folded$143dca7c-f9a4-472a-a4bc-4578e4e8413bcell_id$143dca7c-f9a4-472a-a4bc-4578e4e8413bcodemd"## Tree"metadatashow_logsèdisabled®skip_as_script«code_folded$c0daf219-cb87-4203-b835-49ab7eb955becell_id$c0daf219-cb87-4203-b835-49ab7eb955becodeOmd""" ``` [local computer]$ ssh lemaitre4 ``` $list_1 $mpicc_cmd $list_2 """metadatashow_logsèdisabled®skip_as_script«code_folded$d04b9af5-f004-4ca4-b1c9-2c86d46cb37dcell_id$d04b9af5-f004-4ca4-b1c9-2c86d46cb37dcodeEmd""" * Each node input is a row and each node output is a column; [source of figure below](https://www.sciencedirect.com/topics/computer-science/crossbar-network). * Each intersection is a switch. The cases (a) and (c) represent conflicting cases where two inputs want to simultaneously communicate with the same output. """metadatashow_logsèdisabled®skip_as_script«code_folded$fc705b81-7310-44cc-ad9f-dc2cf8a9b645cell_id$fc705b81-7310-44cc-ad9f-dc2cf8a9b645codepath(true)metadatashow_logsèdisabled®skip_as_script«code_folded$6d2b3dbc-0686-49f0-904a-56c3ce63b4ddcell_id$6d2b3dbc-0686-49f0-904a-56c3ce63b4ddcodeٚhbox([ Div(md"Initializes MPI, remove `mpiexec`, etc... from `argc` and `argv`."; style = Dict("flex-grow" => "1")), c""" MPI_Init(&argc, &argv) """, ])metadatashow_logsèdisabled®skip_as_script«code_folded$e172f5c5-8b96-4efd-9cf3-805c58d1a6a3cell_id$e172f5c5-8b96-4efd-9cf3-805c58d1a6a3codezfunction definition(name, content) return Markdown.MD(Markdown.Admonition("key-concept", "Def: $name", [content])) endmetadatashow_logsèdisabled®skip_as_script«code_folded$ce7bf747-7116-4e76-9004-f234317046c3cell_id$ce7bf747-7116-4e76-9004-f234317046c3codeKcompile_and_run(Example("MPI/mpi_bench1.c"), mpi = true, num_processes = 2)metadatashow_logsèdisabled®skip_as_script«code_folded$d722a86d-6d51-4d91-ac22-53af94c91497cell_id$d722a86d-6d51-4d91-ac22-53af94c91497codevbox([ Div(md"Get the id of processes. `procid` is **different** for **different** processes."; style = Dict("flex-grow" => "1")), c""" int procid; MPI_Comm_rank(MPI_COMM_WORLD, &procid); """, ])metadatashow_logsèdisabled®skip_as_script«code_folded$273ad3a6-cb32-49bb-8702-fdaf8597e812cell_id$273ad3a6-cb32-49bb-8702-fdaf8597e812code2md"## Different processes may be on the same node"metadatashow_logsèdisabled®skip_as_script«code_folded$6041a909-d26c-4ab1-836b-29953c578759cell_id$6041a909-d26c-4ab1-836b-29953c578759codefFoldable(md"What is the number of edges ? What is the bisection width ?", md""" Same as fat-tree. """)metadatashow_logsèdisabled®skip_as_script«code_folded$b94cd399-0370-49e9-a522-056f3af22955cell_id$b94cd399-0370-49e9-a522-056f3af22955codeّimg("https://raw.githubusercontent.com/VictorEijkhout/TheArtOfHPC_vol2_parallelprogramming/refs/heads/main/booksources/graphics/collectives.jpg")metadatashow_logsèdisabled®skip_as_script«code_folded$58e12afd-6eb0-4731-bd57-d9ae7ab4e164cell_id$58e12afd-6eb0-4731-bd57-d9ae7ab4e164code@htl("""

LINMA2710 - Scientific Computing Distributed Computing with MPI

P.-A. Absil and B. Legat

$(PlutoTeachingTools.ChooseDisplayMode()) $(PlutoUI.TableOfContents(depth=1)) """)metadatashow_logsèdisabled®skip_as_script«code_folded$141d162c-c817-498f-be16-f1cd35d82487cell_id$141d162c-c817-498f-be16-f1cd35d82487codeAFoldable(md"How to collect the partial sums ?", md"`MPI_Reduce`")metadatashow_logsèdisabled®skip_as_script«code_folded$e44b0038-d68f-4a49-9da2-67fbcbe098c3cell_id$e44b0038-d68f-4a49-9da2-67fbcbe098c3codepath(false)metadatashow_logsèdisabled®skip_as_script«code_folded$b5a3e471-af4a-466f-bbae-96306bcc7563cell_id$b5a3e471-af4a-466f-bbae-96306bcc7563codeٽvbox([ Div(md"Get the number of processes. `nprocs` is the **same** on all processes."; style = Dict("flex-grow" => "1")), c""" int nprocs; MPI_Comm_size(MPI_COMM_WORLD, &nprocs); """, ])metadatashow_logsèdisabled®skip_as_script«code_folded$21b6133f-db59-4885-9b3d-331c3d6ef306cell_id$21b6133f-db59-4885-9b3d-331c3d6ef306codemd"## Compiling"metadatashow_logsèdisabled®skip_as_script«code_folded$beee4908-d519-413a-964f-149bb82cdbb8cell_id$beee4908-d519-413a-964f-149bb82cdbb8codemd"## Slurm"metadatashow_logsèdisabled®skip_as_script«code_folded$091dd042-580b-4fda-8086-e048663aed6ccell_id$091dd042-580b-4fda-8086-e048663aed6ccodemd""" * NVIDIA Nsight Systems $(img("https://developer.download.nvidia.com/images/nvidia-nsight-systems-icon-gbp-shaded-256.png", :width => "20pt")) can profile CUDA code but also MPI * Available on `manneback` after loading `CUDA` with $(img("https://github.com/TACC/Lmod/raw/main/logos/2x/Lmod-4color%402x.png", :height => "20px")) ```sh [laptop]$ ssh manneback [blegat@mbackf1 ~]$ nsys -bash: nsys: command not found [blegat@mbackf1 ~]$ ml CUDA [blegat@mbackf1 ~]$ nsys ``` """metadatashow_logsèdisabled®skip_as_script«code_folded$60bc118f-6795-43f9-97a2-865fd1704895cell_id$60bc118f-6795-43f9-97a2-865fd1704895codemd"## Allreduce"metadatashow_logsèdisabled®skip_as_script«code_folded$16f8d28b-f201-4fe5-8446-68d7d9ddfb3ccell_id$16f8d28b-f201-4fe5-8446-68d7d9ddfb3ccode4aside(citeintro("Section 2.7.6.2"), v_offset = -250)metadatashow_logsèdisabled®skip_as_script«code_folded$35aa1295-642f-4525-bf19-df2a42ff39d6cell_id$35aa1295-642f-4525-bf19-df2a42ff39d6codeecompile_and_run(Example("MPI/mpi_sum.c"), mpi = true, num_processes = sum_num_processes, verbose = 1)metadatashow_logsèdisabled®skip_as_script«code_folded$e832ce25-94e2-4743-854d-02b52cc7b56dcell_id$e832ce25-94e2-4743-854d-02b52cc7b56dcodeٯaside(Foldable(md"Why is it the first process that gets the sum ?", md"We gave 0 to the 6th argument of `MPI_Reduce`, this decides which node gets the sum."), v_offset = -100)metadatashow_logsèdisabled®skip_as_script«code_folded$9612a1ef-fd3a-4a58-87b0-b2255ac86331cell_id$9612a1ef-fd3a-4a58-87b0-b2255ac86331codemd"## Graph diameter"metadatashow_logsèdisabled®skip_as_script«code_folded$d7117a24-aba6-4479-a40e-5005310a6b38cell_id$d7117a24-aba6-4479-a40e-5005310a6b38code2aside(citeintro("Section 2.7.3"), v_offset = -150)metadatashow_logsèdisabled®skip_as_script«code_folded$0e640e07-82c7-4dab-a8f1-2f634bbebdeacell_id$0e640e07-82c7-4dab-a8f1-2f634bbebdeacodeZhbox([ img("https://raw.githubusercontent.com/VictorEijkhout/TheArtOfHPC_vol2_parallelprogramming/refs/heads/main/booksources/graphics/send-ideal.png", :height => "150pt"), img("https://raw.githubusercontent.com/VictorEijkhout/TheArtOfHPC_vol2_parallelprogramming/refs/heads/main/booksources/graphics/send-blocking.png", :height => "160pt"), ])metadatashow_logsèdisabled®skip_as_script«code_folded$f2ebc6fb-e07c-4922-897d-9bbe0f5fa1d0cell_id$f2ebc6fb-e07c-4922-897d-9bbe0f5fa1d0code#definition("Bisection bandwidth", hbox([ md""" The *bisection width* is: ```math \min_{S \subset V : \lfloor |V|/2 \rfloor \le |S| \le \lceil |V|/2 \rceil} \quad w(S, V \setminus S) ``` """, Div(html" ", style = Dict("flex-grow" => "1")), md""" The *bisection **band**width* is: ```math \min_{S \subset V : \lfloor |V|/2 \rfloor \le |S| \le \lceil |V|/2 \rceil} \quad \texttt{bw}(S, V \setminus S) ``` """])#)metadatashow_logsèdisabled®skip_as_script«code_folded$55e96151-2aa1-4ea0-b672-2038c57d911ecell_id$55e96151-2aa1-4ea0-b672-2038c57d911ecode|aside(img("https://upload.wikimedia.org/wikipedia/en/3/3e/The_LUMI_supercomputer.jpg", :height => "100pt"), v_offset = -140)metadatashow_logsèdisabled®skip_as_script«code_folded$67dee339-98b4-4714-88b2-8098a13235f2cell_id$67dee339-98b4-4714-88b2-8098a13235f2codemd""" There are two protocols: * Rendezvous protocol 1. the sender sends a header; 2. the receiver returns a ‘ready-to-send’ message; 3. the sender sends the actual data. * Eager protocol the message is buffered so `MPI_Send` can return eagerly, before the receiver is even ready Eager protocol is used if the data size is smaller than the *eager limit*. To force the rendezvous protocol, use `MPI_Ssend`. """metadatashow_logsèdisabled®skip_as_script«code_folded$4e32f7fb-cd5a-4190-9c92-ba4029313475cell_id$4e32f7fb-cd5a-4190-9c92-ba4029313475codeُimg("https://raw.githubusercontent.com/VictorEijkhout/TheArtOfHPC_vol2_parallelprogramming/refs/heads/main/booksources/graphics/mpi-node2.png")metadatashow_logsèdisabled®skip_as_script«code_folded$dbc19cbb-1349-4904-b655-2452aa7e2452cell_id$dbc19cbb-1349-4904-b655-2452aa7e2452codevbox([ md"""Before | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_{1,1}`` | ``x_{1,2}`` | ``x_{1,3}`` | ``x_{1,4}`` | | | ``x_{2,1}`` | ``x_{2,2}`` | ``x_{2,3}`` | ``x_{2,4}`` | | | ``x_{3,1}`` | ``x_{3,2}`` | ``x_{3,3}`` | ``x_{3,4}`` | | | ``x_{4,1}`` | ``x_{4,2}`` | ``x_{4,3}`` | ``x_{4,4}`` | """, #Div(md"` `", style = Dict("margin" => "50pt")), md"""After `MPI_Reduce_scatter` | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_{1,1} + \cdots + x_{1,4}`` | | | | | | | ``x_{2,1} + \cdots + x_{2,4}`` | | | | | | | ``x_{3,1} + \cdots + x_{3,4}`` | | | | | | | ``x_{4,1} + \cdots + x_{4,4}`` | """, ])metadatashow_logsèdisabled®skip_as_script«code_folded$0d69e94b-492a-4acc-adba-a2126b871724cell_id$0d69e94b-492a-4acc-adba-a2126b871724code}vbox([ md"""Before | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_1`` | ``x_2`` | ``x_3`` | ``x_4`` | """, #Div(md"` `", style = Dict("margin" => "50pt")), md"""After `MPI_Allreduce` | `procid` | 1 | 2 | 3 | 4 | |----------|---|---|---|---| | | ``x_1 + \cdots + x_4`` |``x_1 + \cdots + x_4`` | ``x_1 + \cdots + x_4`` | ``x_1 + \cdots + x_4`` | """, ])metadatashow_logsèdisabled®skip_as_script«code_folded$4309dc43-aeb8-4ec7-94fe-0e320b784349cell_id$4309dc43-aeb8-4ec7-94fe-0e320b784349code*md"Special case of multidimensional array"metadatashow_logsèdisabled®skip_as_script«code_folded$1551122c-70ae-4e37-b3fb-4be91fcc4afbcell_id$1551122c-70ae-4e37-b3fb-4be91fcc4afbcodeFoldable( md""" How to order the nodes so that consecutive nodes in the order are adjacent in the graph ? """, md""" Map nodes to binary number and use [Gray code](https://en.wikipedia.org/wiki/Gray_code). $(img("https://raw.githubusercontent.com/VictorEijkhout/TheArtOfHPC_vol1_scientificcomputing/refs/heads/main/booksources/graphics/hypercubenumber.jpg", :width => "300pt")) """ )metadatashow_logsèdisabled®skip_as_script«code_folded$3e98c0ca-1b47-4631-83d7-cd0c8c0a431dcell_id$3e98c0ca-1b47-4631-83d7-cd0c8c0a431dcode)citepara(what) = "[Eij17; " * what * "]";metadatashow_logsèdisabled®skip_as_script«code_folded$3ec3c058-a94d-4717-b99f-66373f2fa31dcell_id$3ec3c058-a94d-4717-b99f-66373f2fa31dcodeّimg("https://raw.githubusercontent.com/VictorEijkhout/TheArtOfHPC_vol1_scientificcomputing/refs/heads/main/booksources/graphics/butterflys.jpeg")metadatashow_logsèdisabled®skip_as_script«code_foldedënotebook_id$6cc3c06c-4ab3-11f1-be57-ebdd0f7c0b17in_temp_dir¨metadata