To be able to edit code and run cells, you need to run the notebook yourself. Where would you like to run the notebook?

In the cloud (experimental)

Binder is a free, open source service that runs scientific notebooks in the cloud! It will take a while, usually 2-7 minutes to get a session.

On your computer

(Recommended if you want to store your changes.)

  1. Copy the notebook URL:
  2. Run Pluto

    (Also see: How to install Julia and Pluto)

  3. Paste URL in the Open box

Frontmatter

If you are publishing this notebook on the web, you can set the parameters below to provide HTML metadata. This is useful for search engines and social media.

Author 1

LINMA2710 - Scientific Computing Single Instruction Multiple Data (SIMD)

P.-A. Absil and B. Legat

     
👀 Reading hidden code
header("LINMA2710 - Scientific Computing
Single Instruction Multiple Data (SIMD)", "P.-A. Absil and B. Legat")
5.5 ms

Motivation

👀 Reading hidden code
12.5 μs

The need for parallelism

👀 Reading hidden code
11.2 μs
👀 Reading hidden code
28.2 ms
👀 Reading hidden code
31.7 ms

A bit of historical context

👀 Reading hidden code
11.9 μs
  • 1972 : C language created by Dennis Ritchie and Ken Thompson to ease development of Unix (previously developed in assembly)

  • 1985 : C++ created by Bjarne Stroustrup

  • 2003 : LLVM started at University of Illinois

  • 2005 : Apple hires Chris Lattner from the university

  • 2007 : He then creates the LLVM-based compiler Clang

  • 2009 : Mozilla start developing an LLVM-based compiler for Rust

  • 2009 : Develpment starts on Julia, with LLVM-based compiler

👀 Reading hidden code
317 ms

A sum function in C and Julia

👀 Reading hidden code
18.0 μs
float sum(float *vec, int length) {
    float total = 0;
    for (int i = 0; i < length; i++) {
        total += vec[i];
    }
    return total;
}
👀 Reading hidden code
10.0 μs
c_sum(x::Vector{Cfloat}) = ccall(("sum", sum_float_lib), Cfloat, (Ptr{Cfloat}, Cint), x, length(x));
👀 Reading hidden code
656 μs
julia_sum (generic function with 1 method)
function julia_sum(v::Vector{T}) where {T}
total = zero(T)
for i in eachindex(v)
total += v[i]
end
return total
end
👀 Reading hidden code
1.5 ms
👀 Reading hidden code
44.5 ms

Let's make a small benchmark

👀 Reading hidden code
12.3 μs
vec_float = rand(Float32, 2^16)
👀 Reading hidden code
31.8 ms
32784.137f0
@btime c_sum($vec_float)
👀 Reading hidden code
❔
8.6 s
32784.137f0
@btime julia_sum($vec_float)
👀 Reading hidden code
❔
7.4 s
How to speed up the C code ?

Try passing the following flags to Clang by selecting them and waiting for the benchmark timing to refresh:

What are they doing ? We'll see in the slide...

👀 Reading hidden code
96.6 ms
👀 Reading hidden code
71.0 ms
👀 Reading hidden code
13.9 μs
👀 Reading hidden code
342 ms

Summing with SIMD

👀 Reading hidden code
11.8 μs
Loading cells...