![]() ![]() If rand(local_rng())^2 + rand(local_rng())^2 <= += 1)Īre there any rule of thumbs when to use or when to avoid vectorized code? Seems like vectorized/broadcasted code or (?) non-scalar code in generators is not the best idea when it casues additional allocations. Regarding parallelization I fully get the point of good ol for loops thereĮdit, the parallel loops make things worse: function(N) # 364.900 μs (0 allocations: 0 bytes)įunction(N) # 970.800 μs (12 allocations: 336 bytes) I like that one, thanks! Btw, with Vectorized RNG, it needs 900 µs which is nice.Ĭrazy, that count(sum(rand(2).^2) <= 1 for i in 1:N ) needs 20 ms and 18 that’s very interesting but if I understand it right, the limit for “small” N needs to be investigated for each problem, right? Very your version is indeed fast (1.5 ms, 0 allocations) and it does not matter if sum or count is used. ![]() I also tried the Vectorized RNG, it’s a bit faster in vectorized version: function(N) # 993 µs, 3.83 MiBįunction(N) # 1.1 ms (13 allocations: 3.83 MiB)īut much faster in scalar loop version (it’s now the fastest variant): function(N) # 813.600 μs (0 allocations: 0 bytes)įunction(N) # 364.900 μs (0 allocations: 0 bytes) Do scalar returning function not need memory allocation? However I don’t understand why rand()^2 + rand()^2 does not allocate memory whereas norm(rand(2)) does. Julia> function f(n::MyInteger(10)įor large N the good thing about writing loops is that you can parallelize them: julia> using FLoopsĬool, so many answers, thank you very much for thanks for that version. Which one would you choose, based on performance and (imo also very important) readability?įor small N, you can build a data structure that informs the compiler the size of what you are doing, and the function will specialize to that size, possibly being very fast and non-allocating: # originalĬnt = count(r.^2 + r.^2. Since it’s clear that preallocation will come to some limits for high numbers of N, I’m looking for iterative solutions, but all generator or explicit loop based solutions are slower (in this example). I’m quite unsure if I should watch for the total time or the numbers of allocations. However, here are 9 different approaches to calculate Pi via Monte Carlo sampling (see BasicProblems ()) and I’d be extremely happy if you can point out the best version or propose your best one to solve that issue. But in case you tell me that’s the recommended way I’ll give it a try I have a strong Matlab background therefore it feels a bit weird to go back to simple loops (and I would not prefer to do that) due to their verbosity. Dear all, I’m trying to solve some toy problems in Julia and benchmark them since there often are many different ways to go. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |