Go Concurrency: Chudnovky's algorithm, slower than sync

Issue

I just recently started learning go at the recommendation of a friend. So far, I am loving it, but I wrote (what I thought would be) the perfect example of the power of lightweight concurrency, and got a surprising result… so I suspect I’m doing something wrong, or I’m misunderstanding how expensive goroutines are. I’m hoping some gophers here can provide insight.

I wrote Chudnovsky’s algorithm in Go using both goroutines and simple synchronous execution. I assumed, with each calculation being independent of the others, it’d be at least a little faster running concurrently.

note: I am running this on a 5th gen i7, so if goroutines are multiplexed onto threads as I was told, this should be both concurrent and parallel.

 package main

import (
  "fmt"
  "math"
  "strconv"
  "time"
)

func main() {
  var input string
  var sum float64
  var pi float64
  c := make(chan float64)

  fmt.Print("How many iterations? ")
  fmt.Scanln(&input)
  max,err := strconv.Atoi(input)

  if err != nil {
    panic("You did not enter a valid integer")
  }
  start := time.Now() //start timing execution of concurrent routine

  for i := 0; i < max; i++ {
    go chudnovskyConcurrent(i,c)
  }

  for i := 0; i < max; i++ {
    sum += <-c
  }
  end := time.Now() //end of concurrent routine
  fmt.Println("Duration of concurrent calculation: ",end.Sub(start))
  pi = 1/(12*sum)
  fmt.Println(pi)

  start = time.Now() //start timing execution of syncronous routine
  sum = 0
  for i := 0; i < max; i++ {
    sum += chudnovskySync(i)
  }
  end = time.Now() //end of syncronous routine
  fmt.Println("Duration of synchronous calculation: ",end.Sub(start))
  pi = 1/(12*sum)
  fmt.Println(pi)
}

func chudnovskyConcurrent(i int, c chan<- float64) {
  var numerator float64
  var denominator float64
  ifloat := float64(i)
  iun := uint64(i)
  numerator = math.Pow(-1, ifloat) * float64(factorial(6*iun)) * (545140134*ifloat+13591409)
  denominator = float64(factorial(3*iun)) * math.Pow(float64(factorial(iun)),3) * math.Pow(math.Pow(640320,3),ifloat+0.5)
  c <- numerator/denominator
}

func chudnovskySync(i int) (r float64) {
  var numerator float64
  var denominator float64
  ifloat := float64(i)
  iun := uint64(i)
  numerator = math.Pow(-1, ifloat) * float64(factorial(6*iun)) * (545140134*ifloat+13591409)
  denominator = float64(factorial(3*iun)) * math.Pow(float64(factorial(iun)),3) * math.Pow(math.Pow(640320,3),ifloat+0.5)
  r = numerator/denominator
  return
}

func factorial(n uint64) (res uint64) {
  if ( n > 0 ) {
    res = n * factorial(n-1)
    return res
  }

  return 1
}

And here are my results:

How many iterations? 20
Duration of concurrent calculation:  573.944µs
3.1415926535897936
Duration of synchronous calculation:  63.056µs
3.1415926535897936

Solution

The calculations you’re doing are too simple to do each one of them in a separate goroutine. You’re loosing more time in the runtime (creating goroutines, multiplexing, scheduling etc) than on actual calculations. What you’re trying to do is more suited for GPU, for example, where you have massive number of parallel execution units that can do this simple calculations all at ones in an instant. But you would need other languages and APIs to do that.

What you can do is to create software thread of execution for every hardware thread of execution. You want to split your max variable into big chunks and execute them in parallel. Here’s very simple take on it just to illustrate the idea:

package main

import (
  "fmt"
  "math"
  "strconv"
  "time"
  "runtime"
)

func main() {
  var input string
  var sum float64
  var pi float64
  c := make(chan float64, runtime.GOMAXPROCS(-1))
  fmt.Print("How many iterations? ")
  fmt.Scanln(&input)
  max,err := strconv.Atoi(input)

  if err != nil {
    panic("You did not enter a valid integer")
  }
  start := time.Now() //start timing execution of concurrent routine

  for i := 0; i < runtime.GOMAXPROCS(-1); i++ {
    go func(i int){
      var sum float64
      for j := 0; j < max/runtime.GOMAXPROCS(-1); j++  {
        sum += chudnovskySync(j + i*max/runtime.GOMAXPROCS(-1))
      }
      c <- sum
    }(i)
  }

  for i := 0; i < runtime.GOMAXPROCS(-1); i++ {
    sum += <-c
  }
  end := time.Now() //end of concurrent routine
  fmt.Println("Duration of concurrent calculation: ",end.Sub(start))
  pi = 1/(12*sum)
  fmt.Println(pi)

  start = time.Now() //start timing execution of syncronous routine
  sum = 0
  for i := 0; i < max; i++ {
    sum += chudnovskySync(i)
  }
  end = time.Now() //end of syncronous routine
  fmt.Println("Duration of synchronous calculation: ",end.Sub(start))
  pi = 1/(12*sum)
  fmt.Println(pi)
}

func chudnovskySync(i int) (r float64) {
  var numerator float64
  var denominator float64
  ifloat := float64(i)
  iun := uint64(i)
  numerator = math.Pow(-1, ifloat) * float64(factorial(6*iun)) * (545140134*ifloat+13591409)
  denominator = float64(factorial(3*iun)) * math.Pow(float64(factorial(iun)),3) * math.Pow(math.Pow(640320,3),ifloat+0.5)
  r = numerator/denominator
  return
}

func factorial(n uint64) (res uint64) {
  if ( n > 0 ) {
    res = n * factorial(n-1)
    return res
  }

  return 1
}

And here’s the results

$ go version
go version go1.5.2 windows/amd64

$ go run main.go
GOMAXPROCS = 4
How many iterations? 10000
Duration of concurrent calculation:  932.8916ms
NaN
Duration of synchronous calculation:  2.0639744s
NaN 

Answered By – creker

Answer Checked By – Jay B. (GoLangFix Admin)

Leave a Reply

Your email address will not be published.