习明昊 习明昊 - 4 months ago 21
C Question

Why cgo's performance is so slow? is there something wrong with my testing code?

I'm doing a test: compare excecution times of cgo and pure Go functions run 100 million times each. The cgo function takes longer time compared to the Golang function, and I am confused with this result. My testing code is:

package main

import (
"fmt"
"time"
)

/*
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void show() {

}

*/
// #cgo LDFLAGS: -lstdc++
import "C"

//import "fmt"

func show() {

}

func main() {
now := time.Now()
for i := 0; i < 100000000; i = i + 1 {
C.show()
}
end_time := time.Now()

var dur_time time.Duration = end_time.Sub(now)
var elapsed_min float64 = dur_time.Minutes()
var elapsed_sec float64 = dur_time.Seconds()
var elapsed_nano int64 = dur_time.Nanoseconds()
fmt.Printf("cgo show function elasped %f minutes or \nelapsed %f seconds or \nelapsed %d nanoseconds\n",
elapsed_min, elapsed_sec, elapsed_nano)

now = time.Now()
for i := 0; i < 100000000; i = i + 1 {
show()
}
end_time = time.Now()

dur_time = end_time.Sub(now)
elapsed_min = dur_time.Minutes()
elapsed_sec = dur_time.Seconds()
elapsed_nano = dur_time.Nanoseconds()
fmt.Printf("go show function elasped %f minutes or \nelapsed %f seconds or \nelapsed %d nanoseconds\n",
elapsed_min, elapsed_sec, elapsed_nano)

var input string
fmt.Scanln(&input)
}


and result is:

cgo show function elasped 0.368096 minutes or
elapsed 22.085756 seconds or
elapsed 22085755775 nanoseconds

go show function elasped 0.000654 minutes or
elapsed 0.039257 seconds or
elapsed 39257120 nanoseconds


The results show that invoking the C function is slower than the Go function. Is there something wrong with my testing code?

My system is : mac OS X 10.9.4 (13E28)

Answer

As you've discovered, there is fairly high overhead in calling C/C++ code via CGo. So in general, you are best off trying to minimise the number of CGo calls you make. For the above example, rather than calling a CGo function repeatedly in a loop it might make sense to move the loop down to C.

There are a number of aspects of how the Go runtime sets up its threads that can break the expectations of many pieces of C code:

  1. Goroutines run on a relatively small stack, handling stack growth through segmented stacks (old versions) or by copying (new versions).
  2. Threads created by the Go runtime may not interact properly with libpthread's thread local storage implementation.
  3. The Go runtime's UNIX signal handler may interfere with traditional C or C++ code.
  4. Go reuses OS threads to run multiple Goroutines. If the C code called a blocking system call or otherwise monopolised the thread, it could be detrimental to other goroutines.

For these reasons, CGo picks the safe approach of running the C code in a separate thread set up with a traditional stack.

If you are coming from languages like Python where it isn't uncommon to rewrite code hotspots in C as a way to speed up a program you will be disappointed. But at the same time, there is a much smaller gap in performance between equivalent C and Go code.

In general I reserve CGo for interfacing with existing libraries, possibly with small C wrapper functions that can reduce the number of calls I need to make from Go.