Performance Comparison of the GCD Core


In the previous tutorial, we have built C program for testing the GCD core. In this tutorial, we are going to improve our C program. We are going to make performance comparison between hardware GCD vs. software GCD (implemented in C).

Get the full source code from here. Some of the code was adapted from the book Embedded SoPC Design with Nios II Processor and Verilog Examples, by Pong P. Chu.

Performance Comparison

First, we are going to two functions: calc_gcd_hw (in line 88-103) and calc_gcd_sw (in line 39-86). The function  calc_gcd_hw is similar to the implementation in the previous tutorial. It calculates GCD using the GCD core. The difference is that in this function we read the perfromance counter, in line 100. The function calc_gcd_sw implements the software GCD. It is similar to the C code in tutorial part 1.

How the code works:

  • First, in main function, we do 10000 GCD calculations both in hardware and software.
  • In line 26, we check whether the results from hardware are the same with the results from software or not.
  • In line 28-29, we accumulate the clock cycles from the performance counter both for hardware and software GCD.
  • We measure the clock cycles of software GCD by using the performance counter. It is done by taking two time stamps, before and after executing the function calc_gcd_sw, as shown in line 54-82.
  • In line 49-51, we calculates the overhead associated with the execution of the C code for accessing the performance counter, and we subtract it from the final counter value.
  • Finally, in line 31-34, we print the average number of clock cycles and the accelerator speed up to the SDK terminal.

The following figure shows the result from the SDK terminal.


In this tutorial, we have built the C program for comparing the performance of the hardware GCD and software GCD. We use performance counter to count how many clock cycles does it takes to do the GCD calculation for both hardware and software. Finally, we calculates the average clock cycles and accelerator speed up.

Next: Configure the Linux System, Ethernet Connection, and Python Libraries