# RTL Simulation of the GCD Core

## Introduction

This is part 1 of the tutorial series, entitled, Easy FPGA and Embedded Linux on ZYBO. In this tutorial, we are going to build the GCD accelerator core in Verilog. After that, we will simulate the GCD core. We are going to use Xilinx Vivado 2016.4 for this tutorial series. The GCD algorithm in this tutorial is based on the code from the book Embedded SoPC Design with Nios II Processor and Verilog Examples, page 663-679, by Pong P. Chu. You can get the full source code of this part from here.

## GCD Algorithm

Greatest common divisor (GCD) between two numbers is the largest number that divides them without remainder. We have gcd(a, b). This gcd function returns the largest number that divides both a and b without remainder. For example, gcd(35, 25) is 5, and gcd(128, 72) is 8.

We will use the binary GCD algorithm. It is shown in the following figure. This algorithm uses only subtraction and divide-by-2 operations. It has six equations that should be applied repetitively until a is equal to b (equation 1) . For example, this is step-by-step how to calculate gcd(24, 15):

• gcd(24, 15), apply equation 4, then the result is,
• gcd(12, 15), apply equation 4, then the result is,
• gcd(6, 15), apply equation 4, then the result is,
• gcd(3, 15), apply equation 6, then the result is,
• gcd(3, 12), apply equation 3, then the result is,
• gcd(3, 6), apply equation 3, then the result is,
• gcd(3, 3), apply equation 1, then the result is, 3

## Software Implementation

Now that we have the GCD algorithm, the next step is to implement this in C program. Later on, we can use this C program for both bare metal application and Linux application. This C program is used for verification of the hardware implementation and performance comparison between hardware and software.

The C program is shown in the following listing:

In equation 2, we should multiply the gcd result by 2. In C program, it is implemented by counting the occurrences this condition, saved in a variable, n. At the end, in line 31, we should multiply the result by 2n. Note that multiplying by 2 can be done by shifting it to the left once. Then multiplying by 2n is equal to left shifting n times.

## Hardware GCD Core

Now that we have the C program for the GCD algorithm. Based on this program, we can build the Verilog code. The ASMD chart of the GCD algorithm is shown in the following figure.

The state machine has two states: S_IDLE and S_OP. In S_IDLE state, it waits until the start signal is equal to 1. In this state the ready signal is also 1 meaning that it is ready to accept new inputs. In S_OP state, it calculates the GCD using the same algorithm as in the C program. After the calculation is finished, it goes back to the S_IDLE state. The Verilog code for this GCD core is shown in the following listing:

## Simulation Result

The timing diagram of the GCD core is shown in the following figure. In the idle state, the ready signal is 1. It indicates that the GCD core is ready to accept new inputs. When the GCD core is in the process of calculating GCD, the ready signal is 0, and it goes back to 1 when the process is finished. The testbench file of this simulation is shown in the following listing:

## Summary

In this tutorial, we have build a GCD core in Verilog. The hardware GCD algorithm is modeled using the ASMD chart. The simulation result shows that GCD core is work without any error.