FFT Architeture

Data and M athematical O peration Architecture

Design of Processor in Verilog

 

Design of Processor in Verilog

 

The project was carried out using the Verilog Hardware Design Language using Xilinx tools for compilation synthesis. The processor was designed by creating Verilog modules for the radix-4 butterfly, twiddle factor ROM, commutator, complex multiplier, complex adder/subtractor, and a controller. Once each unit was completed a test bench was created to verify the functionality at the block level. As modules were completed an input RAM was created and each part of the processor connected using a top level module. Finally a test bench consisting of signals loading the ram, and an instance of the input data ram and processor module was created to test the processor.

 

Radix-4 Butterfly

Figure 12: Radix-4 Butterfly Block

 

Pin Name

Description

Pin Name

Description

In1

Input value from ram or previous stage

Out1

Output to next stage

In2

Input value from ram or previous stage

Out2

Output to next stage

In3

Input value from ram or previous stage

Out3

Output to next stage

In4

Input value from ram or previous stage

Out4

Output to next stage

The radix-4 butterfly Verilog module was designed as shown in Fig 6 of section FFT Architeture. Four complex adders and four complex subtractions are specified by instantiation of the complex adder/subtractor module. The scaling of ?j shown at the bottom portion of Fig X is accomplished by exchanging the real part and the imaginary part of the incoming data and then inverting the sign of the real part or the imaginary part.

 

Twiddle Factor ROM

Figure 13: Twiddle Factor ROM Block

 

Pin Name

Description

index

Selecting output values

Enable

Enable output values

Out1

Output to complex multiplier

Out2

Output to complex multiplier

Out3

Output to complex multiplier

The twiddle ROM was generated by case statement assignments to the three outputs. Values for each case were given by a lookup table generated for the four sets of values required to form the complex twiddles. The Radix-4 algorithm requires 16 twiddles, but due to the symmetry of the 16 values it is possible to store just half the numbers in four sets. Desired outputs are specified by the gindexh input provided by the controller.

 

Commutator

Figure 14: Commutator Block

 

Pin Name

Description

index

Selecting output patterns

Enable

Enable output values

In1

input from butterfly

In2

input from butterfly

In3

input from butterfly

In4

input from butterfly

Out1

Output to complex multiplier

Out2

Output to complex multiplier

Out3

Output to complex multiplier

Out4

Output to complex multiplier

The behavior of the switch was specified by case assignment for four possible patterns routing input to output. Conditions for the case statement are given by the input gindexh generated by the controller.

 

Complex Adder/Subtractor and Multiplier

Figure 15: Complex Adder/Subtractor and Multiplier Block

 

Pin Name

Description

out

Output value

In1

input value

In2

input value

The arithmetic units were created using the design for floating point addition and multiplication to realize complex arithmetic.

 

Controller

Figure 16: Controller Block

 

Pin Name

Description

clk

clock

go

To start the process

reset

Asynchronous reset

oe_ram_address

Send output address to ram

rom

Send output address to rom

Switch2

Sending patterns to switch

En_ram_oe

Enable output from ram

En_rom

Enable output from rom

En_switch2

Enable switch2

run

Indicate the calculation is in progress

stop

The calculation is finished

The controller generates clock synchronous inputs for the twiddle ROM, commutators, and the external input RAM. All signals are generated off a counter and specified using the state of the counter that corresponds to the portion of the FFT cycle where twiddles and switching cycle.

The counter starts counting from zero upon the input of a go signal, resetting upon a stop signal. The output to the RAM is asserted from the 0-15 th cycle specifying the output of one sample each cycle for a 16 point transform. Signals to the ROMs provide the current count of the processing cycle where it is decoded into the appropriate addresses for the twiddle factor required. The commutator receives a similar count for its decoding of an appropriate switching pattern at the given cycle. Additionally a run and stop signal are generated at the beginning and end of processing for the convenience of using the processor as a module in a larger design.

 

Copyright (C)2004 CDS Technology Inc., All rights reserved.

@