Design and Analysis of Hybrid Tree Multipliers for Reduction of Partial Products

This paper confers one of the three phases of tree multipliers, i.e. the Partial Product reduction phase. In this paper four types of hybrid tree multipliers are studied and proposed using parallel counters (full adders and half adders) for reduction of Partial Products in multiplication operation.After ANDing the bits of multiplier and multiplicand, the Partial Products are arranged into two groups for reduction, each group uses a different technique for reduction of Partial Products, resulting in a fewer gates than the parent tree reduction techniques. The results of the proposed tree reduction techniques are then tabulated and compared with the parent tree multipliers. The performance comparison is done in terms of number of gate counts of half adder and full adders used in the Partial Product reduction phase. Four types of hybrid tree multipliers are presented using CSA (Carry Save Adder) Array multiplier, Wallace Tree multiplier, Modified Wallace Tree multiplier and Dadda Tree Multiplier. The results show significant reduction in number of full adders and half adders with the slight overhead of increased final addition stage of the hybrid multiplier. The proposed multipliers can prove to be the better choice for digital signal processing designs, image processing designs and processor architecture.

. Multiplication produces Partial Product matrix which is reduced and then added to produce final result [6]. Any multiplier unit has 3 phases: (i) Generation of Partial Products (ii) Reduction of Partial Products (iii) Final Addition INTRODUCTION An NxN bit multiplier needs N 2 AND gates for multiplication forming N rows of Partial Product matrix having N bits each, and the final sum produces 2N+1 bits result [7].
The Partial Products are generated by ANDing every bit of the multiplier with all the bits of multiplicand [8].
Subsequently, after generation of Partial Products there is reduction of Partial Product matrix to two rows. This is achieved by splitting the Partial Products into groups of three or two. The groups of three or two bits are then reduced by using CSA. The third step in multiplication is the addition of reduced rows of Partial Products by using CPA (Carry Propagate Adder) to get the final product. The multiplication process can be accelerated on the basis of these three expedients.
Multipliers can be optimized in generation of Partial Products, reduction of Partial Product matrix and addition of final two rows of reduced Partial Product matrix.
The second step, i.e. Partial Product reduction is the main parameter that determines the performance of the multiplier as it adds most to the overall delay.
In this paper, four types of Hybrid Tree Multipliers are proposed and designed using conventional CSA Array, Wallace Tree, Dadda Tree and Modified Wallace Tree Reduction Schemes. The proposed designs exploit the advantages of their parent multipliers and reduce the number of compressors in the Partial Product reduction phase of tree multipliers for fast and low area applications like Signal and Image Processing.

RELATED WORK
A variety of multipliers have been discussed and proposed in the literature to get an efficient design [9]. In 1964 Wallace proposed column compression scheme for fast multiplication with the entire delay which is relative to the logarithmic of the length of the word in the operand [10]. The proposed column compression multiplier is faster than the array multiplier because of the fact that in array multiplier, as the operand word length increases, the delay increases linearly. In 1965 Dadda refined the approach of reducing the number of adders in column compression scheme anticipated by Wallace [11]. Baugh

MATERIALS AND METHOD
The importance of a fast, area efficient multiplier cannot be denied in any design and signal processing application. For this purpose the paper proposed hybrid multiplier design for reduction of PPs.

Carry Save Array
When several numbers are added consecutively, there is no need to circulate the carries through each addition.
As an alternative, the carries produced in each addition can be kept as partial carries for the next column and In first stage, the CSA array adds up the set of three or two Partial Products and results in partial sum and partial carry. In second stage the same approach is used for incomplete sum and incomplete carry from the earlier stage with operands and turns out into a new partial sum and partial carry and so on until the whole matrix of Partial Products is reduced to two.

Wallace Tree Reduction
In 1964, an Australian Computer Scientist Chris Wallace proposed a method for rapid multiplication which was rooted in adding the Partial Product matrix on parallel by means of a tree of counters (most commonly 3:2 counter) for which it is recognized as the Wallace tree multiplier [6].

Dadda Tree Reduction
In 1965, Dadda presented an alternate multiplier which is similar to Wallace scheme but performs less reduction i.e.
whenever required, to get the limits [11].

Modified Wallace Tree Reduction
The Modified Wallace Tree multiplier utilizes half adders and full adders in Partial Product matrix reduction as conventional Wallace Tree multiplier but

Methodology
The multiplication process of the proposed NxN hybrid multipliers starts with the production of Partial Product matrix using N 2 AND gates. Half rows of the Partial Product matrix are reduced to two rows using X reduction

RESULTS AND DISCUSSION
The second phase of multiplication i.e. Partial Product reduction is analyzed for 8, 12 and 16 bits using the hybridization of conventional CSA Array, Wallace Tree,

Dadda Tree and Modified Wallace Tree Reduction
Multipliers. The results shown in Fig. 7 reveal that the proposed Hybrid Reduction Multipliers show significant decrease in number of full adders, half adders and hence the total gates in reduction phase of multiplication than the parent reduction techniques. Number of full adders, half adders, and overall no. of gates for parent multipliers are given in Table 1 for the sake of reference.
The total gate count shown inTables 1-2 is found from gate level designs; where full adders are realized using nine gates and half adders are realized using four gates (each XOR gate is equal to three gates). Only reduction phase is included in the tables, neither the N 2 AND gates of partial product generation nor the carry propagating adders of final sum are included.
The results of Table 2  area, delay etc) calculated using Xilinx are presented in Table 3.