6577

# An Efficient Reverse Converter for The New High Dynamic Range 5-Moduli Set

## Xiaolan Lv\*, Ming Xiao

College of Computer and Electronic Information, Guangdong University of Petrochemical Technology, Maoming 525000, Guangdong, China \*Corresponding author, e-mail: 29090137@qq.com

### Abstract

In this paper, an efficient residue to binary converter design for the new high dynamic range moduli set  $\{2^{n}-1,2^{n}+1,2^{2n},2^{2n}+1,2^{2n-1}-1\}$  is presented. The reverse conversion in the four-moduli set  $\{2^{2n}, 2^{2n}+1, 2^{n}+1, 2^{n$ 

**Keywords**: reverse converter, residue number system (RNS), new Chinese remainder theorems (new CRT).

### Copyright © 2013 Universitas Ahmad Dahlan. All rights reserved.

### 1. Introduction

The conventional weighted number system has carry propagation among arithmetic operations, resulting in performance degradation in hardware computing systems. The residue number system (RNS) is an efficient alternative number system which has been an important research field in computer arithmetic for many decades. One of the key advantages of RNS is a carry-free number system, which can represent number in a non-weighted form. High speed and less hardware complexity could be achieved by decomposing a large binary number into a set of smaller residues [1-3]. RNS has drawn widespread attention for increasing performance of communication system, Digital Signal Processing such as Fourier transform (FFT), digital filter, and image processing.

The basic building blocks for a RNS needed residue arithmetic unit, binary-to-redidue converters, and residue-to-binary converters [4, 5]. Choice of the moduli set that includes relatively prime integers and designs of the reverse converter are important because these issues have basic effects on reverse converter's performance, they have long been the performance bottleneck for RNS [8-15]. Another important arithmetic problem is converting residue to binary number, which is the crucial for a RNS application. The traditional algorithms of reverse converter are Chinese Remainder Theorem (CRT) and Mixed-radix conversion (MRC). However, the use of CRT is unprofitable since it is involve a large modulo M operation, where M is the product of all moduli [6]. The MRC is a sequential process which requires times [7]. Recently, Wang has proposed New Chinese Remainder Theorem 1 (New CRT I) and New Chinese Remainder Theorem 2 (New CRT II), which are based on CRT and MRC [8]. The New CRTs algorithm eliminates the drawbacks of traditional CRT and MRC, resulting in notable improvement in hardware.

By far, many different moduli sets have been proposed, which have different properties, which according to dynamic range (DR), arithmetic operations and hardware implementation. In some high-performance computation systems, they demand more parallelism with larger dynamic range. A few of five-moduli set have been reported, Skavantzos proposed a five-moduli set  $\{2^{n}-1,2^{n},2^{n}+1,2^{n}+2^{(n+1)/2}+1,2^{n}-2^{(n+1)/2}+1\}$  [9], Mathew et al. also proposed a five-moduli set  $\{2^{3n}-1,2^{3n},2^{n}+1,2^{3n}+2^{(3n+1)/2}+1,2^{3n}-2^{(3n+1)/2}+1\}$  [10], they have the same disadvantages that all the moduli are not in the form of  $2^{n}$  or  $2^{n}\pm1$ , the multiplicative inverses are in poor formats, resulting in the speed of the arithmetic unit is restricted. Skavantzos and Stouraitis proposed a five-

moduli set  $\{2^{n+1}, 2^n-1, 2^n+1, 2^{n+1}-1, 2^{n+1}+1\}$  [11], Cao et al. also proposed a relatively balanced five-moduli set  $\{2^n-1, 2^n, 2^n+1, 2^{n+1}-1, 2^{n-1}-1\}$  [12], though all the moduli are in the form of  $2^n$  or  $2^n\pm 1$  with this two moduli sets, they are only co-prime for even values of n and dynamic range are only 5n bits , larger dynamic range moduli sets have proposed, which comprises non co-prime moduli [13].

In this paper, we propose a residue to binary converter design based on New CRT I and New CRT II with our new moduli set  $\{2^{n}-1,2^{n}+1,2^{2n},2^{2n}+1,2^{2n-1}-1\}$ , which is valid any value of n. These involving moduli of this moduli set are in the form of  $2^{n}$  or  $2^{n}\pm1$ , this makes the converter design in less hardware complexity. The proposed moduli set is derived from the moduli set  $\{2^{2n}, 2^{2n}+1, 2^{n}+1, 2^{n}+$ 

The rest of the paper is organized as follows: In section 2, we provide a brief introduction for the RNSs and New CRTs. In section 3, the new 5-moduli set is proposed, and an efficient reverse converter algorithms for the new five-moduli set that based on New CRTs is provided. Hardware implementation is presented in section 4. Evaluating the proposed reverse converter and compare the result in terms of conversion delay and hardware cost with other similar reverse converters in section 5. The paper is concluded in section 6

### 2. Background

In a residue number system (RNS), an integer *X* can be defined by a moduli set  $\{m_1, m_2, ..., m_n\}$ , which is consisted of a set of pairwise relatively prime integer numbers, i.e., gcd  $(m_i, m_j)=1$  for  $i\neq j$ . The dynamic range (DR) M is the product of the moduli,  $M = \prod_{i=1}^n m_i$ , A weighted representation *X* can be represented as  $X=(x_1, x_2, ..., x_n)$ , where:

$$x_{i} = |X|_{m_{i}} = X \mod m_{i} \qquad i = 1, 2, \dots n \quad 0 \le x_{i} < m_{i}$$
(1)

Then integer X has a unique n-tuple represention:  $X \xrightarrow{RNS} (x_1, x_2, \cdots , x_n) 0 \le X < M$ .

New Chinese Remainder Theorem 1 (New CRT I) [8]: For a moduli set  $\{m_1, m_2, ..., m_n\}$ , the weighted number X can be converted from its a set of residue number  $(x_1, x_2, ..., x_n)$ , it is presented using New CRT I as:

$$X = x_{1} + m_{1} \begin{vmatrix} k_{1}(x_{2} - x_{1}) + k_{2}m_{2}(x_{3} - x_{2}) + \cdots \\ + k_{n-1}m_{2}m_{3}\cdots m_{n-1}(x_{n} - x_{n-1}) \end{vmatrix}_{m_{2}m_{3}\cdots m_{n-1}m_{n}}$$
(2)

Where  $|k_1m_1|_{m_2m_3\cdots m_{n-1}m_n} = 1$ ,  $|k_2m_1m_2|_{m_3\cdots m_{n-1}m_n} = 1$ , ...,  $|k_2m_1m_2|_{m_3\cdots m_{n-1}m_n} = 1$ 

New Chinese Remainder Theorem 2(New CRT II) [8]: For a 2-moduli set  $\{m_1, m_2\}$ , where  $m_1 < m_2$ . the weighted number X can be converted from its residue representation  $(x_1, x_2)$ , it can be represented by New CRT II as follows:

$$X = x_2 + m_2 \left| k_0 (x_1 - x_2) \right|_{m_1}$$
(3)

$$|k_0 m_2|_{m_1} = 1 \tag{4}$$

Where,  $k_i$  is the multiplicative inverse.

## 3. Reverse Converter to 5-Moduli Set $\{2^{n}-1, 2^{n}+1, 2^{2n}, 2^{2n}+1, 2^{2n-1}-1\}$

In this section, we apple the New CRT-I and the New CRT-II to the proposed superset  $\{2^{n}-1,2^{n}+1,2^{2n},2^{2n}+1,2^{2n-1}-1\}$  to achieve a high-performance residue to binary converter, its dynamic range is 8n-1 bits. We denote the residues corresponding to this moduli set  $(m_1, m_2, m_3)$  $m_3, m_4, m_5$ ) as  $(x_1, x_2, x_3, x_4, x_5)$ .

A two-level conversion algorithm for the residue to binary conversion is presented. The moduli set  $S=\{2^{n}-1,2^{n}+1,2^{2n},2^{2n}+1,2^{2n-1}-1\}$  is decomposed into two subsets. i.e.,  $S_1=\{2^{2n}, 2^{2n}+1, 2^{n}+1, 2^{n}+$  $2^{n}+1$ ,  $2^{n}-1$  are pairwise prime when n is valid for any value. Therefore, it is necessary that prove the all moduli of S<sub>1</sub> are all pairwise prime to the moduli of  $2^{2n-1}$ -1 for any value of *n*. Theorem 1: The number of  $2^{n}$ -1, $2^{n}$ +1, $2^{2n}$ , $2^{2n}$ +1,and  $2^{2n-1}$ -1 are pairwise relatively prime

for any value of *n*.

Proof: According to the Euclid's Theorem gcd (a,b)=gcd $(b,a \mod b)$ , it is easy to find.

$$gcd(2^{2n-1}-1, 2^{n}-1) = gcd(2^{n}-1, 2^{n-1}-1) = gcd(2^{n-1}-1, 1) = 1$$

$$gcd(2^{2n-1}-1, 2^{n}+1) = gcd(2^{n}+1, -(2^{n-1}+1)) = gcd(-(2^{n-1}+1), 2) = 1$$

$$gcd(2^{2n-1}-1, 2^{2n}) = gcd(2^{2n}, -1) = 1$$

$$gcd(2^{2n}+1, 2^{2n-1}-1, 2) = gcd(2^{2n-1}-1, 3) = 1$$
(5)

It can be seen that all the greatest common divisors are equal to 1, therefore the four numbers  $2^{n}-1, 2^{n}+1, 2^{2n}$ , and  $2^{2n}+1$  are relatively prime to the modulo  $2^{2n-1}-1$ . At the beginning, we assume that an integer  $X_1$  is represented as  $(x_1, x_2, x_3, x_4)$  based on the

four-moduli set  $S_1 = \{2^{2n}, 2^{2n}+1, 2^n+1, 2^n-1\}$ . The final weighted X can be calculated from the residues  $(X_1, x_5)$  correspond to the second moduli set  $S_2 = \{2^{2n-1}-1, (2^n-1)(2^n+1)(2^{2n}+1)2^{2n}\}$ . In the first level of the converter, According to the conversion algorithm of [14], the weighted  $X_1 = (x_1, x_2, x_3, x_4)$  can be obtained by:

$$X_1 = x_1 + 2^{2n} Z (6)$$

*Z* is the output of moduo  $2^{4n}$ -1 adder, therefore Z is a 4n-bit integer, then X<sub>1</sub> has 6n bits. In the second level of the reverse converter, by applying (3) to the moduli set  $S_2$ ,  $S_2$  can be rewritten as:

$$S_{2} = \left(2^{2n} \left(2^{n} - 1\right)\left(2^{n} + 1\right)\left(2^{2n} + 1\right), 2^{2n-1} - 1\right) = \left(2^{2n} \left(2^{4n} - 1\right), 2^{2n-1} - 1\right)$$
(7)

X can be calculated by New CRT-II.

$$X = X_{1} + 2^{2n} (2^{4n} - 1) \left| k_{0} (x_{5} - X_{1}) \right|_{2^{2n-1} - 1}$$
(8)

Lemma 1: The multiplicative inverse of the number  $2^{2n}(2^{4n}-1)$  modulo  $2^{2n-1}-1$  is  $k_0$ .

$$k_0 = \frac{1}{3} \Big[ 2(2^{2n-1} - 1) + 1 \Big] \times 2^{2n-2}$$
(9)

*Proof.* According to (4), we have  $|k_0 \times 2^{2n} (2^{4n} - 1)|_{2^{2n-1}} = 1 \circ$ 

Then. (8) can be rewritten as:

$$X = X_{1} + 2^{2n} (2^{4n} - 1) \left| \frac{1}{3} \left[ 2(2^{2n-1} - 1) + 1 \right] \times 2^{2n-2} (x_{5} - X_{1}) \right|_{2^{2n-1} - 1}$$
(10)

Its original form of  $k_0$  is not amenable to hardware design, then it is expanded into geometrical series.

$$\frac{1}{3}[2(2^{2n-1}-1)+1)] = 2^{0} + 2^{2} + 2^{4} + 2^{6} + \dots + 2^{2n-2}$$
(11)

Finally, the weighted X can be represented as follow:

$$X = X_1 + 2^{2n} (2^{4n} - 1) \left| (2^0 + 2^2 + 2^4 + 2^6 + \dots + 2^{2n-2}) \times 2^{2n-2} (x_5 - X_1) \right|_{2^{2n-1} - 1}$$
(12)

## 4. Hardware Realization of Reverse Converter

Hardware frame of the proposed 5-moduli set  $S=\{2^{n}-1,2^{n}+1,2^{2n},2^{2n}+1,2^{2n-1}-1\}$  reverse converter is shown Figure 1.



Figure 1. Frame of the Proposed 5-moduli Set Reverse Converter

Hardware implementation for integer  $X_1$  from the residues ( $x_1$ ,  $x_2$ ,  $x_3$ ,  $x_4$ ) correspond to the four-moduli set  $S_1=\{2^{2n}, 2^{2n}+1, 2^n+1, 2^n-1\}$  has completed by [14]. The aim of this section is to implement the hardware architecture of X from the residues ( $X_1$ ,  $x_5$ ) correspond to the second moduli set  $S_2=\{2^{2n-1}-1, (2^n-1)(2^n+1)(2^{2n}+1)2^{2n}\}$ . Figure 2 shows the architecture of the residue to binary converter for the 2-moduli set  $S_2$ .



Figure 2. Architecture of Reverse Converter for the 2-moduli set S<sub>2</sub>

The (12) can be rewritten as:

$$X = X_1 + 2^{2n} (2^{4n} - 1)R$$
(13)

Where:

$$R = \left| (2^{0} + 2^{2} + \dots + 2^{2n-2}) \times 2^{2n-2} (x_{5} - X_{1}) \right|_{2^{2n-1} - 1} = \left| (2^{0} + 2^{2} + \dots + 2^{2n-2}) (r_{1} + r_{2}) \right|_{2^{2n-1} - 1}$$
(14)  
$$r_{1} = \left| 2^{2n-2} x_{5} \right|_{2^{2n-1} - 1}, r_{2} = \left| 2^{2n-2} (-X_{1}) \right|_{2^{2n-1} - 1}$$

Now, we consider Equation (13) and (14), and simplify them for efficient hardware implement. First, we show two *Lemmas*, they can be used to implement modulo  $2^{n}$ -1 adder by simply circular shift.

*Lemma 2*:  $|x2^n|_{2^m-1}$  is equivalent to circular lef shift a *m*-bits binary namber *x* by *n* bits [12]. *Lemma 3*:  $|-x2^n|_{2^m-1}$  is equivalent to circular lef shift a *m*-bits binary namber *x* by *n* bits , then complement of result [12].

Lemma 4:  $\left|2^{m_{s+i}}\right|_{2^{m}-1}$  is equivalent to  $\left|2^{i}\right|_{2^{m}-1}$ 

Where, *n* and *m* are any natural numbers.

Since  $X_1$  is a 6n-bits number and  $x_5$  is a (2n-1)-bits number, they are represented in their binary forms as follows:

$$X_{1} = \underbrace{X_{1,6n-1}X_{1,6n-2}\cdots X_{1,1}X_{1,0}}_{6n} \qquad \qquad x_{5} = \underbrace{x_{5,2n-2}x_{5,2n-3}\cdots x_{5,1}x_{5,0}}_{2n-1}$$

Where,  $x_{i,j}$  denote the jth of  $x_i$ . Then,  $r_1$  and  $r_2$  can be furthermore simply by lemma 2 and lemma 3.

$$r_{1} = \left| 2^{2n-2} x_{5} \right|_{2^{2n-1}-1} = \underbrace{x_{5,0} x_{5,2n-2} x_{5,2n-3} \cdots x_{5,1}}_{2n-1}$$
(15)

$$r_{2} = |2^{2n-2}(-X_{1})|_{2^{2n-1}-1}$$

$$= \begin{vmatrix} \overline{X'_{6n-3}} \underbrace{11\cdots 11}_{2n-4} \underbrace{\overline{X'_{6n-1}} X'_{6n-2}}_{2n-4} \\ + \underbrace{\overline{X'_{2n-1}} X'_{4n-3} \underbrace{X'_{4n-4}}_{2n-1} \\ + \underbrace{\overline{X'_{2n-1}} X'_{4n-3} \underbrace{X'_{4n-4}}_{2n-1} \\ - \underbrace{\overline{X'_{2n-1}} X'_{4n-3} \underbrace{X'_{4n-4}}_{2n-1} \\ - \underbrace{\overline{X'_{2n-1}} X'_{2n-1} }_{2n-1} \\ + \underbrace{\overline{X'_{2n-1}} X'_{4n-3} \underbrace{X'_{4n-4}}_{2n-1} \\ - \underbrace{\overline{X'_{2n-1}} X'_{2n-1} }_{2n-1} \\ - \underbrace{\overline{X'_{2n-1}} X'_{2n-1} }_{2n-1} \\ + \underbrace{\overline{X'_{2n-1}} X'_{4n-3} \underbrace{X'_{4n-4} \cdots \underbrace{X'_{2n}}_{2n-1} }_{2n-1} \\ - \underbrace{\overline{X'_{2n-1}} X'_{2n-1} }_{2n-1} \\ - \underbrace{\overline{X'_{2n-1}} X'_{4n-3} \underbrace{X'_{4n-4} \cdots \underbrace{X'_{2n}}_{2n-1} }_{2n-1} \\ - \underbrace{\overline{X'_{2n-1}} X'_{2n-1} \\ - \underbrace{\overline{X'_{2n-1}} X'_{2n-1} \\- \underbrace{\overline{X'_{2$$

Where,  $x_i$  is the complement of  $x_i$ , and  $r_{21}$ ,  $r_{22}$ ,  $r_{23}$ ,  $r_{24}$  can be represented in their binary forms. By substituting (15) and (16) into (14), R can be calculated by:

$$R = \left| (2^{0} + 2^{2} + 2^{4} + \dots + 2^{2n-2})(r_{1} + r_{21} + r_{22} + r_{23} + r_{24}) \right|_{2^{2n-1}-1}$$
(17)

By applying *Lemma 2, Lemma 3* and *Lemma 4* in Equation (15), (16) and (17), they can be simplified to decrease the hardware complexity. The Operand Preparation Unit (OPU) includes simply manipulating the routing of the bits and inverters of the residues that prepares the operands for modulo adder. The implementation of R requires a 5*n*-operand carry save adder (CSA) with end-around carry (EAC) [15] tree followed by a modulo  $2^{2n-1}$ -1 adder that is one (5*n*,  $2^{2n-1}$ -1) Multi-Operand Modular Adder (MOMA). In the paper, we considered modular adder that is the carry propagate adder (CPA) with end around carry (EAC) [16]. Furthermore, (13) can be rewritten as:

$$X = X_1 + R \times 2^{2n} (2^{4n} - 1) = X_1 + R \times 2^{6n} - R \times 2^{2n} = Y - R \times 2^{2n}$$
(18)

Where:

$$Y = X_{1} + R \times 2^{6n} = \underbrace{R_{2n-2}}_{2n-1} \underbrace{R_{2n-3}}_{2n-1} \cdots \underbrace{R_{1}}_{R_{0}} \underbrace{X_{1,6n-1}}_{1,6n-2} \underbrace{X_{1,1}}_{6n} \underbrace{X_{1,1}}_{1,1} \underbrace{X_{1,0}}_{6n}$$
(19)

It should be noted, because  $X_1$  is a 6*n*-bits number, (19) can be realized by simple concatenation without additional hardware and computation cost.

Finally, by substituting (19) in (18), we have:

$$X = Y_1 - V \times 2^{2n} = Y_1 + \underbrace{11\cdots}_{4n} \underbrace{1}_{2n-2} \underbrace{V_{2n-3}}_{2n-1} \cdots \underbrace{V_1}_{2n-1} \underbrace{V_1}_{2n} \underbrace{11\cdots}_{2n} \underbrace{1}_{2n} + 1$$
(20)

Realization of X relies on a (8n-1)-bits binary subtracter, which can be implemented rely on a (8n-1)-bits regular CPA with '1' carry-in and (2n-1) NOT gates.

**Example 1:** For *n*=8, the proposed 5-moduli set is S=(65536,65537,255,257, 32767), the weighted number X can be calculated from its RNS representation (121, 165, 2741, 1056, 30125), the residues have binary representation  $x_1 = (000010101010101)$ ,  $x_2$ =(00000010000100000),  $x_3$ =(01111001),  $x_4$ =(010100101) and  $x_5$ =(111010110101101). For the first level of the proposed reverse converter, the 4-moduli set  $S_1$ =(65536,65537,255,257), its equivalent weighted number  $X_1$ =(11944497348884)<sub>10</sub> that we can obtained by (6). For the converter, second level reverse 2-moduli of the proposed the  $S_2$ =(32767,255×257×65536×65537), according to (14)-(19), we have R=(30804) <sub>10</sub> and  $Y_1 = (8670674627568536245)_{10}$ , finally,  $X = (8670674625549765301)_{10}$  can be obtained by (20).

### 5. Performance Evaluation

In this section, the area and delay of proposed reverse converter for the new five-moduli superset are evaluated. According to Figure 1 and Figure 2, the proposed 5-moduli residue-tobinary converter consists of one four-moduli set converter, one  $(5n, 2^{2n-1}-1)$  MOMA for the calculation of R, one (8n-1)-bit binary subtracter and some inverters. The estimated hardware costs according to full adder (FA), and modular adder (mod add), subtractor (sub) and some logic gates. Full adders (FA's) with constant bits of 1's or 0's are reduced to pairs of XOR/AND or XNOR/OR gates. The total conversion delay and hardware costs are the sum of the residue-to-binary converter for the 4-moduli set S<sub>1</sub> and 2-moduli set S<sub>2</sub>. Table 1 shows the hardware costs and conversion delays of the proposed 5-moduli set reverse converters, where  $t_{FA}$  denote the delay of an FA and L denote the number of levels in an CSA tree, respectively.

| Mod set        | part         | FA                                                                                                   | NOT           | XOR/AND       | XNOR/OR       | Delay                          |  |  |
|----------------|--------------|------------------------------------------------------------------------------------------------------|---------------|---------------|---------------|--------------------------------|--|--|
| S <sub>1</sub> | OPU          | -                                                                                                    | 6 <i>n</i> +3 | -             | -             | t <sub>not</sub>               |  |  |
|                | CSA1         | 2 <i>n</i> +1                                                                                        | -             | 2 <i>n</i> -1 | -             | t <sub>FA</sub>                |  |  |
|                | CSA2         | 2 <i>n</i> +2                                                                                        | -             | 2 <i>n</i> -2 | -             | t <sub>FA</sub>                |  |  |
|                | CSA3         | 2 <i>n</i> +3                                                                                        | -             | -             | 2 <i>n</i> -3 | t <sub>FA</sub>                |  |  |
|                | CPA          | 4 <i>n</i>                                                                                           | -             | -             | -             | (8 <i>n</i> )t <sub>FA</sub>   |  |  |
|                | Area         | $(10n+6)A_{FA}+(4n-3)A_{xor}+(4n-3)A_{and}+(2n-3)A_{xnor}+(2n-3)A_{or}$<br>+(6n+3)A <sub>not</sub>   |               |               |               |                                |  |  |
|                | Delay        | $(8n+3)t_{FA}+t_{not}$                                                                               |               |               |               |                                |  |  |
| S <sub>2</sub> | OPU1         | -                                                                                                    | 6 <i>n</i>    | -             | -             | t <sub>NOT</sub>               |  |  |
|                | CSA          | 10 <i>n</i> ²-9 <i>n</i> +2                                                                          | -             | -             |               | $L_1$                          |  |  |
|                | CPA          | 2 <i>n</i> -1                                                                                        | -             | -             | -             | (4 <i>n</i> -2)t <sub>FA</sub> |  |  |
|                | OPU2         |                                                                                                      | 2 <i>n</i> -1 | -             | -             | t <sub>NOT</sub>               |  |  |
|                | 8n-1 bit sub | 8 <i>n</i> -1                                                                                        | -             | -             | -             | (8 <i>n</i> -1)t <sub>FA</sub> |  |  |
|                | Area         | $(10n^2+n)A_{FA}+(8n-1)A_{not}$                                                                      |               |               |               |                                |  |  |
|                | Delay        | L <sub>1</sub> +2 t <sub>NOT</sub> +(12 <i>n</i> -3)t <sub>FA</sub>                                  |               |               |               |                                |  |  |
| S              | Area         | a $(10n^2+11n+6) t_{FA}+(4n-3) t_{xor}+(4n-3) t_{and}+(2n-2) t_{xnor}+(2n-2) t_{or}+(14n+2) t_{not}$ |               |               |               |                                |  |  |
|                | Delay        | $3t_{NOT}+20nt_{FA}+L_1$                                                                             |               |               |               |                                |  |  |

Table 1. Hardware Characterization of Each Part of the Proposed Reverse Converer

The dynamic range of the proposed 5-moduli set has 8*n*-1 bits. Up to now, it is difficult to find a existing moduli sets that has the same dynamic range as the proposed 5-moduli set. Moduli

6583

sets of [10] and [12] have the same numbers of channels as our proposed moduli set, but have different dynamic range. We compare our moduli set with [10] and [12]. For a fair comparison, *n*-bit CPAs with EAC are considered for the implementation of the modulo  $2^n$ -1 adder for all converters. Table 2 shows the hardware costs and conversion delays of the proposed 5-moduli set reverse converters, [10] and [12]. It is clear from the table that proposed 5-moduli set reverse converters has a significant superiority in high dynamic range moduli set with efficient reverse converter.  $L_1$  and  $L_2$  are number of the levels of the CSA tree with (5n) input and ((n/2)+1) input respectively.

 Table 2. Hardware Complexity and Conversion Delay Comparison

| converter           | DR   | n    | area                                                                                                       | Delay                                                           |  |  |  |
|---------------------|------|------|------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------|--|--|--|
| [10]                | 15n  | odd  | >( 36n <sup>2</sup> +144 <i>n</i> ) A <sub>FA</sub>                                                        | >(27 <i>n</i> +9) <i>t</i> <sub>FA</sub>                        |  |  |  |
| [12]( <i>n</i> =6k) | 5n   | even | $\frac{1}{6}$ (5n <sup>2</sup> +145n-12) A <sub>FA</sub> n=6k                                              | (18 <i>n</i> + <i>L</i> <sub>2</sub> +7) <i>t</i> <sub>FA</sub> |  |  |  |
| Proposed            | 8n-1 | any  | $(10n^{2}+11n+6) A_{FA} + (4n-3)A_{xor} + (4n-3)A_{and} + (2n-2)A_{xnor} + (2n-2)A_{or} + (14n+2) A_{not}$ | $(3t_{NOT}+20nt_{FA}+L_1) t_{FA}$                               |  |  |  |
|                     |      |      |                                                                                                            |                                                                 |  |  |  |

## 6. Conclusion

This paper introduced the new high dynamic range 5-moduli set  $\{2^{n}-1,2^{n}+1,2^{2n}, 2^{2n}+1,2^{2n-1}-1\}$ , where *n* is any positive integer. The efficient reverse converter for the moduli set is presented that is based on the NEW CRTs and the converse algorithm for the four-moduli set  $\{2^{n}-1,2^{n}+1,2^{2n},2^{2n}+1\}$ . The overall speed of the arithmetic unit of RNS systems based on the moduli set  $S_1$  and S is restricted to the  $2^{2n}+1$  channel, but the proposed 5-moduli set reverse converter that its dynamic range has raised to 8n-1 bits.

## References

- [1] HL Garner. The Residue Number System. *IRE Transactions on Electronic Computers*. 1959; 8(6): 140-147.
- [2] NS Szabo, RJ Tanaka. Residue Arithmetic and its Applications to Computer Technology. New York: McGraw-Hill.1967; 1-20.
- [3] MA Soderstrand. Residue number system arithmetic: modern applications in digital signal processing. California: IEEE Press. 1986: 1-32.
- [4] B Parhami. Computer Arithmetic: Algorithms and Hardware Design. Oxford, U.K.: Oxford Univ. 2000.
- [5] PVA Mohan. Residue Number Systems: Algorithms and Architectures. Norwell, MA: Kluwer, 2002.
- [6] G Dimauro, S Impdedovo, G Pirlo. A new technique for fast number comparison in the residue number systems. *IEEE Trans. Comput.*, 1993; 42: 608–612.
- [7] M Lu, J Chiang. A novel division algorithm for the residue number system. *IEEE Trans. Comput.,* 1992; 41: 1026–1032.
- [8] Y Wang. New Chinese remainder theorems. Proc. 32th Asilomar Conf. Signals, Systems, Computers, 1998; 1: 165–171.
- [9] A Skavantzos. An efficient residue to weighted converter for a new residue number system. Proc. 8th Great Lakes Symp. VLSI, New Orleans, LA. 1998; 9: 185–191.
- [10] J Mathew, D Radhakrishnan, T Srikanthan. Fast residue-to-binary converter architectures. Proc. 4nd Midwest Symp. Circuits Syst. 1999; 2: 1090–1093.
- [11] A Skavantzos, T Stouraitis. *Grouped-moduli residue number systems for fast signal processing*. Proc. Int. Symp. Circuits Syst. 1999; 3: 478–483.
- [12] B Cao, CH Chang, T Srikanthan. A residue-to-binary converter for new five-moduli set. *IEEE Trans. Circuits Systems I.* 2007; 54(5): 1041–1049.
- [13] A Skavantzos, M Abdallah. Implementation issues of the two-level residue number system with pairs of conjugate moduli. *IEEE Trans. Signal Process.* 1999; 47(3): 826–838.
- [14] AS Molahosseini, K Navi, O Hashemipour, et al. Efficient reverse converter for the New 4-moduli sets {2<sup>n</sup>-1, 2<sup>n</sup>, 2<sup>n</sup>+1, 2<sup>2n+1</sup>-1} and {2<sup>n</sup>-1, 2<sup>n</sup>+1, 2<sup>2n</sup>, 2<sup>2n</sup>+1} based on the New CRTs. *IEEE Trans. on Circuits and Systems I.* 2010; 57(4): 823-835
- [15] SJ Piestrak. A high- speed realisation of residue to binary system conversion. IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process. 1995; 42(6): 661–663,
- [16] Patel RA, Benaissa M, Boussakta S. Fast parallel-prefix architectures for modulo 2<sup>n</sup>-1 Addition with a single representation of zero. *IEEE Transactions on Computers*. 2007; 56(11): 1484-1492.