# Low-Power Manhattan Distance Calculation Circuit for Self-Organizing Neural Networks Implemented in the CMOS Technology

Rafał Długosz<sup>1,2</sup>, Tomasz Talaśka<sup>1</sup>, Witold Pedrycz<sup>3</sup>, Pierre-André Farine<sup>2</sup>

 Faculty of Telecommunication and Electrical Engineering University of Technology and Life Sciences
 ul. Kaliskiego 7, 85-796, Bydgoszcz, Poland

2- Institute of Microtechnology Swiss Federal Institute of Technology in Lausanne (EPFL) Rue A.-L. Breguet 2, CH-2000, Neuchâtel, Switzerland

3- Department of Electrical and Computer Engineering University of Alberta Edmonton, AB T6G 2V4, Canada

Abstract. The paper presents an analog, current-mode circuit that calculates a distance between the neuron weights vectors W and the input learning patterns X. The circuit can be used as a component of different self-organizing neural networks (NN) implemented in the CMOS technology. In Self-Organizing Maps (SOM) as well as in NNs using the Neural Gas or the Winner Takes All (WTA) learning algorithms, to calculate the distance between X and W, the same circuit can be used that makes it a universal structure. Detailed system level simulations of the WTA NN and the Kohonen SOM showed that using both the Euclidean (L2) and the Manhattan (L1) distance measures leads to similar learning results. For this reason, the L1 measure has been implemented, as in this case the circuit is much simpler than the one using the L2 distance, resulting in very low chip area and low power dissipation. This enables including even large NNs in miniaturized portable devices, such as sensors in Wireless Sensor Networks (WSN) or Wireless Body Area Networks (WBAN).

## 1 Introduction

In today's world there is a strong demand for new medical health care systems that are able to provide continuous monitoring of persons that suffer from different disabilities, such as Cardio-vascular Disease (CVD), diabetes, etc. One of new emerging solutions that are suitable for such purposes are Wireless Body Area Networks (WBAN), as it has been well described by Latré *et al.* in [1]. A frequent monitoring of such persons possible in this case reduces, for example, the risk of a sudden death due to the stroke. Looking at the problem from another point of view, in the literature one can find many examples of using artificial neural networks (ANNs) in the analysis of various biomedical signals [2],[3]. Due to their high efficiency, ANNs are able to aid the medical staff in monitoring of patients. As ANNs are usually realized in software, this makes rather difficult to use them in the monitoring systems based on WBAN.

The WBAN technology is currently intensively developed around the world [1]. Such systems are composed of miniaturized wireless sensors placed on a human body that communicate with a base station (master processing unit – MPU). The currently used WBAN systems are based on relatively simple sensors that usually perform only several basic tasks, such as: data collection, analog-todigital conversion (ADC), simple data preprocessing and conditioning. Finally, using the radio frequency (RF) communication block data are being transmitted to MPU for further detailed analysis. One of the main problems encountered in such systems today is very large amount of energy lost (even 95% of total energy) during the RF wireless transmission. For this reason we are considering a new approach. The aim is to develop a miniaturized ultra-low power NN that will be directly used, as an additional component, in particular sensors of the WBAN. As a result, advanced data processing and analysis tasks will be performed at the sensor level, while the RF block will be used only in emergency situations, typically remaining in the 'standby mode'. As the expected power dissipation of the NN will be much smaller than those of the RF block, this approach will reduce the energy consumed by particular sensors even by 70-90%.

The new approach requires using simple learning algorithms that can be easily implemented in hardware, shuch as WTA NN or the SOM. In this case only basic arithmetic operations such as addition, subtraction and multiplication, are required. The Kohonen SOMs are commonly used, for example, in the analysis and classification of the ECG signal [4] with the efficiency of up to 97 %. The number of neurons required in this case usually does not exceed 150 [5] that makes an implementation of such network in hardware feasible. An example application of the Kohonen SOM in a wearable system is described in [4]. This system, not wireless in this case, is able to recognize the most significant cardiac arrhythmias. In this system, the NN is implemented on the MPU, while the sensors are only used to collect data that are transmitted to MPU using wires.

The proposed distance calculation circuit (DCC) is very simple, containing only 16 medium sized transistors per each x - w pair. As a result, the overall NN will be small enough to enable using it in particular sensors in the WBAN. This will make the systems like the one in [4] much more convenient in use.

## 2 The proposed analog Distance Calculation Circuit

Distance calculation circuit (DCC), which is the topic of this paper is one of main components of hardware implemented self-organizing NNs. While designing this block, first it is necessary to determine a proper distance measure between the Xand the W vectors. The simulations performed by means of the software model of both the WTA NN and the Kohonen SOM show that using the Manhattan distance (L1) does not negatively impact the learning process. Let us recall that the L1 and the L2 distance measures are described, as follows: ESANN 2012 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 25-27 April 2012, i6doc.com publ., ISBN 978-2-87419-049-0. Available from http://www.i6doc.com/en/livre/?GCOI=28001100967420.



Fig. 1: General block diagram of the proposed analog, current-mode DCC



Fig. 2: Main components of the proposed circuit: (a) current-mode comparator (CMP), (b) absolute function block (ABS)

$$I_{\rm L1} = k \cdot \sum_{i=1}^{m} |x_i - w_{i,j}| \tag{1}$$

$$I_{L2} = \sqrt{k \cdot \sum_{i=1}^{m} (x_i - w_{i,j})^2},$$
(2)

In the L1 case we avoid using the squaring and the rooting operations, required to compute the L2 distance that significantly simplifies the overall structure of the DCC. Note that in (1) only summation and subtraction operations are required. In this case using the current-mode approach is the most suitable, given the fact that these operations are realized simply in junctions. The 'absolute' function is realized in a very direct manner in the proposed circuit.

A general block diagram of the proposed DCC is shown in Fig. 1, while components of this circuit are presented in Fig. 2. The main block of the DCC is the ABS block (a) that calculates an absolute value of the  $(x_i - w_{i,j})$  term for each weight w. This block is controlled by a current-mode comparator (b) that compares a given input  $x_i$  signal with the corresponding weight,  $w_{i,j}$ . The output signal of the comparator,  $s_{i,j}$ , controls the switches in the ABS block in such a way that the larger current of each of the x - w pair is always added to the junction A, while the smaller one is subtracted from this junction. The factor k is a constant parameter determined by transistor sizing. The  $I_{ABS_i}$  currents, coming from particular ABS blocks, are summed in the output junction in each neuron, providing the signal proportional to the distance measure.

The proposed ABS block, is kind of the rectifier. Many circuits of this type have been described in the literature [6, 7], but the existing solutions are not useful in this case. The proposed circuit calculates a rectified value of a difference between two signals independently on which of them is greater. Additionally the output signal from the comparator along with the  $\eta \cdot (x_i - w_{i,j})$  signal calculated by the ABS block is being used by a subsequent adaptation block of the NN, where the  $\eta$  parameter plays the role of the learning rate.

One of the advantages of the proposed circuit is parallel and asynchronous data processing. For n inputs of the NN, each neuron contains n ABS blocks working in parallel. A total number of such blocks in the NN equals  $m \cdot n$ , where m is the number of neurons. All these blocks operate in parallel without using the controlling clock. This substantially simplifies the structure of the circuit. The number of transistors in a single ABS block does not exceed 20.

Another advantage of the proposed DCC are short signal paths between the inputs and the output of the circuit, containing only two or three current mirrors (CM). A short signal path is very important to minimize the influence of the mismatch effect. This effect in particular CMs modifies the gain of the CM. In case of equal transistors in the CM the gain should equal 1, but the mismatch effect can modify it even by 1-10 %, depending on transistor sizes.

# 3 Verification of the proposed circuit

The proposed DCC has been tested be means of transistor simulations and on the basis of the software model to compare the results and evaluate the obtained precision. The DCC has been tested for an example case of three inputs xand three corresponding weights w that varied in the range of 100 nA – 6  $\mu$ A. Waveforms of particular signals have been carefully selected in such a way to present the behavior of the circuit in different situations. Figs. 3 and 4 present selected results for large  $(1-6 \ \mu A)$  and small  $(100-600 \ nA) \ x$  and w signals, respectively. In the time period of  $0 - 10 \ \mu s$  particular x signals and their corresponding weights w differ by very small values – for selected x - w pairs even only by 0.5%. The resultant DCC output current is very small, which is representative for the situation in which the weights of a given neuron are located very close to a given pattern X. Even for such small differences the comparators operate properly, but for such signals they exhibit the slowest performance. The calculation time was equal to 100 ns and 500 ns for large and small signals, respectively. Such situation, which can be viewed as the worst case scenario, appears rather seldom during the overall learning process.

ESANN 2012 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 25-27 April 2012, i6doc.com publ., ISBN 978-2-87419-049-0. Available from http://www.i6doc.com/en/livre/?GCOI=28001100967420.



Fig. 3: Verification of the DCC at the transistor level for large currents in the range of 1–6  $\mu$ A: (a, b, c) Input  $(I_x)$ , weight  $(I_w)$  that the comparator output signals, (d) Theoretical  $(I_{out_t})$  and real  $(I_{out_r})$  DCC output current and the power dissipation, (e) Error:  $I_{out_t} - I_{out_r}$  in reference to maximum range of  $I_{out}$ . The DCC output current is normalized – divided by the number of inputs.



Fig. 4: Results for small signals (in the range of 10–100 nA).

In the period of time from 10 to 18  $\mu$ s the signals differ more visibly that is more common situation, as typically neurons are more distant from particular vectors X. The waveforms of the signals in-between 0 and 18  $\mu$ s have multiplelevel step-like shapes. Such shapes are representative for the way, in which input data as well as neuron weights are provided to DCC i.e., as steady signals.

In the period from 18 to 30  $\mu$ s, x and w are relatively fast changing signals. Such signals allow to observe dynamic parameters of the circuit e.g. introduced delays. Figs. 3 and 4 (e) present a difference between the ideal (theoretical) output signal (proportional to the L1 distance) and the real output signal provided by the DCC. In the period starting from 18  $\mu$ s the error fluctuates in the range up to 10% (20 % for small signals), which is due to the delay introduced by the DCC. For "slower" signals the error is much smaller. The most reliable values of the error are those obtained in-between 10 and 18 $\mu$ s, always after finishing the transient state for particular data samples. In these cases, the obtained values are below 0.4% and 1.3% for large and small signals, respectively. This is a good result, taking additionally into account the fact that this error is systematic and almost equal for all neurons. It is also worth mentioning that accurate values of the distances are of secondary importance in detection process of the winner.

Figs. 3 and 4 (d) present also the power dissipation. In the second case the circuit dissipates less power but at the same time it requires larger time to calculate the value of the L1 distance. As a result the energy consumed during calculation of a single distance is smaller for larger signals (around 5 pJ). Additionally, for larger signals the circuit produces more precise results.

### 4 Conclusions

In this paper we have presented a new, low power, high precision, distance calculation circuit for analog self-organizing Neural Networks. The circuit operates asynchronously and fully in parallel. It features a simple structure, resulting in low chip area. This makes the circuit suitable for the application in miniaturized portable devices e.g. in WBAN systems.

#### References

- B. Latré, B. Braem, I. Moerman, C. Blondia, "A survey on wireless body area networks", Wireless Networks, vol. 17 (1), 2011, pp. 1-18
- [2] S. Osowski, Linh T.H., "ECG beat recognition using fuzzy hybrid neural network", IEEE Transactions on Biomedical Engineering, vol. 48 (11), 2001, pp. 1265-1271
- [3] A. Gacek, "Preprocessing and analysis of ECG signals A self-organizing maps approach", Expert Systems with Applications, Vol. 38 (7), 2011, pp. 9008-9013
- [4] G. Valenza, A. Lanata, M. Ferro, E.P. Scilingo, "Real-time discrimination of multiple cardiac arrhythmias for wearable systems based on neural networks", *Computers in Cardiology*, vol. 35, 2008, pp. 1053-1056
- [5] O. Inan, L. Giovangrandi, G. Kovacs, "Robust neural-network-based classification of premature ventricular contractions using wavelet transform and timing interval features", *IEEE Trans. Biomed. Eng.*, vol. 53, no. 12, pp. 2006, pp. 2507–2515
- [6] S. Khucharoensin, V. Kasemsuwan, "A High Performance CMOS Current-Mode Precision Full-Wave Rectifier (PFWR)", *International Symposium on Circuits and Systems*, (ISCAS), Vol.1, May 2003, pp. I-41–I-44
- [7] B. Boonchu, W. Surakampontom, "A CMOS current-mode squarer/rectifier circuit", International Symposium on Circuits and Systems, Vol.1, May 2003, pp. I-405–I-408