Soft Computing Research

Soft Computing Techniques to Advance Non-Linear Mappings for
Multivariate Data Visualization and Wireless Sensor Localization


Kuncup Iswandy and Andreas König

Institute of Integrated Sensor Systems,
Department of Electrical Engineering and Information Technology,
University of Kaiserslautern, Kaiserslautern D-67655, Germany


Email: ,



Abstract

The estimation of variables representing pivot points in a given space Y from similarity information in an original space X under the constraint of preserving this similarity information is a common task in pattern recognition, data analysis, and most recently also in localization required for context-based systems, such as wireless-sensor-networks. Non-linear mappings, such as Multi-Dimensional Scaling or Sammon’s mapping are commonly applied for these tasks based on the Euclidean distance as similarity measure. In visualization, data is commonly mapped from known measurement values in a high-dimensional space to a low dimensional target space, e.g., 2D or 3D, while in classification the target space dimension can vary. In localization, the original and target space have the same dimension, coordinate values are unknown and have to be estimated from potentially noisy and incomplete ranging data. Basically, gradient descent techniques are applied for error minimization in the nonlinear mapping process. Achievable visualization, analysis, classification, and localization quality depend on the mapping error reduction. Reminiscent of the gradient descent technique’s limitation, potentially more powerful soft computing methods are regarded in our work, e.g., genetic algorithms or particle swarm optimization. In this paper, we propose a novel particle swarm optimization approach to non-linear mappings, based on the Michigan representation as well as a relaxing neighborhood function inspired by Kohonen’s SOM for improved convergence. The approach was applied to several visualization benchmarks and generated artificial localization data, and found to be well working. Compared with results from standard gradient descent, it returned either better or faster results or both.


1. Introduction

Intelligent systems, based on increasing variety of sensorial information, find more and more widespread application. In medical, military, automotive, industrial control and automation, agriculture, data analysis, home automation, or ambient intelligence applications soft computing methods are required to map data for compaction or analysis purpose from a commonly high dimensional original space to a lower dimensional target space. Numerous linear and nonlinear as well as unsupervised or supervised methods have been investigated in this context; see, e.g., [2, 3]. For instance, linear discriminance analysis is widely used, e.g., in gas sensing application. Non-linear mappings, such as corresponding variants of Multi-Dimensional-Scaling (MDS) or Sammons’s mapping [1] have found widespread use for data compression to 2- or 3-dimensional representations employed for multivariate data visualization and analysis. These mappings try to preserve the similarity of data in the original space X, commonly expressed based on the Euclidean metric, in the low-dimensional target space Y. For this aim, pivot point coordinates are estimated in space Y, that satisfy the following error function, denoted as Sammon’s stress:

(1)
(2)

The error function tries to minimize the difference of error in both spaces by iteratively changing pivot point values to suitable pivot point locations in Y space. For this aim, the distance in Y are recomputed every iteration:

(3)
(4)

Commonly, the iteration is based on gradient descent approach, employing first and potentially second order derivatives:

(5)
(6)
(7)
(8)

Thus, the approach is sensitive to the, commonly random, initialization, as well as being trapped in the nearest local minimum. Naturally, more capable optimization methods are interesting candidates to employ in this versatile mapping type [4, 5], however, also under efficiency constraints. The regarded mapping can also serve as a nonlinear projection method for data reduction in classification, where the dimension of Y-space is not fixed. A required recall extension is provided in [2]. Most recently, such mappings have also found widespread use in context-based applications, where the localization context has to be extracted repetitively. For instance, in Wireless-Sensor-Networks (WSN), fast, efficient, and accurate localization from potentially noisy communication range data requires the estimation of unknown relative coordinates. Actually, X and Y spaces are identical in this application, and the pivot points in Y-space are estimated to minimize the distance error [7, 9, 11]. By additional information, e.g., from anchor nodes, absolute location information can be gained from the estimates.


In all these applications domains, achievable visualization, analysis, classification, and localization quality crucially depend on the mapping error reduction.


In this paper, we propose for this aim a novel particle swarm optimization approach to non-linear mappings, based on the Michigan representation [10, 12] jointly with a relaxing neighborhood function inspired by Kohonen’s SOM for improved convergence, which has been proposed before in the context of mapping focus [3] in such non-linear mappings. The next section will describe the basic PSO and the novel method developed for advance in mapping quality and efficiency. The third section will describe the application of the new method to standard benchmarks for visualization. The fourth section, before concluding, will demonstrate a localization application, taken as a benchmark from prior work [8].



2. Improved Optimization Methods for Non-Linear Mapping

2.1 Standard PSO Algorithm

Particles have their own position, Ak, and velocity, vk, vectors (k = 1, 2, …, L) of dimension S. In standard PSO algorithm used for non-linear mapping, the particle’s position in the swarm is represented as the concatenation of n pairs of coordinates of the whole patterns in the dataset. In general, the particle’s position is represented as

(9)

where ai = [ai,1, ai,2, …, ai,d] is the i-th pattern, so the dimension of particle is S = n ×d.


In the process of finding the optimum solution (i.e., the best particle), the update of position and velocity of each particle is based on the standard PSO [4] [7], i.e.,

(10)
(11)

where w is the inertia weight, c1 and c2 are the acceleration positive constants, and R1 and R2 are two random numbers uniformly distributed in the range [0, 1]. The velocity on each dimension is limited to be ±Vmax. The best previously visited position of particle k-th is denoted as Pk. The best particle in the swarm is denoted as Pg. Each new updated position of particle is evaluated by the Sammon’s stress as the fitness measure. The current fitness of particle is compared with its previous best position Pk.


In [4], the authors reported the successful results by employing standard PSO algorithm for non-linear mapping, but this algorithm requires high computational effort and time for converging to optimum solution (8000 iterations). Similar to our previous work [8] in wireless sensor localization, the standard PSO set to maximum 2000 iterations could not compete the gradient descent set to 100 iterations. Thus, under efficiency constraints, state-of-the-art methods seem not be competitive.


2.2 Novel PSO Algorithm based on Michigan Representation

To overcome the imperfection of standard PSO, our novel PSO algorithm for non-linear mapping called Shrinking Neighborhood Michigan PSO (SNMPSO) is introduced. In this algorithm, each position of single particle represents only a single pattern of whole dataset. This representation of particles is based on Michigan approach from the area of genetic classifier systems [8, 13], and denoted as follows

(12)


Movement rules are modified, where the particle’s positions are subsequently updated and the updated position of one particle is used by other particles in the same iteration. The member of particle’s neighborhood is split into two groups, i.e., attraction and repulsion sets, by comparing the distances between high dimensional data and low dimensional data. The Gaussian function, gx, with decaying of σ(t), computed based on original data is applied to select the member of particle’s neighborhood and to control the attraction of particles in the particle’s movement. Another Gaussian function, gy, computed based on the low dimensional data is used for controlling the repulsion effect. The decaying of σ(t) can be as linear or exponential function during iteration as shown in Fig. 1. The neighbors of a particle are selected, if their probability value is larger or equal 0.1.


Fig. 1. The decaying of σ(t) as (a) linear function or (b) exponential function


The movement of each particle in our SNMPSO algorithm is computed as follows

(13)
(14)
(15)
(16)
(17)
(18)

where zatt and zrep are attraction and repulsion vectors, respectively. The procedure of this SNMPSO is shown in Fig.2.


  1. Load dataset.
  2. Initialize swarm; dimension of particles equals to dimension of patterns and the number of particles is same with the number of patterns in dataset.
  3. Until maximum number of iterations or determined minimum error reached:
    1. Subsequently move the particles, one by one, for each particle:
      1. Compute the multivariate normal density (Gaussian Function) based on original dataset.
      2. Compute the multivariate normal density (Gaussian Function) based on swarm.
      3. Select which particles in its neighborhood region are put in attractive or repulsive sets.
      4. Compute the particle's next position based on attraction, repulsion, and its best position achieved.
      5. Update the particle.
    2. Evaluate the Sammon's mapping error.
    3. If the swarm gives the best success so far, then record the current particle's position as current best swarm.

Fig.2.The procedure of SNMPSO algorithm for NLM



3. Experiments for Multivariate Data Visualization

In these experiments for multivariate data visualization, two benchmark datasets (Iris and Wine) and two real-world datasets (Mechatronic [5] and Gas [6]) were used and summarized in Table 1.

The simulation results and the quantifying measures, i.e., Sammon’s stress or mapping error [1] and topology preservation [2], were presented for each dataset. In each algorithm, ten simulation runs were executed. The Sammon’s magic factor (MF) of gradient decent technique was set at 0.3 similar to [1].  The SNMPSO parameters were set for initial w = 0.9 with a gradual decrease toward 0.4, c1 = c2 = 2.5, c3 = 1 and Vmax = 0.3.  Gradient descent (GD) and MPSO algorithms were set at 100 iterations for the simulations using Iris, Wine, and Mechatronic datasets, and 200 iterations for using Gas datasets.


Table 1. Datasets used in the experiments.

DatasetAttributesClassesSamples
Iris4350 / 50 / 50
Wine13359 / 71 / 48
Mechatronic24450 / 100 / 200 / 25
Gas410466 / 66 / 66 / 66


Table 2. Sammon’s stress in 10 runs.

MethodIris
mean (std.)
Wine
mean (std.)
Mechatronic
mean (std.)
Gas
mean (std.)
NLM – GD0.0101 (0.0023)0.0668 (0.0056)0.0505 (0.0178)0.0140 (0.0231)
NLM – SNMPSO0.0082 (0.0001)0.0655 (0.0015)0.0464 (0.0011)0.0030 (0.0003)

In Table 2, the mapping error (Sammon’s stress), averaged over 10 run simulations, is given for both gradient descent and SNMPSO. The best Sammon’s stress was obtained by SNMPSO for all benchmarks. Notice that the mapping error variances of SNMPSO are lower than the respective variances obtained by gradient descent technique. Figure 3 show that, our SNMPSO can converge to the optimum solution faster than achieved by gradient descent technique. The topology preservation measures of gradient descent are better than SNMPSO for Iris, Wine, and Gas datasets as shown in Table 3. For Wine data, the topology preservation measure is slightly better for SNMPSO than achieved by gradient descent. Note the assessment function used in both algorithms is the Sammon’s stress and the topology preservation measure is not correlated to the Sammon’s stress. The same results have been reported in [4]. Figure 4 shows the projection results of all four benchmarks for gradient descent and SNMPSO based on visualization algorithms.



Table 3. Topology preservation measures in 10 runs.

MethodIris
mean (std.)
Wine
mean (std.)
Mechatronic
mean (std.)
Gas
mean (std.)
NLM – GD0.7051 (0.0093)0.4665 (0.0136)0.4048 (0.0160)0.7279 (0.0120)
NLM – SNMPSO0.6867 (0.0056)0.4820 (0.0071)0.3755 (0.0098)0.6097 (0.0267)


(a) Iris (b) Wine
(c) Mechatronic (d) Gas

Fig. 3. Sammon’s mapping error (average of 10 runs): Gradient vs. SNMPSO



(a) Iris: Gradient(b) Iris: SNMPSO
(c) Wine: Gradient(d) Wine: SNMPSO
(e) Mechatronic: Gradient(f) Mechatronic: SNMPSO
(g) Gas: Gradient(h) Gas: SNMPSO

Fig. 4. Projection results: Gradient vs. SNMPSO

 

4. Experiments for Wireless Sensor Localization

Simulation process for localization algorithms involves deployment, ranging, localizing, and analysis steps, respectively. We applied non-linear mapping for localization of wireless sensors and compared the optimization algorithms between gradient descent and SNMPSO. Similar to common simulation of wireless sensor applications [9], the artificial sensor localization data based on Received Signal Strength Indicator (RSSI) were generated.  A cubical volume of [10m X 10m X 10m], 125 sensor nodes and 18 anchors / base stations were deployed. Sensors were deployed in random grid and anchors were deployed on uniform grid as shown in the Fig. 5 below.

Fig. 5. Deployment of anchor and sensor nodes in 3D space (x: anchor and ٠: sensor)



(a) FDM(b) IDM

Fig. 6. Sammon’s stress (average of 10 runs): Gradient vs. SNMPSO


Based on the power capacity of sensor node, maximum range dmax, which corresponds to maximum range of the wireless signal, is defined. Ranging process involves assigning of dmax, with additive Gaussian noise, to model connectivity between nodes in real deployment scenarios.


Variations in range, dmax would lead to variations in connectivity between nodes providing us with different Incomplete Distance Matrix (IDM). For the ranging size, dmax is selected to be 2.5m, 5m, 6m, 7.5m and 10m. Gaussian noise level (s ) was set at level 5. In these experiments, we investigated both Full Distance Matrix (FDM) approach and IDM. For each experiment, localization algorithms were executed 10 times and maximum of 200 iterations for both IDM and FDM.


The mapping error for sensor localization is better for gradient descent in the range of 250, 500, and 600 using FDM approach than obtained by SNMPSO as shown in Fig. 6(a). However, when using IDM approach (see Fig. 6(b)), our SNMPSO can perform, in general, much better than gradient descent technique, especially for the range 250, 500, 600, and 1000.


5. Conclusion

In this work, we picked up the challenge to improve versatile non-linear mappings [1] employed in visualization, analysis, classification [2,3], and localization [8, 10] with regard to mapping quality as well as efficiency. The proposed novel approach employed PSO with Michigan representation along with time-dependent neighborhood function and weighting [3], basically inspired by Kohonen’s SOM.


The algorithm was compared for several benchmarks in visualization and the artificial localization of wireless sensor data to the common gradient descent implementation and found to be competitive either with regard to achievable mapping error, speed, or both.


In future work, we will refine the method, employ it to more benchmarks in visualization as well as localization and extend its use to classification. In particular, its application to localization in sensor modalities other than RF ranging information and options of distributed computation in WSN will be subject of study.

 

References

[1] J.W. Sammon, "A nonlinear mapping for data structure analysis", IEEE Transactions on Computers, C-18: 401-409, 1969.


[2] A. König, "Interactive visualization and analysis of hierarchical neural projections for data mining", IEEE Transactions on Neural Networks, 11(3): 615-624, 2000.


[3] A. König, "Dimensionality reduction techniques for interactive visualization, exploratory data analysis, and classification", Pattern Recognition in Soft Computing Paradigm, World Scientific, FLSI Soft Computing Series, Nikhil R. Pal (eds.), 2: 1-37, 2001.


[4] C.J. Figueroa, P.A. Estévez, R.E. Hernández, "Non-linear mappings based on particle swarm optimization", Proc. International Joint Conference on Neural Networks, 1487-1492, 2005.


[5] K. Iswandy, A. König, "Evolutionary multidimensional scaling for data visualization and classification", Application of Soft Computing: Recent Trends, Tiwari et al. (eds.), Springer, 177-186, 2006.


[6] K.Iswandy, A. Koenig, T. Fricke, M. Baumbach, A. Schuetze, "Towards automated configuration of multi-sensor systems using evolutionary computation – a method and a case study", Journal Comput. Theoret. Nanosci., 2:574-582, 2005.


[7] K. Whitehouse, D.E. Culler, "A robustness analysis of multi-hop ranging-based localization approximations", Proc. of the fifth Int. Conf. on Information Processing in Sensor Networks, 2006.


[8] J. Kennedy, R.C. Eberhart, "Particle swarm optimization", Proc. IEEE International Conference on Neural Networks, IJCNN, 1995.


[9] L. Rao, K. Iswandy, A. König, "Cost effective 3D localization for low-power WSN based on modified Sammon’s stress function with incomplete distance information", Online World Conf. on Soft Computing in Industrial Applications, WSC14, 2009.


[10] A. Cervantes, I.M. Galván, P. Isasi, "AMPSO: a new particle swarm method for nearest neighborhood classification", IEEE Transactions on Systems, Man, and, Cybernetics – Part B: Cybernetics, 39(5): 1082-1091, 2009.


[11] Y. Shang, W. Ruml, Y. Zhang, M.P.J. Fromherz, "Localization from mere connectivity", Proc. of the fourth ACM Int. Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc), 201-212, 2003.


[12] S.W. Wilson, "Classifier fitness based on accuracy", Evol. Comput., 3(2): 149-175, 1995.



About the Authors

Kuncup Iswandy received the B.E. degrees from University of Trisakti, Jakarta, Indonesia, in 1999 and M.Sc. degrees from University of Kaiserslautern, Germany, in 2005, respectively. He is currently working toward the Ph.D. degree in Institute of Integrated Sensor Systems, EIT, University of Kaiserslautern, Germany. His current research focuses on automated design of intelligent integrated sensor systems, wireless sensor networks, pattern recognition, and evolutionary computation techniques.

 

 

Andreas König received the diploma and Ph.D. degrees in electrical engineering from TU Darmstadt Darmstadt, Germany, in 1990 and 1995 in Microelectronic Systems and Neural Networks, respectively. In 1995, he was with Fraunhofer-Institute for Information- and Data processing IITB in Karlsruhe in the Department Image- and Signal Interpretation, Machine Vision Group pursuing research in visual automated inspection and aerial/satellite image evaluation. In 1996, he became appointed Assistant Prof. (Hochschuldozent) for Electronic Devices at TU Dresden pursuing research on massively parallel architectures and analog integrated circuits for robust vision systems. After a temporary assignment as head of the chair of computer engineering at TU Chemnitz in 2001, he was appointed Professor and Head of Institute of Integrated Sensor Systems, Dept. of Electrical and Computer Engineering, TU Kaiserslautern, pursuing research on robust, adaptive/self-x multi-sensory system design, integration, and application, e.g., in industrial vision, measurement and instrumentation, home automation, Ambient Intelligence, and wireless sensor networks. He founded and currently heads the IEEE CIS German chapter and is Senior member of IEEE. He serves an editorial/advisory board member for the Journals of Hybrid Intelligent Systems and Knowledge-Based-Engineering and is member of the MIR-Labs. He is author or coauthor of more than 100 technical papers in the field of his research.