Kuncup
Iswandy and Andreas König
Institute of Integrated Sensor Systems,
Department of Electrical Engineering and Information Technology,
University of Kaiserslautern, Kaiserslautern D67655, Germany
Email:
,
The estimation of variables representing pivot
points in a given space Y from similarity information in an original space X
under the constraint of preserving this similarity information is a common task
in pattern recognition, data analysis, and most recently also in localization
required for contextbased systems, such as wirelesssensornetworks.
Nonlinear mappings, such as MultiDimensional Scaling or Sammon’s mapping are
commonly applied for these tasks based on the Euclidean distance as similarity
measure. In visualization, data is commonly mapped from known measurement
values in a highdimensional space to a low dimensional target space, e.g., 2D
or 3D, while in classification the target space dimension can vary. In
localization, the original and target space have the same dimension, coordinate
values are unknown and have to be estimated from potentially noisy and
incomplete ranging data. Basically, gradient descent techniques are applied for
error minimization in the nonlinear mapping process. Achievable visualization,
analysis, classification, and localization quality depend on the mapping error
reduction. Reminiscent of the gradient descent technique’s limitation,
potentially more powerful soft computing methods are regarded in our work,
e.g., genetic algorithms or particle swarm optimization. In this paper, we
propose a novel particle swarm optimization approach to nonlinear mappings,
based on the
Intelligent systems, based on increasing variety of sensorial information, find more and more widespread application. In medical, military, automotive, industrial control and automation, agriculture, data analysis, home automation, or ambient intelligence applications soft computing methods are required to map data for compaction or analysis purpose from a commonly high dimensional original space to a lower dimensional target space. Numerous linear and nonlinear as well as unsupervised or supervised methods have been investigated in this context; see, e.g., [2, 3]. For instance, linear discriminance analysis is widely used, e.g., in gas sensing application. Nonlinear mappings, such as corresponding variants of MultiDimensionalScaling (MDS) or Sammons’s mapping [1] have found widespread use for data compression to 2 or 3dimensional representations employed for multivariate data visualization and analysis. These mappings try to preserve the similarity of data in the original space X, commonly expressed based on the Euclidean metric, in the lowdimensional target space Y. For this aim, pivot point coordinates are estimated in space Y, that satisfy the following error function, denoted as Sammon’s stress:
(1)  
(2) 
The error function tries to minimize the difference of error in both spaces by iteratively changing pivot point values to suitable pivot point locations in Y space. For this aim, the distance in Y are recomputed every iteration:
(3)  
(4) 
Commonly, the iteration is based on gradient descent approach, employing first and potentially second order derivatives:
(5)  
(6)  
(7)  
(8) 
Thus, the approach is sensitive to the, commonly random, initialization, as well as being trapped in the nearest local minimum. Naturally, more capable optimization methods are interesting candidates to employ in this versatile mapping type [4, 5], however, also under efficiency constraints. The regarded mapping can also serve as a nonlinear projection method for data reduction in classification, where the dimension of Yspace is not fixed. A required recall extension is provided in [2]. Most recently, such mappings have also found widespread use in contextbased applications, where the localization context has to be extracted repetitively. For instance, in WirelessSensorNetworks (WSN), fast, efficient, and accurate localization from potentially noisy communication range data requires the estimation of unknown relative coordinates. Actually, X and Y spaces are identical in this application, and the pivot points in Yspace are estimated to minimize the distance error [7, 9, 11]. By additional information, e.g., from anchor nodes, absolute location information can be gained from the estimates.
In all these applications domains, achievable visualization, analysis, classification, and localization quality crucially depend on the mapping error reduction.
In this paper, we propose for this aim a novel particle swarm optimization approach to nonlinear mappings, based on the Michigan representation [10, 12] jointly with a relaxing neighborhood function inspired by Kohonen’s SOM for improved convergence, which has been proposed before in the context of mapping focus [3] in such nonlinear mappings. The next section will describe the basic PSO and the novel method developed for advance in mapping quality and efficiency. The third section will describe the application of the new method to standard benchmarks for visualization. The fourth section, before concluding, will demonstrate a localization application, taken as a benchmark from prior work [8].
Particles have their own position, A_{k},
and velocity, v_{k}, vectors (k = 1, 2, …, L) of dimension S. In standard PSO algorithm
used for nonlinear mapping, the particle’s position in the swarm is
represented as the concatenation of n pairs of coordinates of the whole
patterns in the dataset. In general, the particle’s position is represented as
(9) 
where a_{i}_{ }=
[a_{i,1}, a_{i,2}, …, a_{i,d}] is
the ith pattern, so the dimension of particle is S = n ×d.
In the process of finding the optimum solution (i.e., the best particle), the update of position and velocity of each particle is based on the standard PSO [4] [7], i.e.,
(10)  
(11) 
where w is the inertia weight, c_{1}
and c_{2} are the acceleration positive constants, and R_{1}
and R_{2} are two random numbers uniformly distributed in the
range [0, 1]. The velocity on each dimension is limited to be ±V_{max}.
The best previously visited position of particle kth is denoted as P_{k}.
The best particle in the swarm is denoted as P_{g}. Each new
updated position of particle is evaluated by the Sammon’s stress as the fitness
measure. The current fitness of particle is compared with its previous best
position P_{k}_{.}
In [4], the authors reported the successful results by employing standard PSO algorithm for nonlinear mapping, but this algorithm requires high computational effort and time for converging to optimum solution (8000 iterations). Similar to our previous work [8] in wireless sensor localization, the standard PSO set to maximum 2000 iterations could not compete the gradient descent set to 100 iterations. Thus, under efficiency constraints, stateoftheart methods seem not be competitive.
To overcome the imperfection of standard
PSO, our novel PSO algorithm for nonlinear mapping called Shrinking
Neighborhood Michigan PSO (SNMPSO) is introduced. In this algorithm, each
position of single particle represents only a single pattern of whole dataset.
This representation of particles is based on
(12) 
Movement rules are modified, where the
particle’s positions are subsequently updated and the updated position of one
particle is used by other particles in the same iteration. The member of
particle’s neighborhood is split into two groups, i.e., attraction and
repulsion sets, by comparing the distances between high dimensional data and
low dimensional data. The Gaussian function, g_{x}, with
decaying of σ(t), computed based on
original data is applied to select the member of particle’s neighborhood and to
control the attraction of particles in the particle’s movement. Another
Gaussian function, g_{y}, computed based on the low dimensional
data is used for controlling the repulsion effect. The decaying of σ(t) can be as linear or exponential function
during iteration as shown in Fig. 1. The neighbors of a particle are selected,
if their probability value is larger or equal 0.1.
Fig. 1. The decaying of σ(t) as (a) linear function or (b) exponential function
The movement of each particle in our SNMPSO
algorithm is computed as follows
(13)  
(14)  
(15)  
(16)  
(17)  
(18) 
where z_{att} and z_{rep}
are attraction and repulsion vectors, respectively. The procedure of this
SNMPSO is shown in Fig.2.

Fig.2.The procedure of SNMPSO algorithm for NLM
In these experiments for multivariate data
visualization, two benchmark datasets (Iris and Wine) and two realworld
datasets (Mechatronic [5] and Gas [6]) were used and summarized in Table 1.
The simulation results and the quantifying
measures, i.e., Sammon’s stress or mapping error [1] and topology preservation
[2], were presented for each dataset. In each algorithm, ten simulation runs
were executed. The Sammon’s magic factor (MF) of gradient decent technique was
set at 0.3 similar to [1]. The SNMPSO
parameters were set for initial w = 0.9 with a gradual decrease toward
0.4, c_{1} = c_{2}_{ }= 2.5, c_{3}
= 1 and V_{max}_{ }= 0.3. Gradient descent (GD) and MPSO algorithms
were set at 100 iterations for the simulations using Iris, Wine, and
Mechatronic datasets, and 200 iterations for using Gas datasets.
Table 1. Datasets used in the experiments.
Dataset  Attributes  Classes  Samples 
Iris  4  3  50 / 50 / 50 
Wine  13  3  59 / 71 / 48 
Mechatronic  24  4  50 / 100 / 200 / 25 
Gas  410  4  66 / 66 / 66 / 66 
Table 2. Sammon’s stress in 10 runs.
Method  Iris mean (std.)  Wine mean (std.)  Mechatronic mean (std.)  Gas mean (std.) 
NLM – GD  0.0101 (0.0023)  0.0668 (0.0056)  0.0505 (0.0178)  0.0140 (0.0231) 
NLM – SNMPSO  0.0082 (0.0001)  0.0655 (0.0015)  0.0464 (0.0011)  0.0030 (0.0003) 
In Table 2, the mapping error (Sammon’s
stress), averaged over 10 run simulations, is given for both gradient descent
and SNMPSO. The best Sammon’s stress was obtained by SNMPSO for all benchmarks.
Notice that the mapping error variances of SNMPSO are lower than the respective
variances obtained by gradient descent technique. Figure 3 show that, our
SNMPSO can converge to the optimum solution faster than achieved by gradient
descent technique. The topology preservation measures of gradient descent are
better than SNMPSO for Iris, Wine, and Gas datasets as shown in Table 3. For
Wine data, the topology preservation measure is slightly better for SNMPSO than
achieved by gradient descent. Note the assessment function used in both
algorithms is the Sammon’s stress and the topology preservation measure is not
correlated to the Sammon’s stress. The same results have been reported in [4].
Figure 4 shows the projection results of all four benchmarks for gradient
descent and SNMPSO based on visualization algorithms.
Table 3. Topology preservation measures in 10 runs.
Method  Iris mean (std.)  Wine mean (std.)  Mechatronic mean (std.)  Gas mean (std.) 
NLM – GD  0.7051 (0.0093)  0.4665 (0.0136)  0.4048 (0.0160)  0.7279 (0.0120) 
NLM – SNMPSO  0.6867 (0.0056)  0.4820 (0.0071)  0.3755 (0.0098)  0.6097 (0.0267) 
(a) Iris  (b) Wine 
(c) Mechatronic  (d) Gas 
Fig. 3. Sammon’s mapping error (average of 10 runs): Gradient vs. SNMPSO
(a) Iris: Gradient  (b) Iris: SNMPSO 
(c) Wine: Gradient  (d) Wine: SNMPSO 
(e) Mechatronic: Gradient  (f) Mechatronic: SNMPSO 
(g) Gas: Gradient  (h) Gas: SNMPSO 
Fig. 4. Projection results: Gradient vs. SNMPSO
Simulation process
for localization algorithms involves deployment, ranging, localizing, and
analysis steps, respectively. We applied nonlinear mapping for
localization of wireless sensors and compared the optimization algorithms
between gradient descent and SNMPSO. Similar to common simulation of wireless
sensor applications [9], the artificial sensor localization data based on
Received Signal Strength Indicator (RSSI) were generated. A cubical volume of [10m X 10m X 10m], 125
sensor nodes and 18 anchors / base stations were deployed. Sensors were deployed
in random grid and anchors were deployed on uniform grid as shown in the Fig. 5
below.
Fig. 5. Deployment of anchor and sensor nodes in 3D space (x: anchor and ٠: sensor)
(a) FDM  (b) IDM 
Fig.
6. Sammon’s stress (average of 10 runs): Gradient vs. SNMPSO
Based on the power
capacity of sensor node, maximum range d_{max}, which
corresponds to maximum range of the wireless signal, is defined. Ranging
process involves assigning of d_{max}, with
additive Gaussian noise, to model connectivity between nodes in real deployment
scenarios.
Variations in range, d_{max} would lead
to variations in connectivity between nodes providing us with different Incomplete
Distance Matrix (IDM). For the ranging size, d_{max} is
selected to be 2.5m, 5m, 6m, 7.5m and 10m. Gaussian noise level (s ) was set
at level 5. In these experiments, we investigated both Full Distance
Matrix (FDM) approach and IDM. For each experiment, localization
algorithms were executed 10 times and maximum of 200 iterations for both IDM
and FDM.
The mapping error for sensor localization is
better for gradient descent in the range of 250, 500, and 600 using FDM
approach than obtained by SNMPSO as shown in Fig. 6(a). However, when using IDM
approach (see Fig. 6(b)), our SNMPSO can perform, in general, much better than
gradient descent technique, especially for the range 250, 500, 600, and 1000.
In this work, we picked up the challenge to
improve versatile nonlinear mappings [1] employed in visualization, analysis,
classification [2,3], and localization [8, 10] with regard to mapping quality
as well as efficiency. The proposed novel approach employed PSO with Michigan
representation along with timedependent neighborhood function and weighting
[3], basically inspired by Kohonen’s SOM.
The algorithm was compared for several
benchmarks in visualization and the artificial localization of wireless sensor
data to the common gradient descent implementation and found to be competitive
either with regard to achievable mapping error, speed, or both.
In future work, we will refine the method,
employ it to more benchmarks in visualization as well as localization and
extend its use to classification. In particular, its application to
localization in sensor modalities other than RF ranging information and options
of distributed computation in WSN will be subject of study.
[1] J.W. Sammon, "A nonlinear mapping for
data structure analysis", IEEE Transactions on Computers, C18: 401409,
1969.
[2] A. König, "Interactive
visualization and analysis of hierarchical neural projections for data mining",
IEEE Transactions on Neural Networks, 11(3): 615624, 2000.
[3] A. König, "Dimensionality reduction
techniques for interactive visualization, exploratory data analysis, and
classification", Pattern Recognition in Soft Computing Paradigm, World
Scientific, FLSI Soft Computing Series, Nikhil R. Pal (eds.), 2: 137, 2001.
[4] C.J. Figueroa, P.A. Estévez, R.E.
Hernández, "Nonlinear mappings based on particle swarm optimization",
Proc. International Joint Conference on Neural Networks, 14871492, 2005.
[5] K. Iswandy, A. König, "Evolutionary
multidimensional scaling for data visualization and classification",
Application of Soft Computing: Recent Trends, Tiwari et al. (eds.), Springer,
177186, 2006.
[6] K.Iswandy, A. Koenig, T. Fricke, M.
Baumbach, A. Schuetze, "Towards automated configuration of multisensor
systems using evolutionary computation – a method and a case study",
Journal Comput. Theoret. Nanosci., 2:574582, 2005.
[7] K. Whitehouse, D.E. Culler, "A robustness
analysis of multihop rangingbased localization approximations", Proc. of
the fifth Int. Conf. on Information Processing in Sensor Networks, 2006.
[8] J. Kennedy, R.C. Eberhart, "Particle
swarm optimization", Proc. IEEE International Conference on Neural
Networks, IJCNN, 1995.
[9] L. Rao, K. Iswandy, A. König, "Cost
effective 3D localization for lowpower WSN based on modified Sammon’s stress
function with incomplete distance information", Online World Conf. on Soft
Computing in Industrial Applications, WSC14, 2009.
[10] A. Cervantes, I.M. Galván, P. Isasi, "AMPSO:
a new particle swarm method for nearest neighborhood classification", IEEE
Transactions on Systems, Man, and, Cybernetics – Part B: Cybernetics, 39(5):
10821091, 2009.
[11] Y. Shang, W. Ruml, Y. Zhang, M.P.J.
Fromherz, "Localization from mere connectivity", Proc. of the fourth ACM
Int. Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc), 201212,
2003.
[12] S.W. Wilson, "Classifier fitness based
on accuracy", Evol. Comput., 3(2): 149175, 1995.

Kuncup Iswandy received the B.E. degrees from University of
Trisakti, Jakarta, Indonesia, in 1999 and M.Sc. degrees from University of
Kaiserslautern, Germany, in 2005, respectively. He is currently working
toward the Ph.D. degree in Institute of Integrated Sensor Systems, EIT,
University of Kaiserslautern, Germany. His current research focuses on
automated design of intelligent integrated sensor systems, wireless sensor
networks, pattern recognition, and evolutionary computation techniques. 



Andreas König received the diploma and Ph.D. degrees in
electrical engineering from TU Darmstadt Darmstadt, Germany, in 1990 and 1995
in Microelectronic Systems and Neural Networks, respectively. In 1995, he was
with FraunhoferInstitute for Information and Data processing IITB in
Karlsruhe in the Department Image and Signal Interpretation, Machine Vision
Group pursuing research in visual automated inspection and aerial/satellite
image evaluation. In 1996, he
became appointed Assistant Prof. (Hochschuldozent) for
Electronic Devices at TU Dresden pursuing research on massively parallel
architectures and analog integrated circuits for robust vision systems. After
a temporary assignment as head of the chair of computer engineering at TU
Chemnitz in 2001, he was appointed Professor and Head of Institute of
Integrated Sensor Systems, Dept. of Electrical and Computer Engineering, TU
Kaiserslautern, pursuing research on robust, adaptive/selfx multisensory
system design, integration, and application, e.g., in industrial vision,
measurement and instrumentation, home automation, Ambient Intelligence, and
wireless sensor networks. He founded and currently heads the IEEE CIS German
chapter and is Senior member of IEEE. He serves an editorial/advisory board
member for the Journals of Hybrid Intelligent Systems and
KnowledgeBasedEngineering and is member of the MIRLabs. He is author or
coauthor of more than 100 technical papers in the field of his research. 