# Clustering during Construction

JOptTourOptimizer uses a multi-algorithm solving strategy. Meaning, different algorithms are used in direct succession to solve a Tour-Optimization problem where the output of one algorithm is the input of the subsequent algorithm.

By default, in the first step, a deterministic construction algorithm is used to create an initial starting solution (working solution). Even though theoretically, it is not necessary to use a constructed starting solution, it helps to converge faster against a better solution in subsequently executed non-deterministic optimization algorithms.

Indeed, the expectable quality of the solution (utilizing the same processing power for crunching) is **highly** improved when using construction.

Here we would like to present you some examples of the output of our clustering-construction algorithm that was developed by DNA-Evolutions to precisely match our customers' requirements.

## Overview

## At a glance

The **HybridMultiDimensionalClusteringAlgorithm** is developed to fulfil the following requirements:

**Fast**: The runtime should be on a second, up to a low minute scale, even for big problem sets (> 1000 Elements).**Robust**: No matter how the input data looks, the construction result should be meaningful and provide a feasible solution.**Multi-dimensional**: Often, data clustering is done using a specific merit figure. In TourOptimization problems, the distance between elements is often used to define clusters. However,`WorkingHours`

of Resources and`OpeningHours`

of Nodes play a critical role to decide which Nodes belong to a cluster and which Resource (or collection of Resources) should own that cluster. Moreover, constraints can exclude a Resource from visiting a Node.**Dynamic**: As the user input purely defines the number of Resources and Nodes (and all their properties), it is impossible to determine the number of clusters a priori. Therefore, the generation of an ideal number of clusters and their population is left solely to the algorithm.

The basic idea of the **HybridMultiDimensionalClusteringAlgorithm** is a multi-stage attempt. In general, the algorithm uses strategies from different well-known clustering algorithms like K-Means, DBSCAN, Hierarchical Clustering, and others but tries to overcome some of their drawbacks with the goal in mind, to be most suitable for TourOptimization problems. For example, the algorithm should be able to handle single-, multi-depot and effective multi-depot problems.

Summarized working principle:

- Collect Resources and Nodes and calculate free path lengths between them to define an approximate local size and number of potential clusters.
- Let Resources collect Nodes in parallel and compete for Nodes acquired by multiple Resource clusters.
- Use a parallel multidimensional savings-like algorithm to re-adjust clusters and create an initial solution.
- Use disturbance-theory strategies to position elements that were not positionable so far.

(The internal name of **HybridMultiDimensionalClusteringAlgorithm** is **MultiAttemptParallelMultiDimensionalHybridSavingsAlgo** )

## Examples

In the following, we describe the result of the **HybridMultiDimensionalClusteringAlgorithm**. We do not use any further optimization stages besides the construction phase.

### Phyllotaxis - Example One

For dynamically creating a large number of elements, we use a phyllotaxis pattern. *Figure - Phyllotaxis - Overview* shows a setup of 1000 Nodes (yellow needles) and 40 Resources (blue needles). The node ids are defined as *N_0* to *N_1000*, and the resource ids are defined as *R_0* to *R_40*. Further, all Nodes (and all Resources) have the same stack of properties besides their individual position. Therefore we expect a pure distance driven result.

*Figure - Phyllotaxis - Overview* (Click to enlarge)

*Figure - Phyllotaxis - Zoom* (Click to enlarge)

When zooming in, it becomes evident that the result of the algorithm is convincing. The routes have approximately the same length, and further, they are not crossing each other, nor do they have in-route crossings.

Example

Please visit our GitHub-Page for the Phyllotaxis-Example. Note: You can only run the example when you have a valid license.

### Phyllotaxis - Example Two

For the following example, we modify the first example. All Nodes with an odd number (e.g. N_13, N_25 etc.) are open on the 7th of May (8 am to 10 pm), and all even-numbered nodes are open on the 6th of May. Equally, odd-numbered Resources are open on the 7th (even-numbered on the 6th).

We expect that the clustering algorithm is now also creating clusters and routes based on this changing environment (*Figure - Phyllotaxis - Odd and Even*).

*Figure - Phyllotaxis - Odd and Even* (Click to enlarge)

It is hard to understand why 'Figure - Phyllotaxis - Odd and Even' represents a good starting solution. Therefore, we look at the odd and even solution separately.

*Figure - Phyllotaxis - Odd* (Click to enlarge)

*Figure - Phyllotaxis - Even* (Click to enlarge)

The clustering algorithm created two strictly separated sub-populations of clusters. Each solution in itself shows no crossings and routes of the same length.

### City-to-City - Example Three

Besides the ability to solve inner-city multi-depot problems, the algorithm also needs to be able to solve single depot problems. In the following, we look at an effective-single depot problem covering Germany. Meaning, Resources are positioned far away from each other and Nodes are arranged in clusters (*Figure - Germany - Overview*).

*Figure - Germany - Overview* (Click to enlarge)

We expect that Resources are efficiently utilized to reduce the overall distance. For fun, you can try to solve the problem by hand before looking at the actual solution (*Figure - Germany - Overview - Solution*).

The algorithm shows its strength, for example, in the north area of the problem set. Usually, the city 'Eberswalde' would be assigned to the light green route, which would cause over time. Further, the west area also shows a nice clustering behaviour (around City *Essen*).

## Closing Words

The **HybridMultiDimensionalClusteringAlgorithm** construction algorithm is suitable for creating a feasible solution for different problem scenarios. Moreover, it became the default construction algorithm used in JOpt-TourOptimizer and is already heavily used worldwide in production. Indeed, it is also used to provide an approximate solution to end customers before doing any optional additional optimization steps.

## Authors

A product by dna-evolutions ©