Différences

Ci-dessous, les différences entre deux révisions de la page.

--- projets:plim:20152016:gr6 [2015/11/22 15:57] – [User matching] chammam
+++ projets:plim:20152016:gr6 [2015/11/24 17:34] (Version actuelle) – [Machine learning and clustering] boussarsar
@@ Ligne 39: / Ligne 39: @@
 ==== Machine learning and clustering ====
-Based on trails information (distance and duration) the next step is to build a dataset, the database. K-Means is the supervised machine learning algorithm implemented in this solution to ensure clustering. There are three clusters which are: easyTrails, mediumTrails and hardTrails.
+Based on trails information (distance and duration) the next step is to build a dataset, the database. K-Means is the unsupervised machine learning algorithm, or simply clustering algorithm, implemented in this solution to ensure grouping similar users together. There are three clusters which are: easyTrails, mediumTrails and hardTrails.
 The first step is to initialize the input which is a bi-dimensional array of distance and duration, choosing the appropriate distance measure mechanism and fixing the number of clusters. After that, the algorithm chooses randomly n centroids based on the number of clusters, which is called k, and calculates the distance between a given centroid and all the other points based on Euclidean distance. After initializing the clusters, the algorithm finds the nearest point in the centre of the cluster and call it centroid by calculating intra-cluster distance (distance between points in the same cluster). It also calculates the inter-cluster distance to identify which point belongs to which cluster (distance between the centroid of each cluster and points of other clusters). The system repeats these instructions until the stability of cluster centroids or reaching the iterations limit.