Différences

Ci-dessous, les différences entre deux révisions de la page.

--- projets:plim:20152016:gr6 [2015/11/22 11:22] – [User matching] chammam
+++ projets:plim:20152016:gr6 [2015/11/24 17:34] (Version actuelle) – [Machine learning and clustering] boussarsar
@@ Ligne 33: / Ligne 33: @@
 ===== How it works =====
-==== Data gathering (acquisition) ====
-  * GPS data (latitude, longitude) => distance
-  * time-stamps => duration
 ==== Data manipulation ====
-Measuring the length of the trail circuit and its total duration.
+After collecting sensors raw data which are a set of latitude and longitude couple called Geopoints, it is time to calculate the length of the trail circuit.
+To do so, the application's algorithm takes each couple of Geopoints as input, with the measure unit of latitude and longitude, and returns the distance between these two points. As for the measure of trail's duration, the algorithm stores two times: the trail's starting and anytime which are system's timestamp at the beginning of the trail and the end of the trail. These steps are done in a background-task.
+==== Machine learning and clustering ====
-==== Clustering ====
+Based on trails information (distance and duration) the next step is to build a dataset, the database. K-Means is the unsupervised machine learning algorithm, or simply clustering algorithm, implemented in this solution to ensure grouping similar users together. There are three clusters which are: easyTrails, mediumTrails and hardTrails.
+The first step is to initialize the input which is a bi-dimensional array of distance and duration, choosing the appropriate distance measure mechanism and fixing the number of clusters. After that, the algorithm chooses randomly n centroids based on the number of clusters, which is called k, and calculates the distance between a given centroid and all the other points based on Euclidean distance. After initializing the clusters, the algorithm finds the nearest point in the centre of the cluster and call it centroid by calculating intra-cluster distance (distance between points in the same cluster). It also calculates the inter-cluster distance to identify which point belongs to which cluster (distance between the centroid of each cluster and points of other clusters). The system repeats these instructions until the stability of cluster centroids or reaching the iterations limit.
-Clustering every "trail" into one of three different classes: Easy, Medium and Hard.
+==== User matching ====
-==== User matching ====
-When a user searches for a hiking buddy, the app will look for another person with similar activities (has a number of trails within the same cluster).
+The main goal of this application is to give the users the possibility to find people, trail buddies, based on their hiking activities similarity. The user matching algorithm is quite simple, it relies on comparing the connected user with all application users.
+The comparison mechanism is based on two conditions; two trail buddies are similar, if the number of trails of the connected user is equal or lesser than 5% another one and if the connected users percentage of trails per cluster is equal or lesser than 5% another one. A set of similar trail buddies is then listed to the connected user with their phone number so he can contact them and organise a trail together.
 === Developped SOFTWARE ===