JY Tigli Homepages

Getting the most hittled geographical places in a typical week

Name of the members of the Group

Ben Aicha Assma, IHM, benaicha@polytech.unice.fr
El Amin Moustafa, IAM,moustafa.o.elamin@gmail.com
Zayani Amal, IAM, amal.zayani@esprit.tn

Equipments

Mobile: HTC 8S
Personnel device: no
IMEI: 3537600589022026

Summary

I. Descrition of the project

II. Preliminary study

a. Data model
b. Architecture of the project

III. GPS sensor Module

a. Main idea
b. GPS APIs
c. Collecting data

IV. Classification Module

a. K-means classification
b. Sequential classification

V. Software packages & User Interface

VI. Further improvements

a. Battery optimization
b. Classification
c. Data Storage

I. Description of the Project

This application consists of representing all the most geographical places visited by a specific user during a week.

In order to establish this idea, we are using mobile HTC 8S as it contains a GPS sensor.
The aim of this application is to gather regularly (for example each hour) all the information concerning the position of the user at a specific time and date.

After that, via a clever classification, we can determine the most visited places by our user. Then, we can establish our typical week by describing all this information during a week.

So that, we can predict the user's location just by using our application. We can imagine that this application can identify the trends of a certain user in a season.

Scenario

Combining this application with an intelligent agenda, it can do all the planning of his TODO list related to his geographic attendance habits.

In addition, our application optimizes all the user trips and time. In order to understand more about the feasibility of our application, we made a quick scenario.

"I have to go to FNAC at Nice this week to bring my new headphones in between 9h - 19h30. My intelligent calendar is based on a module (this application) to propose him the best time to go there. Our application detects that referring to my habits, i usually go to Nice each Wednesday between 14-18h.

So that, our calendar suggests me to go to FNAC Nice on Wednesday at 18h30 after finishing my usual task."

II. Preliminary study

a. Data model

To understand how we modelise our data, and how we manage them, we have elaborate a Data model .

b. Architecture of the project

Our project is based on MVC architecture with a model, controller and view for collecting data. We have a GPS module with an asynchronous process. That's mean when the GPS module collects a new data; it notifies the controller by sending him an event handler to add this new data.

Classification Module is designed like a service; it takes data as an input and return classified data. The classification is divided into two parts; K-means and sequential classifications.

This classification is performed once every week, at the start of the app or on demand of the user. We do this to avoidgenerating a non-useful typical week updatedfor only a one day or one slot time.

Data are stored in a file in a "JSON" format, loaded at the start of the app and saved when the app is closed.
To save, we use Local folder, all read/write I/O operation resctricted to local folder, using Isolated Settings storage. The data serialization is manage by the lifecycle of the app; when the App Lunch or reactivated, deserialize data from Isolated Storage: when the app being tombstoned or terminated, serialize data to isolated Storage. The serialisation use DataContractJsonSerializer. We can see more information here on the process of Storage Data in Windows Phone 8.

The user interface will contains three interfaces:

- A tab for home page, where global information are displayed like date and location.
- A tab, which contains the typical week displayed by day.
- In The last tab, we can find most popular places attended in the week.

we can define our architecture by this diagram:

If the diagram is not correctly displayed, click here.

III. GPS sensor module

a. Main idea

GPS (Global Positioning System) sensor is a system that permits a user to specify his location via his smart phone. A group of classes provides information about the windows phone location service for the phone's position to an application.
This mobile technology is called "GPS" coordinated or "GPS" location, need an internet connection, and can provide latitude, longitude,altitude, auccuracy and many other information.
System.Devices.Geolocation provide us different location providers. So that, we have the location service to get the phone's current longitude, latitude and speed of travel. Otherwise, this location service provides us only the basic information about any position without more details. That's why we use also the Microsoft.Phone.Maps.Services to get further details about our target position.

b. GPS APIs

In our project we have imported some APIs so we can use the GPS of the device like:
- using Microsoft.Phone.Controls;
- using System.IO.IsolatedStorage;
- using Windows.Devices.Geolocation;
- using Microsoft.Phone.Maps.Services;

c. Collecting data

To collect location data, we Use Geolocator class, by the Geolocator object we can get a Geoposition object, the geoposition object provide us latitude and longitude:

position = new Geolocator();
Geoposition geo = await position.GetGeopositionAsync(TimeSpan.FromMinutes(5), TimeSpan.FromSeconds(00)); double x = geo.Coordinate.Latitude;
double y = geo.Coordinate.Longitude;

To get additional informations, we use MapAddress object in this way:

MapAddress address = e.Result[0].Information.Address
this.district = address.District;

At first, at initialization step in the constructior of GPS module we check the permission from the user to access the GPS:

IsolatedStorageSettings.ApplicationSettings["LocationConsent"]== true;

After the initialisation step, we start the TrackLocation by the folowwing method:

public async void TrackLocation();

In this method, we define the type of the desired result from the GPS. First, the accuracy has to be high in order to get well the location. Then, we fix a 50 meter for a movement threshold.
After that, once we get a new position, we have to change the status of the new position using the method "position_StatusChanged". So that, we can modify the position with the new location given by the method "position_PositionChanged".

position_StatusChanged (Geolocator sender, StatusChangedEventArgs args);

This method report the different status of the trackation, like; initializing, ready, permission to access location ...

position_PositionChanged (Geolocator sender, PositionChangedEventArgs args)

For each new data, we will define slotime, when we get a new location, a new slotime is starting, with the new latitude and longitude and a startdate by calling new DateTime().
When an another location is detected, we close the current slotime with an endTime with new DateTime(), And starting a new slotime. If the slotime is shorter than 15min is will be not considered. When a slotime is closed, we rise an event to notify the controller that there is a new data available:

//Event Declaration
public delegate void GetNewLocationEventHandler(object sender, GetNewLocationEventArg e);
public class GetNewLocationEventArg : System.EventArgs{}
public event GetNewLocationEventHandler UpdateLocation;

//when we have a new slotime we call OnUpdateLocation
OnUpdateLocation(new GetNewLocationEventArg());
protected virtual void OnUpdateLocation(GetNewLocationEventArg e){}

Then the controller will collect the new data.

IV. Classification Module

a. K-means classification

Before to use the clustering algorithm,it is important to define those parts:

- Classification: supervised Vs unsupervised

- Clustering and K-means method

- Clustering of areas throw K-means

a.1. Classification supervised Vs unsupervised

There are two ways to classify items into different categories:

- Supervised Classification
The supervised classification methods are based on user-defined classes and corresponding representative sample sets. These sample sets are specified by training data sets, which must be created prior to entering the Automatic Classification process. The supervised Classification methods are: Minimum Distance to Mean, Maximum Likelihood, etc.

- Unsupervised Classification
The unsupervised classification methods are algorithms that analyze and classify a large number of raster cells. Then,the entire input raster set is processed, and a classification rule is used to assign each raster cell to one of the defined classes. Some of the unsupervised classification methods are: Simple One-Pass Clustering, K Means, Fuzzy C Means.

a.2. Clustering and K-means method

Data clustering is the process of allocating object to a class according to his characteristics A1... An. The most common technique for clustering numeric data is called the k-means algorithm.
The k-means is an Unsupervised Classification. Using to find groups of similar or related objects and different from (or unrelated to) the objects in other groups.

[1]
If the image is not correctly displayed, click here.

a.3. General description of K-means:

k-means clustering is a method of classifying items into k groups using the following steps:
- Each point is assigned to the cluster with the closest centroid
- A centroid is "the center of mass of a geometric object of uniform density".[2]
- Number of clusters K must be specified
- The grouping is done by minimizing the sum of squared distances (Euclidean distances) between items and the corresponding centroid (center point)

a.4. Algorithm

The basic algorithm is very simple. In pseudo-code, k-means is: [3]

initialize clustering
loop until done

compute mean of each cluster

update clustering based on new means

end loop

a.5. K-means clustering of visited area :

These clustering techniques can be used in different domains and applications. Using this method to grouping our areas at order of most frequently in the week. To apply k-means in our project we implement the sample code with all methods we need in k-means's class.

public class KMeans

Cluster method take in paramters, data collected from sensor, and the number of cluster that we want to have. It return the numbers of cluster that we have specified in parameters.

public int[] Cluster(double[][] rawData, int numClusters)

In this previous method we requier to call several methods:

private double[][] Normalized(double[][] rawData)
private int[] InitClustering(int numTuples, int numClusters, int randomSeed)
private bool UpdateMeans(double[][] data, int[] clustering, double[][] means)
private bool UpdateClustering(double[][] data, int[] clustering, double[][] means)
private double Distance(double[] tuple, double[] mean)
private int MinIndex(double[] distances)

To show our cluster we can use:

ShowClustered(rawData, clustering, numClusters, 1);

b. Sequential classification

This part of classification aims to build typical day, indeed a typical week is a composition of 5 typical day.
First of all, we will group SlotTime by day of week, we give a list of slotime and the desired day as input parameters and the method return the list of slotime corresponding to this day mentioned below:

- public List$Slotime slotimeForOneday(List$Slotime brutSlotimes,DayOfWeek day){...}

After that, we will group similar SlotTimes and define their frequencies from a list of SlotTimes corresponding to a day of week.
In order to do this, the partition of frequencies is based on three parameters: Start Time, End Time and the Location. To simplify, we have chosen a margin of error for about 15 min.
We provide a list of slot time, specific to a day of week in parameters, and we get a list where slottimes are grouped and the frequence has been defined to each different slot time.

- public List$Slotime getSlotimeSortedByFrequence(List$Slotime list){...}

Finally, we have to return a list of typical slot from a list of slot time. The list that we have to provide have to be processed by previous methods.
The following method, will provide us the calendar of location of a day of week, by slot time of 15min. Then, for every slot of 15min, situated between 8 am and 7 pm, we will assign a location by the max frequency of slotimes during this period.

- public List$TypicalSlot getTypicalSlotOfDay(List$Slotime listSorted){...}

V. Software packages & User Interface

Software packages

Download project from GoogleDrive

Readme

User Interface

The interface of our application "Getting the most visited geographical places in a typical week" is very robust and simple so that the user can retrieve quickly the desired information without any complexity.

While describing our application user interface, we will represent briefly the purpose of all the functionalities.

When we start our application, we have to confirm using the GPS of the mobile during enjoying our service by accepting the application to access the phone's location.

If the image is not correctly displayed, click here.

At this moment, we can have our main page.
The main page represents as we can see below three buttons (Home, week and locations) to display three functionalities

If the image is not correctly displayed, click here.

First, in "Home" we can have the time and date and a geolocation of the user at this moment.
Second, throught the "Locations" button we can see a classification for the most visited geographical places by the phone owner in a typical week.

If the image is not correctly displayed, click here.

Then, the user can have more details about the places that he always visits for a specific day. For example, each Monday morning, Joe goes to visit his grandmother in Nice. Using this kind of information, Joe can preschedule if he wants to do a specific task after that in Nice.

If the image is not correctly displayed, click here.

Notes: We are looking in the future to expand our application by implementing a better attractive user interface with more animated design.

VI. Further improvements

a. Optimization battery

In this demo, we can explore new useful functionalities for the phone holder like getting his typical week and even more precisely his typical day referring to his past visits during certain of time.
Our goal right now is to develop and improve our existing application by changing or creating new methods to get effective processes.
One of our concerns is the battery consumption while using our application. The problem is we couldn't make our application to work in a background mode since this option isn't available on the mobile HTC 8S. So that, we have to let the application opened during all the day so we can retrieve a perfect typical week combined with a lot of battery consumption.
In order to fix this problem, we have to implement a background mode beceause many of WP8 devices can run application in background. Track Location have to process only within effective hour, indeed between 8am and 7pm.

b. Classification Improvments

To improve our classification, we can use K-means for all our process of classification, and converge into a machine learning. For this, our classification have to be divided into several step.

we can process as follow:

1. Get slotimes by day of week (Monday, tuesday, wednesday, thursdayn friday).
2. Get cluster for each day, with X,Y and the start time (DateTime in minute from OOhOO): Cluster(x,y,startTime)
3. For every cluster that we got in the previous step, get news cluster with the end time params: Cluster(x,y,endTime)
4. Define most popular slot time with her frequency, by counting the number of value for every cluster obtained.
5. Define the typical day by considering the higher frequency of slotimes.
6. For every new data, assign to the nearer cluster or when a cluster is beiing too large, divide it in two cluster.

c. Data Storage

The principal information that we have to store are Slotimes, with data an SQL Lite. For more optimization, we can save cluster, and reload them instead process classification each time.

There are two types of structure of databases:

- " Flat file ": all the information of the customer is contained in the same file who can be of variable length
- Relational: the information of the customer is contained in several files united by a common "key", for example the number of the customer

Then we can storage our data coming from the sensor in a database who respect our data model and implement some method to insert, delete and fetch data.

To use the database in our project we use "DataContext" we must use the namespace:
Namespace: System.Data.Linq [4]

- To make our database:

public static string DBConnectionString = @"isostore:/Databases.sdf";
public DataDataContext(string connectionString)

- To connect database:

using (DataDataContext context = new DataDataContext(DataDataContext.DBConnectionString))

- To add items:

context.datas.InsertOnSubmit(d);
context.SubmitChanges();

- To fetch data:

ToList()

References

[]GPS Sensor
[1] http://www.cs.uky.edu/~jzhang/CS689/PPDM-Chapter3.pdf
[2] http://en.wikipedia.org/wiki/Centroid
[3] http://visualstudiomagazine.com/articles/2013/12/01/k-means-data-clustering-using-c.aspx
[4]http://msdn.microsoft.com/en-us/library/system.data.linq.datacontext%28v=vs.90%29.aspx
K-means