Modelling and predicting early dropouts in a crowdsensing app

Schleicher, Miro (2018) Modelling and predicting early dropouts in a crowdsensing app. Masters thesis, Otto-von-Guericke-University Magdeburg.

[img] PDF - Registered users only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


Today most people in the western world own smart mobile devices. This circumstance opens up opportunities for research to expand its traditional tools. An emerging trend is mobile crowd sensing, since a large group of people use the capability of their devices (e.g. sensors) to collect and share data. This data creates enormous data sets of information. Even in the medical field, this technique is increasingly used. An example of this is the TrackYourTinnitus (TYT) app, where tinnitus patients can monitor their disease and simultaneously the data is made available for research. Tinnitus is the phantom perception of sound and a neuropsychiatric disorders. Participation helps both participants to cope with their tinnitus and researchers by providing valuable data which is elsewise difficult to capture in clinical trials due to the particular characteristics (e.g. high variability) of the disease. Though, some people stop the usage of the app at a particular moment in time. Others pause for a while and further ones stop forever. A few stop using it after a very short time. The intended behavior is that the app is used every day. It therefore raises the question of whether it is possible to model and predict the early dropouts. This thesis presents a method to model and predict early dropouts in a crowd sensing app. For this purpose, an option is offered to define early dropouts and apply this definition to a data set. Subsequently, various methods of classification and clustering are presented to make the corresponding predictions. It is revealed that the definition of early for the TYT data set is 10 days. According to the definition each participant is labeled with one of four classes, whereas two encompass dropouts the other two include non-dropouts. In order to predict early dropouts static data (e.g. personal data or the tinnitus assessments catalog Mini-TQ) assessed prior to the usage of the app is not predictive. The Algorithm Rotation Forrest, Random Forrest, C45 and Shapelet transform learn on time series with a maximum length of 10 days achiev a mean accuracy of >= 75% in predicting the classes. It also shows that the clustering algorithm “Agglomerative Hierarchical Clustering” can produce clustering results with a silhouette coefficient of 0.7964. The linkage method Ward’s method is used. The algorithm is applied to a distance matrix created from the univariate time series of users for the assessment question about the tinnitus consciousness. The distance matrix was created with dynamic time warping, which turns out to be a better distance measure than the Fréchet distance for this data set. At last, it is possible to model early dropouts and predict them by using the information of the first 10 days of assessment.

Item Type:Thesis (Masters)
Subjects:DBIS Research > Master and Phd-Thesis
ID Code:1703
Deposited By: Ruediger Pryss
BibTex Export:BibTeX
Deposited On:16 Nov 2018 15:22
Last Modified:16 Nov 2018 15:22

Repository Staff Only: item control page