Skype: franck.dernoncourt
MIT, Cambridge, USA

profile for Franck Dernoncourt on Stack Exchange, a network of free, community-driven Q&A sites


Google scholar profile

0TrackMania is intractable [ENG]2014
1beatDB: A Large ScaleWaveform Feature Repository [ENG]2013
2MoocViz: A Large Scale, Open Access, Collaborative, Data Analytics Platform for MOOCs [ENG]2013
3 MOOC En Images (MIT Technical Report) [ENG]2013
4 MOOCdb: Developing Standards and Systems for MOOC Data Science (MIT Technical Report) [ENG]2013
5 MOOCdb: Developing Data Standards for MOOC Data Science (MOOCshop paper) [ENG]2013
6Machine Learning Algorithms for In-Database Analytics [ENG]2013
7Efficient training set use for blood pressure prediction in a large scale learning classifier system [ENG]2013
8Replacing the computer mouse [ENG]2012
9Artificial Intelligence: why should firms care? [ENG]2012
10Of the use of natural dialogue to hide MCQs in serious games [ENG/FR]2012
11Designing an intelligent dialogue system for serious games [ENG/FR]2012
12The medial Reticular Formation (mRF): a neural substrate for action selection? An evaluation via evolutionary computation. [ENG/FR]2011
13Fuzzy logic: introducing human reasoning within decision support systems? [ENG/FR]2011
14Fuzzy logic: between human reasoning and artificial intelligence [ENG/FR]2011
15Presentation on the Motion-Induced Blindness (MIB) phenomenom [ENG/FR]2011
16Presentation on the paper Automated Variable Weighting in k-Means Type Clustering [ENG/FR]2010
17Prediction of the water inflow to a lake [FR]2010

Abstract: We prove that completing an untimed, unbounded track in TrackMania Nations Forever is NP-complete by using a reduction from 3-SAT and showing that a solution can be checked in polynomial time.

Download paper
TrackMania is intractable
Abstract: A great majority of the effort is spent assembling the data and formulating the features, while, rather ironically, the model building exercise takes relatively less time. beatDB aims at radically shrinking the time of large scale investigations by judiciously pre-computing beat features which are likely to be frequently used. In this poster we present beatDB structure and use beatDB for a concrete research study: predicting acute hypotensive event with blood pressure.

Download report Download Poster
Overview of MOOCEnImages
Abstract: In this paper we present an open access large scale analytics platform that helps researchers analyze MOOC data from multiple platforms with out the need to share the data. It allows researchers to share scripts/effort, compare results and attempts to engage the community to achieve shared educational science goals. The platform utilizes some well known tools and packages and provides multiple levels of access to address a wide variety of needs around the data. We demonstrate the platforms capability by analyzing data from two MOOCs, one from Coursera (offered by Stanford University) and one from edX (offered by MITx). This is the first time two courses from two platforms have been jointly analyzed. The analysis and the platform is made possible due to joint adoption of a data model called MOOCdb.

Download report Download Poster
Overview of MOOCEnImages
Abstract: This report provides a view into different descriptive statistics extracted from the data recorded during 6.002x the first course offering by MITx. We have developed a generalizable analytics framework and this report demonstrates use of this framework. This is a working document and we are expanding the scope of this document as we add additional analytical tools and interfaces to our framework

Download report
Overview of MOOCEnImages
This MIT Technical Report is an extended version of the MOOCshop paper.

Abstract: The intent of this document is to enable development of data standards for MOOCs and build enabling technology. This document will be updated from time to time with feedback from the community as well from our internal development process

Download report
Overview of MOOCdb
Abstract: The intent of this article is to propose data standards for MOOCs. Our team has been conducting research related to mining information, building models, and interpreting data from the inaugural course offered by edX, 6.002x: Circuits and Electronics, since the Fall of 2012. This involves a set of steps, undertaken in most data science studies, which entails positing a hypothesis, assembling data and features (aka properties, covariates, explanatory variables, decision variables), identifying response variables, building a statistical model then validating, inspecting and interpreting the model. In our domain, and others like it that require behavioral analyses of an online setting, a great majority of the effort (in our case approximately 70%) is spent assembling the data and formulating the features, while, rather ironically, the model building exercise takes relatively less time. As we advance to analyzing cross-course data, it has become apparent that our algorithms which deal with data assembly and feature engineering lack cross-course generality. This is not a fault of our software design. The lack of generality reflects the diverse ad hoc data schemas we have adopted for each course. These schemas partially result because some of the courses are being offered for the first time and it is the first time behavioral data has been collected. As well, they arise from initial investigations taking a local perspective on each course rather than a global one extending across multiple courses.

Download report Download Presentation Download BibTeX
Overview of MOOCdb
Abstract: Our project focused on extending the functionality of MADlib. MADlib is an open source machine learning and statistics library which works with Postgres or Greenplum to provide in-database analytics. Although some machine learning algorithms have been implemented in MADlib, there is room for additional contributions. We have implemented two different machine learning algorithms, symbolic regression with genetic programming and adaptive boosting for MADlib, and are in the process of contributing our code to the MADlib community codebase. We have also assessed the performance of our implementations and compared their performance with the same algorithms outside MADlib.

Download report
Overview of MOOCdb
Abstract: We define a machine learning problem to forecast arterial blood pressure. Our goal is to solve this problem with a large scale learning classifier system. Because learning classifiers systems are extremely computationally intensive and this problem's eventually large training set will be very costly to execute, we address how to use less of the training set while not negatively impacting learning accuracy. Our approach is to allow competition among solutions which have not been evaluated on the entire training set. The best of these solutions are then evaluated on more of the training set while their offspring start off being evaluated on less of the training set. To keep selection fair, we divide competing solutions according to how many training examples they have been tested on.

Download report Download Presentation Download BibTeX
Overview of ECstar
Abstract: In a few months the computer mouse will be half-a-century-old. It is known to have many drawbacks, the main ones being: loss of productivity due to constant switching between keyboard and mouse, health issues such as RSI, medical impossibility to use the mouse e.g. broken or amputated arm and unnatural human-computer interface like the keyboard. However almost everybody still uses a computer mouse nowadays.

In this short article, we explore computer mouse alternatives. Our research shows that moving the mouse cursor can be done efficiently with the SmartNav device and mouse clicks can be emulated in many complementary ways. We believe that computer users can increase their productivity and their health by using those alternatives. There are a few exceptions such as advanced users of graphics editing programs or FPS gamers, who will still be more efficient using a computer mouse.

This article is voluntary short and not overly technical, our main motivation being to make the readers aware of these solutions and their efficiencies. Details can be found in the appendices and by following the URLs and references. The primarily intended readers are computer scientists, people with RSI, physicians and interface pioneers. Feedback is highly welcome: this is work in progress, so feel free to e-mail the main author at

Download report Download Presentation Download BibTeX
Talk given on May 30th, 2012 at the Swedish Chamber of Commerce in Paris. As I was reading an article about IBM Watson, a small sentence drew my attention: "Eighty or 90 per cent of these requests don't need Watson anyway, technology already exists for what they need.". This epitomizes the growing need for the business world to catch up with artificial intelligence's latest developments. What is AI? What is the state of the art? Why should I care? i.e. what can AI bring to the business world? From law to finance, any field will be reshaped in the long term by AI.

Download Presentation
Replace CS by AI
Abstract of the original paper: A major weakness of serious games at the moment is that they often incorporate multiple choice questionnaires (MCQs). However, no study has demonstrated that MCQs can accurately assess the level of understanding of a learner. On the contrary, some studies have experimentally shown that allowing the learner to input a free-text answer in the program instead of just selecting one answer in an MCQ allows a much finer evaluation of the learner's skills. We therefore propose to design a conversational agent that can understand statements in natural language within a narrow semantic context corresponding to the area of competence on which we assess the learner. This feature is intended to allow a natural dialogue with the learner, especially in the context of serious games. Such interaction in natural language aims to hide the underlying MCQs. This paper presents our approach.

Download report Download BibTeX                 Download report in French Download Presentation in French
Abstract of the original paper: the objective of our work is to design a conversational agent (chatterbot) capable of understanding natural language statements in a restricted semantic domain. This feature is intended to allow a natural dialogue with a learner, especially in the context of serious games. This conversational agent will be experimented in a serious game for training staff, by simulating a client. It does not address the natural language understanding in its generality since firstly the semantic domain of a game is generally well defined and, secondly, we will restrict the types of sentences found in the dialogue.

Download report Download BibTeX                 Download report in French Download Presentation in French

The medial Reticular Formation (mRF) is located in the brainstem: it receives many sensory inputs and it can control motor actions through its projections on the spinal cord and cranial nerves. The mRF is phylogenetically one of the oldest neural structures of the brainstem, the latter being regarded as one of the oldest centers of the central nervous system. Subsequently it seems to be a low-level system for action selection.

The first model of the mRF was proposed by Kilmer and McCulloch in 1969, who already proposed that the mRF could be a "mode selector". In 2005, Humphries et al. (2005) tested the efficiency of this model in the minimal survival task defined in Girard et al. (2003). It performed poorly, but another version of it that included artificially evolved weights performed quite honorably. As a result, Humphries proposed a second model of the mRF, based on neural network formalism and taking into account new anatomical data. Nevertheless, it showed poor performances in the minimal survival task and turns out not to be anatomically very plausible.

In this Master's Thesis, we propose a new model of the mRF:

  1. constrained by anatomical information about its structure,

  2. constructed based on neural networks generated by artificial evolution,

  3. assessed on tasks of action selection.

The model we obtain successfully manages the tasks of selection, indicating that the mRF can be used as an action selection system. We also demonstrate an anatomical property of the mRF, which coupled with the results of the paper Humphries et al. (2006) shows that it is very likely that the mRF network has a small-world structure.

This project was funded by the ANR (ANR-09-EMER-005-01. ANR = French National Agency for Research) in the project EvoNeuro.

Download report Download Presentation Download BibTeX Download Source code                 Download report in French Download Presentation in French Watch presentation video in French
Fuzzy logic is based on solid mathematical foundations, including the mathematical theory of fuzzy sets, generalizing classical set theory. Firstly, we define fuzzy operators, which generalize operators of classical logic.

As a second step, we see how fuzzy logic can imitate human reasoning. We analyze the contribution of fuzzy logic for the modeling of human reasoning, and also experimentally investigate whether the decisions taken by humans correspond to decisions taken by fuzzy systems. To this end, given that the literature is deficient on this point, we design an experiment for that purpose and analyze the results.

We study the potential applications for databases and decision support systems in Chapter 5. How to integrate the advantages of fuzzy logic in the database? To which extent decision-making systems can use the flexibility of fuzzy logic?

We then analyze the potential applications for decision support systems and databases.

We show that at the heart of the company, bringing together all the interesting information from the operational databases, decision systems could benefit greatly from fuzzy logic by giving the keys to human reasoning, allowing to refine the decision-making.

Database theorists know what fuzzy logic could bring them in terms of information modeling: queries more intuitive and more powerful on the one hand, the data more consistent with the reality on the other. Many papers have been written, but few significant achievements have followed. The lack of consensus on a standard is probably the main reason behind.

Download report Download Presentation Download BibTeX                 Download report in French Download Presentation in French
Lac St-Jean
Fuzzy logic is an extension of Boolean logic by Lotfi Zadeh in 1965 based on the mathematical theory of fuzzy sets, which is a generalization of classical set theory. By introducing the concept of degree in the verification of a condition, allowing a condition of being in a state other than true or false, fuzzy logic provides a very valuable flexibility to use reasoning, which makes it possible taking into account the inaccuracies and uncertainties. One of the advantages of fuzzy logic to formalize human reasoning is that the rules are set in natural language.

In this report, we:

  1. introduce the basic concepts of fuzzy logic,

  2. propose some arguments which support the view that fuzzy logic can model human reasoning better than standard logic and probability theory,

  3. conduct an psychological experiment on humans to see if their way of thinking can be reflected by fuzzy logic.

We show that fuzzy logic can explain many experiments that had undermined traditional models of human reasoning in the 20th century. We show how the non-additivity of probability judgments can be expressed in a fuzzy system. We then confront fuzzy logic with some paradoxes of classical logic when it tries to model human reasoning: the sorites paradox is typically the kind of threshold problem that fuzzy logic reduces and the paradox of entailment does not pose a problem in fuzzy logic. It would be interesting to further explore Hempel's paradox and especially how we could express it in a neuro-fuzzy system. Similarly, Wason selection task would require further analysis, this time by focusing on fuzzy modus ponens and modus tollens.

Thus fuzzy logic appears as a powerful theoretical framework for studying human reasoning. Surprisingly, we find only one study comparing the decisions made by humans with that of a fuzzy system, whose purpose was essentially to design a system of decision support for medical personnel, not analyze human reasoning as such. We conduct our own experiment and investigate whether a fuzzy system could mimic the results observed in humans. For this purpose, we use a technique for optimizing fuzzy system using neural networks (neuro-fuzzy), through which we obtain good results, although the correlation between the two criteria for entry is high: a fuzzy system gives results closer to experimental values than those obtained by a polynomial system. This result reinforces the hypothesis that fuzzy logic can be used to explain decisions from human reasoning.

Download report Download BibTeX                 Download report in French
Lac St-Jean
The visual system has a number a 'bugs', some of which we call illusions. Motion-induced blindness (MIB) belongs to a very interesting class of illusions in which objects in plain sight just disappear from phenomenal perception. Other classical examples of disappearance illusions are:

  1. Binocular rivalry, in which two very different objects are presented to the two eyes, and at any given moment one of the obects--or most of it--remains invisible,

  2. Backward masking, in which a stimulus is 'erased' from perception by a second stimulus, called a "mask", presented a brief time later,

  3. Troxler fading, in which a low-contrast object may fade from visual perception after some time.

In addition, a number of neurological conditions usually involving lesions in parietal cortex, such as hemineglect and extinction, lead to cases in which objects in plain view are not seen, or not noticed. For a good review of these phenomena, see article "Psychophysical magic" by Kim and Blake (2005).

Motion-induced blindness MIB is a recently discovered and quite spectacular example of a disappearance illusion. The stimulus consists of a field of small objects, moving in a coherent way (either a 2D or 3D rotation, for example). Superimposed on this moving field is a number of high-contrast stationary objects. When most observers fixate a stationary point in this stimulus (such as one of the high-contrast objects, or a fixation point), after several seconds one or more of the stationary objects just disappear.

Download Source code Watch demonstration videos Download Presentation Download Presentation in French
Activate full-screen, fix the white point in the center. After a few seconds, you will notice that the yellow point seems to disappear.
Abstract of the original paper: This paper proposes a k-means type clustering algorithm that can automatically calculate variable weights. A new step is introduced to the k-means clustering process to iteratively update variable weights based on the current partition of data and a formula for weight calculation is proposed. The convergency theorem of the new clustering process is given. The variable weights produced by the algorithm measure the importance of variables in clustering and can be used in variable selection in data mining applications where large and complex real data are often involved. Experimental results on both synthetic and real data have shown that the new algorithm outperformed the standard k-means type algorithms in recovering clusters in data.

Download Presentation Download Presentation in French
K-means algorithm
The purpose of this project is to predict the water inflow to a lake, the Lac St-Jean, based on the evolution of the inflow to the lake from the history of this flow, snowmelt and precipitation in the watershed. All the data for this work have already been collected: our work aims to process, analyze and use these data to build a model which should be able to accurately predict the lake's water inflow.

In the first part, we conduct a preliminary study of the data so as to extract general information. In the second part, we establish a classification of the data to see the main trends. In the third and last part, we build several models to predict and we evaluate them through quality measurements.

Download BibTeX Download Source code Download report in French
Lac St-Jean