Monday, July 21, 2014

Potentially Invalid Data From All Trials

Thanks to a fundamental misunderstanding of the function of "elites," the select few members of a generation of a GA that get passed on to the next, in our GA, it's possible that all of our data is invalid. My understanding from what I was told of the code and my own examinations, was that an elite was kept for one generation, but if it failed it was discarded. This is apparently not the case, and elites are kept until the end of the GA regardless of failures along the way.

When looking at the data before, what I thought I was seeing was 30-40 great results and ~10 errors per last generation per population. Now I look at this same data with the understanding that the 30 elites could be hiding 30 errors underneath them. If this is the case, then virtually all of our data is invalid due to the fact that we cannot be certain that the PaaS-es are outputting results throughout a run of the GA.

What this means for the project is that a month's worth of data collection and analysis may have to be thrown out the window (worst-case scenario). I'm conferring with Dr. Remy to see how we should move forward.

No comments:

Post a Comment