I think kernel methods, including but not limited to SVM, KNN, GPR, should not be included in evaluation methods. The reason is that they do not weigh features, then the features with the largest scale will be weighted the highest, in this case, age, while others, if processed between 0-1 will not contribute much any way. Then participants will have to focus their intention on how to tune the relative weight of features by changing the scales and changing the numbers rather than extracting interesting features.   Similarly, neural net I think shouldn't be included either. Because essentially that is going to activate one of the features, while ignoring all rest useful features.   In that sense, I think good ensemble methods to evaluate the algorithm rather than directing the participants to fit the limitations of the algorithm itself, should be the methods that have the property to integrate and weigh all features by themselves, such as random forest, extreme trees, elastic net, l1, l2, etc many of them already listed in the slide.   How do you assemble predictions then? For example, the neural net result will be sigmoid, while elastic net is linear. then when you assemble by mean, only the former practically contributes to final predictions.   Because I am sure, even the best features, there will be standard algorithms to make them completely signal-less.   **But anyway, the scoring algorithm, including how you assemble them, should be decided and announced way before the deadline. Then, even that it is wrong, it is fine. Otherwise, whatever the result is, all non-winning participants will think the organizers manipulated the results to what they would like it shall look like, after getting everyones features, either by adjusting the relative contribution, or parameters of the algorithms. Then, even if they claim didn't intentionally do so, that only takes a grin .** But on the other hand, if, by any chance, you are willing to do it in the scientific, fair, and transparent way, it will also be appreciated by every one of us.

Created by Yuanfang Guan ???? yuanfang.guan
@yuanfang.guan-   As we mentioned on the webinar we will be releasing the model building portion of the scoring code when we open the scoring queues.   Solly
> since we are centering and scaling all the predictors first I guess you are going to scale all features into 0-1? That doesn't solve the problem, but if you could share the scoring code, that would be still be fine also since we will know exactly what you do. What i meant is that whether the scoring is wrong or right doesn't matter at all. What matters is that it has to be done in a predefined way, rather than after getting everyone's result. When I read the slides today i think it is wrong that you median the input features, and then do predictions for each patient, because one feature often interacts with another feature, this is especially true for frequency - derived features, e.g. the frequency and the magnitude is bound as a pair. It is more appropriate to make predictions for each record and take the median for the output. but again, what i meant is that even it is wrong, it is fine, but it has to be made clear through documented code before submission.   you know for svm for any data there will be certain set of parameters to get all predictions to be the same value no matter of the input, right?   i think it is not a very high standard demand from any competition organizers to get the scoring method before the competition ends.
Thank you both for your input. We discussed this internally and did not see the concern, since we are centering and scaling all the predictors first. If this doesn't alleviate your concern @yuanfang.guan, please let me know.
exactly, looking at the algorithms that were chosen, KNN, SVM, neural net, or just adding age/sex as new features. i think this is building swords and spears on top of bomber and tank.   but as have been mentioned above, this should be easily solved, by just no post - add hoc adding of new features by the organizer side, so we could do our own, and the choose the maximal performance. i believe this was one of the ways that the organizers have considered. and i hope the organizers could re-consider it.
I concur. If the aim is to understand characteristics, having an adhoc solution of a 'bunch of random models' is inappropriate. But this is how the NETFLIX prize was developed and BellKor won with over 100 adhoc random models (https://en.wikipedia.org/wiki/Netflix_Prize). The goal of this challenge should be feature extraction that is related to the Parkinson's Disease - FOG as 'freeze of gait', sway, shuffling, and so on. Such characteristic extraction from motion data could benefit PD researchers throughout the ecosystem. The rational of some machine-learning approaches to solve this is really not aligned with the intent MJFF has for this challenge.
I think there are two ways to partially solve this (and in anyway the scoring code should be released beforehand rather than afterwards to make the result fair and convincing). I know I have been nagging, but i think these can solve most of the concerns and serve the best purpose of using mobile data to monitor disease progression:   1. to pick the maximal performance across all base learners, instead of average. I think that mimics real situation where a best model is built instead of ensembling with a bunch of random models.   2. allow a separate submission queue with direct final predictions. The reason is that I think adding age and gender in the ways listed as above could significantly drop the optimal performance, and we will have to spend too much time in tuning all features to the same scale as an average of age and gender (?) so that they actually contribute, due to the several algorithms that were picked. After that the best performing one can be picked out by either feature extracted predictions or direct predictions. I think this well serve the purpose of extracting feature as well, as you have said the feature can be one, (but this one has to be a good combination of age/gender together with other features extracted from fourier, variation, skew, etc)

Concerns with kernel methods being included in base learners as in webinar slides page is loading…