Hi,
For some files both in the training and the test (L-Dopa data) set I have only NaN values except for the timestamp column. Is that known to the organizers? As we should provide features for all samples I think it would make sense to remove them.
I just found that the 'accessing data' page has no information about the L-Dopa right now. Does it mean some change in planned connected to the data?
Best regards,
Balint
Created by Balint Armin Pataki patbaa See https://www.synapse.org/#!Synapse:syn8717496/discussion/threadId=2764 @Raj Could you send me a dataFileHandleId or two that is all NAN values except for the timestamp? Phil-
If these samples don't have sensor data, then they should be removed.
Solly . Yes, like subchallenge 1, training and test features will be in one file. @phil, can you look into whether there are NA samples in the submission template? So I noticed that when I limit my feature generation to only the file list in the L-Dopa sample submission template, they still have the files that consist of only NaN values. Is this intentional, should we just provide null features for those entries? Also should we submit the train and test features separately or as one file?
Thank you @sieberts.
@patbaa, During the L-Dopa data collection data collection failed for some tasks and devices. When dividing the data we did not throw out those tasks and devices but left the accelerometer readings as NaN. As Solly mentioned, we will not be building models or scoring on those samples.
@patbaa -
Yes, it is correct that some entries may be completely null. As this has just come to our attention, we will make sure these samples are not included in the submission templates. @larssono
We have restored the information about the L-Dopa data in the Accessing Data page. Thank you for pointing out this inadvertent deletion.
Solly