Hi moderators,
I am getting the message "No prediction file generated, please submit to the express lanes to debug your model!" for my SC2 submission.
Since there is no log file I have no idea what is going on. This submission has been tested and scored on express lane.
Can some one check my submission ?
submission name: UoA_sc2_repo1
submission ID: 9650120
I have gone through the discussions here, and I don't think I am making any of reported mistakes.
Our model considers the CNA and RNA feature names (gene symbols) as provided in express lane data.
Whether it is differently named in test set ?
We have spent a lot of time on this challenge, and would really like to submit.
Any clue would be appreciated.
Thanks,
Created by Sunil Kalmady Sunil Yes. It did. Thanks.
I had assumed test features wouldn't have missing values, just like filtered training or express training features didn't.
Ok, great. Sounds like that care of it? Thanks Andrew.
I could submit to main lane.
The problem was with missing/NaN values in RNA/CNA data. I apologize I can't get any more specific, but there are missing values in the main-lanes data. We're asking some of the organizers who know the data better to help here. I think I've reached the limit of my usefullness on the problem unfortunately. Thanks Andrew,
Seems to me like this error is thrown by a peculiarity of this test data that is not captured in training set or express lane test set.
Are there missing values in the features of this test set?
In training set, RNA and CNA didn't have any missing values, only protein had missing values (except for 3 RNAs where all values were missing)
Or.. do you have any advice here?
Thanks, Hi Sunil,
Looks like you're still getting the same non-pickle error:
goclf predictor. Total Time Taken: --- 110.6456651687622 seconds ---
Traceback (most recent call last):
File "sc2_run.py", line 273, in
allclf_pred_test[kind] = allclf_predictor(ftname, allclf, kind, ov_cna_e, ov_rna_e, sub_cna_rna)
File "sc2_run.py", line 260, in allclf_predictor
clf_pred[prot_name] = np.around(all_clf[kind][prot_name].predict(X_basetest),12)
File "/root/.local/lib/python3.5/site-packages/sklearn/utils/metaestimators.py", line 54, in
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "/root/.local/lib/python3.5/site-packages/sklearn/pipeline.py", line 326, in predict
Xt = transform.transform(Xt)
File "/root/.local/lib/python3.5/site-packages/sklearn/preprocessing/data.py", line 646, in transform
estimator=self, dtype=FLOAT_DTYPES)
File "/root/.local/lib/python3.5/site-packages/sklearn/utils/validation.py", line 407, in check_array
_assert_all_finite(array)
File "/root/.local/lib/python3.5/site-packages/sklearn/utils/validation.py", line 58, in _assert_all_finite
" or a value too large for %r." % X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64'). Hi Andrew,
I sorted out the user warning regarding the pickled scikit version.
Now, express lane submission log doesn't even show that user warning.
Still, I am not able to complete main lane submission.
Can you please show log of -
submission name: sc2_repo1_ver2
submission ID: 9650235
Thanks, Hi Andrew,
Thank you for your reply.
No, I am not assuming the sample names in my code.
The local docker version runs just fine with different set of samples.
The pickled version related log here is usually just a user warning.
The prediction file is created even with this warning in express lane.
But the prediction file is not created in main lane?
Whether express and main are very different architecture?
Thanks,
Hi Sunil,
I want to make sure you know that the samples, and sample names in the express and main lanes are different. So if you are assuming the same sample names you're script won't work right.
I can show you part of the log:
Order of subjects in cna and RNA data are same!
goclf predictor. Total Time Taken: --- 110.5472719669342 seconds ---
/root/.local/lib/python3.5/site-packages/sklearn/base.py:312: UserWarning: Trying to unpickle estimator StandardScaler from version 0.18.1 when using version 0.19.0. This might lead to breaking code or invalid results. Use at your own risk.
UserWarning)
/root/.local/lib/python3.5/site-packages/sklearn/base.py:312: UserWarning: Trying to unpickle estimator LinearRegression from version 0.18.1 when using version 0.19.0. This might lead to breaking code or invalid results. Use at your own risk.
UserWarning)
/root/.local/lib/python3.5/site-packages/sklearn/base.py:312: UserWarning: Trying to unpickle estimator Pipeline from version 0.18.1 when using version 0.19.0. This might lead to breaking code or invalid results. Use at your own risk.
UserWarning)
/root/.local/lib/python3.5/site-packages/sklearn/base.py:312: UserWarning: Trying to unpickle estimator GenericUnivariateSelect from version 0.18.1 when using version 0.19.0. This might lead to breaking code or invalid results. Use at your own risk.
UserWarning)
/root/.local/lib/python3.5/site-packages/sklearn/base.py:312: UserWarning: Trying to unpickle estimator Lasso from version 0.18.1 when using version 0.19.0. This might lead to breaking code or invalid results. Use at your own risk.
UserWarning)
Traceback (most recent call last):
File "sc2_run.py", line 273, in
allclf_pred_test[kind] = allclf_predictor(ftname, allclf, kind, ov_cna_e, ov_rna_e, sub_cna_rna)
File "sc2_run.py", line 260, in allclf_predictor
clf_pred[prot_name] = np.around(all_clf[kind][prot_name].predict(X_basetest),12)
File "/root/.local/lib/python3.5/site-packages/sklearn/utils/metaestimators.py", line 115, in
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "/root/.local/lib/python3.5/site-packages/sklearn/pipeline.py", line 315, in predict
Xt = transform.transform(Xt)
File "/root/.local/lib/python3.5/site-packages/sklearn/preprocessing/data.py", line 681, in transform
estimator=self, dtype=FLOAT_DTYPES)
File "/root/.local/lib/python3.5/site-packages/sklearn/utils/validation.py", line 422, in check_array
_assert_all_finite(array)
File "/root/.local/lib/python3.5/site-packages/sklearn/utils/validation.py", line 43, in _assert_all_finite
" or a value too large for %r." % X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
Drop files to upload
SC2: No prediction file generated, please submit to the express lanes to debug your model! page is loading…