Feel free to use any data external of the challenge:
- [pyrfume](https://github.com/pyrfume/pyrfume) has compiled a lot of data on single molecules and some mixtures
We included Intensity and Pleasantness in the data to give participants who are interested in predicting these dimensions more flexibility. However, you're absolutely right that they are not part of the leaderboard or test set, and thus should not be used in training if your goal is to submit for this challenge. For anyone focused purely on the challenge targets, these columns can simply be ignored — but we hope having them available adds value for those who may want to explore broader perceptual modeling questions. These columns can not be used in training the model if they are not available in the leaderboard or test data. It would have definitely made a robust model as we are predicting odor at different intensities The reason "Intensity" and "Pleasantness" are not included in the leaderboard or test data is because the main objective of this challenge is to predict the values of the 51 semantic descriptors. "Intensity" and "Pleasantness" were not considered as part of the target variables for this task. why is "Intensity", "Pleasantness" not included in leaderboard or test data?