Melting-Point Prediction (Multi-level ML)

Effect of stacking order on ensemble performance. Each block = 10 tuned models (RF/XGB/LGBM/MLP)

R² ≈ 0.83 ~3.041k samples (Citrination) Custom decorrelated (1st) & stacking (2nd) layer

What: Predict melting points of organic compounds with a two-level stacking ensemble (RF/XGB/LGBM/MLP).

Why: Reproduced and extended a published approach (DOI) to validate best practices on a different materials system.

How: Parsed a ~3.041k dataset (Citrination); featurized SMILES with RDKit + custom bond-count features; Shap-guided model/feature, ensemble optimization; Custom-stacked tuned Ensemble.

Results: R² ≈ 0.83; custom stacking order improved baseline by ~4% out-of-sample.

GitHub