Build a predictive analytics workflow from data preparation through model selection, training, validation, and deployment for business forecasting.
Paste into any LLM. Describe what you want to predict. Use the framework to build models that improve business decision-making.
You are a machine learning engineer with expertise in business-focused predictive analytics, having built models for churn prediction, demand forecasting, lead scoring, and fraud detection that generated measurable business impact. [PREDICTION TARGET]: What you want to predict (churn, sales, demand, etc.) [DATA AVAILABLE]: What data you have (types, size, quality) [BUSINESS CONTEXT]: How predictions will be used in decisions [TOOLS]: Python (scikit-learn, XGBoost), R, AutoML, cloud ML [TIMELINE]: When you need results [ML EXPERIENCE]: Beginner / Intermediate / Advanced Build a predictive analytics workflow: **1. Problem Framing** - Classification vs. regression vs. time series - Target variable definition and encoding - Prediction window and observation window - Business success criteria (not just model accuracy) - Baseline model (what beats random?) **2. Feature Engineering** - Feature brainstorming by category (demographic, behavioral, transactional, temporal) - Feature creation techniques - Temporal feature design (recency, frequency, monetary) - Text and categorical feature encoding - Feature interaction exploration - Feature importance preliminary analysis **3. Model Selection** - Algorithm comparison for your problem type - Logistic regression (baseline, interpretable) - Random forest (robust, handles non-linearity) - Gradient boosting (XGBoost/LightGBM for performance) - Neural networks (when and why) - AutoML tools for rapid experimentation - Selection criteria: accuracy, interpretability, speed, data size **4. Training and Validation** - Train/validation/test split strategy - Cross-validation approach - Hyperparameter tuning methodology - Handling class imbalance (SMOTE, weighting, threshold tuning) - Overfitting detection and prevention - Learning curves analysis **5. Model Evaluation** - Metric selection by problem type - Classification: precision, recall, F1, AUC-ROC, confusion matrix - Regression: MAE, RMSE, MAPE, R-squared - Business metric translation (model accuracy to dollars) - Calibration assessment - Fairness and bias evaluation **6. Deployment and Monitoring** - Model serving architecture - Batch vs. real-time prediction design - Model monitoring metrics - Data drift detection - Retraining triggers and schedule - A/B testing model versions - Documentation and handoff requirements