The mid-level scikit-learn exam. Built by the people who maintain it.

The Professional Practitioner Certification is for working data scientists. Regularization, ensembles, feature engineering, nested cross-validation, and the judgement to pick a model and defend it to the business.

Seven competencies of a working mid-level data scientist.

The Professional certification is designed to ensure that our certified professionals possess both the conceptual understanding and the practical skills of a mid-level data scientist.

Advanced ML knowledge — Proficiency in a broad range of machine learning algorithms and the ability to select appropriate models for specific problems.
Programming expertise — Strong coding skills in Python, with experience in optimizing code for performance and scalability.
Data handling and engineering — Ability to handle large datasets, including data extraction, transformation, and loading processes.
Feature engineering — Experience in creating and selecting features to improve model performance.
Tuning and optimization — Proficiency in hyperparameter tuning, model selection, and ensemble methods to improve model performance.
Critical thinking — Approach complex problems systematically and evaluate multiple solutions, including diagnosing issues in a model pipeline.
Business expertise — How ML projects align with business goals and how to translate technical results into actionable business insights.

Five topics. The shape of the Professional exam.

A step beyond Associate. You need to recognize when a model is regularized correctly, when a CV strategy leaks, and how to communicate that to non-technical readers.

Machine learning concepts

The advanced mental model. Probabilistic outputs, regularization regimes, and what overfitting does to soft predictions.

Supervised and unsupervised, regression, classification, clustering, dimensional reduction
Model families: tree-based, linear, ensemble, neighbors
Regularization: L1, L2, Elasticnet
Hard and soft predictions: predict vs predict_proba
Overfitting and underfitting, impact on soft predictions

Model building and evaluation

Pick the baseline, regularize the noise, ensemble when warranted, and choose the metric that fits the problem.

Linear models as baselines
Handling correlation with regularization and feature selection
Bagging and boosting, the working ensemble methods
Choosing metrics for outliers and imbalanced settings

Interpretation and communication

Read the plot, name the failure mode, explain it without using the word probability twice.

Visualizing results with intermediate matplotlib and seaborn techniques
Interpreting model outputs and performance metrics
Communicating results to non-technical stakeholders

Data preprocessing

Heatmaps, PCA, polynomial features, label propagation. The shaping work that makes a real-world dataset trainable.

Loading parquet datasets
Heatmaps and PCA for first look
Identifying strongly correlated features
Missing values in the target via label propagation
Feature engineering with PolynomialFeatures, SplineTransformer
Combining features with FeatureUnion

Model selection and validation

Group structure, non i.i.d. data, nested CV, stable hyperparameters across folds.

Cross-validation with group structure and non i.i.d. data
Hyperparameter tuning: GridSearchCV, RandomSearchCV
Stability of optimal hyperparameters via nested cross-validation

Three levels. You are on the second.

Three certifications, each matching a level and a typical data scientist career path.

Associate Practitioner

Junior data scientist. Fundamental ML, preprocessing, evaluation.

Professional

Mid-level. Regularization, ensembles, feature engineering, nested CV.

Expert

Senior practitioner. Production ML, scaling, governance.

Prepare with the Professional course on Skolar. Free to start.

The Professional track on Skolar matches this exam: regularization, ensembles, feature unions, and nested validation, with notebooks and practice questions written by the scikit-learn team.

Logistics, plain.

Everything you need to plan your sitting, in six lines.

Do I need Associate before Professional? No. Associate is recommended as a stepping stone but not required. If you have a year or two of working data science with scikit-learn, you can sit Professional directly.

Is there a hands-on component? Yes. The Professional exam adds one hands-on lab on top of the multiple-choice questions. You will write and tune a small pipeline against a held-out dataset, in a sandboxed scikit-learn environment.

What about retakes? One retake is included with your registration. After that, retakes are discounted. There is a 21-day cool-down between attempts so you can revisit weak topics on Skolar.

Is the credential verifiable? Yes. Every passing candidate gets a credential ID and a public verification page on probabl.ai. Recruiters can confirm validity without contacting you.

Does it expire? The Professional certification is valid for 3 years. Renew by passing the Expert exam, or by re-taking Professional at a discount.

Certify the work you already do, with scikit-learn.

120 minutes. $349 USD. Multiple-choice plus a hands-on lab, a credential issued by the maintainers themselves.