xwOBA/xwOBACON Model and Dashboard

I worked to create an xwOBA/xwOBACON model for the NECBL, utilizing all batted ball data from the 2021 through 2025 seasons, which consisted of a sample of roughly 52,000 batted balls.

The model predicts xwOBA by taking exit velocity, launch angle, and bearing from Trackman data and estimating what outcome should happen for an individual batted ball based on historical patterns. Data was split 70/30 for training/validation. The model is an XGBoost classifier with 250 trees using multi-class softmax (predicting probabilities for out/single/double/triple/home run). The model predicts the probability of each outcome, then converts those probabilities to expected run value, which is then scaled where 100 represents league average.

xwOBACON measures batted ball quality only (weighted value divided by batted balls), while xwOBA includes all plate appearances by calculating (mean_xwobacon * batted_balls) + (0.690 * (BB - IBB + HBP)) divided by total PAs.

The dashboard uses season-aware composite key matching to connect model predictions with actual NECBL statistics in order to get correct totals of BBs and Ks, and, to properly compare against the model results to see who has been getting lucky and unlucky. It standardizes both model and scraped player names into "LASTNAME, F" format, then creates composite keys combining last name, first initial, team abbreviation, and season. This ensures players appearing in multiple seasons are properly separated and prevents cross-season contamination for individual season totals. This also handles accented characters and apostrophes. This then allowed us to match player names in the model, with the NECBL data.

xwOBA/xwOBACON Dashboard

The xwOBA vs. wOBA tab (default tab as you open it) compares model predictions against actual results for players in the selected season (can pick between xwOBA vs wOBA or xwOBACON v wOBACON-see dropdown at top). There is a scatter plot where each green point represents a player. The x-axis shows actual wOBA, y-axis shows xwOBA. Hovering shows player name, team, season, PAs, both stats, and the difference. When someone clicks "Load Selected Season," the app scrapes NECBL stats for that year, creates composite keys combining last name, first initial, team abbreviation, and season on both sides. Below the plot, a player dropdown appears, while the summary table shows all players with PA counts, xwOBA, wOBA, xwOBACON, wOBACON, teams, seasons, and calculated differences. There is conditional formatting so that green equates to underperformers and red to overperformers. . The table sorts by largest absolute wOBA difference

The next tab is underperformers—those whose xwOBA or xwOBACON exceed actual results by more than 0.025 in either wOBA or wOBACON. The plate appearance threshold defaults to 20, and all those who fit the requirements are in the table. The table shows PA count, xwOBA, wOBA, xwOBACON, wOBACON, team, season, and both calculated differences. It sorts by descending wOBA_Diff so the most “unlucky” players are shown first, as these players are due for positive regression, as their batted ball quality shows that their numbers should be better. Page length is set to 15 entries with search functionality and sorting capability.

The next tab is the player comparison feature. It allows head-to-head tracking of two selected players' xwOBA, xwOBACON, wOBA, and wOBACON across a specific date range, which defaults to a 90-day stretch, but can be updated in the top right corner. The comparison plot shows cumulative xwOBA and wOBA over each player's plate appearances. I used facet_wrap to create separate panels for wOBA and wOBACON. Expected stats appear as solid lines, actual as dashed lines, color-coded by player (purple and teal gradient). Below the plot, the Comparison Summary table aggregates both players' stats over the selected date range. Both the plot and table update reactively when you change selections.

The next tab is overperformers—essentially the exact opposite of the previous tab. It automatically displays players whose actual wOBA or wOBACON exceed their xwOBA or xwOBACON by more than 0.025. It sorts by descending wOBA_Diff so the “luckiest” players appear first, as these guys are due for negative regression, as their batted ball data suggests that they are actually performing worse than the numbers indicate. Page length is set to 15 entries with search and sort functionality, and updates automatically, dependent on the season or threshold.

The Team Browser tab features a table of all players with flexible filtering options. I built it with four controls at the top: Filter by Team dropdown (defaults to All Teams), Filter by Player dropdown (defaults to All Players), and, a dropdown to automatically shift the table to sort based on either xwOBA, xwOBACON, wOBA, wOBACON, or PAs. The table updates automatically when you change seasons, PA thresholds, team filters, player filters, or sorting preferences.

Previous
Previous

Stuff+/Location+/Pitching+ Model and Dashboard

Next
Next

Trackman Pitching Cards