How I Built The NHL Betting Model That Beat The Books

Circles Off

2026-04-08

How to Build an NHL Betting Model: The Essential Framework (Tested)

You’re tired of placing bets based on gut feelings or what the talking heads say. You know there has to be a structured, repeatable way to find an edge in hockey betting, but the complexity of building a NHL betting model feels impossible. I completely get that frustration. I’m pulling back the curtain on my 2016-17 spreadsheet model that returned nearly 12% ROI and beat the closing line over 87% of the time. This isn't about copying old code; this is about understanding the winning mindset.

I’m going to walk you through every tab, every formula decision, and every validation step used in that original structure. While the specific numbers have changed since then, the architecture for separating successful analysts from casual bettors remains exactly the same. You’ll walk away with a crystal-clear blueprint for how predictive analysis works.

Here's What We'll Cover

Why an old, high-ROI model is still the best learning tool
The core statistical inputs that actually predict hockey wins
How to accurately estimate team strength when data is sparse
The critical role of goalie modeling and player adjustments
The method used to confirm real betting value

Why Reviewing an Old NHL Betting Model Still Matters

People constantly ask how to start sports modeling and what the process looks like from the inside. The answer, frankly, is showing proof of concept. I built this initial NHL betting model using only Excel over a decade ago. It nailed the market distinction between luck and skill back then.

I must be upfront. If you copy this exact spreadsheet today, it won't be profitable. The market is far more efficient now. The edges that existed in 2017 have been priced in by smarter analysts with better tools. However, the foundation—the architectural thinking—is entirely evergreen. Winning in betting isn't about having the newest proprietary data; it's about asking the right questions about data you already possess.

Think of this journey as observing a professional dissecting a problem. We’ll cover the structure, the variables we chose, and how we judged whether the model actually found something real. That's the valuable part for you.

Spreadsheet Organization: Color Coding the Process

The structure of the file itself reflects a clear workflow. I color-coded the main tabs for easy navigation, even though there were 20 other hidden tabs feeding data, like raw inputs for multi-year goalie breakdowns.

Here’s the color scheme I used:

Blue Tabs: The bedrock. This is where research data and foundational assumptions lived.
Green Tabs: Team Ratings. This is where we calculated baseline strengths.
Orange Tabs: Goalie Modeling. Because goalies are massive variables, they required dedicated attention.
Yellow Tabs: Game Day Adjustments. This accounts for immediate factors like injuries or travel.
Red Tabs: The output. Where the final picks and projected lines appeared.
Light Blue Tabs: The results ledger, tracking performance versus expectations.

This organization ensures that every piece of input funnels logically toward the final prediction output. You see the complexity without getting lost in the noise.

Deconstructing the Core Predictive Engine

The first hurdle in building any statistical NHL betting model is identifying what *actually* causes success. Casual fans point to goals scored. And yes, goals win games. But goals are incredibly noisy indicators over a small sample size. One bad bounce can cost you the game even if you completely dominated play.

We needed stats that reliably predicted goal share over the long run. This meant running a regression analysis to map inputs (stats) to the output we cared about (goal share).

Key Variables That Explained 95% of Variance

I tested several variables, but three inputs explained roughly 95% of the variance in goal share across hundreds of historical team seasons. That's an incredibly strong statistical fit.

These were:

1. CF percentage (Corsi For %): This measures shot attempt share. If your team generates 55% of all shot attempts at even strength, your Corsi For is 55%. It acts as a proxy for territorial control and overall pace of play.

2. SH percentage (Shooting %): This is simply what percentage of your team's shots actually go in. This relates directly to finishing talent.

3. SV percentage (Save %): The opposing team's save percentage against you. This is the rate at which goalies stop pucks aimed at your net.

If you can accurately estimate those three numbers for two teams facing each other, you can predict with high confidence which team *should* score more goals over 82 games.

I also tested replacing Corsi with Scoring Chances (SCF %). The fit dropped significantly to about 82.6%. The thing is, we had far more Corsi data points than SCF data points, allowing the Corsi signal to stabilize faster. Corsi became the backbone.

Creating Stable Team Ratings from Noisy Data

Here's where most attempts at sports modeling fail. You can’t just look at the standings after 15 games and assume your team rating is accurate. Early season data is thin, and lucky streaks or unlucky cold spells distort initial numbers.

The Solution: Blending Time Frames

The solution we implemented in the Team Ratings tab was blending historical context with current form. This is how we stabilized the early-season projections.

We blended three things, focusing only on 5v5 play because special teams add unnecessary noise:

Full Season Context: The prior season's complete stats.
Multi-Year History: Stabilizing data going back several years.
Recent Form: The team’s performance over the last 10 or 15 games.

For the start of the season, we leaned heavily on history. As the year went on, we gradually shifted more weight toward recent results, allowing the model to adapt to actual roster changes.

This process generated a composite rating that projected an expected goals for percentage for that team on a neutral sheet. This allowed us to stack rank every team—Tampa Bay, Pittsburgh, Vegas—from best to worst.

Adjusting for Player Absences

Crucially, these projections needed to account for injuries. Columns on the right side of the Team Rating tab tracked short-term and long-term absences of significant players. These injuries forced mathematical adjustments to the projected team strength. Losing a key offensive driver like Nashville's Philippe Forsberg visibly lowered their projected strength rating.

Addressing the Single Biggest Variable: Goalie Modeling

Save percentage isn't a team trait; it's a measurement that belongs almost exclusively to the individual starting goalie. You can’t just plug in the team’s season save percentage and call it a day.

I built a specific goalie model that looked at goalie performance across three seasons, weighted heavily toward the most recent year. Shots faced also served as a sample size weight. More shots faced means a more reliable save percentage figure.

Value Over Replacement Player (VARP) for Goalies

We utilized a custom metric based on Wins Above Replacement Player (VARP) for goalies, developed with input from Dom Lucian. This metric quantified how many wins a specific goalie provided compared to an anonymous, league-average backup.

This was powerful for applying real-time adjustments. If a top-three goalie like Carey Price (who rated near 2.8 VARP) was injured and replaced by a low-tier backup, the model immediately dropped the team’s projected strength significantly. For example, replacing Vegas’s starter with a truly weak backup dropped them from a top-three team to well outside the top 15.

Keep in mind, goalie analysis has exploded since 2017. We didn't have access to things like Goals Saved Above Expected, which are common now. Our model was relatively simple weighted save percentage, which was its weakest link, but still effective back then.

Game Day Factors and Finalizing the Edge

Once the baseline team and goalie projections were set, the final layer involved real-time, game-specific adjustments. This is where we layered in skater value and schedule fatigue.

Skater Impact Using VARP

For skaters, we used VARP to measure what a player contributed above a replacement level. If Connor McDavid is out, you’re losing the massive performance gap between McDavid and the player who replaces him in the lineup. This loss is quantifiable.

Minor players retiring or sitting doesn't move the needle much, but an elite absence absolutely does. The rankings tab was adjusted based on these VARP losses or gains. For instance, losing two key players like Brad Marchand and Ryan Lindgren might translate to a loss of 3.3 projected wins for Boston over the rest of their schedule.

Schedule Compression Adjustments

The model also recognized physical toll. Playing the second game of a back-to-back slate or having three games in four nights imparts a measurable disadvantage. The model slightly lowered the win probability for the tired team accordingly.

When the schedule tab ran its final calculation for a specific game, it pulled the adjusted home/away ratings, factored in goalie quality, accounted for player VARP losses, and applied rest penalties. The output was a projected win probability for both sides.

For example, our opening night projection for St. Louis at Pittsburgh suggested Pittsburgh should be favored at -172. The market odds were slightly shorter at -180. This small difference indicated the NHL betting model was finding a positive expectation right out of the gate, even if small.

Tracking Wagers and Validating the True Edge

Finding value is one thing. Proving you can consistently beat the market is another. This is the crucial step many sharps skip.

I manually tracked every wager in the Wagers tab. I recorded the line we took, the closing line value (CLV), and whether we won or lost. It wasn't automated like today, but the discipline was non-negotiable.

The Importance of Closing Line Value

The results showed a 56.3% win rate and a 11.7% ROI over 341 bets. But the real test was CLV. We beat the closing line 87.4% of the time. When we took a bet that moved the market in our favor, our ROI was 13.8%. When we lost a bet *and* the market closed against us, our ROI was negative 3%.

This confirmed that the model found real, exploitable edges, not just variance luck. We were identifying lines that the broader market would eventually correct to.

Interestingly, the model was consistently wrong about certain teams for betting purposes, even if they were good teams overall. Betting against Arizona actually lost us significant money, as did betting against the Kings and Vegas early on. Conversely, fading Ottawa returned nearly $25,000.

Common Questions About NHL Betting Model Construction

The Easiest Way to Start Building Today

If you're looking to start your own NHL betting model, forget the fancy stats initially. Start simple. Take those foundational three inputs—Corsi, Shooting %, Save %—and run a basic regression in a spreadsheet. The goal is to establish a repeatable framework before adding complexity like VARP or schedule adjustments.

Why Can't I Just Use Season Averages?

Season averages are too noisy, especially early in the year, due to small sample sizes. If your star player misses 10 games, your season average is skewed. We blend recent form with multi-year history to create a tempered, more reliable true team strength projection, stabilizing the input data.

How Important Is Goalie VARP in Modern Hockey Modeling?

Goalie VARP is extremely important. Goalies impact win probability more dramatically than any other single position in hockey. While newer metrics are better than simple weighted save percentage, understanding how much value a starter adds over a replacement is essential for accurate game projection.

What Was the Biggest Flaw in the 2017 Model?

The biggest flaw was the simplicity of the goalie analysis and the manual injury tracking. Today, analysts use goals saved above expected derived from underlying shot data, which captures true goaltending skill better than we could in 2017. Automation would also replace the manual injury logs we maintained.

How Do I Know If My Model Found a Real Edge?

The best indicator is Closing Line Value. If your model consistently recommends bets where your line is better than the closing line, you are finding value, regardless of short-term results. Long-term profitability always follows consistent CLV.

Your Next Steps

This throwback model proved that sophisticated analysis doesn't require proprietary chips or massive cloud computing power. It requires asking the right questions about observable variables. You need the right statistical framework and, most importantly, the honesty to validate whether your framework is actually working by tracking closed lines.

We covered the core regression, the stabilizing tactics, and the necessary game-day adjustments. Now, it’s your turn to apply this mindset. If you want to discuss these components, share your own model structures, or ask granular questions about the Excel tabs, join our free community.

We talk modeling strategy constantly on our Discord server. Go check out discord.gg/hammer, or find the link in the description below. Start building your framework today; the thinking is what matters.

About Circle Back

To support Circles Back: Sign up for new sportsbook accounts using our custom links and offers. Click HERE.

Stay Updated: Subscribe for more Circle Back content on your favourite platforms:

YouTube | Spotify | Apple Podcasts

Follow Us on Social Media:

🔨 Sign up to Kirk's Hammer

Scale Your Winnings With Betstamp PRO

Betstamp Pro saves you time and resources by identifying edges across 100+ sportsbooks in real-time. Leverage the most efficient true line in the industry and discover why Betstamp Pro is essential for top-down bettors.

Limited number of spots available! Apply for your free 1-on-1 product demo by clicking the banner below.

Episode Transcript

[00:00] ## Introduction to NHL Betting Model
[00:02] In the 2016-17 NHL season, I placed 341 bets on hockey using a model that I built entirely in an Excel spreadsheet. Those 341 bets returned almost a **12% ROI**, and beat the closing line **87% of the time**.
[00:18] Today, I'm going to open up that spreadsheet publicly for the first time and walk you through exactly how I built it — every tab, every formula, every decision. Not because the model still works — it doesn't, it was built a decade ago — but because the thinking behind this model is the same thinking that separates people who *actually* make money betting sports from those who just think they do.
[00:46] ## Overview: Why Show This Model?
[00:49] Many people have been asking about sports modeling: how to start, what it looks like under the hood. I believe the best answer is just to show an example.
[01:00] I'll pull up my 2016-17 model right here — it's just a bunch of tabs in an Excel sheet.
[01:08] I want to be upfront: this model would **not beat the market today.** The data landscape has changed, competition is different, and the edges exploited by this model have largely been priced in now.
[01:21] But the **architecture** of building a model — the questions you ask, how you structure data, how you validate the model's edges — none of that has changed. This is evergreen.
[01:34] So, think of this less as "here's a model to copy" and more as "here's how a professional bettor's brain works when building a model from scratch."
[01:45] ## Spreadsheet Tabs and Organization
[01:47] At the bottom, I color-coded and organized the tabs:
[01:50] - **Blue:** Research foundation
[01:52] - **Green:** Team ratings
[01:53] - **Orange:** Goalie modeling
[01:55] - **Yellow:** Game day adjustments
[01:57] - **Red:** Where the model spits out picks
[02:00] - **Light blue:** Results
[02:02] There are actually about 20 more hidden tabs — raw data inputs, player stats, multi-year goalie breakdowns — feeding into these visible tabs. I hid them to keep it clean and followable, but the model is deeper than what you see.
[02:18] ## The Core Predictive Model: Regression and Variables
[02:21] ### What Predicts Hockey Wins?
[02:23] The first question was simple: what actually predicts whether a hockey team wins or loses?
[02:29] Casual viewers might say goals — the team that scores more goals wins, literally how the sport works. But goals are noisy. A team can outplay their opponent for 58 minutes and lose on a bad bounce — that happens often.
[02:46] So the question became: **what stats reliably predict which teams will score more goals over a season?**
[02:53] ### Running a Regression
[02:54] I ran a regression — mathematically finding relationships between inputs and outputs.
[02:59] - Inputs I used were **CF percentage (Corsi For %), SH percentage (Shooting %), and SV percentage (Save %)**.
[03:07] #### Definitions:
[03:07] - **CF percentage:** measures shot attempt share — e.g., if your team generates 55% of shot attempts, you have a CF of 55%. It acts as a proxy for territorial dominance (time spent in offensive vs. defensive zone).
[03:23] - **SH percentage:** shooting percentage of your team’s shots, the % that score.
[03:28] - **SV percentage:** save percentage of shots the opposing team takes, i.e., what % the goalie stops.
[03:35] These three inputs explained about **95% of the variance** in goal share across about 300 team seasons of NHL data going back a decade — a very strong fit.
[03:46] If I can accurately estimate these three numbers for each team, I can predict which team should score more goals with high confidence.
[03:55] ### Alternative Stats: Scoring Chances
[03:57] I reran the regression replacing Corsi with scoring chances (SCF %). The R squared dropped to about **82.6%**, still good but not as strong.
[04:07] Part of that was sample size: way more shot attempts (Corsi) than scoring chances, so Corsi signal stabilizes faster.
[04:15] This first regression became the **backbone** of the whole model.
[04:19] ## Estimating Team Stats Accurately
[04:21] ### The Challenge
[04:22] You can't just look at season stats and call it a day — early in the season, data is sparse (e.g., only 10 games), so your team's stats might look good or bad by chance.
[04:36] ### Solution: Blend Multiple Time Frames
[04:38] This tab takes every team's full 2016-17 season stats at **5v5 play** (not overall, because special teams add noise). 5v5 is stickier season over season and shows biggest team differences.
[04:50] I blended:
[04:51] - Full season
[04:52] - Recent form
[04:53] - Multi-year history
[04:55] The idea was to use a longer history early in the season (stabilizer) then shift weight to recent performance as the season progresses.
[05:04] For example, I used previous year stats as a stabilizer going into the new season.
[05:10] ### Team Ratings Tab
[05:11] This tab is the **brain of the model.** Each team gets a composite rating blending multi-year, current season, and recent form.
[05:20] It produces a projected **expected goals for percentage** — what % of goals the team should score on a neutral surface.
[05:28] Teams are ranked top to bottom: Tampa Bay (1), Pittsburgh (2), Vegas Golden Knights (3), Nashville (4), down to Buffalo, Ottawa, Arizona at the bottom. Seattle was not in the league yet.
[05:41] ### Injuries
[05:42] Columns on the right track injuries — day-to-day and long-term absences of key players are manually entered (this would be automated today).
[05:51] These injuries **mathematically adjust** team strength projections.
[05:53] For example, Nashville missing Philippe Forsberg or Boston missing Brad Marchand affects the projected rating.
[05:59] ## Goalie Modeling
[06:01] ### Why Needed?
[06:02] Save percentage isn’t really a team stat; it’s a goalie stat.
[06:06] Goalies vary nightly, so I built a **separate goalie model**.
[06:10] ### Data Collection
[06:11] Goalie data going back three seasons, weighted based on recency and shots faced:
[06:17] - Most recent season weighted highest
[06:19] - Shots faced acts as a sample size weight
[06:23] ### Sample Illustration
[06:24] Sergey Bobrovsky is the #1 goalie in the model, with save percentage improving from ~91.5% three years ago to 93.2% most recently.
[06:33] The model regresses small-sample goalies toward league average to avoid distorting projections.
[06:37] ### Value Above Replacement Player (VARP) for Goalies
[06:41] I worked with Dom Lucian to build a VARP metric ranking goalies by value over a replacement-level goalie.
[06:48] Example: Carey Price ranked top at about 2.8 wins above replacement.
[06:52] This matters when a team uses a backup or different goalie — it adjusts projected strength.
[06:59] ### Example
[06:59] Switching Vegas goalie from Marc-André Fleury to Brian Elliott reduces team strength slightly.
[07:05] Switching to a weaker goalie like Harry Satariy lowers Vegas ranking dramatically from 3rd to 14th.
[07:11] ### Model Weakness
[07:12] Goalie analysis has become more sophisticated since 2017, with advanced stats like goals saved above expected, pre-shot movement data, rebound control — most of which were unavailable then.
[07:23] My goalie model was just weighted save percentage with regression, making it the weakest link.
[07:29] ## Community and Learning
[07:31] If you’re interested, we discuss modeling constantly on our free Discord community at **discord.gg/hammer** — people build models, share numbers, and collaborate.
[07:40] ## Game Day Factors: Injuries and Adjustments
[07:43] ### Value Above Replacement Player (VARP)
[07:45] For skaters, VARP quantifies how much each player adds compared to a replacement-level player.
[07:51] If McDavid is out, Edmonton loses the gap between McDavid and his replacement.
[07:56] Example top players’ VARP: McDavid, Kucherov, Marchand, Pastrnak, Bergeron, Marchand.
[08:00] Lower-tier players don’t move the needle much in real time.
[08:04] ### Using VARP in the Model
[08:06] Injuries on the rankings tab adjust team strength based on player absences, measured in wins lost or gained.
[08:13] E.g., Nashville losing Forsberg costs 1.54 wins, Boston losing Marchand and Makavoy costs 3.3 wins.
[08:19] ### Other Adjustments: Schedule Compression
[08:21] The model accounts for days rest, back to back games, and other schedule factors.
[08:27] For instance, Pittsburgh playing the second game in two nights is at a measurable disadvantage, and the model lowers their win probability accordingly.
[08:36] ## Final Model: Putting it All Together
[08:39] ### Schedule Tab
[08:40] For every game, the model:
[08:42] - Pulls away and home team ratings
[08:45] - Adjusts for goalie and injuries
[08:47] - Adjusts for rest
[08:49] - Outputs projected win probability for each team
[08:52] Example: Opening day game Oct 4, St. Louis at Pittsburgh.
[08:56] Model’s expected line was -172 for Pittsburgh, Pinnacle odds were -180 (home) and +162 (away).
[09:02] Early season predictions closely aligned with market lines.
[09:05] ### Bets and Edges
[09:07] On Oct 7, model flagged bets on Tampa Bay, Carolina, Chicago, and Vegas.
[09:12] The model’s edge for Tampa Bay was about 0.6%, and bet size was calculated accordingly (e.g., $1,200 for Vegas).
[09:20] The model’s bets moved the market somewhat.
[09:23] ## Tracking Bets and Performance
[09:25] ### Wagers Tab
[09:26] I tracked every bet manually — before automatic tools existed.
[09:30] Recorded line taken, closing line, win/loss, closing line value.
[09:33] ### Results Tab
[09:35] - 341 total bets
[09:36] - 56.3% win rate
[09:38] - ROI just under 11.7%
[09:40] - Beat the closing line 87.4% of the time
[09:44] When beating the close, ROI was 13.8%; losing to the close, ROI was -3%.
[09:49] This CLV data gave confidence that the model was finding **real edges**.
[09:54] ### Team-Specific Betting Results
[09:56] - Betting against Arizona cost me about $21,000
[09:59] - Betting against the Kings and Vegas Golden Knights cost me too
[10:04] - Betting against Ottawa returned about $25,000
[10:06] ## Closing Thoughts
[10:08] ### Why This Model Doesn’t Work Today
[10:10] 1. Data availability has improved dramatically since 2017.
[10:14] 2. Market efficiency has increased, competition sharpened.
[10:16] 3. Operational workflows have evolved — automation now replaces manual data entry.
[10:21] ### Advice for Beginners
[10:23] Start simple:
[10:23] - Use three variables in a regression
[10:26] - Build it in a spreadsheet
[10:29] My first hockey model made money *not* because it was sophisticated, but because I asked the right questions, used the right framework, and validated the results honestly.
[10:39] The tools changed, but the thinking stayed the same.
[10:43] ### Community Engagement
[10:44] If you have questions, want details about tabs, or want to share your model ideas, join our Discord or leave a comment below.
[10:53] Just type **discord.gg/hammer** or find the link in the description.
[10:57] Thanks for watching — hit subscribe if you found this useful!
[11:02] See you in the next one. Peace out.