The 2025 hurricane season was a coming-of-age story for AI weather models, which have been around in some capacity since around 2022.

Last hurricane season, AI models not only proved they could compete with the gold-standard corrected consensus models – the guidance human forecasters rely on most that blend the best physics-based forecast models and correct for their biases in real-time – but beat them, even besting the National Hurricane Center’s official forecast across nearly every forecast period.

Forecast track skill for the main hurricane track models for the 2025 Atlantic hurricane season relative to climatology and persistence (the “no-skill” benchmark for forecast models). The higher numbers indicate higher skill and more accurate forecasts. The best-performing model in 2025 was Google’s machine learning-based DeepMind (green line). NHC’s official forecast (black line) was the second most-skillful forecast followed by the gold-standard Hurricane Forecast Improvement Program (HFIP) Corrected Consensus Approach (HCCA) model (pink line). (National Hurricane Center)

It was impressive, though not altogether surprising, for an AI-designed hurricane model to be competitive in forecasting hurricane track, but it was downright incredible to see an AI forecast model top the National Hurricane Center’s official intensity forecast last season, as Google’s DeepMind did across most forecast periods.

Forecast intensity skill for the main hurricane intensity models for the 2025 Atlantic hurricane season relative to the “no-skill” Decay-SHIFOR5 (Statistical Hurricane Intensity FORecast) benchmark. The higher numbers indicate higher skill and more accurate forecasts. The top intensity forecasts in 2025 for most periods came from Google’s machine learning-based DeepMind (green line), followed by NHC’s official intensity forecast (black line). The least-skillful intensity forecasts were from our global models such as the American GFS (dark blue line) and European model (light blue line). Global forecast models can excel with hurricane track forecasting but don’t capture the fine details necessary for hurricane intensity forecasting. (National Hurricane Center)

While hurricane track forecasting has made tremendous strides over the past 30 years, intensity forecasting has often stagnated.

Over the past 15 years, however, slow progress has been made in reducing forecast error for hurricane intensity and detecting rapidly intensifying hurricanes in advance.

These improvements have been largely attributed to ultra-high-resolution, storm-following hurricane models – like NOAA’s HAFS and HWRF predecessor – that better capture the fine inner-core details and complicated interaction between the air and ocean that meteorologists argue are necessary to accurately forecast hurricane intensity.

The remarkable feat of DeepMind and its stellar intensity forecasts last hurricane season was that it didn’t have all the fine details of our traditional high-resolution, physics-based hurricane models.

It simply takes a snapshot of the current state of the atmosphere as assessed by the European Centre’s high-resolution global weather prediction system (known as the Integrated Forecasting System or IFS) – the only thing it presumably knows about the state of the atmosphere – and then makes a forecast drawing only from data its trained on, which includes historical weather data back to 1940 (from the European Centre’s fifth generation atmospheric reanalysis of the global climate or ERA5) and global unified hurricane track data (from the International Best Track Archive for Climate Stewardship or IBTrACS).

That’s it.

The beauty is in AI’s simplicity, and AI forecasts are made in a fraction of the time and with immensely less computing power than what’s required of conventional physics-based weather models, models that harness some of the most powerful and expensive supercomputers in the world.

Once the AI model is trained, it can make a forecast in minutes using a standard laptop, whereas traditional physics-based models take hours to produce a full forecast on banks of expansive, mainframe supercomputers stored in large server rooms around the country.

So, unlike what we think about with other types of AI-related tasks, which often require massive, energy-hungry AI data centers and industrial-sized water draws for continuous cooling, AI weather models actually have the potential to streamline and make more cost-efficient weather forecasting. In fact, NOAA estimates the AI-version of its global forecast model uses a mere 0.3% of computing resources compared to its physics-based version.

If AI weather models can prove as accurate or more than traditional weather forecast models, as Google DeepMind is doing in the hurricane world, we’re looking at significantly cheaper and faster forecasts in the future.

Machine learning versus other types of AI

Artificial intelligence or AI is an umbrella term used for many different types of new technology nowadays. It’s important to distinguish which subset of AI new weather models are harnessing, however, compared to the type of AI technology most accessible to the public through AI chatbots.

The new AI weather models are not large language models or LLMs that simply mimic human speech patterns or text. They’re also not generative AI that produce images or videos based on user prompts. These are the most accessible and familiar types of AI found on platforms like Open AI’s ChatGPT, Google Gemini, and Anthropic Claude.

Instead, AI weather models are a type of deep, machine learning-based subset of AI that use complex neural networks to effectively mimic how the human brain processes information. This makes these models behave similarly to how a human might forecast weather, relying on pattern-recognition based on their understanding of historical weather. But in the case of AI, the models are trained on petabytes of data going back a century or longer, more data than most of us could holistically comprehend in routine forecasts.

These advanced machine learning-based weather models are a far cry from the AI “slop” we often see floating around the dark corners of social media and the internet.

Not all AI weather models are created equal

For the time being, Google DeepMind has the hurricane forecast model market cornered. While DeepMind ran circles around all other models in 2025, the same can’t be said about other AI newcomers.

For example, the European Centre which runs the famed Euro model – the world’s leading physics-based global weather model – also has an AI-based version which ran last hurricane season.

While it performed exceptionally well for hurricane track forecasts – better than the conventional physics-based Euro – it was a total flop for hurricane intensity forecasts, basically showing no skill whatsoever.

Models across the Atlantic and Pacific basins last hurricane season. Both Google DeepMind (GDMI red line above) and the European Centre’s AI weather model (AIFI gold line above) had lower track errors than even the National Hurricane Center’s official forecast (black line). (James Franklin/Mark DeMaria/NOAA)

Intensity error from assorted hurricane models across the Atlantic and Pacific basins last hurricane season. While Google DeepMind (GDMI red line above) had extremely low track error comparable to NHC’s official track (black line), the European Centre’s AI weather model (AIFI gold line above) exhibited no skill relative to the OCD5 “no-skill” baseline. (Credit: James Franklin/Mark DeMaria/NOAA)

A large reason for the AI-based Euro model’s poor intensity forecast performance is its reliance on coarse reanalysis training data from ERA5 that works to minimize error and more often than not weakens hurricanes too quickly, resulting in a large low intensity bias.

Google DeepMind bypasses this issue by incorporating historical hurricane intensities in its training dataset so maximum winds act like extra grid points in the model.

Presumably, other AI models will eventually catch up to Google’s DeepMind approach, but when it comes to intensity forecasting, DeepMind is the AI champion.

What’s new with AI models for the 2026 hurricane season?

Google has several new features and improvements to DeepMind available this hurricane season. Perhaps the biggest change is that DeepMind’s now offers up to 1,000 ensemble members compared to the 50-ensemble member max last hurricane season.

It’s unclear yet where the larger ensembles might help, but the thinking is it’ll help to better detect outlier events (low probability events that could also pose high-risk impacts).

Also, with the addition of 1,000-member ensembles, Google and NOAA are experimenting with incorporating them in wind speed probability forecasts that currently use a 1,000-member Monte Carlo-based model to simulate storm scenarios.

It’s also possible the larger set of ensembles will improve detection of hurricane rapid intensification, which Google DeepMind’s 50-member ensemble helped with at longer lead times last season.

Brier skill score for various probabilistic rapid intensification models for last hurricane season across both the Atlantic and eastern North Pacific basins. Higher values indicate higher skill and more accurate forecasts. Google DeepMind’s ensemble system was the best performing model for predicting rapid intensification at long lead times (2-3 days out), but struggled at short lead times. It’s worth noting that DeepMind updated its AI tracker in September, which may have affected these verification numbers. (Mark DeMaria/NOAA)

It’s also worth noting that the folks at Google updated their AI hurricane tracker last September, which seemed to improve DeepMind’s forecasts leading up to Hurricane Melissa last October.

As I wrote about toward the end of last hurricane season, the American GFS had an especially bad year for hurricane forecasts, its worst performance in over two decades. Much of that error can be attributed to its especially poor forecasts for Hurricane Melissa, the most impactful hurricane last season, which the GFS inaccurately showed recurving too quickly before reaching Jamaica.

American GFS model track errors for the 2025 Atlantic hurricane season including Hurricane Melissa (black bars) and excluding Hurricane Melissa (red bars). Hurricane Melissa increased the GFS’s Atlantic basin track errors by roughly 20-30% in the 4-to-7-day window. (NOAA/EMC)

NOAA is planning to release an update to its GFS model (from version 16 to version 17) in October, which the agency says shows improvements in tropical cyclone track forecasting.

Additionally, the AI version of the American GFS ensemble system (known as AI-GEFS) has shown considerable promise in improving hurricane forecasts versus the conventional physics-based GFS, which NHC will be evaluating this hurricane season.

Is the writing on the wall for conventional models and human forecasters?

Although the new era of hurricane forecasting will increasingly look to AI models for speed, efficiency, and accuracy, don’t expect conventional physics-based models to be going away anytime soon.

Those models are what’s training the AI models, so the two will continue to work hand-in-glove, even if AI forecasts outpace their physics-based counterparts.

And as far as the human side of forecasting, understanding model biases, detecting quick changes in hurricane strength or organization, harvesting and quality controlling data in real-time used to train and improve AI models, designing products and decision-support tools around AI, and messaging the complexities of the forecast will keep humans vital to future forecasts even and especially with the addition of AI.

AI weather models changing the hurricane forecast game

Updates to AI forecast models in 2026 look to build upon the successes of last hurricane season

About The Author

Michael Lowry