← Back to context

Comment by tndl

10 months ago

This is super interesting, I'd never come across ARGO before. Data assimilation is a similar problem for our data, and there currently exist systems for assimilating weather balloon observations into gridded reanalysis data (https://www2.mmm.ucar.edu/wrf/users/). One thing we believe, however, is that the reanalysis step in weather forecasting is unnecessary in the long term, and that future (ML) weather models will eventually opt to generate predictions based on un-assimilated raw data and will get better results in doing so.

That being said, trajectory-based data tooling could be super interesting to us. Let's definitely chat: austin@sorcerer.earth

And re: recovery, we're pretty confident we'll be able to recover the majority of our systems. Being in the air has the advantage that we can choose to 'beach' ourselves in a specific location, rather than the first place we run across land like with the buoys. At his previous company, Alex wrote a prediction engine able to get similar balloon systems to land in a predicted 1kmx1km zone for recovery

> One thing we believe, however, is that the reanalysis step in weather forecasting is unnecessary in the long term, and that future (ML) weather models will eventually opt to generate predictions based on un-assimilated raw data and will get better results in doing so.

The idea that we'll be able to run ML weather models using "raw" observations and skip or implicitly incorporate an assimilation is spot-on - there's been an enormous shift in the AI-weather community over the past year to acknowledge that this is coming, and very soon.

But... in your launch announcement you seem to imply that you're already using your data for building and running these types of models. Can you clarify how you're actually going to be using your data over the next 12-24 months while this next-generation AI approach matures? Are you just doing traditional assimilation with NWP?

Also, to the point about reanalysis - that's almost certainly not correct. There are massive avenues of scientific research which rely on a fully-assimilated and reconciled, corrected, consistent analysis of atmospheric conditions. AI models in the form of foundation models or embeddings might provide new pathways to build reanalysis products, but they are a vital and critical tool and will likely be so for the foreseeable future.

  • > There are massive avenues of scientific research which rely on a fully-assimilated and reconciled, corrected, consistent analysis of atmospheric conditions.

    That’s a good point! In fact, the outputs for observation based foundational models will likely include a "reanalysis-like" step for the final output.

    Regarding the next 6-12 months, we will be integrating our data with traditional NWP models and utilizing AI for forecasting. We've developed a compact AI model that can directly assimilate our "ground truth" data with reanalysis, specifically for use in AI forecasting models.

    Once we have hundreds of systems deployed, we'll use the collected observations, combined with historical publicly available data, to train a foundational model that will directly predict specific variables based on raw observations.