TimesFM: zero‑shot forecasting
Google’s open‑source TimesFM time‑series foundation model reportedly produces zero‑shot predictions on market data and can outperform LLMs for short‑term price forecasting without additional training. The result points to new foundation models tailored to time series that may simplify baseline forecasting tasks for trading research. (x.com)
Prices look random because most of the signal is tiny and most of the chart is noise. A model that works on electricity demand or store sales can look smart there and still fall apart on a stock that gaps 4% on one headline. (research.google.com) Most forecasting systems are built one dataset at a time. You collect a series, choose a model, tune it, retrain it, and only then find out whether it was worth the effort. (research.google.com) A foundation model tries to skip that setup phase by learning general patterns first. It is the forecasting version of a model that has already read a giant library before you ask it one specific question. (arxiv.org) Google’s TimesFM is one of the clearest attempts at that idea. The original model was trained on 100 billion real-world time points and was built to make forecasts on new series with no extra training. (research.google.com) That “no extra training” part is what researchers call zero-shot forecasting. You hand the model a history it has never seen before, and it gives you a forecast immediately instead of waiting for a custom training run. (research.google.com) TimesFM is also smaller than many language models people try to repurpose for numbers. Google said the first public version used 200 million parameters and still came close to state-of-the-art supervised systems on several public forecasting benchmarks. (research.google.com) Those public benchmarks mostly contain regular series like weather, traffic, and search trends. Financial prices are harder because they are more irregular and have a lower signal-to-noise ratio, so success on benchmark data does not automatically carry over to markets. (tech.preferred.jp) That is where the new story starts. A 2024 Preferred Networks study tested TimesFM on price prediction and found that directly applying the base model to financial data gave unsatisfactory results because market data behaves differently from cleaner benchmark series. (arxiv.org) The same study then kept training TimesFM on financial data instead of starting from scratch. The added dataset covered about 100 million time points across stocks, currencies, commodities, indices, and cryptocurrencies at daily and hourly frequencies. (arxiv.org) After that financial fine-tuning, the authors reported better price-prediction accuracy than the baseline TimesFM model and stronger mock-trading results on returns, Sharpe ratio, max drawdown, and trading cost. That is a useful detail because it suggests the model got better on both forecast error and trading-style evaluation. (arxiv.org) So the clean version of the claim is narrower than “Google solved market prediction.” The open-source TimesFM base model shows that a time-series-specific foundation model can make zero-shot forecasts, but finance still appears to benefit from extra domain training rather than pure out-of-the-box use. (github.com, arxiv.org) That still changes the workflow for trading research. Instead of treating every new series like a fresh machine-learning project, a team can start with a strong pretrained baseline and spend its time testing whether domain tuning, covariates, or execution rules add real value. (docs.cloud.google.com, research.google.com) Google has kept pushing the model since the first release. The public repository lists TimesFM 2.5 as the latest open version, and Google Cloud now exposes TimesFM inside BigQuery through built-in forecasting functions. (github.com, cloud.google.com) The bigger pattern is that time series may be splitting off from language as its own foundation-model category. If your data is a sequence of prices, temperatures, trips, or sensor readings, a model trained for sequences of numbers may be a better starting point than a language model trained for sequences of words. (arxiv.org, research.google.com)