The Bank for International Settlements Innovation Hub, working with the Deutsche Bundesbank and the European Central Bank, published a report on Project Spectrum demonstrating an embedding-based approach to automatically classify high-frequency product descriptions for inflation analysis using the ECB’s Daily Price Dataset. The project transforms unstructured product text into high-dimensional embeddings and then applies traditional machine learning classifiers to assign products to ECOICOP categories, achieving accuracy levels comparable to direct large language model prompting while materially reducing compute time and cost. Using the dataset of around 34 million unique products, the report estimates that classifying the data via GPT-5 would take over six months and cost more than EUR 0.5 million, compared with a full classification in five days for about EUR 1,500 using embeddings plus machine learning. In evaluation across categories covering about 50% of the euro area CPI basket, direct LLM prompting achieved 86% weighted accuracy, while embedding-based classifiers achieved 80% with a feedforward neural network and 75% with k-nearest neighbours, alongside much lower per-record processing latency and cost. Next steps identified include testing the method on different datasets and languages and a follow-up phase to benchmark constructed CPI indices at the ECOICOP subclass level against official inflation series, which would require historical price data outside the current project scope.
Bank for International Settlements - Innovation Hub 2026-02-17
Bank for International Settlements Innovation Hub publishes Project Spectrum results on embedding-based AI classification for inflation nowcasting
The Bank for International Settlements Innovation Hub, Deutsche Bundesbank, and European Central Bank released a report on Project Spectrum, showcasing an embedding-based approach for classifying high-frequency product descriptions for inflation analysis. This method significantly reduces compute time and cost compared to large language models, achieving comparable accuracy. Future steps involve testing on diverse datasets and benchmarking CPI indices against official inflation series.