The Spanish Securities Commission (CNMV) has published a research paper assessing how advanced large language models perform when generating stock investment predictions, concluding that using such tools without human oversight carries significant operational risks that could ultimately lead to investor losses. The study compares outputs from ChatGPT, Gemini, DeepSeek and Perplexity and identifies recurring reasoning failures, including computational errors, incorrect financial interpretations, and reliance on outdated or invented information (hallucinations). Errors were more common when users posed simple, unstructured queries, pointing to the need for clearer analytical instructions and robust supervision. The paper also argues that more reliable outcomes require a governance framework combining rigorous verification processes with systematic human validation, and that grounding models in official, regulated and standardised sources such as supervisor data materially improves result quality and reduces errors versus using general web information.