๐Ÿคฏ Mistral AI's Leanstral 1.5: Genius? ๐Ÿš€

July 04, 2026 |

AI

๐ŸŽง Audio Summaries
English flag
French flag
German flag
Japanese flag
Korean flag
Mandarin flag
Spanish flag
๐Ÿ›’ Shop on Amazon

๐Ÿง Quick Intel


  • Mistral AI released Leanstral 1.5, a code agent model for Lean 4, targeting automated theorem proving and proof engineering.
  • Leanstral 1.5 has 119B parameters, with 6.5B activated per token and a 256k token context length, accepting multimodal input (text and image) and producing text output.
  • The model utilizes a mixture-of-experts (MoE) architecture with 128 experts, 4 active per token, enabling low compute while maintaining high capacity.
  • During training, Leanstral 1.5 progresses through mid-training, supervised fine-tuning, and reinforcement learning with CISPO, shaped by a multitrurn environment and a code agent environment.
  • On the PutnamBench benchmark, Leanstral 1.5 achieves 100% saturation on validation and test sets, solving 587 of 672 problems, outperforming Seed-Prover 1.5 by 7 problems.
  • On the FLTEval benchmark, pass@1 rises to 28.9 and pass@8 rises to 43.2, significantly outperforming open-source models three to ten times larger.
  • Leanstral 1.5 flags 47 violated properties and 11 genuine bugs across 57 repositories, including a critical bug in the sign function for zigzag decoding.
  • ๐Ÿ“Summary


    Today, Mistral AI released Leanstral 1.5, a code agent model designed for Lean 4, a proof assistant. This update focuses on automated theorem proving and proof engineering, utilizing a mixture-of-experts architecture with 128 experts. The model, trained across three stages including reinforcement learning, demonstrates significant performance gains. Specifically, Leanstral 1.5 achieves 100% accuracy on validation and test sets, solving 587 of 672 PutnamBench problems and achieving state-of-the-art results on the FATE-H and FATE-X benchmarks. The modelโ€™s capabilities extend to verifying code, identifying bugs, and generating correctness properties, leveraging a free API endpoint and OpenAI-style tool calling.

    ๐Ÿ’กInsights

    โ–ผ


    LEANSTRAL 1.5: A NEW GENERATION OF PROOF ASSISTANTS
    Mistral AI has released Leanstral 1.5, a novel code agent model specifically designed for automated theorem proving and proof engineering using Lean 4. This release represents a significant advancement, targeting complex logical tasks and offering a free API endpoint for immediate experimentation.

    THE CORE ARCHITECTURE: MOE AND SCALE
    Leanstral 1.5 leverages a mixture-of-experts (MoE) architecture to optimize both computational efficiency and overall capacity. The model employs 128 experts, with four active per token, allowing it to handle complex queries while maintaining a substantial parameter count of 119 billion. The context length is a generous 256k tokens, accommodating extensive logical arguments. Input is multimodal, accepting both text and image data, while output is strictly text-based, streamlining the workflow.

    TRAINING STRATEGIES: A MULTI-PHASE APPROACH
    The training process for Leanstral 1.5 is meticulously structured across three distinct phases. Initially, mid-training establishes a foundational understanding of Lean 4 and its capabilities. Subsequently, supervised fine-tuning refines the modelโ€™s performance on specific tasks. Finally, reinforcement learning, guided by two carefully designed environments, shapes the modelโ€™s agentic behavior and problem-solving strategies.

    AGENTIC BEHAVIOR: MULTITURN AND CODE AGENT ENVIRONMENTS
    The reinforcement learning environments are crucial to Leanstralโ€™s functionality. The multiturn environment simulates a theorem proving scenario where the model iteratively refines its proof attempts, learning from Lean compiler feedback. The code agent environment places Leanstral within a raw filesystem, allowing it to directly edit files, execute bash commands, and utilize the Lean language server for real-time assistance, enabling the model to build auxiliary lemmas and persist through context compaction.

    STATE-OF-THE-ART PERFORMANCE: BENCHMARK RESULTS
    Leanstral 1.5 achieves remarkable results on a variety of benchmarks. It saturates miniF2F, reaching 100% on both validation and test sets. Notably, it solves 587 of 672 PutnamBench problems and sets a new state-of-the-art on the FATE-H and FATE-X algebra benchmarks, achieving 87% and 34% respectively. Furthermore, on FLTEval, pass@1 increases to 28.9 and pass@8 rises to 43.2, demonstrating a significant improvement over open-source models.

    PUTNAMBENCH AND FATE: A CLEAR LEAD
    On PutnamBench, Leanstral edges Seed-Prover 1.5 high by 7 problems, operating at an estimated cost of $4 per problem, compared to Seed-Proverโ€™s near $300 per problem. The model also outperforms Goedel-Architect and AxProverBase, with Aleph Prover costing roughly $54 to $68 per problem. The test-time scaling behavior is a key characteristic, where increasing the token budget per attempt elevates PutnamBench Pass@8.

    TEST-TIME SCALING: EXPONENTIAL GROWTH
    The modelโ€™s performance scales exponentially with the token budget per attempt. Specifically, 44 solved at 50k tokens, 244 at 200k, 493 at 1M, and 587 at 4M tokens, demonstrating the power of this scaling strategy. An interactive explorer allows users to visualize this relationship directly.

    APPLICATION CASE STUDIES: PRACTICAL USE CASES
    Leanstral 1.5's training primarily on mathematics extends to code verification. Case studies reveal practical applications for engineers, including flagging 47 violated properties and 11 genuine bugs across 57 repositories, with five previously unreported on GitHub. The model can generate correctness properties for functions automatically and stress-test Rust code by proving or disproving inferred invariants.

    ACCESS AND INTEGRATION: A USER-FRIENDLY EXPERIENCE
    Leanstral 1.5 is accessible through Mistral Vibe, Mistralโ€™s agent CLI. It runs on Mistralโ€™s free plan, enabling users to enable โ€˜Labs modelsโ€™ and create an API key. Installation of vLLM 0.24.0 or newer allows for self-hosting, and the OpenAI-compatible client facilitates interaction. Key integrations include Lean-lsp-mcpserver for tighter Lean integration, and support for OpenAI-style tool calling.

    FURTHER RESOURCES AND COMMUNITY
    For further exploration, users can access the Mistral AI announcement, the Leanstral 1.5 model card, and the Hugging Face page. Additionally, engagement with the Mistral AI community via Twitter and the 150k+ ML SubReddit is encouraged. Telegram support is also available, and collaboration opportunities exist for promoting GitHub Repositories, Hugging Face Pages, Product Releases, or Webinars.