HyperAI超神经

On July 26, the 2025 World Artificial Intelligence Conference (WAIC 2025) kicked off. During the Science Frontiers Plenary Session held that afternoon, the Shanghai Artificial Intelligence Laboratory (Shanghai AI Lab) announced the open-source release of its new science multimodal large model, Intern-S1. This model represents a major step forward in AI-driven scientific discovery, as traditional single-modal analysis often struggles to capture the complexity of interdisciplinary research. Intern-S1 integrates the strengths of the ShuSheng model family, achieving a balanced level of language and multimodal capabilities, while incorporating rich multidisciplinary expertise. It is the first open-source general model with specialized scientific capabilities, offering the best performance among current open-source multimodal models. The model is also the foundation for the newly launched ShuSheng Science Discovery Platform, Intern-Discovery, which aims to enhance the capabilities of researchers, tools, and research subjects, promoting a shift in scientific research from isolated team efforts to a more scalable, data-driven approach. The Intern-S1 experience page is available at https://chat.intern-ai.org.cn, with GitHub, HuggingFace, and ModelScope links also provided for access and exploration. Intern-S1 introduces a "cross-modal scientific analysis engine" capable of accurately interpreting complex scientific data, such as chemical formulas, protein structures, and seismic wave signals. It supports advanced scientific tasks like predicting chemical synthesis pathways, assessing the feasibility of chemical reactions, and identifying seismic events, transforming AI from a conversational assistant into a true scientific collaborator. This model outperforms leading closed-source models like Grok-4 in specialized scientific tasks and surpasses popular open-source models such as InternVL3 and Qwen2.5-VL in multimodal performance, making it a standout in the "all-rounder" category of AI models. Leveraging its strong cross-modal biological information perception and integration capabilities, the Shanghai AI Lab collaborated with institutions like the Lingang Laboratory, Shanghai Jiao Tong University, Fudan University, and MIT to develop the "YuanSheng" (OriGene) multi-agent virtual disease expert system. This system is used for target discovery and clinical translation evaluation, and has already identified new targets—GPR160 for liver cancer and ARG2 for colorectal cancer—verified through real clinical samples and animal experiments, establishing a complete scientific loop. The model’s capabilities are supported by a series of systematic technological innovations. Since the launch of the ShuSheng model family, the Shanghai AI Lab has developed a range of models, including the large language model ShuSheng·Puyu (InternLM), the multimodal model ShuSheng·WanXiang (InternVL), and the strong reasoning model ShuSheng·SiKe (InternThinker). Through a "general-special fusion" approach, the research team has created Intern-S1, setting a new benchmark for next-generation models. Intern-S1 introduces an innovative scientific multimodal architecture that enables the deep integration of various scientific data types, such as materials science, chemical formulas, protein sequences in biopharmaceuticals, light curves from astronomical surveys, gravitational wave signals from cosmic collisions, and seismic waveforms. It features a dynamic tokenizer and a temporal signal encoder, allowing for efficient processing of complex scientific data. For example, its compression rate for chemical formulas is over 70% higher than DeepSeek-R1, and it achieves better performance with less computational cost. To address the challenges of handling highly specialized scientific tasks, the research team developed a "general-special fusion" data synthesis approach. This method combines vast amounts of general scientific data with highly readable, clearly reasoned scientific data generated by specialized models, validated by domain-specific agents. This closed-loop system continuously improves the base model, enabling it to possess both strong general reasoning and top-tier specialized capabilities, allowing it to tackle multiple scientific tasks with a single model. In terms of optimization, the team achieved a tenfold reduction in the cost of large-scale reinforcement learning by combining system and algorithm improvements. The model uses a separated training and inference reinforcement learning (RL) approach, with a self-developed inference engine for efficient FP8 asynchronous inference and a data parallel balancing strategy to address long-tail decoding issues. On the training side, block-based FP8 training significantly improves efficiency. The training system will also be open-sourced. The algorithm side features the Mixture of Rewards (MoR) learning approach, built upon the Intern·BootCamp large-scale multi-task interaction environment. This method integrates multiple reward signals and feedback, using RLVR training for easily verifiable tasks and reward models for more complex ones like dialogue and writing. The team also incorporated several research outcomes from the Shanghai AI Lab on large model reinforcement learning strategies, greatly enhancing training efficiency and stability. The Shanghai AI Lab continues to emphasize open-source, making Intern-S1 and its full toolchain freely available for commercial use. It also offers online open services, aiming to foster a broader open-source ecosystem and develop AI assistants that better understand science. In practical scenarios, Intern-S1 demonstrates strong scientific reasoning. For instance, it can accurately identify black holes in complex image-based CAPTCHAs. It can also analyze artistic works with a scientific mindset, offering rational insights based on logical and knowledge-driven approaches, showcasing its ability to interpret art through a scientific lens.

ScholarIntern-S1 Open-Sourced, Reimagining Research Productivity

Related Links