封面
市场调查报告书
商品编码
1518881

座舱AI代理(2024)

Cockpit AI Agent Research Report, 2024

出版日期: | 出版商: ResearchInChina | 英文 200 Pages | 商品交期: 最快1-2个工作天内

价格
简介目录

座舱代理核心能力

基础模型时代的AI智能体是基于LLM,其强大的推理能力拓展了AI智能体的应用场景,运行过程中获得的回馈可以提高基础模型的思考能力。在座舱中,智能体的能力典范大致分为 "理解" + "规划" + "工具使用" + "反思" 。

当代理第一次开始驾驶时,认知和规划能力变得更加重要。理解任务目标并选择执行路径直接决定了效能结果的准确性,进而影响智能体的场景使用率。

例如,在Xiaomi的语音互动过程中,语意理解是整个汽车语音处理过程的困难。XiaoAi透过语意分析模型来处理语意分析。

智能量体产后,一大亮点将是支援使用者自订场景模式的个人化座舱,以及不断学习和优化的智慧体工作流程,反思成为现阶段最重要的核心能力。

例如理想汽车提供的理想同学,支援一句话场景的创建。这是由 Mind GPT 的内建记忆体网路和线上强化学习功能提供支援。 Mind GPT 可以根据过去的对话来记住个人化的偏好和习惯。如果重复出现类似的场景,可以透过历史资料自动设定该场景的参数,以符合使用者的初衷。

在AI OS的架构配置层面,以SAIC Z-One为例。

Z-One在核心层存取LLM核心(LLM OS),与原有微内核控制AI OS SDK和ASF的接口,AI OS SDK接收LLM的调度并控制AI OS SDK和ASF的接口ASF在应用层促进Agent服务框架。 Z-One AI OS架构高度整合AI和CPU。透过 SOA 原子服务,AI 连接到车辆的感测器、执行器和控制器。此架构基于端云基础架构模型,可增强终端侧基础架构模型的算力,降低运行时延。

座舱AI 代理应用难度等级

代理程式与使用者连线并执行指令。在应用过程中,除了基础模型安装在汽车上的技术困难外,还面临场景的困难。在命令接收-语意分析-意图推理-任务执行的过程中,表现结果的准确性和人机互动的延迟直接影响使用者的骑乘体验。

互动人性化

例如,在 "情感顾问" 场景中,代理商需要对车主进行情感上的共情,并进行拟人化。一般来说,AI体的拟人化有三种形式:物理拟人化、人格拟人化和情感拟人化。

本报告针对中国汽车产业进行调查分析,提供座舱AI智能体技术及应用资讯。

目录

第1章 座舱AI代理简介

  • AI代理的现状
  • 座舱AI代理
  • 座舱代理应用场景
  • 座舱代理程式使用状态

第2章 AI智能体技术的实现路径

  • AI作业系统架构
  • AI视觉大规模模型相关技术
  • 大型座舱模型应用技术
  • AI在座舱的应用趋势
  • 座舱代理解决方案

第3章 供应商座舱AI代理使用分析

  • 座舱代理功能清单:依供应商列出
  • ThunderSoft
  • Huawei
  • Alibaba Cloud
  • Tencent
  • Baidu
  • iFLYTEK
  • AISpeech
  • Lenovo
  • SAIC Z-ONE
  • Zhipu AI
  • Microsoft
  • TINNOVE
  • Desay SV Cockpit大规模模型四大应用场景
  • Rockchip在座舱采用AI Sound Field Technology
  • NNG 将AI应用于导航技术

第4章 OEM座舱AI代理使用分析

  • OEM 中的 座舱代理程式使用状态
  • NIO
  • Xpeng
  • Li Auto
  • Xiaomi
  • BAIC
  • Hozon
  • Dongfeng
  • JAC
  • Chang'an
  • Volkswagen:透过 GPT 进化为代理商
  • Mercedes-Benz:透过 MBUX 虚拟助理提供个人化服务
  • GAC A大型模型座舱应用
  • Great Wall大型模型座舱应用
  • Chery自主研发的LION AI+ iFLYTEK Spark Model
  • Geely
  • IM AI大规模模型建构主动辨识场景
简介目录

Cockpit AI Agent: Autonomous scenario creation becomes the first step to personalize cockpits

In AI Foundation Models' Impacts on Vehicle Intelligent Design and Development Research Report, 2024, ResearchInChina mentioned that the core of an AI Agent uses a large language model (LLM) as its core computing engine (LLM OS). In the AI service framework, the LLM acts as AI core and the Agent acts as AI APP. With the help of reasoning and generation capabilities of AI foundation model, the Agent can create more cockpit scenarios, while further improving current multimodal interaction, voice processing and other technologies in the cockpit.

Compared with the traditional OS, AI OS is programmed in natural language, which is more suitable for human operation and convenient for customization and personalized human-computer interaction. The Agent responds to user prompts or instructions in natural language, and can perform a lot of tasks according to its interaction with physical or digital environment.

Foundation models spawn Agent customized scenarios

Since AI foundation models were installed in cars, they have been widely used in multimodal interaction scenarios. OEMs have started from voice assistants to further improve intelligent processing of user commands:

Before foundation models got on cars, the wake-up rate and effective instructions of voice assistants in cockpits are at a low level;

After foundation models were available in cars, not only has voice recognition become more intelligent, but the application of expression monitoring and gesture control have also been improved, with effective commands and daily active rates increasing significantly.

AI foundation models further enhance the user experience of smart cockpits through strong context processing and reasoning capabilities. Please refer to the following dialogue:

(Mobility scenario)

-**Tongxue! (the name of a voice assistant)

I am here!

There seems to be a prairie ahead. Please take a picture for me!

Wait a moment, and I will capture the beautiful scenariory of the prairie for you. I'll let you know when the photo is taken. Is there anything else I can help you with?

Send it to Moments immediately. I want my friends to have a look at this beautiful scenariory.

Ok, according to your authorized account, please choose the material and filter style.

Just this one, with a caption.

Ok, three captions have been generated for you. Which one do you like?

Just this one.

(You enter the office and receive a phone call)

There is a phone call from a customer, and I have transferred it to the IVI for you. Would you like to answer it?

Yes, and turn on the office mode.

Ok, I will generate an avatar of the customer for you and start the conversation. Do you have any questions or things to discuss? I will record the conversation and provide you with the minutes of the meeting after it is over.

(The avatar is generated)

Now you can start the conversation.

The above scenarios will not be materialized in the cockpit until 2024 when foundation models are installed on vehicles by some OEMs.

For example, IM L6 has built Carlog and City Drive scenarios to enable the AI foundation models to proactively recommend food and attractions and allow users to post them on social media:

Carlog: Actively perceive the scenario during driving through AI vision foundation model, mobilize four cameras to take photos, automatically save and edit them, and support one-click share in Moments.

City Drive: Cooperate with Volcengine to model nearby food, scenic spots and landmarks in real time in the digital screen, and push them according to users' habits and preferences.

The applicability of foundation models in various scenarios has stimulated users' demand for intelligent agents that can uniformly manage cockpit functions. In 2024, OEMs such as NIO, Li Auto, and Hozon successively launched Agent frameworks, using voice assistants as the starting point to manage functions and applications in cockpits.

Agent service frameworks can not only manage cockpit functions in a unified way, but also provide more abundant scenario modes according to customers' needs and preferences, especially supporting customized scenarios for users, which accelerates the advent of the cockpit personalization era.

For example, NIO's NOMI GPT allows users to set an AI scenario with just one sentence:

Core competence of cockpit Agents

AI Agents in the era of foundation models are based on LLMs, whose powerful reasoning expands the applicable scenarios of AI Agents that can improve the thinking capability of foundation models through feedback obtained during operation. In the cockpit, the Agent capability paradigm can be roughly divided into "Understanding" + "Planning" + "Tool Use" + "Reflection".

When Agents first get on cars, cognitive and planning abilities are more important. The understanding of task goals and the choice of implementation paths directly determine the accuracy of performance results, which in turn affect the scenario utilization rate of Agents.

For example, in Xiaomi's voice interaction process, semantic understanding is the difficulty of the entire automotive voice processing process. XiaoAi handles semantic parsing through a semantic parsing model.

After the mass production of Agents, the personalized cockpits that support users to customize scenario modes become the highlight, and Reflection becomes the most important core competence at this stage, so it is necessary to build an Agentic Workflow that is constantly learning and optimizing.

For example, Lixiang Tongxue offered by Li Auto supports the creation of one-sentence scenarios. It is backed by Mind GPT's built-in memory network and online reinforcement learning capabilities. Mind GPT can remember personalized preferences and habits based on historical conversations. When similar scenarios recur, it can automatically set scenario parameters through historical data to fit the user's original intentions.

At the AI OS architecture setting level, we take SAIC Z-One as an example:

Z-One accesses the LLM kernel (LLM OS) at the kernel layer, which controls the interfaces of AI OS SDK and ASF with the original microkernel respectively, in which AI OS SDK receives the scheduling of the LLM to promote the Agent service framework of the application layer. The Z-One AI OS architecture highly integrates AI with CPU. Through SOA atomic services, AI is then connected to the vehicle's sensors, actuators and controllers. This architecture, based on a terminal-cloud foundation model, can enhance the computing power of the terminal-side foundation model and reduce operational latency.

Application Difficulty of Cockpit AI Agents

Agents connect to users and execute commands. In the application process, in addition to the technical difficulties of putting foundation models on cars, they also face scenario difficulties. In the process of command reception-semantic analysis-intention reasoning-task execution, the accuracy of the performance results and the delay in human-computer interaction directly affect the user's riding experience.

Humanization of interaction

For example, in the "emotional consultant" scenario, Agents should resonate emotionally with car owners and perform anthropomorphism. Generally, there are three forms of anthropomorphism of AI Agents: physical anthropomorphism, personality anthropomorphism, and emotional anthropomorphism.

Foundation model performance

In the "encyclopedia question and answer" scenario, Agents may be unable to answer the user's questions, especially open questions, accurately because of LLM illusion after semantic analysis, database search, answer generation and the like.

Current solutions include advanced prompting, RAG+knowledge graph, ReAct, CoT/ToT, etc., which cannot completely eliminate "LLM illusion". In the cockpit, external databases, RAG, self-consistency and other methods are more often used to reduce the frequency of "LLM illusion".

Some foundation model manufacturers have improved the above solutions. For example, Meta has proposed to reduce "LLM illusion" through Chain-of-Verification (CoVe). This method breaks down fact-checking into more detailed sub-questions to improve response accuracy and is consistent with the human-driven fact-checking process. It can effectively improve the FACTSCORE indicator in long-form generation tasks.

Table of Contents

1 Introduction to Cockpit AI Agent

  • 1.1 Status Quo of AI Agent
    • 1.1.1 What Is AI Agent?
    • 1.1.2 Status Quo of AI Agent
    • 1.1.3 Four Capabilities of AI Agent
    • 1.1.4 Three Collaborative Models of AI Agent
    • 1.1.5 Application Scenarios of AI Agent
    • 1.1.6 Agentic Workflow
  • 1.2 Cockpit AI Agent
    • 1.2.1 Cockpit AI Agent Classification
    • 1.2.2 Cockpit AI Agent Evolution: Cognitive Driven
    • 1.2.3 Process of AI Agent Landing in Cockpit: from Large Model to AIOS
    • 1.2.3 Process of AI Agent Landing in Cockpit: Two Ways
    • 1.2.4 Cockpit AI Agent Interaction Mechanism
    • 1.2.5 Four Capability Paradigms of Cockpit AI Agent
    • 1.2.6 Evolution of AI Agent: Active Interaction
    • 1.2.6 Evolution of AI Agent: Reflection Optimization
  • 1.3 Application Scenario of Cockpit Agent
    • 1.3.1 Application Scenario Classification: by Interaction Type
    • 1.3.2 Application Scenario Classification: by Large Model Type
    • 1.3.3 Application Scenario Classification: by Function Type
  • 1.4 Application Status of Cockpit Agent
    • 1.4.1 Application Status (1): Multi-modal Interaction Spawns Agent landing
    • 1.4.2 Application Status (2): Scenario Creation Becomes an Important Approach to Agent Evolution
    • 1.4.3 Application Status (3):
    • 1.4.4 Application Status (4):
    • 1.4.5 Application Status (5): Transition Program

2 AI Agent Technology Implementation Path

  • 2.1 AI OS Architecture
    • 2.1.1 AI OS Architecture Design
    • 2.1.2 8 Features of AI OS Components
    • 2.1.3 AI OS Optimization Technology (1): MemGPT Optimizes Context Expansion Process
    • 2.1.3 AI OS optimization technology (2):
    • 2.1.4 AI OS Core: Key Capabilities of LLM
    • 2.1.5 Core Features of AI OS components
    • 2.1.6 AI OS Design Case: AI Service Structure of SAIC Z-ONE
    • 2.1.6 AI OS Design Case: Meizu Flyme AI OS can be Transferred to Car
  • 2.2 Technology related to AI Vision Large Model
    • 2.2.1 Advantages and Disadvantages of commonly used AI Technology in Automobiles
    • 2.2.2 Construction of AI Model for In-Cabin Monitoring
    • 2.2.3 AI Application Cases of In-Cabin Monitoring
  • 2.3 Application Technology of Cockpit Large Model
    • 2.3.1 Overview
    • 2.3.2 Baidu: 5 Steps of Emotional Cockpit Adjustment
    • 2.3.3 Geely: End-side Large Model Deployment Technology
    • 2.3.4 Leopard: Knowledge Graph Optimization for Q&A Scenarios
    • 2.3.5 University of Mining and Technology: Visual Large Model + Adaptive Adjustment
  • 2.4 Application Trends of AI in Cockpit
    • 2.4.1 Trend 1:
    • 2.4.2 Trend 2:
    • 2.4.3 Trend 3:
    • 2.4.4 trend 4:
    • 2.4.5 trend 5:
  • 2.5 Cockpit Agent Solution
    • 2.5.1 Application Painpoits of Agent in Automotive
    • 2.5.2 Solution (1): RAG Enhances Voice assistant's Intelligent Q & A Capabilities
    • 2.5.3 Solution (2):
    • 2.5.4 Solution (3): Zero Trust Architecture & Confidential Computing Protect Cloud Data Security
    • 2.5.5 solutions (4):
    • 2.5.6 Solution (5): Working Memory and Brain Science Become one of the Paths to Promote Evolution of AI Agents
    • 2.5.7 solutions (6):
    • 2.5.8 Solution (7): Anthropomorphizing Emotional Cockpit
    • 2.5.8 Solution (7): NIO NomiGPT Emotional Personification Case
    • 2.5.8 Solutions (7):
    • 2.5.8 Solution (7): Digital Human Enhances Agent Emotional Applicability

3 Application Analysis of Cockpit AI Agent of Suppliers

  • List of Cockpit Agent Functions by Suppliers
  • 3.1 ThunderSoft
    • 3.1.1 Large Model Layout
    • 3.1.2 AquaDrive OS and Rubik's Cube Model Integration
    • 3.1.3 AI Framework Design in AquaDrive OS
  • 3.2 Huawei
    • 3.2.1 AI Application Planning
    • 3.2.2 Function Construction of HarmonySpace Smart Cockpit
    • 3.2.3 AI Features of Harmony OS
    • 3.2.4 Two Implementations of Huawei Harmony OS "Visible to Say"
  • 3.3 Alibaba Cloud
    • 3.3.1 Ali Edge AI Model and Cloud Computing Combination
    • 3.3.2 Functional Application of Qianwen Edge AI Model on IVI
    • 3.3.3 Qianwen Edge AI Model is Installed in FAW IVI
  • 3.4 Tencent
    • 3.4.1 Function of Hunyuan Large Model in Cockpit
  • 3.5 Baidu
    • 3.5.1 Baidu Smart Cockpit 2.0 with ERNIE Bot
    • 3.5.2 Baidu Edge AI Model Mounted on Jiyue
  • 3.6 iFLYTEK
    • 3.6.1 Function List of iFLYTEKSpark Model
    • 3.6.2 Development History of iFLYTEKSpark Model
    • 3.6.3 How iFLYTEK Spark Cockpit Integrates into AI Services
    • 3.6.4 iFLYTEK Spark Cockpit Edge Deployment Mode
    • 3.6.5 Application of Spark Model in Mobile phone-Vehicle Interconnection
    • 3.6.6 Application Technology of iFLYTEK Spark Model
  • 3.7 AISpeech
    • 3.7.1 Development History of AI Speech Technology
    • 3.7.2 DFM Large Model Iterated to 3.0
    • 3.7.3 DFM Large Model "1 + N" Layout
    • 3.7.4 Fusion Large Model Solution
  • 3.8 Lenovo
    • 3.8.1 Six Characteristics of AI Agent Architecture
    • 3.8.2 Agent "Three Characteristics" Accelerate Cockpit Deployment
    • 3.8.3 AI Vehicle Computing Framework Applies to Both Smart Driving and Cockpit
    • 3.8.4 Vientiane Cockpit AI Platform Supports Three Types of Functions
    • 3.8.5 Core Competencies of Edge Applications
  • 3.9 SAIC Z-ONE
    • 3.9.1 AI Service Structure is Built According to 4 Levels
    • 3.9.2 AI Changes to Hardware Layer
    • 3.9.3 AI Changes to Software Layer
    • 3.9.4 AI Changes to Cloud/Vehicle Deployment
  • 3.10 Zhipu AI
    • 3.10.1 Cockpit Design Architecture Based on AI Large Model
    • 3.10.2 Scenario Design of AI Large Model
    • 3.10.3 Design of AI Large Model for Cockpit Interaction Pain Points
  • 3.11 Microsoft
    • 3.11.1 Cockpit Voice Solution
    • 3.11.2 Improves Cockpit Performance by Integrating Private Enterprise Knowledge
  • 3.12 TINNOVE
    • 3.12.1 Three Levels of AI Model Empower Cockpit
    • 3.12.2 Four Stages of Smart Cockpit Planning
    • 3.12.3 AI Cockpit Architecture Design
    • 3.12.4 AI Large Model Service Form
    • 3.12.5 AI Large Model Application Scenario
    • 3.12.6 TTI OS and Digital Human Combination
  • 3.13 4 Main Application Scenarios of Desay SV Cockpit Large Model
  • 3.14 Rockchip Uses AI Sound Field Technology in Cockpit
  • 3.15 NNG Applies AI to Navigation Technology

4 Application Analysis of Cockpit AI Agent of OEMs

  • 4.1 Application Status of Cockpit Agent in OEMs
    • 4.1.1 List of Cockpit Agent Functions of each OEM
    • 4.1.2 List of Cockpit Agent Scenarios of each OEM
    • 4.1.3 List of Large Models that have been filed in Automotive Industry
  • 4.2 NIO
    • 4.2.1 NIO NOMI GPT Supports Edge Deployment
    • 4.2.2 NIO NOMI GPT Adopts Modal Internal and External Multi-dimensional Comparative Learning Technology
    • 4.2.3 Six Scenarios of NIO NOMI GPT
  • 4.3 Xpeng
    • 4.3.1 Three Application Scenarios of Xpeng AI Tianji system
  • 4.4 Li Auto
    • 4.4.1 Lixiang Tongxue: Building Multiple Scenes
    • 4.4.2 Mind GPT: Building AI Agent as Core of Large Model
    • 4.4.3 Mind GPT: Multimodal Perception
    • 4.4.4 Large Model Training Platform Adopts 4D Parallel Mode
    • 4.4.5 Cooperate with NVIDIA to Land Inference Engine
    • 4.4.6 Mind GPT: L9 Ultra Passes AI Large model A-level Certification
  • 4.5 Xiaomi
    • 4.5.1 Xiao Ai Covers Scenarios through Voice Commands
    • 4.5.2 Voice Task Analysis and Execution Process
    • 4.5.3 Xiao Ai Accurate Matching by RAG
    • 4.5.4 Deployment Location of Xiaomi AI Service Framework in Operating System
    • 4.5.5 Two types of Large Models as Core of Xiaomi AI
    • 4.5.6 Essence of Xiaomi AI Smart Center
  • 4.6 BAIC
    • 4.6.1 Three Stages of BAIC Large Model Development
    • 4.6.2 Specific Scenario of BAIC Large Model (1): Customized Scenario Function
    • 4.6.2 Specific Scenario of BAIC Large Model (2): Emotional Mode + Digital Human
    • 4.6.3 BAIC Agent Platform Architecture: Baimo Huichuang
    • 4.6.4 BAIC's Planning Ideas for Large Model Products
  • 4.7 Hozon
    • 4.7.1 Application Status of AI Cockpit Function
    • 4.7.2 Cockpit Design Concept for New Human-Machine Interaction Mode
    • 4.7.3 Cockpit Application of Neta AI model
  • 4.8 Dongfeng
    • 4.8.1 Cockpit Architecture Based on AI Large Model
    • 4.8.2 Application Types of AI Large Model
    • 4.8.3 Main Scenario and Design Path of AI Large Model in Cockpit
    • 4.8.4 Application of Dongfeng Vision AI Large Model in Cockpit
    • 4.8.5 Workflow of Dongfeng AI Large Model in/out of Cockpit: "Four-step" Paradigm
    • 4.8.6 Next-step Planning of Dongfeng AI Cockpit
  • 4.9 JAC
    • 4.9.1 Four Applications of JAC AI Cockpit
    • 4.9.2 JAC AI Large Model Source and Boarding Case
  • 4.10 Chang'an
    • 4.10.1 Changan will Integrate AI into SOA Architecture Layer
    • 4.10.2 Planning of "Digital & Intelligent" Cockpit
    • 4.10.3 AI Achievements and Strategic Priorities
    • 4.10.4 Realize Automatic Switching of Cockpit Scenarios and Functions
  • 4.11 Volkswagen: Evolving to Agents through GPTs
  • 4.12 Mercedes-Benz: Personalized Service with MBUX Virtual Assistant
  • 4.13 Cockpit Application of GAC AI Large Model
  • 4.14 Cockpit Application of Great Wall Large Model
  • 4.15 Chery self-developed LION AI + iFLYTEK Spark Model
  • 4.16 Geely
    • 4.16.1 Two Forms of Geely Large Model Cockpit Application
    • 4.16.2 Geely Xingrui Large Model Application Case
  • 4.17 IM AI Large Model Builds Active Perception Scenario