Skip to main content

1. 基本介绍

MPSMPSNVIDIA官方的CUDA应用编程接口,是CUDA应用程序编程接口的一种可替代、二进制兼容的实现方式,以利用NVIDIA GPUHyper-Q功能,允许CUDA内核在同一个GPU上并发处理:https://docs.nvidia.com/deploy/mps/index.html

MIGMulti-Instance GPU (MIG)NVIDIA针对Ampere架构之后和的系列卡推出的GPU虚拟化技术,它实现了物理卡的拆分,可以将一张GPU,按照特定规格拆分成多个子实例:https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html

MPSMIG技术对比:

特性MPS (Multi-Process Service)MIG (Multi-Instance GPU)
隔离级别软件级隔离硬件级隔离
资源分配动态分配,共享计算单元静态分配,独立计算单元
故障传播存在故障传播风险完全隔离,无故障传播
灵活性高,可动态调整资源分配低,需要预先定义分区方案
部署复杂度较低,软件配置即可较高,需要完全驱逐业务后操作
支持的GPU计算能力 >= 3.5GPUAmpere架构及之后的特定GPU
适用场景多副本推理服务、小模型训练多租户环境、需要强隔离的业务

alt text

2. MPS拆卡方案

2.1 依赖与限制

2.1.1 环境依赖

  • 资源限制
  • 仅支持Linux系统
  • 不管进程的GPU资源独占设置。MPS控制守护进程会对来自不同用户的请求进行排队访问的

2.1.2 部署依赖

  • 需要依赖GPU Operator组件。
  • Kubernetes版本要求 1.22-1.29GPU Operator)要求。
  • 需要完全解除GPU占用,需要驱逐所有的GPU使用POD(建议驱逐节点上所有的POD)。

2.2 优缺点分析

优势

  1. 由于它在单卡上跑多服务,能有效提高单卡的GPU利用率和吞吐量。
  2. 拆分相对轻量,只是纯软件层的配置即可。
  3. 相对于直接单卡跑多服务,在开启MPS的卡上基于container跑多服务,能进行算力和显存限制,多服务间QoS得到保障,同时避免了单卡多服务的GPU上下文切换的开销

劣势

  1. 因为多个进程共享Cuda Context,故存在故障传播问题,当其中一个连接客户端出现错误时,会传播到其他Context
  2. 在某些特定情况下,多个MPS客户端进程可能会互相影响,官方建议一般应用于单个应用的主进程和他的协作进程。
  3. MPS的多个客户端进程都具有完全隔离的GPU地址空间,是从主进程的同一个GPU虚拟地址空间分区分配的,如果进程越界访问内存,可能不会触发访问错误,而是访问到了别的进程的内存空间。

适用场景

考虑到故障传播的问题,通常用它来部署多副本的推理服务、小模型。

2.3 如何识别GPU卡是否支持MPS

MPS要求 nvidia.com/gpu.compute.major 计算能力版本 >= 3.5 (查询地址 https://developer.nvidia.com/cuda-gpus )几乎主流卡都支持。具体查看节点标签内容:

# 省略...
nvidia.com/gpu: nvidia-ada-4090
nvidia.com/gpu-driver-upgrade-state: pod-restart-required
nvidia.com/gpu.compute.major: "8" # 计算能力主版本
nvidia.com/gpu.compute.minor: "9"
nvidia.com/gpu.count: "8"
nvidia.com/gpu.family: ampere
nvidia.com/gpu.machine: R8428-G13
nvidia.com/gpu.memory: "24564"
nvidia.com/gpu.present: "true"
nvidia.com/gpu.product: NVIDIA-GeForce-RTX-4090
nvidia.com/gpu.replicas: "2"
nvidia.com/gpu.sharing-strategy: mps
nvidia.com/mig.capable: "false"
nvidia.com/mig.config: all-disabled
nvidia.com/mps.capable: "true"
# 省略...

2.4 特性的启用流程

安装完成GPU Operator后,默认情况下所有节点的MPS特性是关闭的,可以看到带有GPU卡机器的标签值为nvidia.com/mps.capable=falseMPS特性需要通过配置文件为当前节点启用MPS策略后才能开启。以下是具体操作步骤:

2.4.1 确认GPU卡的机器节点标签

默认情况下MPS特性是关闭的。

# 省略...
nvidia.com/gpu.count: "8"
nvidia.com/gpu.family: ampere
nvidia.com/gpu.machine: R8428-G13
nvidia.com/gpu.memory: "24564"
nvidia.com/gpu.present: "true"
nvidia.com/gpu.product: NVIDIA-GeForce-RTX-4090
nvidia.com/gpu.replicas: "2"
nvidia.com/gpu.sharing-strategy: mps
nvidia.com/mig.capable: "false"
nvidia.com/mig.config: all-disabled
nvidia.com/mps.capable: "false" # MPS特性未开启
# 省略...

2.4.2 确定MPS使用的配置项名称

通过节点标签中的nvidia.com/device-plugin.config: default可以得知具体的配置项名称为default

# 省略...
nvidia.com/device-plugin.config: default # device plugin配置项名称
nvidia.com/gfd.timestamp: "1748575819"
nvidia.com/gpu: nvidia-ada-4090
nvidia.com/gpu.compute.major: "8"
nvidia.com/gpu.compute.minor: "9"
nvidia.com/gpu.count: "8"
# 省略...

DevicePlugin的具体配置默认是通过device-plugin-config来管理的:

$ kubectl get cm -n gpu-operator
NAME DATA AGE
custom-mig-parted-config 1 346d
default-gpu-clients 1 347d
default-mig-parted-config 1 347d
device-plugin-config 52 346d
gpu-clients 1 259d
kube-root-ca.crt 1 347d
nvidia-container-toolkit-entrypoint 1 347d
nvidia-dcgm-exporter 1 247d
nvidia-device-plugin-entrypoint 1 347d
nvidia-mig-manager-entrypoint 1 347d
stable-node-feature-discovery-master-conf 1 347d
stable-node-feature-discovery-topology-updater-conf 1 347d
stable-node-feature-discovery-worker-conf 1 347d

ConfigMap是在nvidia-device-plugin-daemonset中作为配置文件挂载进去的:

$ kubectl get pod -n gpu-operator
NAME READY STATUS RESTARTS AGE
gpu-feature-discovery-jztwz 2/2 Running 0 22h
gpu-operator-699bc5544b-885lt 1/1 Running 15 (59d ago) 129d
node-agent-96wx8 1/1 Running 0 23h
nvidia-container-toolkit-daemonset-d7j66 1/1 Running 0 23h
nvidia-cuda-validator-zhp78 0/1 Completed 0 23h
nvidia-dcgm-exporter-bgpbh 1/1 Running 0 23h
nvidia-device-plugin-daemonset-rcgf9 2/2 Running 0 23h
nvidia-device-plugin-mps-control-daemon-mzbh9 2/2 Running 0 5h23m
nvidia-operator-validator-c5m8f 1/1 Running 0 23h
stable-node-feature-discovery-gc-85f45bc45-gxwj6 1/1 Running 0 78d
stable-node-feature-discovery-master-7dc854f47f-67cwd 1/1 Running 0 78d
stable-node-feature-discovery-worker-2tkjp 1/1 Running 23 (59d ago) 77d
stable-node-feature-discovery-worker-k4pz5 1/1 Running 23 (59d ago) 77d
stable-node-feature-discovery-worker-kzkvq 1/1 Running 0 23h
stable-node-feature-discovery-worker-qjprr 1/1 Running 23 (59d ago) 77d
stable-node-feature-discovery-worker-xn6sl 1/1 Running 25 (59d ago) 77d

describe一下这个nvidia-device-plugin-daemonset-rcgf9可以看到具体的配置卷挂载声明:

kubectl get pod -n gpu-operator nvidia-device-plugin-daemonset-rcgf9 -oyaml
  volumes:
- configMap:
defaultMode: 448
name: nvidia-device-plugin-entrypoint
name: nvidia-device-plugin-entrypoint
- hostPath:
path: /var/lib/kubelet/device-plugins
type: ""
name: device-plugin
- hostPath:
path: /run/nvidia
type: Directory
name: run-nvidia
- hostPath:
path: /
type: ""
name: host-root
- hostPath:
path: /var/run/cdi
type: DirectoryOrCreate
name: cdi-root
- hostPath:
path: /run/nvidia/mps
type: DirectoryOrCreate
name: mps-root
- hostPath:
path: /run/nvidia/mps/shm
type: ""
name: mps-shm
- configMap:
defaultMode: 420
name: device-plugin-config # 挂载的ConfigMap名称
name: device-plugin-config
# 省略...

2.4.3 添加MPS拆卡配置项

查看device-plugin-config的配置内容:

kubectl get cm -n gpu-operator device-plugin-config -oyaml

可以看到节点标签nvidia.com/device-plugin.config: default中关联的default配置项是没有开启MPS特性的:

apiVersion: v1
data:
# 默认配置项
default: |-
version: v1
flags:
migStrategy: none
mig-mixed: |-
version: v1
flags:
migStrategy: mixed
mig-single: |-
version: v1
flags:
migStrategy: single
# 省略...

我们增加一项配置项test-sharing,该配置项用于特定的拆卡策略:

apiVersion: v1
data:
# 默认配置项
default: |-
version: v1
flags:
migStrategy: none
mig-mixed: |-
version: v1
flags:
migStrategy: mixed
mig-single: |-
version: v1
flags:
migStrategy: single
# 新增配置项用于测试MPS
test-sharing: |-
version: v1
sharing:
mps:
renameByDefault: true
resources:
- name: nvidia.com/gpu
replicas: 3
devices: ["0", "1", "2", "3"]

# 省略...

随后修改GPU卡的机器标签,将该配置项名称修改到nvidia.com/device-plugin.config中,修改后为nvidia.com/device-plugin.config:test-sharing:

# 省略...
nvidia.com/device-plugin.config: test-sharing # 修改该配置项名称
nvidia.com/gfd.timestamp: "1748575819"
nvidia.com/gpu: nvidia-ada-4090
nvidia.com/gpu.compute.major: "8"
nvidia.com/gpu.compute.minor: "9"
nvidia.com/gpu.count: "8"
# 省略...

保存后,不一会儿你再查询节点信息时,会发现节点上nvidia.com/mps.capable标签的值从false自动变为了true

# 省略...
nvidia.com/gpu.count: "8"
nvidia.com/gpu.family: ampere
nvidia.com/gpu.machine: R8428-G13
nvidia.com/gpu.memory: "24564"
nvidia.com/gpu.present: "true"
nvidia.com/gpu.product: NVIDIA-GeForce-RTX-4090
nvidia.com/gpu.replicas: "2"
nvidia.com/gpu.sharing-strategy: mps
nvidia.com/mig.capable: "false"
nvidia.com/mig.config: all-disabled
nvidia.com/mps.capable: "true" # 该节点标签被自动更新为true
# 省略...

同时会看到在gpu-operator命名空间下自动增加了一个MPS的控制器nvidia-device-plugin-mps-control-daemon-mzbh9

$ kubectl get pod -n gpu-operator
NAME READY STATUS RESTARTS AGE
gpu-feature-discovery-jztwz 2/2 Running 0 22h
gpu-operator-699bc5544b-885lt 1/1 Running 15 (59d ago) 129d
node-agent-96wx8 1/1 Running 0 23h
nvidia-container-toolkit-daemonset-d7j66 1/1 Running 0 23h
nvidia-cuda-validator-zhp78 0/1 Completed 0 23h
nvidia-dcgm-exporter-bgpbh 1/1 Running 0 23h
nvidia-device-plugin-daemonset-rcgf9 2/2 Running 0 23h
nvidia-device-plugin-mps-control-daemon-mzbh9 2/2 Running 0 5h23m
nvidia-operator-validator-c5m8f 1/1 Running 0 23h
stable-node-feature-discovery-gc-85f45bc45-gxwj6 1/1 Running 0 78d
stable-node-feature-discovery-master-7dc854f47f-67cwd 1/1 Running 0 78d
stable-node-feature-discovery-worker-2tkjp 1/1 Running 23 (59d ago) 77d
stable-node-feature-discovery-worker-k4pz5 1/1 Running 23 (59d ago) 77d
stable-node-feature-discovery-worker-kzkvq 1/1 Running 0 23h
stable-node-feature-discovery-worker-qjprr 1/1 Running 23 (59d ago) 77d
stable-node-feature-discovery-worker-xn6sl 1/1 Running 25 (59d ago) 77d

2.5 如何对节点GPU进行MPS拆卡

拆卡的原理实际上就是修改device-plugin-config配置文件,通过手动或者程序自动增加/修改对应的配置项,随后修改对应机器节点的nvidia.com/device-plugin.config标签值即可。 拆卡完成后,在节点上会增加nvidia.com/gpu.share的资源类型,随后训练推理任务在申请资源时可申请该类型的资源。

Capacity:
cpu: 128
ephemeral-storage: 575546624Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 263234272Ki
nvidia.com/gpu: 7
nvidia.com/gpu.shared: 2 # 新增资源类型
pods: 110
rdma/hca_ib_dev: 0
Allocatable:
cpu: 127600m
ephemeral-storage: 575546624Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 255550011601
nvidia.com/gpu: 7
nvidia.com/gpu.shared: 2 # 新增资源类型
pods: 110
rdma/hca_ib_dev: 0

3. MIG拆卡方案

3.1 依赖与限制

  • 需要依赖GPU Operator组件。
  • Kubernetes版本要求 1.22-1.29GPU Operator)要求。
  • 需要完全解除GPU占用,需要驱逐所有的GPU使用POD(建议驱逐节点上所有的POD)。

3.2 优缺点分析

优势

  1. 每个实例都有自己的内存和计算单元,资源隔离性好,不会存在故障传播的问题。
  2. 多租户场景下,其隔离性强,更适合业务。

劣势

  1. 灵活性相对差,每种卡只支持特定规格的切分策略。
  2. 切分动作较“重”,切分需要完全驱逐业务,解除内核占用后才可以操作。
  3. 对卡有要求,只支持Ampere架构和它之后推出的卡型号(特定卡,不是所有Ampere都支持)

注意

  1. Ampere机构下,如果开启了MIG,机器重启后MIG的设置也会存在
  2. H100系列 需要cuda > 12/R525
  3. A100 A30系列需要cuda > 11/R450

适用场景

QoS要求高,隔离性强的业务。

3.3 如何识别GPU卡是否支持MIG

可以参考GPU Operator的源码:https://github.com/NVIDIA/gpu-operator/blob/main/assets/state-mig-manager/0400_configmap.yaml

alt text

里面有大概型号名称的注释以及十六进制型号编码,但是需要和nvidia.com/gpu.product配置对应的话,名称会存在一定差异不太好做自动化识别。前期可以做配置文件,将企业用到的卡型号配置进去进行识别;后续可以通过脚本识别底层GPU卡十六进制编号,并自动打标到节点上。 使用nvidia-smi工具识别GPU设备十六进制型号的命令:

nvidia-smi -q -x | grep '<pci_device_id>' | sed 's/.*<pci_device_id>\(.*\)<\/pci_device_id>.*/\1/'| uniq

3.4 特性的启用流程

3.4.1 确认GPU卡的机器节点标签

查看节点标签,可以看到nvidia.com/mig.capable: "false"表示没有启用MIG特性:

# 省略...
nvidia.com/gpu.family: hopper
nvidia.com/gpu.machine: OpenStack-Nova
nvidia.com/gpu.memory: "97871"
nvidia.com/gpu.present: "true"
nvidia.com/gpu.product: NVIDIA-H20
nvidia.com/gpu.replicas: "1"
nvidia.com/gpu.sharing-strategy: none
nvidia.com/mig.capable: "false"
nvidia.com/mig.config: all-disabled
# 省略...

3.4.2 确定MIG使用的配置项名称

查询clusterpolicy,并查看其中对于mig配置项的名称

kubectl get clusterpolicy -oyaml

在输出的yaml中的以下配置展示了MIG主要的配置内容,其中default-mig-parted-config即是MIG特性对应的ConfigMap名称:

# 省略 ...
mig:
strategy: single # single 全部卡一样,mixed 可以拆不同的卡
migManager:
config:
default: all-disabled # 默认策略
name: default-mig-parted-config # 自定义的profile配置
enabled: true
env:
- name: WITH_REBOOT # 配置后是否重启
value: "false"
gpuClientsConfig:
name: ""
image: k8s-mig-manager
imagePullPolicy: IfNotPresent
repository: nvcr.io/nvidia/cloud-native
version: v0.12.1-ubuntu20.04
# 省略 ...

3.4.3 修改MIG全局配置文件

3.4.3.1 device-plugin-config

执行以下命令查看device plugin的配置内容中是否存在mig-mixed的配置项,如果没有则添加该配置:

kubectl get cm -n gpu-operator device-plugin-config -oyaml 

mig-mixed的配置项内容如下:

# 省略 ...
mig-mixed: |-
version: v1
flags:
migStrategy: mixed
# 省略 ...
3.4.3.2 clusterpolicy

修改clusterpolicy中的相关配置内容如下:

# 省略 ...
mig:
strategy: mixed
migManager:
config:
default: all-disabled
name: custom-mig-parted-config
enabled: true
env:
- name: WITH_REBOOT
value: "false"
gpuClientsConfig:
name: ""
image: k8s-mig-manager
imagePullPolicy: IfNotPresent
repository: harbor.hl.zkj.local/pa/mirror-stuff/nvidia/cloud-native
version: v0.7.0-ubuntu20.04
# 省略 ...

clusterpolicy中的default-mig-parted-config是对应的ConfigMap名称,我这里修改为了自定义的名称custom-mig-parted-config

MPS特性类似,MIG特性只有增加拆卡配置并且为特定GPU卡节点标签添加对应配置项名称关联后才会真实启用。我们继续看看如何配置MIG拆卡内容。

3.5 如何对节点GPU进行MIG拆卡

3.5.1 查看现有MIG配置内容

查看custom-mig-parted-config配置项内容:

kubectl get cm -n gpu-operator custom-mig-parted-config -oyaml

输出内容如下:

apiVersion: v1
kind: ConfigMap
metadata:
creationTimestamp: "2024-12-05T06:16:49Z"
name: custom-mig-parted-config
namespace: gpu-operator
resourceVersion: "15465565"
uid: b9d510b5-98b2-4a6a-8f9a-f045485e96ff
data:
config.yaml: |
version: v1
mig-configs:
all-1g.5gb:
- devices: all
mig-enabled: true
mig-devices:
1g.5gb: 7
all-1g.5gb.me:
- devices: all
mig-enabled: true
mig-devices:
1g.5gb+me: 1
all-1g.6gb:
- devices: all
mig-enabled: true
mig-devices:
1g.6gb: 4
all-1g.6gb.me:
- devices: all
mig-enabled: true
mig-devices:
1g.6gb+me: 1
# 省略 ... 还有很多

其实这个配置文件是默认的MIG拆卡配置,可以参考GPU Operator的源码:https://github.com/NVIDIA/gpu-operator/blob/main/assets/state-mig-manager/0400_configmap.yaml

拆卡设备名称规则: alt text

3.5.2 增加自定义MIG拆卡内容

我们在custom-mig-parted-config这个ConfigMap内容的最末尾增加一个自定义的MIG拆卡配置,该配置的变更可以手动但通常是由程序自动更新完成:

# 省略 ...
zkj-pa-uat-gpu-004:
- devices: [3]
mig-enabled: true
mig-devices:
1g.12gb: 2
1g.24gb: 1
3g.48gb: 1
# 省略 ...

这个节点上本来有8NVIDIA-H20GPU卡,这个配置项表示将其中索引为3GPU卡拆为4张虚拟卡:21g.12gb11g.24gb13g.48gb

3.5.3 为节点增加拆卡配置项关联

MIG特性相关的机器节点标签如下:

nvidia.com/device-plugin.config: default
nvidia.com/mig.config: all-disabled

修改为以下内容:

nvidia.com/device-plugin.config: mig-mixed
nvidia.com/mig.config: zkj-pa-uat-gpu-004

保存退出后不久,节点的标签会被MIG控制器更新为如下内容:

nvidia.com/mig-1g.12gb.count: "2"
nvidia.com/mig-1g.12gb.engines.copy: "1"
nvidia.com/mig-1g.12gb.engines.decoder: "1"
nvidia.com/mig-1g.12gb.engines.encoder: "0"
nvidia.com/mig-1g.12gb.engines.jpeg: "1"
nvidia.com/mig-1g.12gb.engines.ofa: "0"
nvidia.com/mig-1g.12gb.memory: "12032"
nvidia.com/mig-1g.12gb.multiprocessors: "8"
nvidia.com/mig-1g.12gb.product: NVIDIA-H20-MIG-1g.12gb
nvidia.com/mig-1g.12gb.replicas: "1"
nvidia.com/mig-1g.12gb.sharing-strategy: none
nvidia.com/mig-1g.12gb.slices.ci: "1"
nvidia.com/mig-1g.12gb.slices.gi: "1"
nvidia.com/mig-1g.24gb.count: "1"
nvidia.com/mig-1g.24gb.engines.copy: "1"
nvidia.com/mig-1g.24gb.engines.decoder: "1"
nvidia.com/mig-1g.24gb.engines.encoder: "0"
nvidia.com/mig-1g.24gb.engines.jpeg: "1"
nvidia.com/mig-1g.24gb.engines.ofa: "0"
nvidia.com/mig-1g.24gb.memory: "24192"
nvidia.com/mig-1g.24gb.multiprocessors: "8"
nvidia.com/mig-1g.24gb.product: NVIDIA-H20-MIG-1g.24gb
nvidia.com/mig-1g.24gb.replicas: "1"
nvidia.com/mig-1g.24gb.sharing-strategy: none
nvidia.com/mig-1g.24gb.slices.ci: "1"
nvidia.com/mig-1g.24gb.slices.gi: "1"
nvidia.com/mig-3g.48gb.count: "1"
nvidia.com/mig-3g.48gb.engines.copy: "3"
nvidia.com/mig-3g.48gb.engines.decoder: "3"
nvidia.com/mig-3g.48gb.engines.encoder: "0"
nvidia.com/mig-3g.48gb.engines.jpeg: "3"
nvidia.com/mig-3g.48gb.engines.ofa: "0"
nvidia.com/mig-3g.48gb.memory: "48512"
nvidia.com/mig-3g.48gb.multiprocessors: "32"
nvidia.com/mig-3g.48gb.product: NVIDIA-H20-MIG-3g.48gb
nvidia.com/mig-3g.48gb.replicas: "1"
nvidia.com/mig-3g.48gb.sharing-strategy: none
nvidia.com/mig-3g.48gb.slices.ci: "3"
nvidia.com/mig-3g.48gb.slices.gi: "3"
nvidia.com/mig.capable: "true"
nvidia.com/mig.config: zkj-pa-uat-gpu-004
nvidia.com/mig.config.state: success
nvidia.com/mig.strategy: mixed
nvidia.com/mps.capable: "false"

可以看到MIG特性在该机器节点上被成功启用,拆卡的配置信息也被自动更新到了节点标签上。 同时该节点上的资源类型也新增了如下几个:

  allocatable:
cpu: "128"
ephemeral-storage: "96626364666"
hugepages-1Gi: "0"
hugepages-2Mi: "0"
memory: 1056189400Ki
nvidia.com/gpu: "7"
nvidia.com/mig-1g.12gb: "2"
nvidia.com/mig-1g.24gb: "1"
nvidia.com/mig-3g.48gb: "1"
pods: "110"
capacity:
cpu: "128"
ephemeral-storage: 104846316Ki
hugepages-1Gi: "0"
hugepages-2Mi: "0"
memory: 1056291800Ki
nvidia.com/gpu: "7"
nvidia.com/mig-1g.12gb: "2"
nvidia.com/mig-1g.24gb: "1"
nvidia.com/mig-3g.48gb: "1"
pods: "110"

其中这几项便是新增的资源类型,创建训练推理任务时便可以按照这些资源类型申请使用拆卡资源:

nvidia.com/mig-1g.12gb: "2"
nvidia.com/mig-1g.24gb: "1"
nvidia.com/mig-3g.48gb: "1"

4. 总结

NVIDIAMPSMIG技术为解决GPU资源利用率低下和多租户隔离的问题提供了有效的解决方案。MPS采用软件级隔离,支持动态资源分配,适合多副本推理服务和轻量级训练场景;而MIG提供硬件级隔离,确保完全的性能隔离和故障隔离,更适合多租户环境和需要强隔离的业务场景。选择哪种技术应基于业务需求、硬件条件、运维复杂度和应用场景等因素综合考量。在实际部署中,应注重资源监控、自动化配置和安全防护,以确保系统的稳定性和性能。通过合理应用这些GPU共享技术,企业可以显著提高GPU资源利用率,降低AI应用部署成本,同时满足不同业务场景的需求。