1、Yuan TangSenior Principal Software Engineer,Red Hat AIProject Lead:KServe,Argo,and KubeflowCo-chair:K8s WG Serving and WG AI ConformanceCo-chair:CNCF TAG Workloads FoundationKServe Deep Dive:Evolving Model Serving for the Generative AI EraAgenda0102030405What is KServeHistoryOpen Source CommunityArc
2、hitectureKey FeaturesWhat is KServe?Highly scalable,standard,cloud agnostic model inference platform on Kubernetes#IBMTechXchangeWhy KServe?Provides performant,standardized inference protocol across ML frameworks.Supports modern serverless inference workload with autoscaling including Scale to Zero
3、on GPU.Provides high scalability,density packing and intelligent routing using ModelMesh4Simple and pluggableproduction serving for production ML serving including prediction,pre/post processing,monitoring and explainability.Advanced deploymentswith canary rollout,experiments,ensembles and transform
4、ers.GenAI capabilities:Envoy AI Gateway,KEDA,LMCache.Model Cache,vLLM multi-node inference#IBMTechXchange5History of KServeFrom Kubeflow/KFServingto KServeDeveloped collaboratively by Google,IBM,Bloomberg,NVIDIA,and Seldon in 2019 under the Kubeflow project.The project graduated from Kubeflow and wa
5、s rebranded from KFServing to standalone KServe project in Sep 2021.KServe 0.7 released outside of the Kubeflow with migration guide for minimal disruptions in Oct 2021.KServe was donated to LF AI&Data Foundation in Nov 2021.KServe was accepted as a CNCF incubating project Oct 2025#IBMTechXchange6Hi
6、story of KServeKubeflow Ecosystem#IBMTechXchange7Community:Maintainers and Contributors19 maintainers and 300+contributors!#IBMTechXchange8Community:Adopters30+companies varying from vendors to end users!#IBMTechXchange9KServe Architecture#IBMTechXchange10KServe Feature:Serving RuntimesPluggable,reu