1、Frank YangAccelerating Inference at the Edge:Unlock Scalable,Secure,and Low-Latency Connectivity for the AI Era with IP over DWDMAccelerating Inference at the Edge:Unlock Scalable,Secure,and Low-Latency Connectivity for the AI Era with IP over DWDMFrank YangEdgeNearly 95%of AI spend is on inference(
2、run-time)vs.pre-training,according to Menlo Ventures enterprise AI survey,2024 79%of financial services companies follow a distributed data approach,according to Digital Realtys AI in financial services survey,202477%of the respondents believe inferencing at the Edge to be an essential strategy for
3、managing AI workloads effectively,according to Datacenter Dynamics edge AI survey,2025.Some Interesting Survey ResultsAI Inference WorkflowAI Framework:TensorFlow,PyTorch,TensorRT,etc.Model OptimizationInference ServingFar EdgeAI ApplicationData SourcesAI InfrastructureTrained ModelsModel Repository
4、Edge Data CenterCore Data Center/CloudAI Inference Use Case ExplorationsUse CaseEdge Workload Partial inference Core Workload Final inferenceKey Network RequirementsLatency and BandwidthSmart Retail/Mall AnalyticsShopper detection,anonymized embeddings,foot traffic heatmapsCross-store profiling,camp
5、aign analytics,personalizationModerate bandwidth(cameraedge),low local latency(100 ms),privacy/anonymizationHealthcare Imaging/DiagnosticsDICOM ingestion,image preprocessing,candidate lesion detectionMulti-modal correlation,ensemble diagnosis,access to historical records/EMRHigh bandwidth(images),lo
6、wish latency for triage(500 ms),strict security/complianceReal-Time Financial Fraud DetectionTransaction scoring,feature extraction,early anomaly flagging at branch/POSGraph correlation,cross-bank analysis,escalation and identity verificationUltra-low latency(50 ms),extremely reliable and auditable