1、 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Own Your AI Blazing Fast OSS AI on AWSS T P 1 0 4Health/MedicalManufacturingSecurityCoding AgentsD
2、ocument AgentsSales/Marketing AgentsHiring AgentsCustomer Service AgentsRetailInsuranceFinanceEducation2025 was the year of the agentsML expertiseDevOps complexityContinuous optimizationsModel selectionLatency scaleQuality/AccuracyCost per requestInfra complexityData privacyComplianceSecurity requir
3、ementsAvailabilityBut,building agents is very hardFireworks PlatformBUILDSDKCLISERVICESAPIMODEL LIBRARYWhisperTUNECUSTOMIZATION ENGINEMODELLIFECYCLE MANAGEMENTFINE TUNING(SFT&DPO)REINFORCEMENT LEARNING TUNINGSCALEVIRTUAL CLOUD INFRASTRUCTUREMULTI CLOUDMULTI HWBYOCMANAGEMENT AND SECURITYFireworks Pla
4、tformThe inference cloud,a highly optimized,globally available platform for building AI agents and applicationsProduction Workload and ProfilePersonalized configuration for faster,higher-quality Inference Fireworks OptimizerTransforms generic models into tailored enterprise solutions with blazing fa
5、st speed personalized to your workloadAdaptive speculationAdaptive cachingQuantizationHardware Setup OptimizationFireAttention:custom CUDA kernelquality84,000+possible combinations 3 speculative decoding modes 5 quantization modes 7 hardware SKUs 4 parallel execution modes5 cross-host setups10 kerne
6、l options4 quality tuning approachesThe optimization space for LLM inference is complexOptimization goalsFireOptimizer turns an intractable manual optimization problem into a workload-specific solution.Fireworks OptimizerYour model is your IP,your model is your product.Fine-tune to beat top closed-s