{"product_id":"hands-on-llm-serving-and-optimization-hosting-llms-at-scale-paperback","title":"Hands-On LLM Serving and Optimization: Hosting Llms at Scale - Paperback","description":"\u003cdiv\u003e\u003cp style=\"text-align: right;\"\u003e\u003ca href=\"https:\/\/reportcopyrightinfringement.com\/\" target=\"_blank\" rel=\"nofollow\"\u003e\u003cb\u003eReport copyright infringement\u003c\/b\u003e\u003c\/a\u003e\u003c\/p\u003e\u003c\/div\u003e\u003cp\u003eby \u003cb\u003eChi Wang\u003c\/b\u003e (Author), \u003cb\u003ePeiheng Hu\u003c\/b\u003e (Author)\u003c\/p\u003e\u003cp\u003e\u003c\/p\u003e\u003cp\u003eLarge language models (LLMs) are the reasoning engines of modern AI. Today, a major inflection point has arrived: as the world races to deploy AI at scale, model inference has moved to the center of the stack. Welcome to the inference era.\u003c\/p\u003e \u003cp\u003eWithout proper optimization, however, LLMs can be expensive and slow to serve. \u003cem\u003eHands-On LLM Serving and Optimization\u003c\/em\u003e is a comprehensive guide to the complexities of deploying and optimizing LLMs at scale.\u003c\/p\u003e \u003cp\u003eIn this hands-on, engineering-focused book, authors Chi Wang and Peiheng Hu combine practical examples, code, and strategies for building robust, performant, and cost-efficient AI token factories. Whether you're building the LLM inference infrastructure or the applications that consume it, a deep understanding of LLM serving will make you a more effective, future-ready engineer as AI transforms how we work and build.\u003c\/p\u003e \u003cul\u003e \u003cli\u003eLearn the foundations of model serving with core concepts, design paradigms, and industry best practices\u003c\/li\u003e \u003cli\u003eUnderstand the common challenges of hosting LLMs at scale\u003c\/li\u003e \u003cli\u003eBalance latency and throughput to meet the demands of AI applications and business requirements\u003c\/li\u003e \u003cli\u003eHost LLMs cost-effectively with practical, code-backed techniques\u003c\/li\u003e \u003c\/ul\u003e \n            \u003cdiv\u003e\n\u003cstrong\u003eNumber of Pages:\u003c\/strong\u003e 371\u003c\/div\u003e\n            \u003cdiv\u003e\n\u003cstrong\u003eDimensions:\u003c\/strong\u003e 0.77 x 9.19 x 7 IN\u003c\/div\u003e\n            \u003cdiv\u003e\n\u003cstrong\u003ePublication Date:\u003c\/strong\u003e June 02, 2026\u003c\/div\u003e\n            ","brand":"BooksCloud","offers":[{"title":"Default Title","offer_id":53510214123827,"sku":"9798341621497","price":93.38,"currency_code":"USD","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0300\/5595\/6612\/files\/j_p_Nfp2eC9798341621497.webp?v=1781748036","url":"https:\/\/www.vysn.com\/products\/hands-on-llm-serving-and-optimization-hosting-llms-at-scale-paperback","provider":"VYSN","version":"1.0","type":"link"}