November 20, 2024

Server Market Sees Two Major Shifts

Advertisements

At the outset of 2023, a substantial internet enterprise approached Inspur Information to address a novel challenge in its operationsThis client had diverse application scenarios, and throughout practical implementation, it became apparent that the most suitable processor platforms varied across these contextsFor instance, in lightweight container applications, the expectations leaned towards moderate performance demands with heightened constraints on power consumption and physical densityConversely, high-performance computing, such as scientific simulations and complex modeling, favored processors with superior parallel processing capabilities, characterized by a greater number of high-frequency coresThe client raised an important question: how could they rapidly deploy servers with different processors across their multitude of business landscapes?

Traditionally, generic server systems were uniquely tailored around a single processor platform, leading to significant customization for each application

However, faced with a growing necessity for diversified processing capabilities and the demand of the client for 'either/or' solutions, the industry found itself at a tipping point, calling for a fundamental transformation of nearly three decades of generic server architectureThis was no longer just about meeting performance or power consumption; it required a paradigm shift in how servers were designed and implemented.

Simultaneously, the advent of artificial intelligence brought forth further demands on generic servers, particularly in areas like data storage for massive model trainingWhile AI servers predominantly handled both the training and inference instructions, generic server architectures needed to adapt to the AI's requirements, adopting high-density configurations similar to those used in dedicated AI hardwareAs data centers globally started to evolve with immense computational clusters that could support over a hundred thousand or even millions of cores, generic servers, in tandem, began to adopt advanced smart acceleration capabilities.

These emerging market dynamics marked a pivotal moment for generic servers, now standing on the precipice of new innovations despite having matured over the years.

If projections hold, the future increase in shipments of generic servers is expected to stay between a modest 5% to 6%. This insight points towards an inevitable disruption initiated by newly established standards and competitive forces within the industry.

In response to the multi-faceted processing power appeals from the internet giant, Inspur Information engaged in a collaborative brainstorming session with their client, unraveling the concept of decoupling within server architecture

Previous experiences with competitive acceleration chips in AI servers had shown the efficacy of the Open Accelerator Module (OAM) standard, which utilized decoupled and standardized modules allowing for the swift application and scaling of various vendors' chips.

"The notion of OAM was a significant inspiration for us," remarked Zhao Shuai, general manager of Inspur Information's server product lineThe revelation was clear: if generic servers could dismantle traditional market norms and refrain from organizing around a solitary processor, instead adopting a structure made up of standardized modules such as processors, storage, I/O, and power supplies, clients would have the opportunity to assemble their systems akin to piecing together Legos, catering to their versatile needs.

More than a year later, following considerable efforts across the industrial spectrum, the decoupling approach was finally set into motion

The Open Computing Module (OCM) standards were initiated by the Open Computing Technology Consortium (OCTC), establishing a framework for standardized computing modules and enabling the seamless integration of multiple CPUs in a single serverAccording to the defined standards, future servers could accommodate a variety of CPU platforms, including Intel, AMD, and ARM, allowing for interchangeability and even simultaneous supportThis marked the first server computing module design standard in China.

The OCM initiative heralded a turning point, breaking from decades of rigid generic server design due to several catalytic factors within the industryFor one, in recent years the once-monolithic processor segment began to undergo significant shiftsThe rise of diverse processing demands led not only to an emergence of the x86 architecture but also aggressive forays into the RISC-V and ARM markets, igniting a fierce competition among chip manufacturers

alefox

The race to reach users and facilitate swift operational rollouts became paramount.

In addition, enterprise end-users voiced urgent needs for flexible, rapidly deployable processing unitsCompanies in the internet domain were demanding adaptable server nodes, while telecom enterprises felt the pressures of scaling diverse processing solutions promptly.

With server manufacturers confronted by an expansive array of chip platforms, development workloads surged, driving costs upHence, there was an acute impetus among these companies to enhance their efficiency in developing multi-faceted processing servers.

Simultaneously, national bodies responsible for standard setting recognized a void in the standards for computing modules, expressing a willingness to construct related standards to elevate domestic server industries to global benchmarks.

This collective momentum facilitated collaboration across the industrial chain

Thus, upon the launch of the OCM standards by the OCTC in 2024, the first member consortium included representatives from the China Electronics Technology Standardization Institute, Baidu, Xiaohongshu, Inspur Information, Intel, AMD, Lenovo, and various other stakeholders.

However, the process of establishing these standards encountered its own set of challenges as varied demands from stakeholders sometimes presented conflictsFor instance, while the internet giants aimed for leading chip platforms to gain official recognition within the standards, some local and international chip manufacturers prioritized concerns regarding the compatibility of platforms alongside showcasing their competitive advantagesThe final consensus allowed for a comprehensive evaluation and compatibility of all processing platforms for the standardization process.

Server manufacturers each had distinct needs, often desiring the standards to tilt more favorably towards their own architectures

Ultimately, the established board standards plus a tray approach facilitated rapid coupling of different chassis or platform technologies, resolving potential conflicts.

Reflecting on the initiation and development of this standard, Luo Jian, the product planning manager for Inspur Information’s server product line, emphasized that consensus emerged largely due to the overarching benefit to the healthy evolution of the industryThis collaborative spirit fostered a level playing field for the OCM, granting all involved a platform to collectively champion the high-quality growth of the processing sector.

After the introduction of the OCM standard, the industry commenced its productization efforts.

Inspur Information promptly introduced the first generic server built on the OCM framework—the Yuan Nerve NF3290G8. This groundbreaking server accommodates the two latest CPUs: the Intel® Xeon® 6 processors and the fifth-generation AMD EPYC™ 9005 series processors

The former demonstrates substantial performance enhancements across AI inference and scientific research contexts, while the latter excels in scenarios demanding all-flash storage solutions and high-frequency trading.

In this pivotal productization endeavor surrounding the OCM standard, three major trends emerged, warranting industry attentionThe first is the continuing trend of decoupling; followed closely by the integration of intelligent management through advancements in large model technology; and finally, a move toward hardware openness and software openness.

The decoupling trend driven by the OCM strategy highlights a promising evolution for server architectures"From a system efficiency perspective, once we break systems down into distinct standardized modules for general-purpose computing, memory, and heterogeneous computing, we can create uniform power supply, cooling, and control strategies catered to each different hardware resource to achieve optimal power efficiency," stated Luo Jian

The initial outcomes are already visible with the NF3290G8 model adopting the OCM standard.

To facilitate the realization of decoupled and modular designs, engineers focused on standardizing crucial issues related to power supply, management, and high-speed interconnectionsIn terms of management, since the management interfaces and protocols for different processor chips vary widely, the Baseboard Management Controller (BMC) must be adept at translating these divergent signals into a coherent management approachHistorically, this technology was the forte of independent BMC firmware providers (IBV). However, in 2023, Inspur Information leveraged the OpenBMC open-source initiative to acquire key firmware development capabilities, laying a foundation for the standardization of processor management.

In terms of smart management of products, new generation server platforms leverage the advantages of large model learning from vast datasets, enabling fault prediction capabilities

Inspired by Inspur's proprietary model “Source”, past server fault logs were used to train a predictive model, which is now integrated into the BMC management engineThis advancement has resulted in a precautionary alert system that informs clients of potential failures up to a week in advance, substantially minimizing unintended downtime and thereby reducing business disruptions.

Additionally, the trend toward openness and shared design in hardware, particularly concerning OCM, has spurred contributions within the Open Community, allowing customers to utilize relevant resources and documentationFrom a software standpoint, the open-source technologies sourced from the OpenBMC community have aided Inspur Information in resolving pivotal issues within the decoupling strategy and have been reinvested back into the open-source ecosystemThis process of open collaboration continually gathers technological strengths, ultimately serving to reinforce both the organization and the wider industrial landscape.

Alongside these three critical trends, concerns surrounding power consumption and the related challenges of heat dissipation in generic servers have also captured industry attention

Reports indicate that managing heat dissipation has emerged as a major hurdle in the productization phase.

Current forecasts suggest that future power consumption levels for processor platforms will approach the 500 to 600-watt range, alongside four GPUs operating around 350 watts eachThe integration of intelligent network cards as standard in cloud services has further exacerbated power consumption concernsWhen aggregated, the total power requirements for a server could approach 3000 wattsHence, addressing these significant power dissipation challenges has become paramountLuo Jian disclosed that one of the strategies adopted by engineers involves segregating thermal channels for CPU, GPU, and intelligent network cards, thereby enhancing heat dissipation effectively by over 5%. This advancement bears substantial significance concerning the Power Usage Effectiveness (PUE) metric for data centers.

Looking ahead, as the power requirements for generic servers continue to escalate, conventional air cooling methods might face their limits, prompting a potential shift toward liquid cooling strategies under the OCM standard.

By embracing the OCM framework, the costs associated with server development have considerably decreased

The decoupling approach mitigated excessive repetitive development tasks, accelerating the transition from chip research and testing to deploymentAs a result, Inspur Information has successfully compressed its product development cycle from 18 months down to a 6 to 8 month timeframeFurthermore, the redefined architecture, owing to increased reliability standards—like signal integrity, power management, structural requirements, and overall system stability—has not undermined the reliability of the servers.

As these transitions unfold, it is apparent that the generic server sector is at the precipice of a transformative eraThe implementation of OCM serves as a milestone, heralding an evolution in server design traditions, but the forthcoming times will likely see an even more profound impact from intelligent computing on generic server landscapes.

Currently, intelligent computing technologies are steering the direction of the entire industry

The surging power demands spurred by large models are promoting exponential growth in intelligent computing capabilitiesMarket research analytics from IDC forecast a continued doubling in the AI server market for 2023 and 2024. In the Chinese market alone, AI servers are predicted to match the $10 billion mark in 2023, with potential growth to nearly $20 billion by 2024. As a consequence, these developments are situated to position AI servers to dominate a substantial portion of the overall server marketThis sentiment underlying the AI server landscape has led to a prevailing opinion that the overall market health can be gauged through the performance of AI server demands.

Within the realm of AI servers, the flagship GPU chips are evolving through Chiplet design—linking multiple chiplets together in a single package to achieve hyper-efficient processingNevertheless, this approach is accompanied by sharply escalating power consumption figures, with certain configurations reaching 1200 watts or even 1600 watts, intensifying the energy needs for infrastructure supporting computation.

Over the past decade, data center infrastructure has evolved gradually

Currently, most data centers are equipped to supply between 10 kilowatts and 12 kilowatts of powerHowever, with the rise of intelligent computing, projections indicate that future data center power capacities will surge towards or even exceed the 100-kilowatt threshold, with specific AI configurations potentially reaching 400 kilowatts.

"Given this overarching context, we anticipate that significant transformative cycles could emerge for generic computing in the future," Luo Jian explainedHe underscored the inadequacies of current generic server deployment modalities in the face of high power-capacity data centers, indicating that low yields and efficiencies may prompt a long-term transition toward high-density, liquid-cooled cabinet structures for generic servers.

If generic servers do indeed migrate toward these high-density cabinet configurations, the component nodes will be designed based on layered decoupling theories

The OCM’s decoupling philosophy will transform processing units into smaller modules, potentially serving as the springboard for increased high-density deployments in data center servicesMoving forward, the feasibility of liquid cooling solutions might further escalate the density levels of these deployments.

Luo Jian analyzed that significant changes in product design are poised to arise during this shift toward high-density, liquid-cooled solutionsInnovations might include memory laid flat against the motherboard or the integration of memory on both sides of the motherboard, with adaptations facilitating easier liquid-cooling setups.

To enable such transformations, the existing industrial chain will undoubtedly expand, ushering in enterprises dealing in liquid cooling, memory management, and advanced power delivery systems to collaborate closely"OCM serves as an excellent starting point," Luo Jian noted

Leave Your Comment

Your email address will not be published.