#intelxeoncpu | Explore Tumblr posts and blogs

govindhtech · 7 months ago

Text

Intel oneDPL(oneAPI DPC++ Library) Offloads C++ To SYCL

Standard Parallel C++ Code Offload to SYCL Device Utilizing the Intel oneDPL (oneAPI DPC++ Library).

Enhance C++ Parallel STL methods with multi-platform parallel computing capabilities. C++ algorithms may be executed in parallel and vectorized with to the Parallel Standard Template Library, sometimes known as Parallel STL or pSTL.

Utilizing the cross-platform parallelism capabilities of SYCL and the computational power of heterogeneous architectures, you may improve application performance by offloading Parallel STL algorithms to several devices (CPUs or GPUs) that support the SYCL programming framework. Multiarchitecture, accelerated parallel programming across heterogeneous hardware is made possible by the Intel oneAPI DPC++ Library (oneDPL), which allows you to offload Parallel STL code to SYCL devices.

The code example in this article will show how to offload C++ Parallel STL code to a SYCL device using the oneDPL pSTL_offload preview function.

Parallel API

As outlined in ISO/IEC 14882:2017 (often referred to as C++17) and C++20, the Parallel API in Intel oneAPI DPC++ Library (oneDPL) implements the C++ standard algorithms with execution rules. It provides data parallel execution on accelerators supported by SYCL in the Intel oneAPI DPC++/C++ Compiler, as well as threaded and SIMD execution of these algorithms on Intel processors built on top of OpenMP and oneTBB.

The Parallel API offers comparable parallel range algorithms that follow an execution strategy, extending the capabilities of range algorithms in C++20.

Furthermore, oneDPL offers particular iterations of a few algorithms, such as:

Segmented reduction

A segmented scan

Algorithms for vectorized searches

Key-value pair sorting

Conditional transformation

Iterators and function object classes are part of the utility API. The iterators feature a counting and discard iterator, perform permutation operations on other iterators, zip, and transform. The function object classes provide identity, minimum, and maximum operations that may be supplied to reduction or transform algorithms.

An experimental implementation of asynchronous algorithms is also included in oneDPL.

Intel oneAPI DPC++ Library (oneDPL): An Overview

When used with the Intel oneAPI DPC++/C++ Compiler, oneDPL speeds up SYCL kernels for accelerated parallel programming on a variety of hardware accelerators and architectures. With the help of its Parallel API, which offers range-based algorithms, execution rules, and parallel extensions of C++ STL algorithms, C++ STL-styled programs may be efficiently executed in parallel on multi-core CPUs and offloaded to GPUs.

It supports libraries for parallel computing that developers are acquainted with, such Boost and Parallel STL. Compute. Its SYCL-specific API aids in GPU acceleration of SYCL kernels. In contrast, you may use oneDPL‘s Device Selection API to dynamically assign available computing resources to your workload in accordance with pre-established device execution rules.

For simple, automatic CUDA to SYCL code conversion for multiarchitecture programming free from vendor lock-in, the library easily interfaces with the Intel DPC++ Compatibility Tool and its open equivalent, SYCLomatic.

About the Code Sample

With just few code modifications, the pSTL offload code example demonstrates how to offload common C++ parallel algorithms to SYCL devices (CPUs and GPUs). Using the fsycl-pstl-offload option with the Intel oneAPI DPC++/C++ Compiler, it exploits an experimental oneDPL capability.

To perform data parallel computations on heterogeneous devices, the oneDPL Parallel API offers the following execution policies:

Unseq for sequential performance

Par stands for parallel processing.

Combining the effects of par and unseq policies, par_unseq

The following three programs/sub-samples make up the code sample:

FileWordCount uses C++17 parallel techniques to count the words in a file.

WordCount determines how many words are produced using C++17 parallel methods), and

Various STL algorithms with the aforementioned execution policies (unseq, par, and par_unseq) are implemented by ParSTLTests.

The code example shows how to use the –fsycl-pstl-offload compiler option and standard header inclusion in the existing code to automatically offload STL algorithms called by the std:execution::par_unseq policy to a selected SYCL device.

You may offload your SYCL or OpenMP code to a specialized computing resource or an accelerator (such CPU, GPU, or FPGA) by using specific device selection environment variables offered by the oneAPI programming paradigm. One such environment option is ONEAPI_DEVICE_SELECTOR, which restricts the selection of devices from among all the compute resources that may be used to run the code in applications that are based on SYCL and OpenMP. Additionally, the variable enables the selection of sub-devices as separate execution devices.

The code example demonstrates how to use the ONEAPI_DEVICE SELECTOR variable to offload the code to a selected target device. OneDPL is then used to implement the offloaded code. The code is offloaded to the SYCL device by default if the pSTL offload compiler option is not used.

The example shows how to offload STL code to an Intel Xeon CPU and an Intel Data Center GPU Max. However, offloading C++ STL code to any SYCL device may be done in the same way.

What Comes Next?

To speed up SYCL kernels on the newest CPUs, GPUs, and other accelerators, get started with oneDPL and examine oneDPL code examples right now!

For accelerated, multiarchitecture, high-performance parallel computing, it also urge you to investigate other AI and HPC technologies that are based on the unified oneAPI programming paradigm.

Read more on govindhtech.com

#Intelonedpl #oneAPIDPCLibrary #sycl #Intel #onedpl #ParallelAPI #InteloneAPIDPCCompiler #SYCLomatic #fpga #IntelXeonCPU #cpu #oneapi #api #intel #technology #technews #news #govindhtech

0 notes

govindhtech · 11 months ago

Text

New NVIDIA L40S GPU-accelerated OCI Compute Instances

Expanding NVIDIA GPU-Accelerated Instances for AI, Digital Twins, and Other Uses is Oracle Cloud Infrastructure

In order to boost productivity, cut expenses, and spur creativity, businesses are quickly using generative AI, large language models (LLMs), sophisticated visuals, and digital twins.

But in order for businesses to use these technologies effectively, they must have access to cutting edge full-stack accelerated computing systems. Oracle Cloud Infrastructure (OCI) today announced the imminent release of a new virtual machine powered by a single NVIDIA H100 Tensor Core GPU and the availability of NVIDIA L40S GPU bare-metal instances that are available for order to match this demand. With the addition of this new virtual machine, OCI’s H100 offering now includes an NVIDIA HGX H100 8-GPU bare-metal instance.

These platforms offer strong performance and efficiency when combined with NVIDIA networking and the NVIDIA software stack, allowing businesses to enhance generative AI.

You can now order the NVIDIA L40S GPU on OCI

Designed to provide innovative multi-workload acceleration for generative AI, graphics, and video applications, the NVIDIA L40S GPU is universal data centre GPU. With its fourth-generation Tensor Cores and FP8 data format support, the L40S GPU is an excellent choice for inference in a variety of generative AI use cases, as well as for training and optimising small- to mid-size LLMs.

For Llama 3 8B with NVIDIA TensorRT-LLM at an input and output sequence length of 128, for instance, a single L40S GPU (FP8) may produce up to 1.4 times as many tokens per second as a single NVIDIA A100 Tensor Core GPU (FP16).

Additionally, the NVIDIA L40S GPU offers media acceleration and best-in-class graphics. It is perfect for digital twin and complex visualisation applications because of its numerous encode/decode engines and third-generation NVIDIA Ray Tracing Cores (RT Cores).

With support for NVIDIA DLSS 3, the L40S GPU offers up to 3.8 times the real-time ray-tracing capabilities of its predecessor, resulting in quicker rendering and smoother frame rates. Because of this, the GPU is perfect for creating apps on the NVIDIA Omniverse platform, which enables AI-enabled digital twins and real-time, lifelike 3D simulations. Businesses may create sophisticated 3D apps and workflows for industrial digitalization using Omnivores on the L40S GPU. These will enable them to design, simulate, and optimise facilities, processes, and products in real time before they go into production.

NVIDIA L40S 48gb

OCI’s BM.GPU.L40S will include the L40S GPU. Featuring four NVIDIA L40S GPUs, each with 48GB of GDDR6 memory, this computational form is bare metal. This form factor comprises 1TB of system memory, 7.38TB local NVMe SSDs, and 112-core 4th generation Intel Xeon CPUs.

With OCI’s bare-metal compute architecture, these forms do away with the overhead of any virtualisation for high-throughput and latency-sensitive AI or machine learning workloads. By removing data centre responsibilities off CPUs, the NVIDIA BlueField-3 DPU in the accelerated compute form improves server efficiency and speeds up workloads related to networking, storage, and security. By utilising BlueField-3 DPUs, OCI is advancing its off-box virtualisation approach for its whole fleet.

OCI Supercluster with NVIDIA L40S allows for ultra-high performance for up to 3,840 GPUs with minimal latency and 800Gbps internode bandwidth. NVIDIA ConnectX-7 NICs over RoCE v2 are used by OCI’s cluster network to handle workloads that are latency-sensitive and high throughput, such as AI training.

“For 30% more efficient video encoding, we chose OCI AI infrastructure with bare-metal instances and NVIDIA L40S GPUs,” stated Beamr Cloud CEO Sharon Carmel.50% or less on the network and storage traffic will be used for videos processed with Beamr Cloud on OCI, resulting in two times faster file transfers and higher end user productivity. Beamr will offer video AI workflows to OCI clients, getting them ready for the future of video.

OCI to Feature Single-GPU H100 VMs Soon

Soon to be available at OCI, the VM.GPU.H100.1 compute virtual machine shape is powered by a single NVIDIA H100 Tensor Core GPU. For businesses wishing to use the power of NVIDIA H100 GPUs for their generative AI and HPC workloads, this will offer affordable, on-demand access.

A decent platform for LLM inference and lesser workloads is an H100 alone. For instance, with NVIDIA TensorRT-LLM at an input and output sequence length of 128 and FP8 precision, a single H100 GPU can produce more than 27,000 tokens per second for Llama 3 8B (up to 4x greater throughput than a single A100 GPU at FP16 precision).

VM.GPU.H100 is the one. form is well-suited for a variety of AI workloads because it has 13 cores of 4th Gen Intel Xeon processors, 246GB of system memory, and a capacity for 2×3.4TB NVMe drives.

“Oracle Cloud’s bare-metal compute with NVIDIA H100 and A100 GPUs, low-latency Supercluster, and high-performance storage delivers up to” claimed Yeshwant Mummaneni, head engineer of data management analytics at Altair. 20% better price-performance for Altair’s computational fluid dynamics and structural mechanics solvers.” “We are eager to use these GPUs in conjunction with virtual machines to power the Altair Unlimited virtual appliance.”

Validation Samples for GH200 Bare-Metal Instances Are Available

The BM.GPU.GH200 compute form is also available for customer testing from OCI. It has the NVIDIA Grace Hopper Superchip and NVLink-C2C, which connects the NVIDIA Grace CPU and NVIDIA Hopper GPU at 900GB/s with high bandwidth and cache coherence. With more than 600GB of RAM that is available, apps handling terabytes of data can operate up to 10 times faster than they would on an NVIDIA A100 GPU.

Software That’s Optimised for Enterprise AI

Businesses can speed up their AI, HPC, and data analytics workloads on OCI with a range of NVIDIA GPUs. But an optimised software layer is necessary to fully realise the potential of these GPU-accelerated compute instances.

World-class generative AI applications may be deployed securely and reliably with the help of NVIDIA NIM, a set of user-friendly microservices that are part of the NVIDIA AI Enterprise software platform that is available on the OCI Marketplace. NVIDIA NIM is designed for high-performance AI model inference.

NIM pre-built containers, which are optimised for NVIDIA GPUs, give developers better security, a quicker time to market, and a lower total cost of ownership. NVIDIA API Catalogue offers NIM microservices for common community models, which can be simply deployed on Open Cross Infrastructure (OCI).

With the arrival of future GPU-accelerated instances, such as NVIDIA Blackwell and H200 Tensor Core GPUs, performance will only get better with time.

Contact OCI to test the GH200 Superchip and order the L40S GPU. Join Oracle and NVIDIA SIGGRAPH, the world’s preeminent graphics conference, which is taking place until August 1st, to find out more.

L40S NVIDIA price

Priced at approximately $10,000 USD, the NVIDIA L40S GPU is intended for use in data centres and AI tasks. It is an improved L40 that was created especially for AI applications rather than visualisation jobs. This GPU can be used for a variety of high-performance applications, including media acceleration, large language model (LLM) training, inference, and 3D graphics rendering. It is driven by NVIDIA’s Ada Lovelace architecture.

Read more on govindhtech.com

0 notes

govindhtech · 1 year ago

Text

Aurora Supercomputer Sets a New Record for AI Tragic Speed!

Intel Aurora Supercomputer

Together with Argonne National Laboratory and Hewlett Packard Enterprise (HPE), Intel announced at ISC High Performance 2024 that the Aurora supercomputer has broken the exascale barrier at 1.012 exaflops and is now the fastest AI system in the world for AI for open science, achieving 10.6 AI exaflops. Additionally, Intel will discuss how open ecosystems are essential to the advancement of AI-accelerated high performance computing (HPC).

Why This Is Important:

From the beginning, Aurora was intended to be an AI-centric system that would enable scientists to use generative AI models to hasten scientific discoveries. Early AI-driven research at Argonne has advanced significantly. Among the many achievements are the mapping of the 80 billion neurons in the human brain, the improvement of high-energy particle physics by deep learning, and the acceleration of drug discovery and design using machine learning.

Analysis

The Aurora supercomputer has 166 racks, 10,624 compute blades, 21,248 Intel Xeon CPU Max Series processors, and 63,744 Intel Data Centre GPU Max Series units, making it one of the world’s largest GPU clusters. 84,992 HPE slingshot fabric endpoints make up Aurora’s largest open, Ethernet-based supercomputing connection on a single system.

The Aurora supercomputer crossed the exascale barrier at 1.012 exaflops using 9,234 nodes, or just 87% of the system, yet it came in second on the high-performance LINPACK (HPL) benchmark. Aurora supercomputer placed third on the HPCG benchmark at 5,612 TF/s with 39% of the machine. The goal of this benchmark is to evaluate more realistic situations that offer insights into memory access and communication patterns two crucial components of real-world HPC systems. It provides a full perspective of a system’s capabilities, complementing benchmarks such as LINPACK.

How AI is Optimized

The Intel Data Centre GPU Max Series is the brains behind the Aurora supercomputer. The core of the Max Series is the Intel X GPU architecture, which includes specialised hardware including matrix and vector computing blocks that are ideal for AI and HPC applications. Because of the unmatched computational performance provided by the Intel X architecture, the Aurora supercomputer won the high-performance LINPACK-mixed precision (HPL-MxP) benchmark, which best illustrates the significance of AI workloads in HPC.

The parallel processing power of the X architecture excels at handling the complex matrix-vector operations that are a necessary part of neural network AI computing. Deep learning models rely heavily on matrix operations, which these compute cores are essential for speeding up. In addition to the rich collection of performance libraries, optimised AI frameworks, and Intel’s suite of software tools, which includes the Intel oneAPI DPC++/C++ Compiler, the X architecture supports an open ecosystem for developers that is distinguished by adaptability and scalability across a range of devices and form factors.

Enhancing Accelerated Computing with Open Software and Capacity

He will stress the value of oneAPI, which provides a consistent programming model for a variety of architectures. OneAPI, which is based on open standards, gives developers the freedom to write code that works flawlessly across a variety of hardware platforms without requiring significant changes or vendor lock-in. In order to overcome proprietary lock-in, Arm, Google, Intel, Qualcomm, and others are working towards this objective through the Linux Foundation’s Unified Acceleration Foundation (UXL), which is creating an open environment for all accelerators and unified heterogeneous compute on open standards. The UXL Foundation is expanding its coalition by adding new members.

As this is going on, Intel Tiber Developer Cloud is growing its compute capacity by adding new, cutting-edge hardware platforms and new service features that enable developers and businesses to assess the newest Intel architecture, innovate and optimise workloads and models of artificial intelligence rapidly, and then implement AI models at scale. Large-scale Intel Gaudi 2-based and Intel Data Centre GPU Max Series-based clusters, as well as previews of Intel Xeon 6 E-core and P-core systems for certain customers, are among the new hardware offerings. Intel Kubernetes Service for multiuser accounts and cloud-native AI training and inference workloads is one of the new features.

Next Up

Intel’s objective to enhance HPC and AI is demonstrated by the new supercomputers that are being implemented with Intel Xeon CPU Max Series and Intel Data Centre GPU Max Series technologies. The Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA) CRESCO 8 system will help advance fusion energy; the Texas Advanced Computing Centre (TACC) is fully operational and will enable data analysis in biology to supersonic turbulence flows and atomistic simulations on a wide range of materials; and the United Kingdom Atomic Energy Authority (UKAEA) will solve memory-bound problems that underpin the design of future fusion powerplants. These systems include the Euro-Mediterranean Centre on Climate Change (CMCC) Cassandra climate change modelling system.

The outcome of the mixed-precision AI benchmark will serve as the basis for Intel’s Falcon Shores next-generation GPU for AI and HPC. Falcon Shores will make use of Intel Gaudi’s greatest features along with the next-generation Intel X architecture. A single programming interface is made possible by this integration.

In comparison to the previous generation, early performance results on the Intel Xeon 6 with P-cores and Multiplexer Combined Ranks (MCR) memory at 8800 megatransfers per second (MT/s) deliver up to 2.3x performance improvement for real-world HPC applications, such as Nucleus for European Modelling of the Ocean (NEMO). This solidifies the chip’s position as the host CPU of choice for HPC solutions.

Read more on govindhtech.com

0 notes

govindhtech · 2 years ago

Text

Intel Cloud Optimization Enhances AWS AI

Intel Cloud Optimization on AWS Because it provides infrastructure and scalability, cloud computing is often used to create and operate large AI systems. Amazon Web Services (AWS), one of the largest and most prominent CSPs, offers hundreds of services to build any cloud application. The platform’s purpose-built databases and tools for AI and machine learning let developers and enterprises innovate faster, cheaper, and more agilely.

Developers may accelerate their innovation on popular hardware technologies and further boost model efficiency by utilizing pre-built optimizations and tools for a wide range of applications and use cases on AWS. It can take a lot of time and resources to find and implement the best tools and optimizations for your project. The pain of adding additional architectures to code can be mitigated for developers by providing comprehensive documentation and guides that make the implementation of these optimizations simple.

Intel Cloud Optimization Modules: What Are They? Intel Cloud Optimization Modules are a set of cloud-native, open-source reference architectures designed with production AI developers in mind. They further optimize the potential of cloud-based solutions that easily connect with AI workloads. These modules enable developers to apply AI solutions that are optimized for Intel processors and GPUs, thereby increasing workload efficiency and achieving peak performance.

With specially designed tools to complement and enrich the cloud experience on AWS with pertinent codified Intel AI software optimizations, the cloud optimization modules are accessible for well-known cloud platforms like AWS. With end-to-end AI software and optimizations for a range of use cases, including computer vision and natural language processing, these optimizations provide numerous important advantages for driving AI solutions.

Every module has a content bundle that contains a whitepaper with additional details on the module and its contents as well as the open-source GitHub repository with all of the documentation. The content packages also include a cheat sheet that lists the most pertinent code for each module, a video series, practical implementation walkthroughs, and the opportunity to attend office hours if you have any special implementation-related issues.

AWS Cloud Intel Cloud Optimization Modules AWS users can choose from a number of Intel Cloud Optimization Modules, which include optimizations for popular AWS tools like SageMaker and Amazon Elastic Kubernetes. Below, you may find out more about various AWS optimization modules:

GPT2-Modest Dispersed Instruction Generative pre-trained transformers, or GPT models, are widely used in a range of fields as GenAI applications. Since compact models are easier to construct and deploy, building large language models (LLM) is often sufficient in many use situations. This module shows developers how to optimize a GPT2-small (124M parameter) model for high-performance distributed training on an AWS cluster of Intel Xeon CPUs.

Using software optimizations and frameworks such as the Intel Extension for PyTorch and oneAPI Collective Communications Library (oneCCL) to speed up the process and improve model performance in an effective multi-node training environment, the module walks through the whole lifecycle of fine-tuning an LLM on a configured AWS cluster. An LLM on AWS with the ability to produce words trained on your particular task and dataset for your use case is the end result.

SageMaker with XGBoost A well-liked tool for creating, honing, and deploying machine learning applications on AWS, Amazon SageMaker comes with built-in Jupyter notebook instances and commonly used, optimized machine learning methods for faster model building. Working through this session will teach you how to activate the Intel AI Tools for accelerated models and inject your own training and inference code into a prebuilt SageMaker pipeline. This module accelerates an end-to-end custom machine learning pipeline on SageMaker by leveraging Intel Optimization for XGBoost. The Lambda container has all the parts needed to create custom AWS Lambda functions with XGBoost and Intel oneDAL optimizations, while the XGBoost oneDAL container comes with the oneAPI Data Analytics Library to speed up model algorithms.

Within Kubernetes, XGBoost With an automatically managed service, Amazon Elastic Kubernetes Services (EKS) makes it simple for developers to launch, operate, and expand Kubernetes applications on AWS. Using EKS and Intel AI Tools, this module makes it easier for developers to create and launch accelerated AI applications on AWS. With Intel oneDAL optimizations, developers can learn how to construct an expedited Kubernetes cluster that makes use of Intel Optimization for XGBoost for AI workloads. The module makes use of Elastic Load Balancer (ELB), Amazon Container Registry (ECR), and Amazon Elastic Compute Cloud (EC2) in addition to EKS.

Use Intel Cloud Optimization Modules to improve your AI projects on AWS by leveraging Intel optimizations and containers for widely used tools. To further your projects, you can learn how to use strong software optimizations and construct accelerated models on your preferred AWS tools and services. Use these modules to maximize the potential of your AWS projects, and register for office hours if you have any inquiries concerning implementation!

They invite you to explore Intel’s additional AI Tools and Framework enhancements and discover the oneAPI programming paradigm, which is a unified, open, and standards-based framework that serves as the basis for Intel’s AI Software Portfolio. Additionally, visit the Intel Developer Cloud to test out the newest AI-optimized software and hardware to assist in creating and implementing your next cutting-edge AI projects!

Read more on govindhtech.com

#NextGenComputing #DellPowerEdgeXR8000 #artificialintelligence #ai #AIcapabilities #machinelearning #5Gnetworks #DellTechnologies #csp #communicationsserviceproviders #IntelXeonCPUs #genai #GenAIcapabilities #mec #technology #technews #news #govindhtech

0 notes

govindhtech · 1 year ago

Text

IBM Cloud Bare Metal Servers for VPCs Use 4th Gen Intel Xeon

The range of IBM Cloud Bare Metal Servers for Virtual Private Clouds is being shaken up by new 4th Gen Intel Xeon processors and dynamic network bandwidth.

With great pleasure, IBM is thrilled to announce that the fourth generation of Intel Xeon CPUs are now available on IBM Cloud Bare Metal Servers for Virtual Private Clouds. IBM customers now have the ability to provision Intel’s most recent micro architecture within their very own virtual private cloud. This allows them to get access to a variety of performance benefits, such as increased core-to-memory ratios (21 new server profiles) and dynamic network bandwidth that is only available through IBM Cloud VPC. For those who are following track, that is three times as many provisioning options as their present Intel Xeon CPUs, which are of the second generation. Take a look around.

Are these servers made of bare metal suitable for my needs?

In addition to having rapid provisioning, excellent network speeds, and the most secure software-defined resources that are accessible within IBM, IBM Cloud Bare Metal Servers for Virtual Private Clouds are hosted on their most recent and developer-friendly platform. Every single one of your central processing units would be based on the 4th gen Intel Xeon processors, which IBM initially introduced on IBM Cloud Bare Metal Servers for traditional infrastructure in conjunction with Intel’s day-one release product.

The traditional IBM Cloud infrastructure is distinct from the IBM Cloud Virtual Private Cloud. More suitable for large, steady-state, predictable activities that call for the highest possible level of customisation is this method. However, IBM Cloud Virtual Private Cloud is an excellent solution for high-availability and maximum elasticity requirements. Take a look at this brief introduction video to get a better understanding of which environment would be most suitable for your workload requirements.

The customisation choices available to you include five pre-set profile families, which contain your number of CPU instances, RAM, and bandwidth, in the event that IBM Cloud Bare Metal Servers for Virtual Private Cloud turns out to be your preferred choice. What sets IBM Cloud apart from other cloud services is the fact that each profile provides you with DDR-5 memory and dynamic network bandwidth ranging from 10 to 200 Gbps. For tasks that require a significant amount of CPU power, such as heavy web traffic operations, production batch processing, and front-end web servers, compute profiles are the most effective solution.

Balanced profiles are designed to provide a combination of performance and scalability, making them a great choice for databases of a moderate size and cloud applications that experience moderate traffic.

Memory profiles are most effective when applied to workloads that require a significant amount of memory, such as large cache and database applications, as well as in-memory analytics.

When it comes to running small to medium in-memory databases and OLAP, such as SAP BW/4 HANA, very high profiles are the most effective solutions.

Large in-memory databases and online transaction processing workloads are both excellent for ultra-high profiles because they offer the most memory per core.

For these bare metal servers, what kinds of workloads do you propose they handle?

Over the course of this year, IBM’s beta programme was exposed to a wide variety of workloads; nonetheless, there were a few noteworthy success stories that particularly stood out:

Building on top of IBM Cloud, VMware Cloud Foundation These workloads required a high core performance, interoperability with VMware, licencing portability, a smaller core count variety, and a Generic operating system, which IBM just recently launched. In a dedicated location, they conducted tests for VMware managed virtual cloud functions (VCF) as well as build-your-own VMware virtual cloud functions (VCF).

They were happy with the customisation freedom and benchmark performance enhancements that backed up their findings. During the second half of the year, these workloads will be accessible on Intel Xeon profiles of the fourth generation within the IBM Cloud Virtual Private Cloud.

With regard to HPCaaS, this workload was one of a kind, and IBM believe that it is a primary use case for this distribution. Terraform and IBM Storage Scale were used in their tests to see whether or not they could get improved performance. They were delighted with the throughput improvement and the agile provisioning experience between platforms and networking.

The task of providing financial services and banking necessitated both powerful and dedicated system performance, as well as the highest possible level of security and compliance. After conducting tests to determine capacity expansion, user interface experience, security controls, and security management, they were thrilled to find that production times had been reduced.

Beginning the process

In the data centres of IBM Cloud Dallas, Texas, bare metal servers powered by 4th gen Intel Xeon processors are currently accessible. Additional sites will be added in the second half of the year 24. The IBM Cloud Bare Metal Servers for Virtual Private Cloud catalogue allows you to view all of the pricing and provisioning options for their new 4th Gen Intel Xeon processors and save a quote to your account. As an alternative, you could start a chat and obtain some answers right now. Within their cloud documents, you can find more information by reading their getting started guides and tutorials.

Spend one thousand dollars in IBM Cloud credits

If you are an existing customer who is interested in provisioning new workloads or if you are inquisitive about deploying your first workload on IBM Cloud VPC, then you should be sure to take advantage of their limited time promotion for IBM Cloud VPC. By entering the promotional code VPC1000 within either the bare metal or virtual server catalogues, you will receive USD 1,000 in credits that may be used towards the purchase of your new virtual private cloud (VPC) resources. These resources include computing, network, and storage components. Only profiles based on the second generation of Intel Xeon processors and profiles from earlier generations are eligible for this promotion, which is only available for a limited period.