#Hyperdisk | Explore Tumblr posts and blogs

govindhtech · 8 months ago

Text

Principal Advantages Of The Storage Pool + Hyperdisk On GKE

Do you want to pay less for storing GKE blocks? Storage Pool for Hyperdisks may assist Whether you’re managing GKE clusters, conventional virtual machines, or both, it’s critical to automate as many of your operational chores as you can in an economical way.

Pool Storage

Hyperdisk Storage Pool are a pre-purchased collection of capacity, throughput, and IOPS that you can then supply to your applications as required. Hyperdisk is a next-generation network connected block storage solution. Hyperdisk block storage disks allow you to optimize operations and costs by sharing capacity and performance across all the disks in a pool when you put them in storage pools. Hyperdisk Storage Pools may reduce your Total Cost of Ownership (TCO) associated with storage by up to 30–50%, and as of Google Kubernetes Engine (GKE) 1.29.2, they can be used on GKE!

Thin provisioning in Storage Pool makes this feasible by enabling you to use the capacity that is allocated inside the pool only when data is written, not when pool disks are provided. Rather of provisioning each disk for peak demand regardless of whether it ever experiences that load, capacity, IOPS, and throughput are bought at the pool level and used by the disks in the pool on an as-needed basis, enabling you to share resources as needed:

Why is Hyperdisk used?

Hyperdisk, the next generation of Google Cloud persistent block storage, is different from conventional persistent disks in that it permits control of throughput and IOPS in addition to capacity. Additionally, even after the disks are first configured, you may adjust their performance to match your specific application requirements, eliminating extra capacity and enabling cost savings.Image Credit Google Cloud

How about Storage Pool?

In contrast, storage pools allow you to share a thinly-provisioned capacity pool across many Hyperdisks in a single project that are all located in the same zone, or “Advanced Capacity” Storage Pool. Rather to using storage capacity that is provided, you buy the capacity up front and just use it for data that is written. Throughput and IOPS may be adjusted in a similar manner in a storage pool referred to as “Advanced Capacity & Advanced Performance.”

Combining Hyperdisk with Storage Pools reduces the total cost of ownership (TCO) for block storage by shifting management responsibilities from the disk level to the pool level, where all disks within the pool absorb changes. A Storage Pool is a zonal resource with a minimum capacity of 10TB and requires a hyperdisk of the same kind (throughput or balanced).

Hyperdisk

Storage Pool + Hyperdisk on GKE

Hyperdisk Balanced boot disks and Hyperdisk Balanced or Hyperdisk Throughput attached disks may now be created on GKE nodes within Storage Pool, as of GKE 1.29.2.

Let’s imagine you want to be able to adjust the performance to suit your workload for a demanding stateful application that is executing in us-central-a. You decide to use Hyperdisk Balanced for the workload’s block storage. You employ a Hyperdisk Balanced Advanced Capacity, Advanced Performance Storage Pools in place of trying to right-size each disk in your application. The capacity and performance are paid for beforehand.

Pool performance is used up when the disks in the storage pool notice an increase in IOPS or throughput, while pool capacity is only used up when your application writes data to the disks. Prior to creating the Hyperdisks inside the Storage Pool(s) must be created.

Google Cloud Hyperdisk

Use the following gcloud command to establish an Advanced Capacity, Advanced Performance StoragePools:gcloud compute storage-pools create pool-us-central1-a --provisioned-capacity=10tb --storage-pool-type=hyperdisk-balanced --zone=us-central1-a --project=my-project-id --capacity-provisioning-type=advanced --performance-provisioning-type=advanced --provisioned-iops=10000 --provisioned-throughput=1024

The Pantheon UI may also be used to construct Storage Pools.

You may also provide your node boot disks in the storage pool if your GKE nodes are utilizing Hyperdisk Balanced as their boot drives. This may be set up at cluster or node-pool construction, as well as during node-pool updates. You may use the Pantheon UI or the following gcloud command to provide your Hyperdisk Balanced node boot drives in your Storage Pool upon cluster setup. Keep in mind that your Storage Pool has to be established in the same zone as your cluster and that the machine type of the nodes needs to support Hyperdisk Balanced.

You must use the storage-pools StorageClass argument to define your Storage Pool in order to deploy the Hyperdisk Balanced disks that your stateful application uses in it. The Hyperdisk Balanced volume that your application will utilize is then provisioned using a Persistent Volume Claim (PVC) that uses the StorageClass.

The provisioned-throughput-on-create and provisioned-iops-on-create parameters are optional and may be specified by the StorageClass. The volume will default to 3000 IOPS and 140Mi throughput if provisioned-throughput-on-create and provisioned-iops-on-create are left empty. Any IOPS or Throughput from the StoragePool will only be used by IOPS and Throughput values that exceed these preset levels.

Google Hyperdisk

The allowed IOPS and throughput figures vary based on the size of the drive.

Only 40 MiB of throughput and 1000 IOPS will be used by volumes allocated with this StorageClass from the Storage Pools.

Next, create a PVC with a reference to the StorageClass storage-pools-sc.

The pooling-storage-sc When a Pod utilizing the PVC is formed, Storage Class’s Volume Binding Mode: Wait For First Consumer is used, delaying the binding and provisioning of a Persistent Volume.

Finally, utilize the aforementioned PVC to include these Hyperdisk Volumes into your Stateful application. It is necessary to schedule your application to a node pool that has computers capable of attaching Hyperdisk Balanced.

NodeSelectors are used in the Postgres deployment to make sure that pods are scheduled to nodes that allow connecting Hyperdisk Balanced, or C3 machine types.

You ought now be able to see that your storage pools has your Hyperdisk Balanced volume deployed.

Next actions

For your stateful applications, you may optimize storage cost reductions and efficiency by using a Storage Pools + Hyperdisk approach for GKE.

Read more on Govindhtech.com

#StoragePool #GKEclusters #HyperdiskStoragePools #Hyperdisk #GoogleCloud #storage #StorageClass #news #technews #technology #technologynews #technologytrends #govindhtech

0 notes

arkman101 · 1 year ago

Text

HYPERDISK v1.0

High Quality Music Player

Available in two colors

- Icicle White ❄

- Cherry Red 🍒

[More colors coming soon]

Some more fake Y2K advertising. Hope you enjoy :)

#90s #2000s #y2k #y2k aesthetic #vectorheart #graphic design #music

2 notes · View notes

iyoopon · 1 year ago

Text

0 notes

retrocgads · 6 years ago

Photo

UK 1995

#UK1995 #QUARTERDECK #TOOLS #IBM #QEMM 386 #HYPERDISK #AXIS THE GAMECHEATER #GAMERUNNER

3 notes · View notes

devsnews · 2 years ago

Link

Google Compute Engine C3 Virtual Machines (VMs) and Hyperdisk are two products the Google Cloud Platform offers. Google Compute Engine C3 VMs are cloud-based virtual machines that offer high-performance virtual computing power, allowing users to run large-scale applications. Hyperdisk is a highly available, high-performance, low-latency block storage system that offers persistent storage for Google Compute Engine instances. It is designed for applications that require high availability and scalability.

#devs_news #google #hyperdisk #infrastructure #virtual_machine #gcp #compute_engine

0 notes

hardwaresfera · 5 years ago

Text

HyperDisk, un pequeño SSD externo de 45 gramos que esta en Kickstarter

Toda la información en: https://hardwaresfera.com/noticias/perifericos/hyperdisk-un-pequeno-ssd-externo-de-45-gramos-que-esta-en-kickstarter/

Actualmente los SSD están a un precio realmente asequible, entre otros, porque las memorias NAND Flash han bajado de precio. Ha pasado la fiebre de los smartphone y eso nos beneficia a todos. Esto permite que se estén desarrollando otras soluciones muy interesantes, como el HyperDisk. Y es que esta unidad SSD extremadamente compacta está …

#almacenamiento #Hardware #hyperdisk #M.2 #Periferico #SSD

0 notes

monsieur-immortel · 8 years ago

Photo

#hyperdisk #aesthetic #chill #sunset

0 notes

govindhtech · 11 months ago

Text

Reduce the Google Compute Engine Cost with 5 Tricks

Google Compute Engine Cost

Compute Engine provides several options for cutting expenses, such as optimising your infrastructure and utilising sales. Google Cloud is sharing some useful advice to help you save Google Compute Engine cost in this two-part blog post. This guide has something for everyone, regardless of whether you work for a huge organisation trying to optimise its budget or a small business just starting started with cloud computing.

Examine your present budgetary plan

It would be helpful to have a map of your present circumstances and spending structure before you embark on a journey to optimise your Google Compute Engine cost. This will allow you to make well-informed decisions regarding your next course of action. That billing panel is the Google Cloud console. It provides you with a detailed breakdown of your spending, tracking each expense to a specific SKU. It can be used to examine the overall financial picture of your company and to determine how much a given product will cost to use for a given project.

You can find resources you are no longer paying for but no longer require by taking a closer look at your spending. Nothing is a better method to save money than simply not spending it, after all.

Examine the automated suggestions

On the page where your virtual machines are listed, have you noticed the lightbulbs next to some of your machines? These are Google Cloud’s automated suggestions for things you could do to cut costs. The following project management categories cost, security, performance, reliability, management, and sustainability are addressed by Recommendation Hub, a new technology. The recommendations system can make suggestions for actions that you might think about based on its understanding of your fleet structure. Helping you cut costs without sacrificing fleet performance is Google Cloud’s main objective.Image credit to Google Cloud

The machine can be scaled down according to its utilisation, or the type of machine can be changed (e.g., from n1 to e2). You get a summary of the recommended modification along with the expected cost savings when you click on one of the recommendations. You have the option of applying the modification or not. Recall that the instance must be restarted in order for modifications to take effect.Image credit to Google Cloud

Check the types of discs you have

You must attach at least one persistent disc to each virtual machine in your fleet. Google Cloud offers a variety of disc formats with varying features and performance. The kinds that are offered are:

Hyperdisk

With a full range of data durability and administration features, Hyperdisk is a scalable, high-performance storage solution built for the most demanding mission-critical applications.

Hyperdisk Storage Pools

Hyperdisk Storage Pools are pre-aggregated volumes, throughput, and IOPS that you can reserve in advance and allocate to your apps as required.

Persistent Disk

Your virtual machines default storage option is called Persistent Disc. It may be regional or zonal. has four variations:

Standard

The desktop computer’s equivalent of an HDD disc. offers the least expensive storage with a slower I/O speed.

SSD

A speed-focused option with excellent I/O performance, albeit at a higher cost per gigabyte.

Balanced

The default setting for newly created compute instances; it strikes a compromise between “Standard” and “SSD.”

Extreme

Suitable for the hardest workloads. enables you to manage the disk’s IOPS in addition to its size.

Local SSD

An SSD that is physically attached to the host that powers your virtual machine is called a local SSD. incredibly quick but transient.

Since persistent disc storage is the most widely used type of storage, let’s concentrate on it. The Balanced disc, which offers a decent compromise between performance and cost, is the default disc type used when building a new virtual machine. Although this works well in a lot of situations, it might not be the ideal choice in every situation.

Fast I/O to disc is not needed, for instance, by stateless apps that are a component of auto-scaling deployments and keep all pertinent data in an external cache or database. These apps are excellent candidates for switching to Standard discs, which, depending on the region, can be up to three times less expensive per gigabyte than Balanced discs.

A list of the discs used in your project can be obtained using: the list of gcloud compute discs with the format “table(name, type, zone, sizeGb, users)”

You must clone the disc and make changes to the virtual machines that use it in order to start using the new disc in order to alter the disc type.

Free up any unused disc space

Moving on to storage, there are other factors besides disc type that influence price. You should also consider how much disc utilisation affects your budget.You will be paid for the full 100 GB of persistent disc space allocated for your project, whether you use 20%, 70%, or 100%. You may still want to monitor your boot discs closely even if your application does not use Persistent Discs for data storage.

If your stateless programme really needs a disc with many gigabytes of free space, think about reducing the size of the discs to match your actual needs. Because they enjoy round numbers, people frequently build 20 GB discs even when they only require 12 GB. Save money and act more like a machine.

Agree to make use of CUDs, or committed use discounts

Compute Engine is not the only product to which this advice is applicable. You can receive a significant discount if you can guarantee that you’ll use a specific number of virtual machines for three or more years, or at least a year! You can get substantially cheaper costs for local SSDs, GPUs, vCPUs, memory, sole-tenant nodes, and software licences by using a range of (CUDs). You are not even limited to allocating your vCPU and memory to a certain project, area, or machine series when using Flex CUDs.

Discounts for committed use are offered on a number of Google Cloud products. If you’re satisfied with Google Cloud and have no intention of switching providers anytime soon, you should seriously think about utilising CUDs whenever you can to save a lot of money. When it comes to computing, you can buy CUDs straight from the Google Cloud dashboard.

Read more on Govindhtech.com

#A3UltraVMs #NVIDIAH200 #AI #Trillium #HypercomputeCluster #GoogleAxionProcessors #Titanium #News #Technews #Technology #Technologynews #Technologytrends #Govindhtech

2 notes · View notes

govindhtech · 6 months ago

Text

H3 Virtual Machines: Compute Engine-Optimized Machine family

Compute Engine’s compute-optimized machine family

High performance computing (HPC) workloads and compute-intensive tasks are best suited for instances of compute-optimized virtual machines (VMs). With their architecture that makes use of characteristics like non-uniform memory access (NUMA) for optimal, dependable, consistent performance, compute-optimized virtual machines (VMs) provide the best performance per core.

This family of machines includes the following machine series:

Two 4th-generation Intel Xeon Scalable processors, code-named Sapphire Rapids, with an all-core frequency of 3.0 GHz power H3 virtual machines. 88 virtual cores (vCPUs) and 352 GB of DDR5 memory are features of H3 virtual machines.

The third generation AMD EPYC Milan CPU, which has a maximum boost frequency of 3.5 GHz, powers C2D virtual machines. Flexible scaling between 2 and 112 virtual CPUs and 2 to 8 GB of RAM per vCPU are features of C2D virtual machines.

The 2nd-generation Intel Xeon Scalable processor (Cascade Lake), which has a sustained single-core maximum turbo frequency of 3.9 GHz, powers C2 virtual machines. C2 provides virtual machines (VMs) with 4–60 vCPUs and 4 GB of memory per vCPU.

H3 machine series

The 4th generation Intel Xeon Scalable processors (code-named Sapphire Rapids), DDR5 memory, and Titanium offload processors power the H3 machine series and H3 virtual machines.

For applications involving compute-intensive high performance computing (HPC) in Compute Engine, H3 virtual machines (VMs) provide the highest pricing performance. The single-threaded H3 virtual machines (VMs) are perfect for a wide range of modeling and simulation tasks, such as financial modeling, genomics, crash safety, computational fluid dynamics, and general scientific and engineering computing. Compact placement, which is ideal for closely connected applications that grow over several nodes, is supported by H3 virtual machines.

There is only one size for the H3 series, which includes a whole host server. You can change the amount of visible cores to reduce licensing fees, but the VM still costs the same. H3 virtual machines (VMs) have a default network bandwidth rate of up to 200 Gbps and are able to utilize the full host network capacity. However, there is a 1 Gbps limit on the VM to internet bandwidth.

H3 virtual machines are unable to allow simultaneous multithreading (SMT). To guarantee optimal performance constancy, there is also no overcommitment.

H3 virtual machines can be purchased on-demand or with committed use discounts (CUDs) for one or three years. Google Kubernetes Engine can be utilized with H3 virtual machines.

H3 VMs Limitations

The following are the limitations of the H3 machine series:

Only a certain machine type is offered by the H3 machine series. No custom machine shapes are available.

GPUs cannot be used with H3 virtual machines.

There is a 1 Gbps limit on outgoing data transfer.

The performance limits for Google Cloud Hyperdisk and Persistent Disk are 240 MBps throughput and 15,000 IOPS.

Machine images are not supported by H3 virtual machines.

The NVMe storage interface is the only one supported by H3 virtual machines.

Disks cannot be created from H3 VM images.

Read-only or multi-writer disk sharing is not supported by H3 virtual machines.

Different types of H3 machines

Machine typesvCPUs*Memory (GB)Max network egress bandwidth†h3-standard-8888352Up to 200 Gbps

With no simultaneous multithreading (SMT), a vCPU represents a whole core. † The default egress bandwidth is limited to the specified value. The destination IP address and several variables determine the actual egress bandwidth. Refer to the network bandwidth.

Supported disk types for H3

The following block storage types are compatible with H3 virtual machines:

Balanced Persistent Disk (pd-balanced)

Hyperdisk Balanced (hyperdisk-balanced)

Hyperdisk Throughput (hyperdisk-throughput)

Capacity and disk limitations

With a virtual machine, you can employ a combination of persistent disk and hyperdisk volumes, however there are some limitations:

Each virtual machine can have no more than 128 hyperdisk and persistent disk volumes combined.

All disk types’ combined maximum total disk capacity (in TiB) cannot be greater than:

Regarding computer types with fewer than 32 virtual CPUs:

257 TiB for all Hyperdisk or all Persistent Disk

257 TiB for a mixture of Hyperdisk and Persistent Disk

For computer types that include 32 or more virtual central processors:

512 TiB for all Hyperdisk

512 TiB for a mixture of Hyperdisk and Persistent Disk

257 TiB for all Persistent Disk

H3 storage limits are described in the following table: Maximum number of disks per VMMachine typesAll disk types All Hyperdisk typesHyperdisk BalancedHyperdisk ThroughputHyperdisk ExtremeLocal SSDh3-standard-88128648640Not supported

Network compatibility for H3 virtual machines

gVNIC network interfaces are needed for H3 virtual machines. For typical networking, H3 allows up to 200 Gbps of network capacity.

Make sure the operating system image you use is fully compatible with H3 before moving to H3 or setting up H3 virtual machines. Even though the guest OS displays the gve driver version as 1.0.0, fully supported images come with the most recent version of the gVNIC driver. The VM may not be able to reach the maximum network bandwidth for H3 VMs if it is running an operating system with limited support, which includes an outdated version of the gVNIC driver.

The most recent gVNIC driver can be manually installed if you use a custom OS image with the H3 machine series. For H3 virtual machines, the gVNIC driver version v1.3.0 or later is advised. To take advantage of more features and bug improvements, Google advises using the most recent version of the gVNIC driver.

Read more on Govindhtech.com

#H3VMs #H3virtualmachines #computeengine #virtualmachine #VMs #Google #googlecloud #govindhtech #NEWS #TechNews #technology #technologytrends #technologynews

1 note · View note

govindhtech · 6 months ago

Text

Hyperdisk ML: Integration To Speed Up Loading AI/ML Data

Hyperdisk ML can speed up the loading of AI/ML data. This tutorial explains how to use it to streamline and speed up the loading of AI/ML model weights on Google Kubernetes Engine (GKE). The main method for accessing Hyperdisk ML storage with GKE clusters is through the Compute Engine Persistent Disk CSI driver.

What is Hyperdisk ML?

You can scale up your applications with Hyperdisk ML, a high-performance storage solution. It is perfect for running AI/ML tasks that require access to a lot of data since it offers high aggregate throughput to several virtual machines at once.

Overview

It can speed up model weight loading by up to 11.9X when activated in read-only-many mode, as opposed to loading straight from a model registry. The Google Cloud Hyperdisk design, which enables scalability to 2,500 concurrent nodes at 1.2 TB/s, is responsible for this acceleration. This enables you to decrease pod over-provisioning and improve load times for your AI/ML inference workloads.

The following are the high-level procedures for creating and utilizing Hyperdisk ML:

Pre-cache or hydrate data in a disk image that is persistent: Fill Hyperdisk ML volumes with serving-ready data from an external data source (e.g., Gemma weights fetched from Cloud Storage). The disk image’s persistent disk needs to work with Google Cloud Hyperdisk.

Using an existing Google Cloud Hyperdisk, create a Hyperdisk ML volume: Make a Kubernetes volume that points to the data-loaded Hyperdisk ML volume. To make sure your data is accessible in every zone where your pods will operate, you can optionally establish multi-zone storage classes.

To use it volume, create a Kubernetes deployment: For your applications to use, refer to the Hyperdisk ML volume with rapid data loading.

Multi-zone Hyperdisk ML volumes

There is just one zone where hyperdisk ML disks are accessible. Alternatively, you may dynamically join many zonal disks with identical content under a single logical PersistentVolumeClaim and PersistentVolume by using the Hyperdisk ML multi-zone capability. The multi-zone feature’s referenced zonal disks have to be in the same area. For instance, the multi-zone disks (such as us-central1-a and us-central1-b) must be situated in the same area if your regional cluster is established in us-central1.

Running Pods across zones for increased accelerator availability and cost effectiveness with Spot VMs is a popular use case for AI/ML inference. Because it is zonal, GKE will automatically clone the disks across zones if your inference server runs several pods across zones to make sure your data follows your application.Image Credit To Google Cloud

The limitations of multi-zone Hyperdisk ML volumes are as follows:

There is no support for volume resizing or volume snapshots.

Only read-only mode is available for multi-zone Hyperdisk ML volumes.

GKE does not verify that the disk content is consistent across zones when utilizing pre-existing disks with a multi-zone Hyperdisk ML volume. Make sure your program considers the possibility of inconsistencies between zones if any of the disks have divergent material.

Requirements

The following Requirements must be met by your clusters in order to use it volumes in GKE:

Use Linux clusters with GKE 1.30.2-gke.1394000 or above installed. Make sure the release channel contains the GKE version or above that is necessary for this driver if you want to use one.

A driver for the Compute Engine Persistent Disk (CSI) must be installed. On new Autopilot and Standard clusters, the Compute Engine Persistent Disc driver is on by default and cannot be turned off or changed while Autopilot is in use. See Enabling the Compute Engine Persistent Disk CSI Driver on an Existing Cluster if you need to enable the Cluster’s Compute Engine Persistent Disk CSI driver.

You should use GKE version 1.29.2-gke.1217000 or later if you wish to adjust the readahead value.

You must use GKE version 1.30.2-gke.1394000 or later in order to utilize the multi-zone dynamically provisioned capability.

Only specific node types and zones allow hyperdisk ML.

Conclusion

This source offers a thorough tutorial on how to use Hyperdisk ML to speed up AI/ML data loading on Google Kubernetes Engine (GKE). It explains how to pre-cache data in a disk image, create a it volume that your workload in GKE can read, and create a deployment to use this volume. The article also discusses how to fix problems such a low it throughput quota and provides advice on how to adjust readahead numbers for best results.

Read more on Govindhtech.com

#AI #ML #HyperdiskML #GoogleKubernetesEngine #GKE #VMs #Kubernetes #News #Technews #Technology #Technologynews #Technologytrends #Govindhtech

0 notes

govindhtech · 6 months ago

Text

Google Cloud Storage Fuse Speeds Model + Weight Load Times

Cloud Storage Fuse or Hyperdisk ML accelerates model + weight load times from Google Cloud Storage.

The amount of model data required to support more complex AI models is growing. Costs and the end-user experience may be impacted by the seconds or even minutes of scaling delay that comes with loading the models, weights, and frameworks required to provide them for inference.

Inference servers like Triton, Text Generation Inference (TGI), or vLLM, for instance, are packed as containers that are often larger than 10GB; this might cause them to download slowly and prolong the time it takes for pods to start up in Kubernetes. The data loading issue is exacerbated by the fact that the inference pod must load model weights, which may be hundreds of gigabytes in size, after it has started.

In order to reduce the total time required to load your AI/ML inference workload on Google Kubernetes Engine (GKE), this article examines methods for speeding up data loading for both downloading models + weights and inference serving containers.

Using secondary boot drives to cache container images with your inference engine and relevant libraries directly on the GKE node can speed up container load times.

Using Cloud Storage Fuse or Hyperdisk ML to speed up model + weight load times from Google Cloud Storage.

In order to skip the image download procedure during pod/container starting, the graphic above depicts a secondary boot disk (1) that saves the container image in advance. Additionally, Cloud Storage Fuse (2) and Hyperdisk ML (3) are choices to link the pod to model + weight data saved in Cloud Storage or a network connected disk for AI/ML inference applications with demanding performance and scalability requirements. Below, we’ll examine each of these strategies in further depth.

Accelerating container load times with secondary boot disks

During construction, GKE enables you to pre-cache your container image onto a secondary boot drive that is connected to your node. By loading your containers in this manner, you may start launching them right away and avoid the image download stage, which significantly reduces startup time. The graphic below illustrates how download times for container images increase linearly with picture size. A cached copy of the container image that is already loaded on the node is then used to compare those timings.

When a 16GB container image is cached on a secondary boot drive in advance, load times may be as much as 29 times faster than when the image is downloaded from a container registry. Furthermore, you may take advantage of the acceleration regardless of container size using this method, which makes it possible for big container pictures to load consistently quickly!

To use secondary boot disks, first construct a disk containing all of your images, then make an image from the disk and provide the disk image when you build secondary boot disks for your GKE node pools. See the documentation for more details.

Accelerating model weights load times

Checkpoints, or snapshots of model weights, are produced by many machine learning frameworks and stored in object storage like Google Cloud Storage, which is a popular option for long-term storage. The two primary solutions for retrieving your data at the GKE-pod level using Cloud Storage as the source of truth are Hyperdisk ML (HdML) and Cloud Storage Fuse.

There are two primary factors to take into account while choosing a product:

Performance: the speed at which the GKE node can load the data

Operational simplicity: how simple is it to make changes to this information?

For model weights stored in object storage buckets, Cloud Storage Fuse offers a direct connection to Cloud Storage. To avoid repeated downloads from the source bucket, which increase latency, there is also a caching option for files that must be read more than once.

Because a pod doesn’t need to do any pre-hydration operational tasks in order to download new files into a designated bucket, Cloud Storage Fuse is an attractive option. Note that you will need to restart the pod with an updated Cloud Storage Fuse configuration if you alter the buckets to which the pod is attached. You may increase speed even further by turning on parallel downloads, which cause many workers to download a model at once, greatly enhancing model pull performance.

Compared to downloading files straight to the pod from Cloud Storage or another online source, Hyperdisk ML offers you superior speed and scalability. Furthermore, a single Hyperdisk ML instance may support up to 2500 nodes with an aggregate bandwidth of up to 1.2 TiB/sec. Because of this, it is a good option for inference tasks involving several nodes and read-only downloads of the same data. In order to utilize Hyperdisk ML, put your data into the disk both before and after each update. Keep in mind that if your data changes often, this adds operational overhead.

As you can see, while designing a successful model loading approach, there are more factors to consider in addition to throughput.

In conclusion

Workload starting times may be prolonged when loading big AI models, weights, and container pictures into GKE-based AI models. Using Hyperdisk ML OR Cloud Storage Fuse for models + weights, or a secondary boot disk for container images, a mix of the three previously mentioned techniques Prepare to speed up your AI/ML inference apps’ data load times.

Read more on Govindhtech.com

#CloudStorageFuse #GoogleCloudStorageFuse #Google #AIMlInference #GKE #HyperdiskML #Kubernetes #govindhtech #NEWS #technews #TechnologyNews #technologies #technology #technologytrends #googlecloud

1 note · View note

govindhtech · 6 months ago

Text

C4A Virtual Machines: Google’s Axion CPU Powered VMs

C4A Virtual Machines

Google’s first Arm-based Axion processor powers C4A virtual machines. C4A provides regular, highmem, and highcpu machine series, and offers machine types with up to 72 vCPUs and 576 GB of DDR5 memory. Built on Titanium, it has network offloads and can achieve standard network speed of up to 50 Gbps and Tier_1 networking performance of up to 100 Gbps per virtual machine.

With up to 10% higher price performance than the newest Arm-based instances offered by top cloud providers, Google is excited to announce today the general availability of C4A virtual machines, the first Axion-based VM series. Web and app servers, containerized microservices, open-source databases, in-memory caches, data analytics engines, media processing, and AI inference applications are just a few of the general-purpose workloads that C4A virtual machines are an excellent choice for.

To provide constant performance, C4A virtual machines (VMs) are housed within a single node with Uniform Memory Access (UMA) and support sole tenant nodes. C4A makes use of the newest storage options from Google Cloud, such as Hyperdisk Extreme and Hyperdisk Balanced.

The C4A machine series, in brief:

Is driven by Titanium and the Google Axion CPU.

Supports 576 GB of DDR5 RAM and up to 72 virtual CPUs.

Provides a variety of preset machine kinds.

Supports up to 50 Gbps of bandwidth in a normal network configuration.

Supports up to 100 Gbps of bandwidth for per-VM Tier_1 networking performance.

Accepts the following choices for consumption and discounts:

Discounts for resource-based and adaptable committed use

Reservations

Spot VMs

Helps the PMU, or performance monitoring unit.

Compact placement policies are not supported.

C4A machines types

C4A virtual machines have sizes ranging from 1 vCPU to 72 vCPUs and up to 576 GB of memory as standard configurations.

standard: 4 GB memory per vCPU

highcpu: 2 GB memory per vCPU

highmem: 8 GB memory per vCPU

Supported disk types for C4A

C4A virtual machines can utilize the following Hyperdisk block storage and only support the NVMe disk interface:

Hyperdisk Balanced (hyperdisk-balanced)

Hyperdisk Extreme (hyperdisk-extreme)

C4A doesn’t support Persistent Disk.

Limits on disk and capacity

A virtual machine (VM) can have a variety of hyperdisk kinds, but the total disk capacity (in TiB) of all disk types cannot be greater than:

For machine types with less than 32 vCPUs: 257 TiB for all Hyperdisk

For machine types with 32 or more vCPUs: 512 TiB for all Hyperdisk

C4A VM network support

gVNIC network interfaces are necessary for C4A instances. For regular networking, C4A instances can handle up to 50 Gbps of network bandwidth; for VM Tier_1 networking performance, they can handle up to 100 Gbps.

Make sure the operating system image you use is fully compatible with C4A before moving to C4A or setting up C4A virtual machines. The upgraded gVNIC driver is included in fully supported images, even if the guest OS displays the gve driver version as 1.0.0. Your C4A virtual machine may not be able to reach the maximum network bandwidth for C4A VMs if it is running an operating system with limited support, which includes an outdated version of the gVNIC driver.

The latest gVNIC driver can be manually installed if you use a custom OS image with the C4A machine series. It is advised to use the gVNIC driver version v1.3.0 or later while working with C4A virtual machines. To take advantage of new features and problem improvements, Google advises using the most recent version of the gVNIC driver.

With Titanium offload, C4A is designed to handle your most demanding workloads with up to 60% more energy efficiency and 65% better price performance compared to comparable current-generation x86-based instances technology and superior maintenance capabilities.Image credit to Google Cloud

Key Google services like Bigtable, Spanner, BigQuery, F1 Query, Blobstore, Pub/Sub, Google Earth Engine, and the YouTube Ads platform have already begun implementing Axion-based servers in production due to these efficiency and performance improvements.

Customers of Google Cloud can use C4A in various services, such as Batch, Dataproc, Google Compute Engine, and Google Kubernetes Engine (GKE). Additionally, C4A virtual machines are currently previewing in Dataflow, with support for CloudSQL, AlloyDB, and other services on the horizon. The most widely used Linux operating systems, such as Container-Optimized OS, RHEL, SUSE Linux Enterprise Server, Ubuntu, Rocky Linux, and others, are supported by C4A instances. It recently introduced support for the transfer of Arm-based instances in the Migrate to Virtual Machines service in preview, and Arm-compatible software and solutions are available on the Google Cloud Marketplace.

Additionally, C4A VMs provide the storage and connection performance that your business requires:

High-Bandwidth Networking: To accommodate high-traffic applications, Tier_1 networking can offer up to 100 Gbps of bandwidth in addition to the typical 50 Gbps.

With a throughput of up to 5 GB/s and 350k IOPS, Google Cloud’s latest iteration of balanced and extreme hyperdisk storage provides scalable, high-performance storage.

Find out more

C4A instances are now widely accessible through Committed Use Discounts (CUDs) with one- and three-year durations, reservations, Spot VMs, and FlexCUDs. As of right now, C4A virtual machines are accessible in the following regions: asia-southeast1 (Singapore), eu-west1 (Belgium), eu-west4 (Netherlands), eu-west3 (Frankfurt), us-central1 (Iowa), us-east4 (Virginia), and us-east1 (SC).

Read more on Govindhtech.com

#C4A #C4Avirtualmachines #C4AVMs #Google #Googlecloud #axion #virtualmachines #Govindhtech #news #technews #technology #technologytrends #technologynews

0 notes