#kubectl-ai | Explore Tumblr posts and blogs

generativeinai · 1 month ago

Text

Top 10 Ways Generative AI in IT Workspace Is Redefining DevOps, Infrastructure Management, and IT Operations

Generative AI is no longer just a buzzword in enterprise IT — it’s a force multiplier. As businesses strive for faster delivery, resilient infrastructure, and autonomous IT operations, generative AI is becoming the secret weapon behind the scenes. From automating code to predicting outages before they happen, generative AI is transforming how DevOps teams, system admins, and IT managers operate daily.

In this blog, we’ll explore the top 10 real-world ways generative AI is redefining the IT workspace—specifically in the areas of DevOps, infrastructure management, and IT operations.

1. AI-Generated Infrastructure as Code (IaC)

Generative AI can automatically create, test, and optimize infrastructure-as-code templates based on user input or workload requirements.

Instead of manually writing Terraform or CloudFormation scripts, engineers can describe their desired setup in plain English.

AI tools like GitHub Copilot or bespoke enterprise copilots generate IaC snippets on demand, reducing human error and speeding up cloud provisioning.

Impact: Saves hours of setup time, increases reproducibility, and enforces security-compliant defaults.

2. Predictive Incident Management and Self-Healing Systems

Generative AI models trained on historical incident logs can predict recurring issues and suggest preventive measures in real-time.

Integrated into observability platforms, AI can flag anomalies before they impact end users.

When tied into automation workflows (e.g., via ServiceNow or PagerDuty), it can trigger remediation scripts, effectively enabling self-healing infrastructure.

Impact: Reduces MTTR (Mean Time to Resolve), enhances uptime, and frees up SRE teams from firefighting.

3. Automated Code Review and Deployment Optimization

Generative AI assists in reviewing code commits with suggestions for performance, security, and best practices.

AI bots can flag problematic code patterns, auto-suggest fixes, and even optimize CI/CD pipelines.

In DevOps, AI tools can recommend the best deployment strategy (blue-green, canary, etc.) based on application type and past deployment metrics.

Impact: Speeds up release cycles while reducing bugs and deployment risks.

4. Natural Language Interfaces for DevOps Tools

Generative AI turns complex CLI and scripting tasks into simple prompts.

Instead of memorizing kubectl commands or writing bash scripts, developers can just ask: “Scale my pod to 5 instances and restart the deployment.”

AI interprets the intent and executes the backend commands accordingly.

Impact: Democratizes access to DevOps tools for non-experts and accelerates operations.

5. Dynamic Knowledge Management and Documentation

Keeping IT documentation up to date is painful — generative AI changes that.

It auto-generates technical documentation based on system changes, deployment logs, and config files.

Integrated with enterprise wikis or GitHub repositories, AI ensures every process is captured in real time.

Impact: Saves time, ensures compliance, and keeps institutional knowledge fresh.

6. Smart Capacity Planning and Resource Optimization

AI-powered models predict workload trends and auto-scale infrastructure accordingly.

Generative AI can simulate future demand scenarios, suggesting cost-saving measures like right-sizing or moving workloads to spot instances.

In Kubernetes environments, AI can recommend pod-level resource adjustments.

Impact: Cuts infrastructure costs and ensures optimal performance during traffic spikes.

7. Personalized IT Assistant for Developers and Admins

Think of this as a ChatGPT specifically trained on your IT stack.

Developers can ask, “Why did the build fail yesterday at 3 PM?” or “How do I restart the staging DB?”

The AI assistant fetches logs, searches through config files, and provides contextual answers.

Impact: Reduces dependency on IT support, accelerates troubleshooting, and enhances developer autonomy.

8. AI-Augmented Threat Detection and Security Auditing

Generative AI scans code, configs, and network activity to detect vulnerabilities.

It can generate risk reports, simulate attack vectors, and recommend patching sequences.

Integrated into DevSecOps workflows, it ensures security is not bolted on, but baked in.

Impact: Proactively secures the IT environment without slowing down innovation.

9. Cross-Platform Automation of Repetitive IT Tasks

Routine tasks like server patching, log rotation, or service restarts can be automated through generative scripts.

AI can orchestrate cross-platform operations involving AWS, Azure, GCP, and on-prem servers from a single interface.

It also ensures proper logging and alerting are in place for all automated actions.

Impact: Enhances operational efficiency and reduces human toil.

10. Continuous Learning from Logs and Feedback Loops

Generative AI models improve over time by learning from logs, performance metrics, and operator feedback.

Each remediation or change adds to the AI’s knowledge base, making it smarter with every iteration.

This creates a virtuous cycle of continuous improvement across the IT workspace.

Impact: Builds an adaptive IT environment that evolves with business needs.

Final Thoughts: The AI-Augmented Future of IT Is Here

Generative AI isn’t replacing IT teams — it’s amplifying their capabilities. Whether you're a DevOps engineer deploying daily, an SRE managing thousands of endpoints, or an IT manager overseeing compliance and uptime, generative AI offers tools to automate, accelerate, and augment your workflows.

As we move toward hyper-automation, the organizations that succeed will be those that integrate Generative AI in the IT workspace strategically and securely.

#ai #artificial intelligence

0 notes

qbokubernetesengine · 1 year ago

Text

What is Kubeflow and How to Deploy it on Kubernetes

Machine learning (ML) processes on Kubernetes, the top container orchestration technology, may be simplified and streamlined with Kubeflow, an open-source platform. From data pretreatment to model deployment, it's like having your specialised toolbox for managing all your ML and AI operations within the Kubernetes ecosystem. Keep on reading this article to know about Kubeflow deployment in Kubernetes.

Why Kubeflow?

Integrated Approach

Complex ML processes can more easily be managed with Kubeflow because it unifies several tools and components into a unified ecosystem.

Efficiency in scaling

Thanks to its foundation in Kubernetes, Kubeflow can easily grow to manage massive datasets and ML tasks that require a lot of computing power.

Consistent results

The significance of reproducibility is highlighted by Kubeflow, who defines ML workflows as code, allowing for the replication and tracking of experiments.

Maximising the use of available resources

Separating ML workloads inside Kubernetes eliminates resource conflicts and makes sure everything runs well.

Easy Implementation

Kubeflow deployment in Kubernetes makes deploying machine learning models as web services easier, which opens the door to real-time applications.

Integration of Kubeflow with Kubernetes on GCP

For this example, we will utilise Google Cloud Platform (GCP) and their managed K8s GKE. However, there may be subtle variations depending on the provider you choose. The majority of this tutorial is still applicable to you.

Set up the GCP project

Just follow these instructions for Kubeflow deployment in Kubernetes.

You can start a new project or choose one from the GCP Console.

Establish that you are the designated "owner" of the project. The implementation process involves creating various service accounts with adequate permissions to integrate with GCP services without any hitches.

Verify that your project meets all billing requirements. To make changes to a project, refer to the Billing Settings Guide.

Verify that the necessary APIs are allowed on the following GCP Console pages:

o Compute Engine API

o Kubernetes Engine API

o Identity and Access Management (IAM) API

o Deployment Manager API

o Cloud Resource Manager API

o Cloud Filestore API

o AI Platform Training & Prediction API

Remember that the default GCP version of Kubeflow cannot be run on the GCP Free Tier due to space constraints, regardless of whether you are utilising the $300 credit 12-month trial term. A payable account is where you need to be.

Deploy kubeFlow using the CLI

Before running the command line installer for Kubeflow:

Make sure you've got the necessary tools installed:

kubectl

Gcloud

Check the GCP documentation for the bare minimum requirements and ensure your project satisfies them.

Prepare your environment

So far, we've assumed you can connect to and operate a GKE cluster. If not, use one as a starting point:

Container clusters in Gcloud generate cluster-name environment compute-zone

More details regarding the same command can be found in the official documentation.

To get the Kubeflow CLI binary file, follow these instructions:

Go to the kfctl releases page and download the v1.0.2 version.

Unpack the tarball:

tar -xvf kfctl_v1.0.2_<platform>.tar.gz

• Sign in. Executing this command is mandatory just once:

gcloud auth login

• Establish login credentials. Executing this command is mandatory just once:

gcloud auth application-default login

• Set the zone and project default values in Gcloud.

To begin setting up the Kubeflow deployment, enter your GCP project ID and choose the zone:

export PROJECT=<your GCP project ID> export ZONE=<your GCP zone>

gcloud config set project ${PROJECT} gcloud config set compute/zone ${ZONE}

Select the KFDef spec to use for your deployment

Export

CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_gcp_iap.v1.0.2.yaml"

Ensure you include the OAuth client ID and secret you generated earlier in your established environment variables.

export CLIENT_ID=<CLIENT_ID from OAuth page> export CLIENT_SECRET=<CLIENT_SECRET from OAuth page>

You can access the CLIENT_ID and CLIENT_SECRET in the Cloud Console by going to APIs & Services -> Credientials.

Assign a directory for your configuration and give your Kubeflow deployment the name KF_NAME.

export KF_NAME=<your choice of name for the Kubeflow deployment> export BASE_DIR=<path to a base directory> export KF_DIR=${BASE_DIR}/${KF_NAME}

When you perform the kfctl apply command, Kubeflow will be deployed with the default settings:

mkdir -p ${KF_DIR} cd ${KF_DIR} kfctl apply -V -f ${CONFIG_URI}

By default, kfctl will attempt to fill the KFDef specification with a number of values.

Conclusion Although you are now familiar with the basics of Kubeflow deployment in Kubernetes, more advanced customisations can make the process more challenging. However, many of the issues brought up by the computational demands of machine learning can be resolved with a containerised, Kubernetes-managed cloud-based machine learning workflow, such as Kubeflow. It allows for scalable access to central processing and graphics processing units, which may be automatically increased to handle spikes in computing demand.

#Kubeflow deployment in Kubernetes #Nvidia GPU deployment in Kubernetes

1 note · View note

suwa-sh · 6 years ago

Text

ライフサイクルすべての領域をカバー BUILD imageができたらスキャン脆弱性コンプライアンス SHIP/RUN クラウドネイティブファイアウォールコンテナ間通信ランタイム防御アクセス制御 AIで自動学習 dockerコマンド、kubectlコマンドの実行制御 #k8se

— 諏訪真一 (@suwa_sh) May 17, 2019

via Twitter https://twitter.com/suwa_sh May 17, 2019 at 08:10PM

#IFTTT #Twitter

0 notes

miusukeeee · 7 years ago

Text

Devsummit2018

過去ログ

Dev Summit運用 8+8 * N列 16列ぐらい？ B 見える範囲だけで1部屋に5人ぐらいいる受付、音響、途中参加者の誘導*2ぐらい

撮影OKなのかNGなのかを表示する立て札みたいのあるといい

ランチセッションはガンガン入って食べさせて、食べてから聞く方がいい、音もあるし

参加者webアンケートに答えると、資料がもらえるってやりやすくない？？

運用について無停止でtokyo regionに持ってくる方法考えるデータとキャッシュがつらそうなのでslaveをtokyoにおいて、切替時と同時にマスターに昇格、ただ台湾もマスター気分いや、死ぬな、無理だ Scaleユニット、podの単位を正確にして、scaleさせる計画を練る今回のタイミングで、Nodeをscalableにしなくちゃいけないそして、その後にk8sでservice-deploymentでscalableに sakura-deploymentも nginxもかなりあ環境ちゃんと作りたい

memcachedを2こ使ってるけど、死んだら死ぬので、なんらかのなにかに移行する

そのうち、東京大阪両方に散らしたいよね

とりあえずSRE本配ろうか

B2 全アーキテクトとマイクロソフトの NoOpsのやつ

運用は設計をみすえて、、みたいな話

好きなものを好きな時にローンチしたい vs 一旦動いたシステムは変更したくない SRE本

開発と運用をわけるとか、そういうの、えぐいいまのMDUは運用と開発を同時にやってる→DevOpsだよね

いや、NoOps

Ops　10年戦争

運用保守はITILのリファレンスにもあるとおりレビューとかアホみたいにあるリードタイム長い、物理的限界→PDCAがふぁっきん長期

「変更が一番のバグの原因」→書類とか多い、仕様が凍結されるでもハードウェアは死ぬから運用は大変

2008年からの仮想化がハードとソフトの依存関係を切り離せるし、クラウドまで行くと運用しなくていいぜ

landing.google.com/sre

昔堅牢さ＝信頼性 AWS時代壊れる前提でオペレーションしようぜ NoOps システムに自立運用能力をもたせる壊れた前提で設計、壊れた後に治す準備までを設計しようぜ

回復性設計 azure/architecture/resiliency いかに止めないかミドルウェアに回復性を実装するのは意味わからないぐらい辛い platformにあってくれクラウドネイティブのものがいい -> Serverless (lambdaとか)

NoOps in Azure App Service

serviceについて考える

今の構成だとingressがNodeの中にあるから、リージョン跨いでサービス作ることが出来ない (例えば大阪におくとか)

地理的冗長化もかんがえる

self Healingは healthcheck死ぬとk8sがやってくれる

in-flight renewingはできない

Adaptive Scale GKEのオートスケール k8sのオートスケール、ただDB,cacheあたりはオートスケールできない正直ここだけはAWS おーろらにしたい

DB載せ替えしたいぞ

河野さん？

superriver 帝国兵

Azure console

App restartみたいなの��っちゃ良さそうログ見にいかなきゃ行けないのも見えるし、そもそも気づかないうちに起こっててハッピー

アプリケーションにも回復性をもたせたい

小さな龍後のステートレス設計非同期処理前提冪等性 -> serverlessの考え方 DevOps -> DevDevDevへ - ジョブに、キューに、ジョブを並列実行し、結果を書き込め - scoreの考え方をこれでやろう - 大量バッチの。。。みたいのがサンプルとかハンズオンにあるから見てみよう - ここかっこいい、真似したい

github.com/noops-jp

noops.connpass.com

BL もしSierのエンジニアがSRE本を読んだらエーピーコミュニケーションズあんどうともき APC pythonでAPI叩く Ansivle, Terraformに手を出したい

絶対的な力関係で無茶振り… 手順書&ダブルチェック… 依頼…非効率…

運用までシステム化したい心理的安全性を高めて生産性を高めよう

ソフトウェアエンジニアリング+サーバーネットワークインフラエンジニアやなおれじゃん！

リスク受容して信頼せ100%を目指さない SLIを本にSLOを決める、これはSLAじゃないぞ人じゃなくてデータに依る判断で標準化するエラーバジェット = 100% - SLO 開発と運用がエラーバジェットを共有

ﾄｲﾙ

手作業の繰り返しのことサービスが成長するとそうなる運用を業務の50%以下にする大きな自社サービスを展開している&高い技術を…

リスクの需要

100%という意識

SLA > SLO (1/100ならいーじゃんとはならず, ただただ0にしろと言われる)

指標となるデータが見つからない…

自動化が可能な作業かどうかが技術力に左右されるうんこ

関係者が増えるとやっばい

なんとか取り入れるやりやすい手法

50%ルール

価値の高いことをするため、時間を確保する戦略(と言い換えることで説得しやすい) 運用が100%+だしつらい運用改善すら出来ないトイルの対処 - 湯やる必要ないものは捨てる - 優先順位をつける - アウトソーシング - 代行してもらったら早かったり正確だったりする…？ - 順序のいれかえ - フローの順序が運用設計で作られるけど - ほんとにこれがただしい？を常に考える - 特殊性の排除 - ��レギュラー対応、例外的な処理は非効率だし事故起きやすいし属人性高いし - 汎用的な手順で処理できるように仕様 - そうすると自動化(コードに落とす)ときにもめっちゃいいよ - 自動化 - 上記でふるいをかけたものを自動化することで正しい判断ができる - 自動化するのが目的になっちゃったり - こーどがくっちゃくちゃになったり

壁

教育学習コスト、工数、理解、、、たいへん。。。。変えるリスクをあげるのは簡単、変わらないリスクを考える必要がある

変わるべき vs 変わらないべきどっちも過大評価するないきなり議論をするな - プロトタイピング動く所をみせてメリットアピールしよう - めちゃくちゃスモールスタートして、知見と実績つもう - 人を巻き込もう

その後…

時間を確保したら - さらなる改善をする - 変化は継続しないと意味がない - そのためにスキルアップだし、それが出来てスキルアップ - トレンドおっかけようぜ

xxxx

当たり前になってる業務に新しい気付きいいものもわるいものもでてくる異なる意見もあつめようぜ、突破口になる Googleのをそのままはできない

serviceでは…

MMDUと共有することで、100なんか無理だろ!と壊れた理由言えよ!の対立を無くす正直マジでだるいから CSなんかがどんどん増えていくのはわかりきっている、自動化できるものはするお問い合わせフォームを無くすの良くない？変わらないリスクをめちゃくちゃ考える必要がある競合の新しいものが生まれたらどうする？メンバー腐る SRE読書会

加速するフロントエンド PWA 立ち見エグいめっちゃやせたな inside frontend で話したものを見てみたい mizchi　さーん node.js Rtech, freee

Reactの人ではないぞ [今]のパフォチューの常識でまっとうに作るとこんなに早いぞ serviceのプロトタイピングで作ってみようかな r.nikkei.com nikkei-inside-frontend 宍戸さんえぐい

PWAについて

本質モダンなブラウザを使うユーザーにはよりyい体験をという方向性 - serviceWorerを使ったモバイルアプリに追いつくぞ ServiceWorkerが実装されているか否か、レガシー→ IE,Safari can i use 毎回忘れるから覚える

SWについて

バックグラウンドで動くローカルプロキシあらゆるレスポンスを書き換え可能 httpsじゃないと動かない

従来のリクエスト＆レスポンスモデルの常識が通用しないぞ！！！死ぬぞ！！！すべてのものをとれんぞ UI ThreadとServerの間にServiceWorkerがいるモデル(ローカルプロキシ) リクエストから15secぐらいでsleep タブが死んでてもサーバーからのpushイベントも受け取れる、おもしろ

出来ないこと

常駐プロセスができない、 15-30秒ぐらい絵あれ Web Budget API　−メモリ使用量も制限 (Google Mozzila)

サーバーに直通だったが動的に書き換わるから…考えることが増える history.pushStateに脱線 - urlに書き戻すやつ semantic な URLをシェアするのが超大事なのがwebの世界

PWA オフライン化

オフラインキャッシュ responceをオフラインキャッシュに横流しするこれは難しい気がするな、pushで更新してそれをオフラインに横流しすれば爆速で見ることが出来たりしないかな起動自体がオフラインキャッシュでおこなわれるサーバーに依存しない形態が発生する firebaseappいいかんじ

AMPもWeb Packageng対応で再構築されるかも！？

競合

Electron chromeが Add to homescreenに… REact Native Weeb技術の Mobile App 開発環境という点で競合

インターネットがおそすぎる、一日一回だけつなぐっていいな 60fpsでうごいてほしいものだとくっそおそい先読みとかしたい、光を超えろ

初回リクエストをCDNでキャッシュオンマウスで遷移先を事前にfetch & cache レスポンスはSWのCacheから返却 dev.to 速さを最初から設計する - 更新時のcacheの履き戦略を詰めてる速さが最高のUX

IEを殺すぞ

いままではゆめ、げんじつはここから

Mobile vs Webの代理戦争じゃないの Appleはストアが良い、Googleはクロールしたいからweb, facebookは… MFI モバイルターゲットにする気持ち

PWA service workerはあってもなくても動くはずの技術 safariの 11にonfetchがあるけどonpushがない。やばい

ぱふぉちゅー

支配的なブロッキング要素を探す潰す繰り返す推測すんな計測しろ

Devtools ライトハウス preact ぴーりあくと超速webページ速度改善ガイド、買うぞ

虚無は早い重さは機能の重さ、否定するのはどうだろう

重いSQL 解像度広告

必要な速度は何でしょう。表示のスピードだけではない

k8s オラクルミッションクリティカルシステム司会はくーべるねーてるこのひとはくーばねーてす hhiroshell

本気で使うとどういうことを考えなくてはいけない？？？ ErgodoxのファームのビルドにDocker ア��キテクチャー、経営プロセス

k8s を使う上で可用性の観点でやるべきこと

マイクロサービスアーキテクチャをちゃんとする - サービス境界、連携箇所に対策を施す - 疎結合にする、疎結合にするがゆえの問題があるよね - 広範囲に影響しないのでは？いやいや、する場合も多いぞ - 障害の連鎖 - Bに依存、B応答待ち、待ち側も死ぬ、Bばっか集中して死ぬ、Aも死ぬ、全部死ぬ - → サーキットブレーカーを設置 - 障害を判断して、リクエスト遮断してエラーを返すだけの子 - Istioで対処できるぞ！！ kubecon -> きゅーぶこん

Istioのおはなし

サイドカーコンテナでEnvoy アプリケーションの変更無しで入れられる、簡単便利！これいれよ IstioとCI/CDツールに依るかなりーデプロイメント - Istioでリクエストの配分率を設定可能、 1%とだけカナリーデプロイメント - istioctlでやる - CI/CDはどうなる？ spinnakerかな？ちょっとこれ後で調べる

両隣がふぁっきんかめらまんでまじでクソ

ちゃんと考慮しろ

バージョン選択

コンフィグレーション

Istioの管理・監視設定 <- これがつらそう

k8sと組み合わせる時に制約がある

あるフラグオンにするとうごかないよ… 環境作るたびにやんないといけないから大変ではある

単体の可用性を確保

従来型の障害対策を - SPOD -> 冗長化 - 障害耐性と回復性 -> 障害想定設計

ここで言うサービスとは->AP,Web,DB どこまでマイクロなんだろうかクラスタ内にMySQL　Master????, Read Replica, でデータはストレージサービスに MySQLの冗長構成 Service(ClusterIP) - read replicaにアクセス分散、スケールアウトしてもルーティングできる Service(Headless) サービスオブジェクトの使い分けで書き込み、読み込みを適切に

StatefulSet でPod配備, 落ちたら落ちた情報まま立ち上げられるマウントするストレージとか FQDNとか 1.9からGA つまりは自動再起動できるDB構成つくれんぞこれつよい read replicaのmaster昇格ができる？できなそう

%% 死なないkafka 分散メッセージ・キュー　->　これはCloud Pub/Subでいいかな透過d的な自律回復死んでる間の復旧とかリバランスとかって誰の機能でやってんの？？？？？？ Kafkaのチャットアプリ

podの名前がきれいなのはdeploymentの使い方が違うのかな… chaoskubeというツールが有る、調べる helm install labels=app=kafkaをinterval=1mごとにランダムで落とすお仕事をしてくれる kubectl logs で標準出力ログ見��るからこれでsakuraみればいいのか

リバランスの負荷高い、Zoneまたぐとき通信速度遅いので結構辛い NodeやDC自体が死んだらどうしよう

サービスメッシュ導入に必要なスキルと工数の低減

障害復旧時の負荷への対応

ノードやデータセンター障害の考慮

Oracleで実現するミッションクリティカルk8s

CNP Container Native Application Development Platform k8s周辺ツール/フレームワークを提供するやつ

CI/CD wercker(おらくるがばいしゅうしてたんだ) APIレジストリ Apiary(swaggerみたいなやつ) IaaS OKE マイクロサービスフレームワーク Istio/Envoy, Kafka, Fn Projct(OSSでつくったやつ) 管理・監視 Prometheus, Zipkin/OpenTracing, Vizceral ServiceBroker Open Service, Broker

IaaS強い Availability Domains 3リージョンの1Tb/s ホスト間低レイテンシー

性能に関するSLAをOlacleが発表ダウンタイムはほかもあるけどパフォーマンス出したの偉い

oracleにteraformのインストーラーがある

serviceでやるとどうなる？

以下をまるっと考える

Istioのおはなし

k8s を用いた最強のマイクロサービス環境をGKEで実現しよう司会はくーべねてぃす福田潔さんはくーばねーてぃす

てんえっくす 10x 1.4とか2.9倍とかじゃなくてイノベーションで10倍20倍にしようぜみたいな

GKE使ってる人ほとんど居ない問題

他のクラウドと比べて勝っている点(きゃくのやつ) - BigDataの領域強い BigQuery, Dataflow - Cloud ML (machine learning, AI) - Container (K8s)

kubectl is the new ssh kelseyhightower, kubecon 2017 keynote

かっこいいエンジニアが対象とするレイヤーがOSとかだったけど変わった低レイヤーは気にしない、kubectlでいける kubectl は全てのエンジニアが覚えろ、デファクトになんぞ

マイクロサービスアーキテクチャ, 12-factor App Zero Opsプラットフォーム

モダンな要件ユーザビリティ 24/7 アジリティ性能マルチデバイス

それを支える基盤の要素自己修復能力モジュール化スケールすることリリースパイプライン Blue/Gree, かなりあデプロイロールバックモニタリング

NOdeの自動アップグレードを設定しておこう

メンテナンスウィンドウ, beta アップグレードが行われるタイミングを決められる素敵

workerノードがVMなのでやばいヘルスチェックしたら再起動してくれる？ノードの自動修復 - Container-Optimized OSなら --enable-autorepair な感じ

マルチゾーンクラスタデフォルトでクラスタ作るとzoneはされてしまう additional zoneデマルゾーンしたい --zone asia-x-a --node-locations asia-x-b, asia-x-cみたいのできる。これやろう既存クラスタにできるし

リージョナルクラスタ(beta) masterも複数ゾーンにできるこれわからない課金されないしやるしかないっしょ

効率性 - COS(Container-Optimized OS) コス ubuntuだとジッ道アップグレードしない - スケールアウト - HPA, pod数を制御 - クラスタオートスケーラーノード数を制御 resize --sizeでおーけー、 --min-nodes 3 --max-nodes 10 --enable-autoscalingとかできる - スケールアップ - VPA Podに対するリソース割当を制御 - VMインスタンスタイプの変更 cordonする drainする既存のnodepoolを削除

プリエンプティブるVM - 中断される可能性がある - 安価 - 任意のマシンタイプいつ使うんだろう nodeSelecter　で preentible=trueみたいなのつけて運用できるランタイム-.実行基盤の意味で使ってる servie -> TCP/UDP LB, ingress -> HTTP(s) LB データストア: Spanner, Datastore, SQL DWH: BigQuery メッセージング: PubSub 運用管理: Stackdriver Logging/Monitoring CI/CD: Container Registry, Contiaer Builder GCPブログのメルカリの所あとで見る

ISP Cloiud Edge

serviceでは

ノードプールをちゃんと分けて運用する分けた先にsakuraとか配置した方がいいのでは？？？ cordon->drain->再生性、面倒だからauto-repairしようね

0 notes