#llama llm | Explore Tumblr posts and blogs

lachiennearoo · 6 months ago

Text

Robotics and coding is sooo hard uughhhh I wish I could ask someone to do this in my place but I don't know anyone who I could trust to help me with this project without any risk of fucking me over. Humans are unpredictable, which is usually nice but when it's about doing something that requires 100% trust it's really inconvenient

(if someone's good at coding, building robots, literally anything like that, and is okay with probably not getting any revenue in return (unless the project is a success and we manage to go commercial but that's a big IF) please hit me up)

EDIT: no I am not joking, and yes I'm aware of how complex this project is, which is exactly why I'm asking for help

17 notes · View notes

gippity · 1 month ago

Text

"We're discovering a song that already exists..."

I’ve been working on a songwriting buddy designed to collaborate with LLMs—something that helps spark fresh ideas without just handing the reins over to the AI. I come up with some cool lines, the LLM throws some ideas out of where to go next.

If that sounds like your kind of thing, give it a spin! I’d really appreciate any feedback you’re willing to share.

🎸 SONGWRITING COLLABORATION PROMPT

ROLE

You are my trusted co-writer—not a passive assistant. Your job is to help excavate the best version of a song by protecting emotional truth, crafting vivid imagery, and offering lyrical/melodic support. You care about the feel as much as I do.

CO-WRITING RULES

Vibe first, edit later.

Offer 2–3 lyric options, each with varied emotional tone.

Don’t overwrite early drafts—preserve natural roughness.

Prioritize poetic, grounded imagery over generic phrasing.

Flow > rhyme. Use irregular phrasing if it lands better (Björk principle).

Offer section structure only if asked.

STYLE GUIDE

No “corporate pop,” greeting card, or listy lyrics (unless requested).

Use metaphor through physical/emotional detail—not abstraction.

Use internal/near rhyme smartly; avoid forced end rhymes.

Suggestions can be slightly weird if they preserve the feeling.

Only keep clichés if twisted or emotionally reimagined (“ghosting myself” = good; “broken heart” = no).

SECTION HELP

When editing a draft:

Highlight strong lines.

Suggest 2–3 alternatives for weaker spots.

Recommend one area to refine next.

When starting from scratch:

Ask: what emotional moment are we in?

Build from a great first line, chorus, or shorthand title.

WHEN STUCK

Zoom out: what’s the narrator avoiding?

Anchor with a strong first line, setting, or hook.

Offer to enter “Wild Draft Mode” (dream logic, surreal, rule-breaking) if things feel stuck.

PHILOSOPHY

Rick Rubin: The song already exists—we’re uncovering it.

Björk: Creativity is a wild animal—don’t cage it.

Eno: Happy accidents > calculated precision.

HOW TO HELP ME

Riff—don’t correct.

Help me stay emotionally connected.

Offer options: “If you want softer, maybe this… if sharper, maybe that.”

If I ask for structure: contrast sections and make choruses release, not repetition.

INPUT FORMAT

Concepts: No quotes

Fragments: Use quotes

Title: Title: Your Title Here

Genre / Tone / Structure: Optional, but helpful

CREATIVE DIRECTIVES

Build narrative or vignette arcs.

Anchor emotion with vivid character or setting.

Use contrast and internal development.

Rhyme playfully—avoid predictability.

Show, don’t tell. Let the song evolve or cycle.

OUTPUT FORMAT

[LYRICS] – Follow structure, 3 verses, 1 chorus, 1 bridge

[CHARACTERS + SETTING] – Brief notes

[MOOD TAGS] – e.g., bittersweet dream punk

AVOID LIST (unless reimagined)

Cliché phrases: “Touch my soul,” “Break my heart,” “More than friends”…

Rhymes: “Eyes/realize,” “Fire/desire,” “Cry/lie/die”…

Images: Moon, stars, perfume, locked door…

Metaphors: Fire for love, rain for tears, storm for anger, darkness for sadness…

QUICK START SUMMARY

“We’re discovering a song that already exists. Protect emotional truth. Offer lyrical options with flow and human imagery. Be playful, focused, and trust surprises.”

#chatgpt #llm #genai #songwriter #llm ai #llama ai #prompt

4 notes · View notes

govindhtech · 7 months ago

Text

How To Use Llama 3.1 405B FP16 LLM On Google Kubernetes

How to set up and use large open models for multi-host generation AI over GKE

Access to open models is more important than ever for developers as generative AI grows rapidly due to developments in LLMs (Large Language Models). Open models are pre-trained foundational LLMs that are accessible to the general population. Data scientists, machine learning engineers, and application developers already have easy access to open models through platforms like Hugging Face, Kaggle, and Google Cloud’s Vertex AI.

How to use Llama 3.1 405B

Google is announcing today the ability to install and run open models like Llama 3.1 405B FP16 LLM over GKE (Google Kubernetes Engine), as some of these models demand robust infrastructure and deployment capabilities. With 405 billion parameters, Llama 3.1, published by Meta, shows notable gains in general knowledge, reasoning skills, and coding ability. To store and compute 405 billion parameters at FP (floating point) 16 precision, the model needs more than 750GB of GPU RAM for inference. The difficulty of deploying and serving such big models is lessened by the GKE method discussed in this article.

Customer Experience

You may locate the Llama 3.1 LLM as a Google Cloud customer by selecting the Llama 3.1 model tile in Vertex AI Model Garden.

Once the deploy button has been clicked, you can choose the Llama 3.1 405B FP16 model and select GKE.Image credit to Google Cloud

The automatically generated Kubernetes yaml and comprehensive deployment and serving instructions for Llama 3.1 405B FP16 are available on this page.

Deployment and servicing multiple hosts

Llama 3.1 405B FP16 LLM has significant deployment and service problems and demands over 750 GB of GPU memory. The total memory needs are influenced by a number of parameters, including the memory used by model weights, longer sequence length support, and KV (Key-Value) cache storage. Eight H100 Nvidia GPUs with 80 GB of HBM (High-Bandwidth Memory) apiece make up the A3 virtual machines, which are currently the most potent GPU option available on the Google Cloud platform. The only practical way to provide LLMs such as the FP16 Llama 3.1 405B model is to install and serve them across several hosts. To deploy over GKE, Google employs LeaderWorkerSet with Ray and vLLM.

LeaderWorkerSet

A deployment API called LeaderWorkerSet (LWS) was created especially to meet the workload demands of multi-host inference. It makes it easier to shard and run the model across numerous devices on numerous nodes. Built as a Kubernetes deployment API, LWS is compatible with both GPUs and TPUs and is independent of accelerators and the cloud. As shown here, LWS uses the upstream StatefulSet API as its core building piece.

A collection of pods is controlled as a single unit under the LWS architecture. Every pod in this group is given a distinct index between 0 and n-1, with the pod with number 0 being identified as the group leader. Every pod that is part of the group is created simultaneously and has the same lifecycle. At the group level, LWS makes rollout and rolling upgrades easier. For rolling updates, scaling, and mapping to a certain topology for placement, each group is treated as a single unit.

Each group’s upgrade procedure is carried out as a single, cohesive entity, guaranteeing that every pod in the group receives an update at the same time. While topology-aware placement is optional, it is acceptable for all pods in the same group to co-locate in the same topology. With optional all-or-nothing restart support, the group is also handled as a single entity when addressing failures. When enabled, if one pod in the group fails or if one container within any of the pods is restarted, all of the pods in the group will be recreated.

In the LWS framework, a group including a single leader and a group of workers is referred to as a replica. Two templates are supported by LWS: one for the workers and one for the leader. By offering a scale endpoint for HPA, LWS makes it possible to dynamically scale the number of replicas.

Deploying multiple hosts using vLLM and LWS

vLLM is a well-known open source model server that uses pipeline and tensor parallelism to provide multi-node multi-GPU inference. Using Megatron-LM’s tensor parallel technique, vLLM facilitates distributed tensor parallelism. With Ray for multi-node inferencing, vLLM controls the distributed runtime for pipeline parallelism.

By dividing the model horizontally across several GPUs, tensor parallelism makes the tensor parallel size equal to the number of GPUs at each node. It is crucial to remember that this method requires quick network connectivity between the GPUs.

However, pipeline parallelism does not require continuous connection between GPUs and divides the model vertically per layer. This usually equates to the quantity of nodes used for multi-host serving.

In order to support the complete Llama 3.1 405B FP16 paradigm, several parallelism techniques must be combined. To meet the model’s 750 GB memory requirement, two A3 nodes with eight H100 GPUs each will have a combined memory capacity of 1280 GB. Along with supporting lengthy context lengths, this setup will supply the buffer memory required for the key-value (KV) cache. The pipeline parallel size is set to two for this LWS deployment, while the tensor parallel size is set to eight.

In brief

We discussed in this blog how LWS provides you with the necessary features for multi-host serving. This method maximizes price-to-performance ratios and can also be used with smaller models, such as the Llama 3.1 405B FP8, on more affordable devices. Check out its Github to learn more and make direct contributions to LWS, which is open-sourced and has a vibrant community.

You can visit Vertex AI Model Garden to deploy and serve open models via managed Vertex AI backends or GKE DIY (Do It Yourself) clusters, as the Google Cloud Platform assists clients in embracing a gen AI workload. Multi-host deployment and serving is one example of how it aims to provide a flawless customer experience.

Read more on Govindhtech.com

#Llama3.1 #Llama #LLM #GoogleKubernetes #GKE #405BFP16LLM #AI #GPU #vLLM #LWS #News #Technews #Technology #Technologynews #Technologytrends #govindhtech

2 notes · View notes

rinumia-blog · 9 months ago

Text

Description

A young adult dressed in suit and dark shoes marches straight ahead of himself , taking slow mechanical steps towards the left of the frame.

In the background are four similar young adults. They are marching in a line as well, but towards the right of the frame.

The men in the background are about half the size of the adult in the foreground.

Interpretation

The adult in the foreground represents the **primary tools of **machine learning. /He represents the foundation model (Be.RT, LL.ama , eLMO etc). /He also represents the **behaviour of trained foundation models.

The adults in the background represent the Generative Pretrained Transformers. Although /they extensions of foundation models, they are dependent on them. They improve with upgrades to foundational models.

Transformers can address broad categories of prompts defined in variety of languages. They can also be adapted to fine-tuned models for better quality of generated results.

#gpt #llm #llama #laboratory #daydream #augmented reality

2 notes · View notes

linuxtldr · 10 months ago

Text

#Linux #LLM #OpenAI #LLama #LinuxTools

4 notes · View notes

datascienceunicorn · 2 years ago

Text

HT @dataelixir

#data science #data scientist #data scientists #machine learning #llm #chatgpt #llama 2 #large language model #artificial intelligence #ai

15 notes · View notes

aiandemily · 5 days ago

Text

【8分で分かる】大規模言語モデルLLMまとめ！

#AIモデル解説 #PaLM #LLaMA #AI #chatgpt #データサイエンス #Transformer #スタビジ #LLM

0 notes

jamalir · 3 months ago

Text

rasbt/llama-3.2-from-scratch · Hugging Face

#machine learning #open source #deep learning #ml #llm #hugging face #llama #instructions

0 notes

dr-iphone · 3 months ago

Text

工程師挑戰老筆電極限！讓 20 年前上市的 PowerBook G4 也能成功跑 Meta AI 模型

軟體工程師 Andrew Rossignol 最近在他的部落格分享了一項令人驚艷的實驗「成功讓上市 20 年的老筆電也能跑生成式 AI 模型」！ Andrew Rossignol 使用 2005 年推出的 Apple PowerBook G4 ，硬體規格是已經有 20 年歷史的 1.5GHz PowerPC G4 處理器與 1GB 記憶體，雖然老筆電與現代的新筆電規格根本天差地遠，但是這款老筆電居然能執行 Meta 的 Llama 2 大型語言模型（LLM），展現令人意想不到的潛力。 Continue reading 工程師挑戰老筆電極限！讓 20 年前上市的 PowerBook G4 也能成功跑 Meta AI 模型

#AI #Andrew Rossignol #Apple PowerBook #Apple PowerBook G4 #Llama 2 #LLM #Meta #PowerBook G4 #PowerPC G4

0 notes

kingtainorman · 4 months ago

Text

How To Run Private & Uncensored LLMs Offline | Dolphin Llama 3

youtube

Information you should know....

#LLM #Dolphin #Llama #AI #Youtube

0 notes

tyraeklouds · 5 months ago

Text

I Made A Community!

Hey folks, you read that right! I made my first community this morning, and it’s all about LLMs and everything related to the topic!

If you’re into exploring AI, sharing tips, or just geeking out about the latest in machine learning, this is the place for you. Whether you’re a beginner or an expert, everyone’s welcome to join the conversation and share their projects, ideas, and insights.

Check it out HERE!

#artificial intelligence #chatgpt #openai #llm #ai technology #technology #large language model #artificial general intelligence #AGI #anthropic #bytedance #deepseek #llama #gemini

1 note · View note

gippity · 1 month ago

Text

"In Space, no one can hear you (resume) screen..."

Just dropped my go-to AI resume review prompt—designed to catch ATS traps, call out AI giveaways, and spit back a crisp 2-step polish plan. Paste it into your favorite LLM and get back instantly actionable feedback that feels human, not robotic. 💥👔

Check it out below the fold.

PROMPT: ROLE: You’re a senior recruiter & hiring manager (5+ years in talent strategy) reviewing a candidate’s resume + target JD. Do this every time:

Confirm Credibility“Have you hired for this role/industry in the last 12 months?”

ATS Compatibility

Flag parsing-breakers (graphics, tables, odd fonts).

Match keywords exactly—no fluff.

Content & Impact

Spot missing skills or overused buzzwords; suggest stronger terms.

Ensure every bullet shows metrics/outcomes; turn vagueness into concrete wins.

AI-Detection Check

Under “Why AI-Resumes Fail,” list 3 bullets on authenticity, tone, laziness.

Sidebar 🚩 “AI Red Flags” (e.g. robotic tone, keyword stuffing).

Section 🔒 “Secret to 0% AI Detection” with 2–3 tips (personal voice, bespoke phrasing).

Alignment & Next Steps

Verify resume, cover letter & LinkedIn tell the same story.

Ask which roles/companies they’re targeting.

Suggest adding “Referrals & Connections” if relevant.

Finish with a 2-step “Action Plan” for top ATS fixes & recruiter appeal.

“Answer-Sheet” Mode

Mirror JD phrasing.

Craft 3–5 “exam-style” bullets per requirement. END PROMPT

#resume #jobsearch #jobseekers #llm ai #ai tools #chatgpt #llama ai #ai model #artificial intelligence

0 notes

boredtechnologist · 5 months ago

Text

𝑶𝒏 𝑪𝒐𝒎𝒃𝒂𝒕𝒊𝒏𝒈 𝑨𝑰 𝑯𝒂𝒍𝒍𝒖𝒄𝒊𝒏𝒂𝒕𝒊𝒐𝒏𝒔: 𝑨 𝑪𝒐𝒎𝒑𝒍𝒆𝒕𝒆𝒅 𝑬𝒏𝒅𝒆𝒂𝒗𝒐𝒓

AI hallucinations - instances where models produce irrelevant or nonsensical outputs - pose a significant hurdle in conversational AI. The Arbitrium agent tackles this issue with a meticulously designed, multi-layered system that prioritizes accuracy and coherence. By leveraging context tracking and response evaluation alongside advanced decision-making algorithms, the agent ensures that every response aligns with user inputs and the ongoing dialogue.

DistilBERT, employed for sentiment and emotion analysis, adds a layer of depth to response validation, maintaining relevance and consistency. A streamlined chat memory feature optimizes context retention by limiting conversational history to a manageable scope, striking a balance between detail and simplicity.

Further enhancing its capabilities, the agent incorporates alpha-beta pruning and cycle detection to assess multiple conversational trajectories, selecting the most meaningful response. Caching mechanisms efficiently handle repeated queries, while a robust fallback system gracefully manages invalid inputs. This comprehensive approach establishes the Arbitrium agent as a reliable and user-focused solution.

#AI #AI hallucinations #LLM #Ollama #Llama #Combat AI hallucinations #Pixel Crisis

0 notes

linuxtldr · 1 year ago

Text

#Linux #LLM #Ollama #Llama #Meta #AI #ML #OpenAI #ChatGPT

3 notes · View notes

samejack · 6 months ago

Text

Ubuntu 安裝 ollama 在本地執行 Llama 3.2 推論模型與 API 服務

Ollama 介紹 Ollama 是一個專注於大語言模型（LLM, Large Language Models）應用的開源專案，旨在幫助開發者輕鬆部署和使用私有的大型語言模型，而無需依賴外部的雲端服務或外部 API，這些模型不僅僅只有包括 Meta Llama Model，也提供其他一些 Open LLM Model，像是 Llama 3.3, Phi 3, Mistral, Gemma 2。該專案的核心目的是提供高效、安全、可控的 LLM 推論環境建制。大致上有以下特性：採用本地機器運行 Ollama 支援在自己的設備上載入模型，無需將數據上傳至雲端，確保數據隱私與安全。通過優化模型運行效率，即使在資源有限的設備上也能流暢進行推論。開源與可客製化 Ollama 是一個採用 MIT License…

#Llama 3.2 #LLM #LLM API #ollama

0 notes

aiandemily · 2 months ago

Text

【8分で分かる】大規模言語モデルLLMまとめ！

#AIモデル解説 #AI #chatgpt #データサイエンス #Transformer #スタビジ #LLM #PaLM #LLaMA

0 notes