#ai inference | Explore Tumblr posts and blogs

jcmarchi · 11 days ago

Text

Apple opens core AI model to developers amid measured WWDC strategy

New Post has been published on https://thedigitalinsider.com/apple-opens-core-ai-model-to-developers-amid-measured-wwdc-strategy/

Apple opens core AI model to developers amid measured WWDC strategy

Apple has opened its foundational AI model to third-party developers for the first time, allowing direct access to the on-device large language model that powers Apple Intelligence. The move, announced at this week’s Worldwide Developers Conference, represents a significant shift in Apple’s traditionally closed ecosystem approach to Apple AI developer tools.

The newly accessible three-billion parameter model operates entirely on-device, reflecting Apple’s privacy-first philosophy while imposing technical limitations compared to cloud-based alternatives from competitors.

“We’re opening up access for any app to tap directly into the on-device, large language model at the core of Apple,” said Craig Federighi, Apple’s software chief, during the conference presentation, according to Reuters.

The foundation model framework enables direct access

The new Foundation Models framework allows developers to integrate Apple Intelligence features with just three lines of Swift code, providing privacy-focused AI inference at no cost. The framework includes guided generation and tool-calling capabilities built-in.

Automattic has already begun leveraging the framework in its Day One journaling app. “The Foundation Model framework has helped us rethink what’s possible with journaling,” Paul Mayne, head of Day One at Automattic said. “Now we can bring intelligence and privacy together in ways that deeply respect our users.”

Xcode 26 integrates AI assistance

Xcode 26 now embeds large language models directly into the coding experience. Developers can use ChatGPT built into Xcode without creating an account, connect API keys from other providers, or run local models on Apple silicon Macs.

The Coding Tools feature assists in the development, offering suggested actions like generating previews, creating playgrounds, or fixing code issues within the development environment.

Visual intelligence opens to third parties

Apple extended Visual Intelligence capabilities to third-party developers through enhanced App Intents. Etsy is exploring these features for product discovery, with CTO Rafe Colburn noting: “The ability to meet shoppers right on their iPhone with visual intelligence is a meaningful unlock.”

The integration allows apps to provide search results within Apple’s visual intelligence experience, potentially driving direct engagement from camera-based searches.

Market and analyst scepticism

Apple’s stock closed 1.2% lower following the conference, with analysts questioning the incremental nature of announcements. “In a moment in which the market questions Apple’s ability to take any sort of lead in the AI space, the announced features felt incremental at best,” said Thomas Monteiro, senior analyst at Investing.com.

The measured approach contrasts sharply with Apple’s more ambitious AI visions presented last year. Bob O’Donnell, chief analyst at Technalysis Research, observed: “They went from being visionary and talking about agents before a lot of other people did, to now realizing that, at the end of the day, what they need to do is deliver on what they presented a year ago.”

Technical limitations and strategic focus

The three-billion parameter on-device model represents both Apple’s commitment to privacy and its technical constraints. Unlike cloud-based models that can handle complex tasks, Apple’s on-device approach limits functionality while ensuring user data remains local.

Ben Bajarin, CEO of analyst firm Creative Strategies, noted Apple’s behind-the-scenes focus: “You could see Apple’s priority is what they’re doing on the back-end, instead of what they’re doing at the front-end, which most people don’t care about yet.”

Apple AI developer tools will be available for testing through the Apple Developer Program starting immediately, with a public beta expected next month. The company’s measured approach may disappoint those expecting revolutionary AI capabilities, but it maintains Apple’s traditional emphasis on privacy and incremental innovation over flashy demonstrations.

As the AI race intensifies, Apple’s strategy of opening its foundational tools to developers while maintaining modest consumer-facing promises suggests a company more focused on building sustainable AI infrastructure than capturing headlines with ambitious claims.

(Photo by Apple )

See also: Apple AI stresses privacy with synthetic and anonymised data

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

0 notes

sharon-ai · 16 days ago

Text

Rent Cloud GPUs | Affordable GPU Cloud Rental

Need powerful GPU compute without the high upfront investment? Sharon AI makes it simple to rent cloud GPUs on-demand with transparent, competitive pricing. Our platform offers affordable GPU cloud rental options designed for AI training, deep learning, scientific computing, and more.

High-Performance Compute at the Right Price

Whether you're an individual developer or managing enterprise-scale workloads, our GPU rental service delivers:

On-Demand Access: Instantly deploy top-tier GPUs—like NVIDIA A100, H100, and more—directly from your browser or CLI.

Flexible Pricing: Pay hourly or monthly with no long-term contracts. Only pay for what you use.

Global Infrastructure: Run jobs closer to your data with a globally distributed network optimized for low latency.

Easy Scaling: Start small and scale compute resources as your project grows—perfect for startups and large teams alike.

Perfect for AI, ML & Research Workloads

Our affordable GPU cloud rental solutions are built for performance-critical use cases including machine learning training, inference, video rendering, and simulation. With full control over your environment and access to high-performance GPUs, you get the power you need—when you need it.

Skip the complexity of hardware procurement. Rent cloud GPUs with Sharon AI and accelerate your compute workflows at a fraction of traditional infrastructure costs.

👉 Check pricing and get started now

#cloudstorage #cloudgpu #gpu cloud #sharon ai #ai inference #sharonai

0 notes

impact-newswire · 3 months ago

Text

Oracle and NVIDIA Collaborate to Help Enterprises Accelerate Agentic AI Inference

Oracle Database and NVIDIA AI Integrations Make It Easier for Enterprises to Quickly and Easily Harness Agentic AI Press Release – Austin, Texas and San Jose, Calif.—GTC—March 18, 2025 – Oracle and NVIDIA today announced a first-of-its-kind integration between NVIDIA accelerated computing and inference software with Oracle’s AI infrastructure and generative AI services to help organizations…

#Agentic #AI #AI Agents #AI Cloud #AI Enterprise #AI Inference #AI infrastructure #AI Training #Artificial Intelligence #Cloud

0 notes

dye-it-rouge-et-noir · 15 days ago

Text

Cheng Xin!

#three body problem #3 body problem #三体 #cheng xin #程心 #my art #cheng xin!! got the time to do digital art of her since i do draw her in my note margins or my little sketch book #i want to make an ai aa and yun tianming design later definitely!#this is my my personal design for her! i felt like a braid would be a good fit for her due to a superficial resemblance to dna #she's grown quite a bit on me and one of the things i've been thinking about recently is time in relation to her character #especially the eternalist perspective from the philosophy of time which is the idea that past-present-future all exist #by contrast to presentism which you can probably infer centers on the present instead #some aspects of eternalism definitely can be applied to cheng xin's characterization and the trilogy as a whole #(especially with death's end!)#i think about the weight of all history when it came to the swordholder scene a lot regarding this

10 notes · View notes

son1c · 2 years ago

Note

It’s so big brained of you to come up with a reason why some of eggmans bots went rogue while the others didn’t it’s just *chefs kiss* world building my beloved

i <3 inferences

#anonymous #to me it just makes sense. eggman is too good at building ai. they all have personality and soul #so he has to constantly crush them to get them to do what he wants #and in the case of specifically mecha sonic she'd been abandoned for a really long time. time to grow and #change was allowed to happen #metal doesn't get that luxury in canon because he's constantly being patched and rebooted...#so yeah i <3 inferences and i <3 text evidence

40 notes · View notes

frank-olivier · 8 months ago

Text

Bayesian Active Exploration: A New Frontier in Artificial Intelligence

The field of artificial intelligence has seen tremendous growth and advancements in recent years, with various techniques and paradigms emerging to tackle complex problems in the field of machine learning, computer vision, and natural language processing. Two of these concepts that have attracted a lot of attention are active inference and Bayesian mechanics. Although both techniques have been researched separately, their synergy has the potential to revolutionize AI by creating more efficient, accurate, and effective systems.

Traditional machine learning algorithms rely on a passive approach, where the system receives data and updates its parameters without actively influencing the data collection process. However, this approach can have limitations, especially in complex and dynamic environments. Active interference, on the other hand, allows AI systems to take an active role in selecting the most informative data points or actions to collect more relevant information. In this way, active inference allows systems to adapt to changing environments, reducing the need for labeled data and improving the efficiency of learning and decision-making.

One of the first milestones in active inference was the development of the "query by committee" algorithm by Freund et al. in 1997. This algorithm used a committee of models to determine the most meaningful data points to capture, laying the foundation for future active learning techniques. Another important milestone was the introduction of "uncertainty sampling" by Lewis and Gale in 1994, which selected data points with the highest uncertainty or ambiguity to capture more information.

Bayesian mechanics, on the other hand, provides a probabilistic framework for reasoning and decision-making under uncertainty. By modeling complex systems using probability distributions, Bayesian mechanics enables AI systems to quantify uncertainty and ambiguity, thereby making more informed decisions when faced with incomplete or noisy data. Bayesian inference, the process of updating the prior distribution using new data, is a powerful tool for learning and decision-making.

One of the first milestones in Bayesian mechanics was the development of Bayes' theorem by Thomas Bayes in 1763. This theorem provided a mathematical framework for updating the probability of a hypothesis based on new evidence. Another important milestone was the introduction of Bayesian networks by Pearl in 1988, which provided a structured approach to modeling complex systems using probability distributions.

While active inference and Bayesian mechanics each have their strengths, combining them has the potential to create a new generation of AI systems that can actively collect informative data and update their probabilistic models to make more informed decisions. The combination of active inference and Bayesian mechanics has numerous applications in AI, including robotics, computer vision, and natural language processing. In robotics, for example, active inference can be used to actively explore the environment, collect more informative data, and improve navigation and decision-making. In computer vision, active inference can be used to actively select the most informative images or viewpoints, improving object recognition or scene understanding.

Timeline:

1763: Bayes' theorem

1988: Bayesian networks

1994: Uncertainty Sampling

1997: Query by Committee algorithm

2017: Deep Bayesian Active Learning

2019: Bayesian Active Exploration

2020: Active Bayesian Inference for Deep Learning

2020: Bayesian Active Learning for Computer Vision

The synergy of active inference and Bayesian mechanics is expected to play a crucial role in shaping the next generation of AI systems. Some possible future developments in this area include:

- Combining active inference and Bayesian mechanics with other AI techniques, such as reinforcement learning and transfer learning, to create more powerful and flexible AI systems.

- Applying the synergy of active inference and Bayesian mechanics to new areas, such as healthcare, finance, and education, to improve decision-making and outcomes.

- Developing new algorithms and techniques that integrate active inference and Bayesian mechanics, such as Bayesian active learning for deep learning and Bayesian active exploration for robotics.

Dr. Sanjeev Namjosh: The Hidden Math Behind All Living Systems - On Active Inference, the Free Energy Principle, and Bayesian Mechanics (Machine Learning Street Talk, October 2024)

youtube

Saturday, October 26, 2024

#artificial intelligence #active learning #bayesian mechanics #machine learning #deep learning #robotics #computer vision #natural language processing #uncertainty quantification #decision making #probabilistic modeling #bayesian inference #active interference #ai research #intelligent systems #interview #ai assisted writing #machine art #Youtube

6 notes · View notes

vox-off · 2 months ago

Text

i plan to spend today eating popcorn and watching redditors fight over whether or not the oblivion remake is a) even real and b) being shadow dropped today, which it apparently is supposed to be

RIP in pieces to the pope or whatever i guess but this is my drama right now

#voxbox #i have less than a passing interest in the oblivion remake #love oblivion but if the leaked info is to be believed it's releasing on limited platforms - none of which i have or am inclined to acquire #just to play one game #unlike one fella who bought a whole ps5 just for this game that still has had no official acknowledgement from bethesda #i thought tumblr was queen of extreme inference from scraps but boy howdy watching reddit declaring the most emphemeral things as #solid evidence of the remake's existence has been a fuckin gas #a 500+ comment thread arguing about whether or not AI bots make grammatical errors - putting a space before a period - and such #it's been wild. who needs sportsball when you can watch the elder scrolls subreddit

3 notes · View notes

fossore · 6 months ago

Text

My cousin just started his CS PhD on something like the applications of machine learning for processing the data cities produce so he had all sorts of fun tidbits about people’s driving habits.

Did you know that people drive faster in curved roads than straight roads? And there’s a contingent that thinks there should be no speed limits?

#his advisor is working on a project to map sidewalks? with assessments on quality (like cracks etc)#generative ai = bad #but stuff like this is cool and good #inference is often useful in non-creative pursuits

2 notes · View notes

snarwin · 1 year ago

Text

Perfectly encapsulates the AI art discourse that the complaints on the Coño Culo post are "this is theft because it was made with AI" and not "this is theft because Goku is a copyrighted character."

#ai discourse #We can forgive stealing from Akira Toriyama and Shonen Jump magazine #but we draw the line at statistical inference

6 notes · View notes

peterbordes · 11 months ago

Photo

(via AI inference chip startup Groq closes $640M at $2.8B valuation to meet next-gen LPUs demand)

Groq, a leader in fast AI inference, has secured a $640M Series D round at a valuation of $2.8B. The round was led by funds and accounts managed by BlackRock Private Equity Partners with participation from both existing and new investors including Neuberger Berman, Type One Ventures, and strategic investors including Cisco Investments, Global Brain’s KDDI Open Innovation Fund III, and Samsung Catalyst Fund. The unique, vertically integrated Groq AI inference platform has generated skyrocketing demand from developers seeking exceptional speed.

#Groq #ai #cloud computing #inference #lambda #gpu #nvidia #chips #data center

1 note · View note

sharon-ai · 3 months ago

Text

Historic $500 Billion AI Infrastructure Investment Announced in the US

Is the US Spearheading a $500 Billion AI Infrastructure Boom?

In a groundbreaking move to solidify the United States as a global leader in artificial intelligence, a massive $500 billion private sector investment has been unveiled. Dubbed "Stargate," this ambitious initiative unites industry giants OpenAI, SoftBank, and Oracle to develop state-of-the-art AI infrastructure and data centers capable of supporting next-generation technologies.

Why Is Texas Emerging as the AI Capital of the US?

Texas has become the epicenter of this transformative project, with 10 data centers already under construction and more in the pipeline across the nation. Thanks to its strong infrastructure, business-friendly policies, and ample resources, the Lone Star State continues to attract high-performance computing and AI investments, reinforcing its status as a hub for technological innovation.

How Will This Investment Influence Jobs and the Economy?

The Stargate initiative is expected to generate over 100,000 jobs across the US, significantly bolstering workforce development in the AI and data center industries. From construction to ongoing operations, this investment is set to fuel economic expansion while strengthening America’s competitive edge in the rapidly evolving AI landscape.

AI’s Role in Revolutionizing Key Industries

Project leaders envision AI-driveHistoric $500 Billion AI Infrastructure Investment Announced in the US n breakthroughs across multiple sectors, with a particular focus on healthcare. Cutting-edge AI applications could redefine diagnostics and treatment, leading to unprecedented advancements in medical care. The initiative aligns with Sharon AI’s mission to drive progress in AI and high-performance computing, setting the stage for transformative technological growth.

What Opportunities Exist in AI Infrastructure Development?

For stakeholders in AI and data centerHistoric $500 Billion AI Infrastructure Investment Announced in the US ecosystems, this investment presents an array of opportunities. The emphasis on scalable, sustainable, and high-performance infrastructure aligns with Sharon AI’s vision of enabling AI-driven workloads through innovative energy solutions and advanced computing capabilities.

Sharon AI remains committed to supporting initiatives that propel AI growth and infrastructure development, particularly in forward-thinking regions like Texas. With monumental projects like Stargate shaping the future, the AI and data center landscape is on the brink of a revolutionary transformation.

#cloudstorage #sharon ai #ai inference #techinnovation #infrastructure asset management #ai

0 notes

clouds-of-wings · 1 year ago

Text

I really recommend familiarizing yourself with the way the eBay AI talks in your language if you buy there a lot. It's very useful if you can tell if a product description is generated by a bot because it often makes things up or describes things in exaggeratedly positive terms. Also a bot description is based only on the title and the product info that is already visible above the actual description (and what it thinks it can infer from these), so you know right away the description won't offer any new information and you don't have to read it. I recognize the bot by its writing style now and it helps.

You can familiarize yourself by pretending you want to sell something, writing a fake title and so on, and then just playing around with the AI.

#ebay #i really can't wait for this hype to be over #and we can start thinking rationally about what ai can actually be used for #instead of just throwing ai at everything in panic #bc a lot of people are so desperatedly afraid of missing out on a trend #that they will ruin their product in order to not seem old-fashioned #without actually understanding what they're doing #using ai product descriptions on ebay doesn't even save ebay any money #it does not communicate any additional info #and it makes product descriptions less accurate and less informative #also you can usually infer stuff about the seller from how they write their descriptions #and that goes away if they use ai #it's a clear net negative

2 notes · View notes

gigglepuffpixie · 9 months ago

Text

please stop playing with that salami

it's not that interesting

#ai is salami #Systematic Approaches to Learning Algorithms and Machine Inferences #ai #chatgpt

144K notes · View notes

themorningnewsinformer · 19 hours ago

Text

Gigabyte Aorus Master 16 AI Laptop With Intel Ultra 9 Launched in India

Introduction Gigabyte has officially unveiled its latest AI-powered gaming laptop, the Gigabyte Aorus Master 16, in India. Packed with the latest Intel Core Ultra 9 275HX CPU and the powerful Nvidia GeForce RTX 5080 Laptop GPU based on Blackwell architecture, this AI PC is designed to deliver unmatched gaming and AI computing performance. Gigabyte Aorus Master 16 Price and Availability in…

#AI agent laptop #AI gaming laptop #AI PC India #AI privacy #Amazon Gigabyte #Dolby Vision #Flipkart Gigabyte #Gigabyte Aorus Master 16 #GiMate AI #Intel Ultra 9 #LLM inference laptop #Nvidia RTX 5080 #OLED laptop #Windforce cooling #WQXGA display

0 notes

semimediapress · 12 days ago

Text

AMD acquires engineering team from Untether AI to strengthen AI chip capabilities

June 9, 2025 /SemiMedia/ — AMD has acquired the engineering team of Untether AI, a Toronto-based startup known for developing power-efficient AI inference accelerators for edge and data center applications. The deal marks another strategic step by AMD to expand its AI computing capabilities and challenge NVIDIA’s dominance in the field. In a statement, AMD said the newly integrated team will…

#AI inference accelerator #AMD AI chips #data center GPU #edge computing chips #electronic components news #Electronic components supplier #Electronic parts supplier #power-efficient AI #Untether AI acquisition

0 notes

govindhtech · 2 months ago

Text

AMD ROCm 6.4: Scalable Inference and Smarter AI Workflows

AMD ROCm 6.4: Plug-and-Play Containers, Modular Deployment, and Revolutionary Inference for Scalable AI on AMD Instinct GPUs

Modern AI workloads are larger and more sophisticated, increasing deployment simplicity and performance needs. AMD ROCm 6.4 advances AI and HPC development on AMD Instinct GPUs.

With growing support for leading AI frameworks, optimised containers, and modular infrastructure tools, ROCm software helps customers manage their AI infrastructure, develop faster, and work smarter.

Whether you're managing massive GPU clusters, training multi-billion parameter models, or spreading inference over multi-node clusters, AMD ROCm 6.4 delivers great performance with AMD Instinct GPUs.

This post presents five major AMD ROCm 6.4 improvements that directly address infrastructure teams, model developers, and AI researchers' concerns to enable AI development fast, straightforward, and scalable.

ROCm Training and Inference Containers: Instinct GPU Plug-and-Play AI

Setting up and maintaining ideal training and inference settings takes time, is error-prone, and delays iteration cycles. AMD ROCm 6.4 provides a large set of pre-optimized, ready-to-run training and inference containers for AMD Instinct GPUs.

For low-latency LLM inference, vLLM (Inference Container) supports plug-and-play open models like Gemma 3 (day-0), Llama, Mistral, Cohere, and others.

FP8 support, DeepGEMM, and simultaneous multi-head attention give SGLang (Inference Container) exceptional throughput and efficiency for DeepSeek R1 and agentic processes.

PyTorch (Training Container) makes LLM training on AMD Instinct MI300X GPUs simpler with performance-tuned variations that enable advanced attention strategies. Optimised for FLUX, Llama 2 (70B), and 3.1 (8B, 70B).1-dev.

Training Container Megatron-LM This ROCm-tuned Megatron-LM fork can train large-scale language models like Llama 3.1, Llama 2, and DeepSeek-V2-Lite.

These containers allow AI researchers faster access to turnkey settings for experimentation and model evaluation. Model developers may use pre-tuned support for the most advanced LLMs, including as DeepSeek, Gemma 3, and Llama 3.1, without spending time configuring. These containers also simplify infrastructure teams' maintenance and scale-out by deploying uniformly across development, testing, and production environments.

PyTorch for ROCm Improves: Faster Focus and Training

As training large language models (LLMs) pushes computing and memory limits, ineffective attention strategies can impede iteration and increase infrastructure costs. AMD ROCm 6.4 improves Flex Attention, TopK, and Scaled Dot-Product Attention in PyTorch.

Flex Attention: Outperforms ROCm 6.3 in LLM workloads that need advanced attention algorithms, reducing memory overhead and training time.

TopK: TopK processes are now three times faster, improving inference reaction times without compromising output quality (source).

SDPA: expanded context, smoother inference.

These improvements speed up training, reduce memory overhead, and optimise hardware consumption. As a consequence, model developers can improve bigger models faster, AI researchers can do more tests, and Instinct GPU customers see shorter time-to-train and higher infrastructure ROI.

Upgrades are pre-installed in the ROCm PyTorch container.

AMD Instinct GPU Next-Gen Inference Performance with vLLM and SGLang

Low-latency, high-throughput inference for big language models is difficult, especially when new models develop and deployment pace increases. ROCm 6.4 addresses this problem with AMD Instinct GPU-optimized vLLM and SGLang versions. Due to its strong support for popular models like Grok, DeepSeek R1, Gemma 3, and Llama 3.1 (8B, 70B, and 405B), model developers can deploy real-world inference pipelines with minimal modification or rewrite. AI researchers can get faster time-to-results on large-scale benchmarks. Infrastructure teams can ensure scaled performance, consistency, and reliability with stable, production-ready containers that get weekly updates.

Set an Instinct MI300X throughput record using SGLang and DeepSeek R1.

Day-0 compatibility for Instinct GPU deployment with vLLM Gemma 3.

These technologies create a full-stack inference environment with weekly and bi-weekly development and stable container upgrades.

Smooth Instinct GPU Cluster Management by AMD GPU Operator

Scaling and managing GPU workloads across Kubernetes clusters can cause manual driver updates, operational disruptions, and limited GPU health visibility, which can reduce performance and reliability. With AMD ROCm 6.4, the AMD GPU Operator automates GPU scheduling, driver lifecycle management, and real-time telemetry to optimise cluster operations. This allows AI and HPC administrators to confidently deploy AMD Instinct GPUs in air-gapped and secure environments with full observability, infrastructure teams to upgrade with minimal disruption, and Instinct customers to benefit from increased uptime, lower operational risk, and stronger AI infrastructure.

Some new features are:

Automatic cordon, drain, and reboot for rolling updates.

More support for Ubuntu 22.04/24.04 and Red Hat OpenShift 4.16–4.17 ensures compatibility with modern cloud and enterprise settings.

Device Metrics Exporter for real-time Prometheus health measurements.

The New Instinct GPU Driver Modularises Software

Coupled driver stacks hinder upgrade processes, reduce interoperability, and increase maintenance risk. AMD ROCm 6.4 introduces the modular Instinct GPU Driver, which isolates the kernel driver from ROCm user space.

main benefits,

Infrastructure teams may now upgrade ROCm libraries and drivers separately.

Extended compatibility to 12 months (from 6 months in earlier iterations)

More flexibility in installing ISV software, bare metal, and containers

This simplifies fleet-wide upgrades and reduces the risk of breaking changes, which benefits cloud providers, government agencies, and enterprises with strict SLAs.

AITER for Accelerated Inference Bonus Point

AITER, a high-performance inference library with drop-in, pre-optimized kernels, removes tedious tuning in AMD ROCm 6.4.

Gives:

It can decode 17 times quicker.

14X multi-head focus improvements

Double LLM inference throughput

#technology #technews #govindhtech #news #technologynews #AMD Instinct #AMD ROCm 6.4 #AMD ROCm #ROCm 6.4 #ROCm #Plug-and-Play AI on Instinct GPUs #ROCm Containers for Training and Inference #artificial intelligence

0 notes