#finetuningmethod
Explore tagged Tumblr posts
Text
Llama 4 Scout and GPT-4.1-nano Models In Azure AI Foundry

Microsoft Azure is offering new fine tuning models and approaches in Azure AI Foundry to help organisations create domain-specific AI systems. The GPT-4.1-nano and Llama 4 Scout models now have Supervised Fine-Tuning (SFT), and o4-mini will soon get Reinforcement Fine-Tuning (RFT).
RFT with o4-mini
RFT is touted to outperform Azure AI Foundry's model fine-tuning. It adds control to match complex business logic with model behaviour. It uses a feedback loop to apply reinforcement learning during training. Developers supply a task-specific grader that grades model outputs based on specified criteria. Train the model to optimise against this reward signal to gradually create replies that match the anticipated behaviour. RFT with o4-mini teaches a model to solve problems, while supervised fine-tuning replicates sample outputs.
Purpose: RFT improves model decision-making in dynamic or high-stakes scenarios by bringing models closer to optimum behaviour for real-world applications and teaching them what to generate and why.
The Model: The o4-mini will soon support this. First adjustable compact reasoning model is the o4-mini. It is one of OpenAI's latest multitasking models and excels at organised reasoning and chain-of-thought suggestions.
RFT with o4-mini is expected to expand use cases that require contextual awareness, adaptive reasoning, and domain-specific logic while maintaining fast inference performance.
It gives developers a lightweight, strong platform for exact adjustment of domain-specific reasoning tasks with high stakes while maintaining computational economy and speed for real-time applications. RFT-tuned models may improve mistake correction and data efficiency with new prompts, requiring fewer instances to match supervised techniques.
RFT is best for domain-specific behaviour, flexibility, and iterative learning. Domain-Specific Operational Standards, where internal procedures deviate from industry norms; Custom Rule Implementation, where decision logic is highly specific and cannot be easily captured through static prompts; or High Decision-Making Complexity, where results depend on navigating many subcases or dynamically weighing multiple inputs, should be considered.
The legal software startup DraftWise increased search result quality by 30% by employing RFT to improve contract creation and review reasoning models. Contoso Wellness is a fictitious illustration of how RFT may adapt to client engagement business principles, including identifying the optimal client interactions based on subtle trends.
Accordance OpenAI listed early adopters like Artificial Intelligence (which improved tax analysis by 39%), Ambience Healthcare (which improved medical coding), Harvey (which improved legal document citation extraction), Runloop (which produced legitimate Stripe API snippets), Milo (which improved complex calendar prompt output quality), and SafetyKit (which improved content moderation accuracy). Partners like ChipStack and Thomson Reuters improved performance.
RFT Usage: First, design a Python grading function that assigns a score between 0 and 1, create a high-quality prompt dataset, start a training job via API or dashboard, and analyse and iterate.
Pricing and Availability: Azure AI Foundry will offer RFT with o4-mini in Sweden Central and East US2. Over the OpenAI API, verified organisations can use o4-mini for RFT. Time spent actively training determines training costs, especially $100 per hour for core training. Organisations that provide datasets for research receive a 50% training cost rebate.
SFT for GPT-4.1-nano
It is believed that SFT fine-tunes the GPT-4.1-nano model using this traditional manner. Add company-specific language, procedures, and structured outputs to your SFT model to tailor it. Developers can contribute tagged datasets to train the nano model for specific use cases.
The GPT-4.1-nano model supports SFT. A compact but powerful foundation model for high-throughput and cost-sensitive workloads, this architecture. It's the company's fastest and cheapest model. Benchmarks are good, and it provides a million-token context window.
Fine-tuning GPT-4.1-nano enables Precision at Scale (tailoring responses while maintaining speed and efficiency), Enterprise-Grade Output (aligning with business processes and tone-of-voice), and a lightweight, deployable model (ideal for latency and cost-sensitive scenarios). With faster inference and lower computational costs than larger models, it offers unmatched speed and affordability.
It's best for internal knowledge assistants (who follow business rules) and customer support automation (which handles thousands of tickets per hour). It enables domain-specific categorisation, extraction, and conversational agents.
GPT-4.1-nano is ideal for distillation due to its compactness, speed, and power. To make 4.1-nano smarter, GPT-4.1 or o4 can provide training data.
Accessibility: Azure AI Foundry Central Sweden and North Central United States now offer 4.1-nano Supervised Fine-Tuning. GPT-4.1 mini supports SFT via OpenAI API for all paid API tiers. GitHub-Azure AI Foundry connectors will allow this fine-tuning strategy.
Model Llama 4 Scout
Model: Fine-tuning help Introducing Meta's Llama 4 Scout. It is a cutting-edge model with 17 billion active parameters. It offers the industry's widest context window of 10M tokens. It can infer on a single H100 GPU. It is a top-tier open source model that outperforms previous Llama models.
Accessibility: Azure AI Foundry managed computing now allows GPU-based Llama 4 fine-tuning and inference. It's in Azure Machine Learning and the Azure AI Foundry model catalogue. Availability through these components allows more hyperparameter customisation than serverless.
These Azure AI Foundry fine-tuning features aim to expand model customisation with efficiency, flexibility, and trust.
#Llama4Scout#ReinforcementFineTuning#Llama4Scoutmodel#SupervisedFineTuning#finetuningmethod#AzureAIFoundry#technology#technologynews#TechNews#news#govindhtech
0 notes