#Buildots | Explore Tumblr posts and blogs

jcmarchi · 9 days ago

Text

The Sequence Radar #554 : The New DeepSeek R1-0528 is Very Impressive

New Post has been published on https://thedigitalinsider.com/the-sequence-radar-554-the-new-deepseek-r1-0528-is-very-impressive/

The Sequence Radar #554 : The New DeepSeek R1-0528 is Very Impressive

The new model excels at math and reasoning.

Created Using GPT-4o

Next Week in The Sequence:

In our series about evals, we discuss multiturn benchmarks. The engineering section dives into the amazing Anthropic Circuits for ML interpretability. In research, we discuss some of UC Berkeley’s recent work in LLM reasoning. Our opinion section dives into the state of AI interpretablity.

You can subscribe to The Sequence below:

TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

📝 Editorial: The New DeepSeek R1-0528 is Very Impressive

This week, DeepSeek AI pushed the boundaries of open-source language modeling with the release of DeepSeek R1-0528. Building on the foundation of the original R1 release, this update delivers notable gains in mathematical reasoning, code generation, and long-context understanding. With improvements derived from enhanced optimization and post-training fine-tuning, R1-0528 marks a critical step toward closing the performance gap between open models and their proprietary counterparts like GPT-4 and Gemini 1.5.

At its core, DeepSeek R1-0528 preserves the powerful 672B Mixture-of-Experts (MoE) architecture, activating 37B parameters per forward pass. This architecture delivers high-capacity performance while optimizing for efficiency, especially in inference settings. One standout feature is its support for 64K-token context windows, enabling the model to engage with substantially larger inputs—ideal for technical documents, structured reasoning chains, and multi-step planning.

In terms of capability uplift, the model shows remarkable progress in competitive benchmarks. On AIME 2025, DeepSeek R1-0528 jumped from 70% to an impressive 87.5%, showcasing an increasingly sophisticated ability to tackle complex mathematical problems. This leap highlights not just better fine-tuning, but a fundamental improvement in reasoning depth—an essential metric for models serving scientific, technical, and educational use cases.

For software engineering and development workflows, R1-0528 brings meaningful updates. Accuracy on LiveCodeBench rose from 63.5% to 73.3%, confirming improvements in structured code synthesis. The inclusion of JSON-formatted outputs and native function calling support positions the model as a strong candidate for integration into automated pipelines, copilots, and tool-augmented environments where structured outputs are non-negotiable.

To ensure broad accessibility, DeepSeek also launched a distilled variant: R1-0528-Qwen3-8B. Despite its smaller footprint, this model surpasses Qwen3-8B on AIME 2024 by over 10%, while rivaling much larger competitors like Qwen3-235B-thinking. This reflects DeepSeek’s commitment to democratizing frontier performance, enabling developers and researchers with constrained compute resources to access state-of-the-art capabilities.

DeepSeek R1-0528 is more than just a model upgrade—it’s a statement. In an ecosystem increasingly dominated by closed systems, DeepSeek continues to advance the case for open, high-performance AI. By combining transparent research practices, scalable deployment options, and world-class performance, R1-0528 signals a future where cutting-edge AI remains accessible to the entire community—not just a privileged few.

Join Me for a Chat About AI Evals and Benchmarks:

🔎 AI Research

FLEX-Judge: THINK ONCE, JUDGE ANYWHERE

Lab: KAIST AI Summary: FLEX-Judge is a reasoning-first multimodal evaluator trained on just 1K text-only explanations, achieving zero-shot generalization across images, audio, video, and molecular tasks while outperforming larger commercial models. Leverages textual reasoning alone to train a judge model that generalizes across modalities without modality-specific supervision.

Learning to Reason without External Rewards

Lab: UC Berkeley & Yale University Summary: INTUITOR introduces a novel self-supervised reinforcement learning framework using self-certainty as intrinsic reward, matching supervised methods on math and outperforming them on code generation without any external feedback. The technique proposes self-certainty as an effective intrinsic reward signal for reinforcement learning, replacing gold labels.

Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning

AI Lab: Google DeepMind & Northwestern University Summary: This paper introduces BARL, a novel Bayes-Adaptive RL algorithm that enables large language models to perform test-time reflective reasoning by switching strategies based on posterior beliefs over MDPs. The authors show that BARL significantly outperforms Markovian RL approaches in math reasoning tasks by improving token efficiency and adaptive exploration.

rStar-Coder: Scaling Competitive Code Reasoning with a Large-Scale Verified Dataset

AI Lab: Microsoft Research Asia Summary: The authors present rStar-Coder, a dataset of 418K competitive programming problems and 580K verified long-reasoning code solutions, which drastically boosts the performance of Qwen models on code reasoning benchmarks. Their pipeline introduces a robust input-output test case synthesis method and mutual-verification mechanism, achieving state-of-the-art performance even with smaller models.

MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs

AI Lab: Fudan University, CUHK MMLab, Shanghai AI Lab Summary: MME-Reasoning offers a benchmark of 1,188 multimodal reasoning tasks spanning inductive, deductive, and abductive logic, revealing significant limitations in current MLLMs’ logical reasoning. The benchmark includes multiple question types and rigorous metadata annotations, exposing reasoning gaps especially in abductive tasks.

DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research

AI Lab: Carnegie Mellon University, NOVA LINCS, INESC-ID Summary: DeepResearchGym is an open-source sandbox providing reproducible search APIs and evaluation protocols over ClueWeb22 and FineWeb for benchmarking deep research agents. It supports scalable dense retrieval and long-form response evaluation using LLM-as-a-judge assessments across dimensions like relevance and factual grounding.

Fine-Tuning Large Language Models with User-Level Differential Privacy

AI Lab: Google Research Summary: This study compares two scalable user-level differential privacy methods (ELS and ULS) for LLM fine-tuning, with a novel privacy accountant that tightens DP guarantees for ELS. Experiments show that ULS generally offers better utility under large compute budgets or strong privacy settings, while maintaining scalability to hundreds of millions of parameters and users.

🤖 AI Tech Releases

DeepSeek-R1-0528

DeeSeek released a new version of its marquee R1 model.

Anthropic Circuits

Anthropic open sourced its circuit interpretability technology.

Perplexity Labs

Perplexity released a new tool that can generate charts, spreadsheets and dashboards.

Codestral Embed

Mistral released Codestral Embed, an embedding model specialized in coding.

🛠 AI in Production

Multi-Task Learning at Netflix

Netflix shared some details about its multi-task prediction strategy for user intent.

📡AI Radar

Salesforce agreed to buy Informatica for $8 billion.

xAI and Telegram partnered to enable Grok for its users.

Netflix’s Reed Hastings joined Anthropic’s board of directors.

Grammarly raised $1 billion to accelerate sales and acquisitions.

Spott raises $3.2 million for an AI-native recruiting firm.

Buildots $45 million for its AI for construction platform.

Context raised $11 million to power an AI-native office suite.

Rillet raised $25 million to enable AI for mid market accounting.

HuggingFace unveiled two new open source robots.

0 notes

er-10-media · 12 days ago

Text

Стартап Buildots внедрит ИИ на строительных площадках

New Post has been published on https://er10.kz/read/it-novosti/startap-buildots-vnedrit-ii-na-stroitelnyh-ploshhadkah/

Стартап Buildots внедрит ИИ на строительных площадках

Стартап Buildots намерен внедрить искусственный интеллект и компьютерное зрение на строительных площадках. Компания уже привлекла инвестиции для реализации пилотного проекта.

В строительной индустрии менеджеры зачастую теряют связь с происходящим на стройплощадке. Среди множества задач, которые необходимо решать, – контроль расходов, коммуникация со всеми заинтересованными сторонами и финансовые вопросы. Buildots намерен оптимизировать эти процессы с помощью искусственного интеллекта и компьютерного зрения.

Стартап предлагает платформу, которая отслеживает прогресс строительства, обрабатывая изображения, снятые 360-градусными камерами, установленными на защитных касках менеджеров. Система не только отслеживает прогресс, но и прогнозирует возможные риски. Команды могут использовать чат-бота, чтобы уточнить статус проекта, а также узнать о возможных рисках задержек или проблемах с темпами.

– Благодаря нашей платформе руководители строительства способны принимать обоснованные решения на основе реальных, измеримых данных. Это существенно отличается от информации, поступающей в разное время из множества источников с неизвестным уровнем надежности, – отмечают в стартапе.

Buildots не единственная компания, применяющая ИИ в строительной сфере. Среди них BeamUp, разрабатывающая платформу проектирования зданий на основе ИИ, и Versatile, которая, как и Buildots, собирает и анализирует данные по всей строительной площадке.

#Buildots #искусственный интеллект #Стартап #строительство

0 notes

nnctales · 10 months ago

Text

Latest Construction News Ongoing Around the World

The construction industry is undergoing significant changes globally, driven by technological advancements, sustainability initiatives, and evolving market dynamics. This article explores the latest construction news and ongoing projects around the world, highlighting key developments, challenges, and trends shaping the future of the industry. Major Ongoing Projects in Construction News 1. Saudi…

View On WordPress

0 notes

news786hz · 11 days ago

Text

AI-Powered Hard Hats? Buildots Raises $45M to Scale Up

0 notes

newtras · 12 days ago

Text

Buildots raises companies $ 45 million to track construction improvements

In the construction industry, managers can easily delete from what is happening on the website. In many duties to juggle, evaluating costs and contractors related to the cost of the costs and capabilities. Buildings All of them want to change the AI and computer vision. In 2018, Roy Danon and Yahovici and Yahovici were founded by Yahovici and Yakir Sudry. The Chicago Startup has opened a…

0 notes

satrthere · 12 days ago

Text

Buildots raises companies $ 45 million to track construction improvements

In the construction industry, managers can easily delete from what is happening on the website. In many duties to juggle, evaluating costs and contractors related to the cost of the costs and capabilities. Buildings All of them want to change the AI and computer vision. In 2018, Roy Danon and Yahovici and Yahovici were founded by Yahovici and Yakir Sudry. The Chicago Startup has opened a…

0 notes

yesgkrasnikovposts · 5 months ago

Text

ИИ на службе строительства: Buildots и Samet возводят The Novus

В Дареме (Северная Каролина, США) строится The Novus — самый высокий жилой комплекс в регионе. Компании Buildots и Samet объединили усилия, чтобы использовать в этом проекте передовые технологии на основе искусственного интеллекта. Платформа Buildots, предоставляющая инструменты для отслеживания прогресса строительства и прогнозной аналитики с помощью ИИ, помогает Samet повысить эффективность и…

#The Novus #архитектура #дизайн #проект

0 notes

y2fear · 11 months ago

Photo

Buildots Secures $15M Investment from Intel Capital to Drive Strategic Growth

0 notes

israeleconews · 3 years ago

Text

Израильский строительный стартап Buildots получил серию международных наград за инновации

фото: pixabay.com Строительная технологическая компания Buildots, расположенная в Тель-Авиве, завоевала несколько крупнейших наград в отрасли. На этой неделе компания получила награду “Инновация года” на церемонии Construction Computing Awards в Лондоне. Это последовало за другой наградой, полученной в прошлом месяце на церемонии Building Innovation Awards, также в Лондоне, где компания была…

View On WordPress

#Buildots #старт ап

0 notes

beurich · 3 years ago

Text

60 Millionen Dollar für KI-Startup: Neue Lösung bietet Echtzeit-Wissen über den gesamten Bauprozess

Das Unternehmen mit der Gesamtlösung für Bauprojekte, basierend auf Künstlicher Intelligenz und Computer Vision, erweitert damit die Software um weitere Funktionen. Nachdem das Bautechnologie-Startup Buildots vor neun Monaten erfolgreich die Series-B-Finanzierungsrunde mit 30 Millionen Dollar abgeschlossen hat, wurde nun die nächste Hürde der C Serie mit doppeltem Erfolg, demnach 60 Millionen…

View On WordPress

#Bauprozess #Bautechnologie #Buildots #Immobilie #KI #Startup #Wissen

0 notes

eretzyisrael · 2 years ago

Text

Artificial intelligence is coming to the aid of the construction of a new skyscraper at the heart of New York City, maximizing efficiency in terms of costs, labor and resources.

Israeli company Buildots, which turns visual data captured via helmet-mounted cameras into actionable construction insights, has teamed up with building giant Global Holdings Management Group on a location just off Columbus Circle.

More: Here

#Israel #Construction-tech #/Buildots #/News

11 notes · View notes

startupmag · 5 years ago

Photo

Buildots raises $16M to bring computer vision to construction management https://ift.tt/3gf1EtF

#Startups – TechCrunch #startup #startupnews #startuplife

1 note · View note

allisterlewisarchitect · 3 years ago

Text

Top 5 Data Driven Design news this week.

Want to know about BIM and Information Management updates this week - check out the news in #AEC this week via @wearenima @buildots @ToricLabs @AECmagazine @bim_plus #DigitalConstructionSummit2022 #AEC #BIM #datadrivendesign #informationmanagement

Welcome to the Association of Data-Driven Design roundup. We aim to share a summary of current information and news on Data-Driven Design practices within the construction industry. Please get in touch if you have any ideas, suggestions or have something to share. Sign up to the LinkedIn group here. And now a LinkedIn page here … Please follow our new YouTube channel – new content each week!…

View On WordPress

#autodesk #BIM #DATA DRIVEN DESIGN #informationmanagement #roundup

0 notes

laliteralwayslive · 4 years ago

Text

12 contech firms raise a combined $396M in investor funding

Built Technologies, Versatile, Buildots and other companies have received funding recently amid massive growth in contech investments.

from Construction Dive - Latest News https://ift.tt/3p06XV8

0 notes

warrenkylefoote · 4 years ago

Text

12 contech firms raise a combined $396M in investor funding

Built Technologies, Versatile, Buildots and other companies have received funding recently amid massive growth in contech investments.

from Construction Dive - Latest News https://ift.tt/3p06XV8 via IFTTT

#IFTTT #Construction Dive - Latest News

0 notes

beurich · 4 years ago

Text

Buildots sichert sich 30 Millionen US-Dollar in B-Series Finanzie-rungsrunde - auch deutsche Investoren dabei

Buildots sichert sich 30 Millionen US-Dollar in B-Series Finanzie-rungsrunde – auch deutsche Investoren dabei

Große Bauprojekte unter anderem in den USA, Europa und Asien nutzen die „Digital-Twin-Technologie“ des Bautechnologie-Start-Ups Buildots. Große Bauprojekte unter anderem in den USA, Europa und Asien nutzen die „Digital-Twin-Technologie“ des Bautechnologie-Start-Ups Buildots, die auf künstlicher Intelligenz beruht, um die Lücke in der Baueffizienz zu schließen. Infolge dessen gibt das Unternehmen…

View On WordPress

#Baugewerbe #Bauprojekt #Digitalisierung #Finanzierung #Finanzierungsrunde #Immobilie #Start-Up #Wirtschaft

0 notes