Daily AI Digest — 2026-04-11

🔬 Deep Dives

Embodied / Python

GitHub NousResearch / hermes-agent

2026-04-11 · 4 pts · 3 comments · ⭐ 7.7k today

⭐ 7.7k GitHub stars today · ⭐ 54.1k total stars

The agent that grows with you

GitHub unslothai / unsloth

2026-04-11 · 16 pts · 2 comments · ⭐ 292 today

⭐ 292 GitHub stars today · ⭐ 61.0k total stars

Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.5, DeepSeek, gpt-oss locally.

arXiv OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks

2026-04-09 · 139 pts · cs.CV · cs.AI · Wenbo Hu, Xin Chen…

📄 New in cs.CV, cs.AI

Group Relative Policy Optimization (GRPO) has emerged as the de facto Reinforcement Learning (RL) objective driving recent advancements in Multimodal Large Language Models. However, extending this success to open-source multimodal generalist models remains heavily constrained by two primary challenges: the extreme variance in reward topologies across diverse visual tasks, and the inherent difficulty of balancing fine-grained perception with multi-step reasoning capabilities. Leveraging the enhanced training stability provided by G$^2$RPO, we introduce two task-level shaping mechanisms to seamlessly balance perception and reasoning.

GitHub HKUDS / DeepTutor

2026-04-11 · 2 pts · ⭐ 1.4k today

⭐ 1.4k GitHub stars today · ⭐ 16.2k total stars

"DeepTutor: Agent-Native Personalized Learning Assistant"

Machine / Rasbt

GitHub rasbt / machine-learning-book

2026-04-11 · 228 pts · 124 comments · ⭐ 7 today

💬 Major HN discussion (124 comments)

Code Repository for Machine Learning with PyTorch and Scikit-Learn

⚡ Quick Signals

Embodied / Python

arXiv SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds 55 pts · 📄 New in cs.RO, cs.AI

arXiv When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models 27 pts · 📄 New in cs.CV

GitHub jingyaogong / minimind 3 pts · ⭐183 · ⭐ 183 GitHub stars today · ⭐ 46.5k total stars

arXiv ViVa: A Video-Generative Value Model for Robot Reinforcement Learning 18 pts · 📄 New in cs.RO, cs.AI

arXiv ClawBench: Can AI Agents Complete Everyday Online Tasks? 10 pts · 📄 New in cs.CL, cs.AI

HF HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents 275 pts

HF AnomalyVFM -- Transforming Vision Foundation Models into Zero-Shot Anomaly Detectors 236 pts

arXiv Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models 5 pts · 📄 New in cs.CV, cs.AI

arXiv MolmoWeb: Open Visual Web Agent and Open Data for the Open Web 1 pts · 📄 New in cs.CV

HN 20 years on AWS and never not my job 101 pts · 🔥 101 pts on Hacker News

RSS Why the Smarter the Model, the Worse the Forecast: One Math Paper That Demolishes the AI… 🆕 New article

Artemis / Safely

HN Artemis II safely splashes down 805 pts · 💬 Major HN discussion (261 comments) · 🔥 Trendi…

Filing / Corners

HN Filing the corners off my MacBooks 694 pts · 💬 Major HN discussion (354 comments) · 🔥 Trendi…

Dmax / Decoding

HF DMax: Aggressive Parallel Decoding for dLLMs 53 pts

Installing / Firefox

HN Installing every* Firefox extension 330 pts · 💬 Active HN discussion (36 comments)

📰 Daily AI Digest