Daily AI Digest — 2026-04-10

🔬 Deep Dives

GitHub NousResearch / hermes-agent

2026-04-10 · 4 pts · 3 comments · ⭐ 6.5k today

⭐ 6.5k GitHub stars today · ⭐ 47.7k total stars

The agent that grows with you

GitHub huggingface / skills

2026-04-10 · 201 pts · 58 comments · ⭐ 25 today

💬 Active HN discussion (58 comments) · ⭐ 25 stars today on GitHub

Give your agents the power of the Hugging Face ecosystem

arXiv When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models

2026-04-09 · 15 pts · cs.CV · Zhengyang Sun, Yu Chen…

📄 New in cs.CV

Text-to-video diffusion models have enabled open-ended video synthesis, but often struggle with generating the correct number of objects specified in a prompt. We introduce NUMINA , a training-free identify-then-guide framework for improved numerical alignment. NUMINA identifies prompt-layout inconsistencies by selecting discriminative self- and cross-attention heads to derive a countable latent layout.

arXiv SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds

2026-04-09 · 15 pts · cs.RO · cs.AI · Yunsong Zhou, Hangxu Liu…

📄 New in cs.RO, cs.AI

Although simulation promises relief from the cost of real-world data acquisition, prevailing sim-to-real pipelines remain rooted in rigid-body abstractions, producing mismatched geometry, fragile soft dynamics, and motion primitives poorly suited for cloth interaction. To address this, we introduce SIM1, a physics-aligned real-to-sim-to-real data engine that grounds simulation in the physical world. These results validate physics-aligned simulation as scalable supervision for deformable manipulation and a practical pathway for data-efficient policy learning.

GitHub HKUDS / DeepTutor

2026-04-10 · 2 pts · ⭐ 1.3k today

⭐ 1.3k GitHub stars today · ⭐ 15.4k total stars

"DeepTutor: Agent-Native Personalized Learning Assistant"

⚡ Quick Signals

Python / Skills

arXiv ViVa: A Video-Generative Value Model for Robot Reinforcement Learning 9 pts · 📄 New in cs.RO, cs.AI

arXiv Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models 1 pts · 📄 New in cs.CV, cs.AI

arXiv MolmoWeb: Open Visual Web Agent and Open Data for the Open Web 1 pts · 📄 New in cs.CV

GitHub microsoft / BitNet 2 pts · ⭐214 · ⭐ 214 GitHub stars today · ⭐ 38.1k total stars

arXiv OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks 📄 New in cs.CV, cs.AI

arXiv RewardFlow: Generate Images by Optimizing What You Reward 📄 New in cs.CV, cs.AI

HF HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents 149 pts

HN I still prefer MCP over skills 119 pts · 💬 Major HN discussion (115 comments)

HN Native Instant Space Switching on macOS 472 pts · 💬 Major HN discussion (213 comments)

HF Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Tr… 13 pts

Rag / Techniques

GitHub NirDiamant / RAG_Techniques 6 pts · ⭐80 · ⭐ 80 stars today on GitHub · ⭐ 26.7k total stars

We'Ve / Raised

HN We've raised $17M to build what comes after Git 95 pts · 💬 Major HN discussion (183 comments)

Youtube / Locked

HN YouTube locked my accounts and I can't cancel my subscription 65 pts · 💬 Active HN discussion (49 comments)

Dmax / Decoding

HF DMax: Aggressive Parallel Decoding for dLLMs 36 pts

Affiliate / Marketing

RSS AffiCraft Review 2026 – The AI Affiliate Marketing Toolkit That Creates & Ranks Content… 🆕 New article

📰 Daily AI Digest