Daily AI Digest — 2026-04-05

🔬 Deep Dives

Python / Microsoft

GitHub onyx-dot-app / onyx

2026-04-05 · 254 pts · 160 comments · ⭐ 1.2k today

💬 Major HN discussion (160 comments) · ⭐ 1.2k GitHub stars today

Open Source AI Platform - AI Chat with advanced features that works with every LLM

arXiv SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

2026-04-02 · 101 pts · cs.LG · Zhengxi Lu, Zhiyuan Yao…

📄 New in cs.LG

Yet inference-time skill augmentation is fundamentally limited: retrieval noise introduces irrelevant guidance, injected skill content imposes substantial token overhead, and the model never truly acquires the knowledge it merely follows. We ask whether skills can instead be internalized into model parameters, enabling zero-shot autonomous behavior without any runtime skill retrieval. SKILL0 introduces a training-time curriculum that begins with full skill context and progressively withdraws it.

Object / Video

arXiv VOID: Video Object and Interaction Deletion

2026-04-02 · 470 pts · cs.CV · cs.AI · Saman Motamed, William Harvey…

📄 New in cs.CV, cs.AI

During inference, a vision-language model identifies regions of the scene affected by the removed object. These regions are then used to guide a video diffusion model that generates physically consistent counterfactual outcomes. Experiments on both synthetic and real data show that our approach better preserves consistent scene dynamics after object removal compared to prior video object removal methods.

Text / Perception

arXiv Generative World Renderer

2026-04-02 · 261 pts · cs.CV · Zheng-Hui Huang, Zhixiang Wang…

📄 New in cs.CV

Scaling generative inverse and forward rendering to real-world scenarios is bottlenecked by the limited realism and temporal coherence of existing synthetic datasets. Furthermore, to evaluate the real-world performance of inverse rendering without ground truth, we propose a novel VLM-based assessment protocol measuring semantic, spatial, and temporal consistency. Combined with our toolkit, our forward renderer enables users to edit styles of AAA games from G-buffers using text prompts.

arXiv UniDriveVLA: Unifying Understanding, Perception, and Action Planning for Autonomous Driving

2026-04-02 · 57 pts · 8 comments · cs.CV · cs.RO · Yongkang Li, Lijun Zhou…

📄 New in cs.CV, cs.RO

Vision-Language-Action (VLA) models have recently emerged in autonomous driving, with the promise of leveraging rich world knowledge to improve the cognitive capabilities of driving systems. However, adapting such models for driving tasks currently faces a critical dilemma between spatial perception and semantic reasoning. To overcome this, we propose UniDriveVLA, a Unified Driving Vision-Language-Action model based on Mixture-of-Transformers that addresses the perception-reasoning conflict via expert decoupling.

⚡ Quick Signals

Python / Microsoft

GitHub HKUDS / LightRAG 1 pts · ⭐263 · ⭐ 263 GitHub stars today · ⭐ 32.2k total stars

HF The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook 524 pts · 🔥 Trending on HN (524 pts)

GitHub microsoft / agent-framework 4 pts · ⭐72 · ⭐ 72 stars today on GitHub

HF LinguDistill: Recovering Linguistic Ability in Vision- Language Models via Selective Cross-Modal Distillation 164 pts · 💬 Active HN discussion (96 comments)

HF Investigating Autonomous Agent Contributions in the Wild: Activity Patterns and Code Change over Time 152 pts · 💬 Active HN discussion (44 comments)

GitHub lyogavin / airllm 2 pts · ⭐64 · ⭐ 64 stars today on GitHub · ⭐ 14.9k total stars

GitHub pyannote / pyannote-audio 2 pts · ⭐50 · ⭐ 50 stars today on GitHub

HN How many products does Microsoft have named 'Copilot'? 579 pts · 💬 Major HN discussion (287 comments) · 🔥 Trendi…

RSS Claude Now Has 1 Million Token Context. Here’s What That Actually Means for Developers. 🆕 New article

arXiv Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing 6 pts · 📄 New in cs.LG, cs.AI

Text / Perception

arXiv Steerable Visual Representations 13 pts · 📄 New in cs.CV, cs.AI

arXiv Omni123: Exploring 3D Native Foundation Models with Limited 3D Data by Unifying Text to 2D and 3D Generation 6 pts · 📄 New in cs.CV, cs.AI

Aws / Engineer

HN AWS engineer reports PostgreSQL perf halved by Linux 7.0, fix may not be easy 249 pts · 💬 Active HN discussion (61 comments)

Introduction / Computer

HN Introduction to Computer Music (2009) [pdf] 127 pts · 💬 Active HN discussion (35 comments)

German / Implementation

HN German implementation of eIDAS will require an Apple/Google account to function 161 pts · 💬 Major HN discussion (111 comments)

📰 Daily AI Digest