🔬 Deep Dives
2026-04-14 · 4 pts · 3 comments · ⭐ 11.3k today
⭐ 11.3k GitHub stars today · ⭐ 80.8k total stars
The agent that grows with you
2026-04-14 · 98 pts · 14 comments · ⭐ 55 today
⭐ 55 stars today on GitHub · ⭐ 16.8k total stars
The absolute trainer to light up AI agents.
2026-04-14 · 147 pts · 17 comments · ⭐ 37 today
⭐ 37 stars today on GitHub · ⭐ 25.1k total stars
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
2026-04-13 · 5 pts · cs.CL · cs.AI · Junlin Liu, Shengnan An…
📄 New in cs.CL, cs.AI
To address this gap, we introduce General365, a benchmark specifically designed to assess general reasoning in LLMs. By restricting background knowledge to a K-12 level, General365 explicitly decouples reasoning from specialized expertise. We envision General365 as a catalyst for advancing LLM reasoning beyond domain-specific tasks toward robust, general-purpose real-world scenarios.
2026-04-13 · 4 pts · cs.LG · cs.AI · Mihir Prabhudesai, Aryan Satpathy…
📄 New in cs.LG, cs.AI
In contrast, other sciences such as physics lack large-scale QA datasets to effectively train reasoning-capable models. We generate random scenes in physics engines, create synthetic question-answer pairs from simulated interactions, and train LLMs using reinforcement learning on this synthetic data. These results demonstrate that physics simulators can act as scalable data generators, enabling LLMs to acquire deep physical reasoning skills beyond the limitations of internet-scale QA data.
⚡ Quick Signals
Claude / Fly
GitHub
lyogavin / airllm
2 pts · ⭐263 · ⭐ 263 GitHub stars today · ⭐ 15.9k total stars
Spam / Policy
Davinci / Resolve
Stacked / Prs
Continuous / Flow
Review / Class