Claude & Anthropic
26It is really funny to me that Sonnet is still more expensive than GPT-5.4
@@theo
zilliztech/claude-context: Code search MCP for Claude Code. Make entire codebase the context for any coding agent.
Show HN: Ctx – a /resume that works across Claude Code and Codex
Tell HN: Codex/Claude Code one-off credit purchases are a money sink
Claude! Stop Burning Tokens on Your Agent's Tool Output!
A Two-Stage Curator That Pays for Itself I watched Claude Code feed 108,894 bytes of seq 1 20000...
Mason – A multi agent system in a container using Claude Code
Show HN: Real-time visualization of Claude Code agent orchestration
Claude system prompt diff: lo que cambió entre Opus 4.6 y 4.7 (y yo lo estaba viendo sin saberlo)
Difeo línea por línea los system prompts públicos de Claude entre versiones y mapeo los cambios de comportamiento que ya estaba observando en producción antes de saber que el prompt había cambiado. El
How to Govern Claude Code Usage Across Engineering Teams
Claude Code is powerful; maybe too powerful to run without guardrails. I came across a case where a...
Show HN: Auto-generated titles and colors for parallel Claude Code sessions
Claude Code can read your secrets if it wanted to
Agentic & Tools
23WorldDB: A Vector Graph-of-Worlds Memory Engine with Ontology-Aware Write-Time Reconciliation
Persistent memory is the bottleneck separating stateless chatbots from long-running agentic systems. Retrieval-augmented generation (RAG) over flat vector stores fragments facts into chunks, loses cro
Context Engineering for Agentic Systems: What Goes Into Your Agent's Mind
Every new generation of Large Language Models arrives with a bigger context window - and the...
Agentic Forecasting using Sequential Bayesian Updating of Linguistic Beliefs
We present BLF (Bayesian Linguistic Forecaster), an agentic system for binary forecasting that achieves state-of-the-art performance on the ForecastBench benchmark. The system is built on three ideas.
What Building with MCP Taught Me About Its Biggest Gap
I spent the last few weeks wiring up MCP at my org, stitching a handful of internal tools (GitHub,...
From 10 Failed Stacks to Production: How a Data Scientist Built a Job Board with Wasp, a Full-stack Framework for the Agentic Era
NOTE: Hireveld is currently down while Marcel works on a major refactor - but it's real, we swear!...
$ENTITY Autonomous intelligence. Zero friction. Next-gen AI agent framework built for real-time analysis and instant execution. From deep search to
@@Entity_Solana
MASS-RAG: Multi-Agent Synthesis Retrieval-Augmented Generation
Large language models (LLMs) are widely used in retrieval-augmented generation (RAG) to incorporate external knowledge at inference time. However, when retrieved contexts are noisy, incomplete, or het
Users unable to load ChatGPT, Codex and API Platform
Less human AI agents, please
ChatGPT and Codex Down
Show HN: AI Coding Agent Guardrails enforced at runtime
I analysed 17 years of fast food and coffee spending using OpenAI Codex
Models & Releases
22introducing gpt-5.5 openai's most powerful model yet
@@haider1
GPT-5.5 might have just solved Frontend Development I asked it to create Excel… and it actually did it. Scarily accurate. The spreadsheet looks and
@@intheworldofai
KIMI K2.6 IS BENCHMAXED. I ran the BridgeBench lava lamp test on four models. GLM 5.1: pixel cube in a column. GPT-5.4: static bubbles. No flow. K
@@bridgemindai
GPT 5.5 Pro: "make snake except its realistic"
@@jasperdevs
WHAT THE HELLL THIS HAS TO BE SPUD??? GPT 5.5 remakes blender from scratch in HTML... "remake blender in HTML identically but name it bender"
@@jasperdevs
GPT-5.5 / GPT-5.5 PRO IS ALREADY BEING A/B TESTED INSIDE CHATGPT. And one of the clearest demos so far is a surprisingly accurate Windows OS clone bu
@@RoundtableSpace
"GPT 5.5 Pro" remakes photoshop in one HTML file from scratch with all buttons working "remake photoshop in HTML identically" great and mid at same
@@jasperdevs
GPT-5.5 MIGHT HAVE JUST CRACKED FRONTEND DEV
@@RoundtableSpace
Let’s hope we see GPT-5.5 today or, at the very latest, Thursday...
@@intheworldofai
MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval
Mathematical problem solving remains a challenging test of reasoning for large language and multimodal models, yet existing benchmarks are limited in size, language coverage, and task diversity. We in
🚀 Launching HumanFlow A custom LLM focused on natural, human-like text generation. Built with: • Fine-tuning • Quantization • Open-source release 🔹 H
@@randhir302
OpenAI Launches GPT-5.4-Cyber with Expanded Access for Security Teams
Forensic Summary OpenAI has launched GPT-5.4-Cyber, a cybersecurity-optimised model...
Research & Papers
1Industry & General
3How exactly one goes about networking in conferences? [D]
So ICLR is coming and apparently the biggest value one can get from these conferences is to network. Let's take my example: I'm a PhD student looking for industry internships. Say I have located abou