Found 4 bookmarks
Custom sorting
What if your LLM is… a graph?
What if your LLM is… a graph?
What if your LLM is… a graph? A few days ago, Petar Veličković from Google DeepMind gave one of the most interesting and thought provoking conference I've seen in a while, "Large Language Models as Graph Neural Networks". Once you start seeing LLM as graph neural network, many structural oddities suddenly falls into place. For instance, OpenAI currently recommends to put the instructions at the top of a long prompt. Why is that so? Because due to the geometry of attention graphs, LLM are counter-intuitively biased in favors of the first tokens: they travel constinously through each generation steps, are internally repeated a lot and end up "over-squashing" the latter ones. Models then use a variety of internal metrics/transforms like softmax to moderate this bias and better ponderate distribution, but this is a late patch that cannot solve long time attention deficiencies, even more so for long context. The most interesting aspect of the conference from an applied perspective: graph/geometric representations directly affect accuracy and robustness. As the generated sequence grow and deal with sequences of complex reasoning steps, cannot build solid expert system when attention graphs have single point of failures. Or at least, without extrapolating this information in the first place and providing more detailed accuracy metrics. I do believe LLM explainability research is largely underexploited right now, despite being accordingly a key component of LLM devops in big labs. If anything, this is literal "prompt engineering", seeing models as nearly physical structure under stress and providing the right feedback loops to make them more reliable. | 30 comments on LinkedIn
What if your LLM is… a graph?
·linkedin.com·
What if your LLM is… a graph?
Adaptive Graph of Thoughts (AGoT), a test-time framework that replaces rigid prompting strategies (like Chain/Tree of Thought) with dynamic directed acyclic graphs
Adaptive Graph of Thoughts (AGoT), a test-time framework that replaces rigid prompting strategies (like Chain/Tree of Thought) with dynamic directed acyclic graphs
Dynamic Reasoning Graphs + LLMs = 🤝 Large Language Models (LLMs) often stumble on complex tasks when confined to linear reasoning. What if they could dynamically restructure their thought process like humans? A new paper introduces Adaptive Graph of Thoughts (AGoT), a test-time framework that replaces rigid prompting strategies (like Chain/Tree of Thought) with dynamic directed acyclic graphs (DAGs). Instead of forcing fixed reasoning steps, AGoT recursively decomposes problems into sub-tasks, selectively expanding only the most critical pathways. This is crucial for industries like scientific research or legal analysis, where problems demand non-linear, nested reasoning. The key innovation lies in complexity checks: AGoT assesses each reasoning node, spawning sub-graphs for intricate subtasks while resolving simpler ones directly. This mirrors how experts allocate mental effort—drilling into uncertainties while streamlining obvious steps. The framework achieved 46.2% improvement on GPQA (a notoriously hard science QA benchmark), rivaling gains from compute-heavy fine-tuning. By unifying chain, tree, and graph paradigms, AGoT retains CoT’s clarity, ToT’s exploration, and GoT’s flexibility without manual tuning. The result? LLMs that self-adapt their reasoning depth based on problem complexity—no architectural changes needed. For AI practitioners, AGoT’s DAG structure offers a principled interface to scale reasoning modularly. ↓ 𝐖𝐚𝐧𝐧𝐚 𝐤𝐧𝐨𝐰 𝐰𝐡𝐚𝐭 𝐲𝐨𝐮 𝐦𝐢𝐬𝐬𝐞𝐝? Join my newsletter with 50k+ readers that breaks down all you need to know about the latest LLM research: llmwatch.com 💡
Adaptive Graph of Thoughts (AGoT), a test-time framework that replaces rigid prompting strategies (like Chain/Tree of Thought) with dynamic directed acyclic graphs
·linkedin.com·
Adaptive Graph of Thoughts (AGoT), a test-time framework that replaces rigid prompting strategies (like Chain/Tree of Thought) with dynamic directed acyclic graphs