Do you primarily see Malia as the way applications will be programmed, or as an intermediate layer within the programming stack combining code and models?

The speaker primarily envisions Malia as a layer between the agentic framework and the LLM, defining how we communicate with the LLMs.

You pointed out the brittle nature of agents being glorified prompts; did you approach this using a schema method to solve the intrinsic ontology problem?

The approach is to think from a programmer's view: define the specification, determine the control flow, and explicitly decide which parts require the LLM via small prompts, leaving the rest to code.

If prompts must be small (one or two lines), where does the natural language aspect fit in, considering the semantic/ontology problems?

Natural language must still be used to write the prompt, but if it requires more than one or two lines, it suggests the prompt is too complex and should be broken down further.

What is the role of requirement checkers in the Instruct-Validate-Repair (IVR) pattern?

Requirement checkers are part of the validation step; the default is using an LLM as a judge, but specific constraints, like word count, can be checked using traditional code.

How do verbalized algorithms address the unreliability of the LLM acting as an oracle in steps like sorting?

By isolating the LLM to only perform pairwise comparisons (the oracle function) and using robust code (like Python's standard sorting algorithms) for the overall control flow, the system gains significant robustness and accuracy.

What inefficiency did you find in the standard use of LoRAs that led to activated LoRAs?

The inefficiency is that if a new task or evaluation (like a requirement check) requires running a new LoRA, the entire preceding computation from the base model must be recomputed because the standard LoRA only knows how to work with the base weight plus its delta.

AI Agents: From Fundamental Research to Practical Applications: Dr. Dan Gutfreund

Generative Programming Paradigm
📌 The speaker introduces generative computing (or generative programming) as embedding software workflows within AI processes, contrasting it with traditional imperative programming (step-by-step instructions) and statistical machine learning (model creation via input/output examples).
🤖 Generative AI (GenAI) involves two stages: first, using inductive programming to create a generalist model, and second, programming that model using prompting (natural language instructions).
⚠️ Despite excitement, 95% of AI pilots in business fail due to issues arising from using long, natural language prompts for complex control flows.

Challenges with Prompt Engineering
📌 Natural language programming (prompting) is difficult to maintain, easily broken by new constraints, and presents security risks (e.g., vulnerability to password revelation).
💻 The core issue is forgetting decades of computer science and software engineering best practices when using LLMs as generalist computing devices.
🚫 Traditional engineering tools like debuggers, unit testing, abstraction boundaries, and design patterns (like divide and conquer) are neglected in favor of trial-and-error prompt engineering.

Introduction to Project Malaya
📌 Malaya is an open-source Python library designed to bring software engineering principles back to LLM interaction, aiming for predictability, maintainability, composability, and security.
⚙️ Malaya's key principle is moving control flows out of the prompt and into code, resulting in much shorter prompts (ideally 1-2 lines) that are easier to reuse and port across models.
🛠️ Malaya is a way to build agents, not an agent orchestration framework (like LangChain), allowing integration with existing systems.

The Instruct-Validate-Repair (IVR) Pattern
📌 Due to the stochastic and often incorrect nature of LLMs, a first-class concern must be checking and repairing outputs, similar to error-correcting codes in quantum computing.
🔄 The core design pattern in Malaya is Instruct-Validate-Repair (IVR), where an instruction and requirements are given, followed by automatic validation (defaulting to LLM as a judge or code checks) and subsequent repair strategies (like rejection sampling).
🔬 This pattern enforces abstraction boundaries (execution is separate from validation) and enables code reuse, structuring agent creation methodically rather than through guesswork.

Verbalized Algorithms and Efficiency
💡 Malaya allows applying classical computer science algorithms to natural language tasks via verbalized algorithms, where the LLM acts as a specialized oracle for pairwise comparisons or specific decisions.
📊 Using a verbalized merge sort approach (where the LLM only compares pairs) showed huge gains for smaller models, allowing a 1.7 billion parameter model to outperform a 32 billion parameter baseline model in sorting correlation.
🚀 The library incorporates technical efficiencies like Activated LoRA, which optimizes the use of efficient tuning modules by training them to work directly with base model representations, speeding up evaluations significantly when multiple functions are needed.

Key Points & Insights
➡️ Do not put control flow in the prompt; LLMs are poor at following complex language-encoded control structures, which is a primary cause of agent failure.
➡️ Structure LLM interactions using the Instruct-Validate-Repair (IVR) pattern to manage inherent LLM unpredictability with built-in error checking and repair strategies.
➡️ Utilize Malaya to reintroduce software engineering rigor by encapsulating control logic in Python code, ensuring prompts remain short, reusable, and modular.
➡️ Consider using LLMs as oracles within established algorithms (verbalized algorithms) to achieve superior performance and robustness, especially for smaller models.

📸 Video summarized with SummaryTube.com on Dec 15, 2025, 09:15 UTC

AI Agents: From Fundamental Research to Practical Applications: Dr. Dan Gutfreund

Related Products

📜Transcript

📄Video Description

Recently Summarized Videos

Related Products

Loading Similar Videos...

Recently Summarized Videos

Get the Chrome Extension