Skip to content

3.4 PREDIBAG: Building Modern AI Agents in Tamgu's Prolog

Claude Roux edited this page Sep 2, 2024 · 21 revisions

PREDIBAG: Predicate Based-Agents

Version française

Resurrecting the dead is generally a risky operation. It requires either knowledge of Latin or Hebrew, or a good thunderstorm with monstrous lightning. The result, according to experts, rarely lives up to expectations. One only needs to see the failure of Dr. Frankenstein to be convinced of this.

In computing, we do this all the time...

It's normal, if you think about it, the data we manipulate is ultimately more or less made up of the same cocktail: numerical or textual data. Even an image or a video, it's just numbers in a row that we project onto a screen.

Who, for example, would suspect that numpy owes everything to Fortran?

Zombie ideas in computer science, half-dead ideas that find a second life, are nothing exceptional.

Agent-Based Architecture

Let's take a recent example. AI models with billions of parameters such as llama3-7b or mistral-7b are rarely as powerful as the frontier AI models that are approaching trillions of parameters. But, if we place them in an agent-based architecture, roughly different prompts, each with the specific task of generating things or judging things already generated, we make the counters explode. AIs know better how to judge their neighbors than to do things themselves. And there are still people who doubt that these AIs will have trouble finding their place in human societies!!!

The goal of an agent-based architecture is to plan the different stages of an analysis, to generate possible answers, to correct or transform them iteratively.

In any case, the goal of such an architecture is to traverse a graph to find a path to the solution(s).

An Old Idea: Prolog

Traversing an analysis graph, that's a known task for which we've even invented a language: Prolog.

Prolog was born from the work of Alain Colmerauer and Philippe Roussel at the University of Marseille in the early 1970s. It reached its peak in the 1980s and early 1990s, finding applications in natural language processing, expert systems, and AI research.

However, with the arrival of the AI winter and the rise of data-driven approaches, Prolog's star began to fade. Today, it is largely forgotten, relegated to academic circles and niche applications. But while the language is somewhat obscure, there are still many communities that continue to keep it alive in various forms.

Unification and Backtracking

A Prolog program relies on variable unification and backtracking to find the best path in an analysis graph defined by rules and facts:

Let's illustrate this with a simple example:

Example: Family Relationships

Let's consider a simple knowledge base with facts about family relationships:

parent("george", "sam").
parent("george", "andy").
parent("sam", "john").

We can define a rule to determine if someone is an ancestor of another person:

ancestor(?X, ?Y) :- parent(?X, ?Y).
ancestor(?X, ?Z) :- parent(?X, ?Y), ancestor(?Y, ?Z).

Here's how the Prolog engine works with these rules and facts:

  1. Query: We ask the engine if "george" is an ancestor of "john":

    bool b = ancestor("george", "john").

    Since we only need to prove that this predicate is true, we use a boolean as the recipient variable. On the other hand, if we wanted to find all possible ancestors for "george" for example:

    vector v = ancestor("george", ?Descendant);
    println(v);

    We would use a vector to store each solution.

  2. Resolution: The engine tries to match the query to the rules and facts.

    • It first tries the rule ancestor(?X, ?Y) :- parent(?X, ?Y).

      • It checks if parent("george", "john"). is true.
      • Since this fact is not in the knowledge base, this path fails.
    • The engine then backtracks and tries the second rule ancestor(?X, ?Z) :- parent(?X, ?Y), ancestor(?Y, ?Z).

      • It checks if parent("george", ?Y). is true.
      • It finds parent("george", "sam").
      • It then tries to satisfy ancestor("sam", "john").
        • It checks if parent("sam", "john"). is true.
        • It finds parent("sam", "john").
        • Since both conditions are satisfied, the query succeeds.
  3. Result: The engine confirms that "george" is an ancestor of "john".

This simple example shows how the Prolog engine uses resolution and backtracking to explore different reasoning paths to satisfy a goal. The engine matches the goal to the rules and facts in the knowledge base, using backtracking to explore alternative paths when necessary.

Tamgu

The formalism above is that of a programming language that was developed within Naver Labs Europe or NLE, a laboratory belonging to the Korean group Naver.

Tamgu is a FIL language that mixes imperative, functional and logical approaches in the same formalism. Furthermore, Tamgu offers libraries to use cURL and especially to execute Python code.

You can thus mix an inference engine à la Prolog with an imperative language that looks like a Python that would have had a long conversation with TypeScript.

It becomes possible to write a Prolog program that calls more traditional functions. However, and this is the subtlety of this mixture, it is also possible to unify variables in these functions.

Simple Example

Let's take the following example:

We have a function that allows, for example, to execute a prompt by calling the REST API of OLLAMA.

//The type ?_ allows the unification of res in an external function

function vprompt(string model, string p, ?_ res) {
    res = ollama.promptnostream(uri, model, p);
    return true;
}

Let's say the prompt is: Write a Python program to count prime numbers between 1 and 1000.

We can then simply call our program as follows:

create(?Prompt,?Result) :-
    vprompt("mistral-large", ?Prompt, ?GeneratedCode ),
    extractcode(?GeneratedCode, ?Code),
    execution(?Code, ?Result).
  • vprompt applies our model to the prompt and receives the response in ?GeneratedCode.
  • extractcode extracts the code from the generated response.
  • execution runs the Python code via Tamgu's Python library.

We can already imagine more complex architectures where different Prolog rules can exchange the results of different prompts.

By the way, there is a very important element that absolutely must be mentioned. The knowledge base presented above is made dynamic: it is possible to add or remove facts in this base.

Furthermore, we can go even further. The classes in Tamgu are called frames. And we can compare frame instances to each other by overriding the appropriate operators.

Here's a particularly relevant example for this purpose...

Implementation of a Simple RAG System with Tamgu

These frames allow the creation of complex data structures with associated methods, which the predicate engine can use transparently. Indeed, one of the specificities of this prolog is that when the engine has to unify external data, such as class instances for example, it then relies on the equality operator to perform this operation.

Now, it is possible to override this equality function in a frame to implement a custom comparison between different frame instances.

Let's take the following example, where we record in a frame instance, embeddings for particular sentences. We can then decide that equality between two instances consists of a cosine distance greater than a given value.

When the predicate engine has to extract with the findall operator a list of facts in memory, close to a given query, this operator will run smoothly in the background to offer a small custom RAG.

frame Vectordb {
    string sentence;
    fvector embedding;

    function _initial(string u) {
        sentence = u;
        // get_embedding returns the embeddings of the sentence
        // as calculated by the "command-r" model.
        embedding = get_embedding("command-r", sentence);
    }

    function string() {
        return sentence;
    }

    //This '==' function is called whenever two frame instances are compared to each other
    function ==(Vectordb e) {
        return (cosine(embedding, e.embedding) > 0.3);
    }
}

//We loop through all the facts to insert them into the knowledge base
//For each fact, a simple string, we build a Vectordb object
//that will calculate the embeddings for each sentence.
append([]).
append([?K | ?R]) :-
    ?X is Vectordb(?K),
    assertz(record(?X, ?K)),
    append(?R).

//findall will use the underlying `==` function exposed by `Vectordb`
//The unification of a non-prolog object is obtained by `equality`.
retrieve(?Query, ?Results) :-
    ?Q is Vectordb(?Query),
    findall(?K, record(?Q, ?K), ?Results).

generate_response(?Query, ?Context, ?Response) :-
    vprompt("llama3", ?Query, ?Context, ?Response).

rag_query(?Query, ?Response) :-
    retrieve(?Query, ?RelevantDocs),
    ?Context is "Relevant information: " + ?RelevantDocs,
    generate_response(?Query, ?Context, ?Response).

// Populate the knowledge base
// We use a boolean recipient variable to force the execution of the predicate
bool x = append(["The capital of France is Paris.",
                 "The Eiffel Tower is located in Paris.",
                 "London is the capital of the United Kingdom."]);

// Query the RAG system
// ?:- is a predicate variable that expects a complete unification of ?Response to succeed
?:- result = rag_query("Tell me about the capital of France", ?Response);
println(result);

In this example, we have extended our previous setup to create a simple RAG system:

  1. The Vectordb frame represents our integrated documents.

  2. The retrieve predicate uses our semantic similarity search to find relevant documents based on the query.

  3. generate_response uses an LLM (in this case, "llama3") to generate a response based on the query and the retrieved context.

  4. The rag_query predicate ties it all together:

    • It first retrieves the relevant documents using our semantic search.
    • It then builds a context string from these documents.
    • Finally, it generates a response using the LLM, augmented with the retrieved context.

This simple RAG system shows how the integration of frames with Tamgu's predicate engine can be used to create sophisticated AI systems. The semantic search capability, activated by our custom equality operator, enables intelligent retrieval of relevant information. This is seamlessly combined with Tamgu's ability to interface with LLMs, resulting in a system capable of generating informed responses based on a dynamically managed knowledge base.

Such a system could be easily extended to include more complex reasoning, multi-step retrieval, or dynamic updates to the knowledge base. The logical framework provided by Tamgu's predicate engine allows for a clear expression of these complex workflows, while the frame system enables efficient management of complex data structures like embeddings.

Note: When calling a predicate in Tamgu, the type of the recipient variable affects how the inference engine handles the query. If a vector is used, the engine explores all possible paths and returns multiple solutions. If a ?:- (predicate variable) type is used, the engine searches for a single solution and stops after finding the first valid result. If no variable needs to be unified, a Boolean can be used to check if it is true or false.

Integrating Python into Tamgu

Additionally, Tamgu allows the execution of Python code directly within predicates. This functionality enables the integration of Python's ecosystem and computational capabilities into the logical reasoning process. Here's an example of how this works:

//We load our Python interpreter library
use('pytamgu');

//We declare a Python variable `p`
python p;

function execute_code(string c, ?_ v) {
    println("Executing code");
    string py_err;
    try {
        // The second parameter of `run` requires that
        // the final result be in `result_variable`,
        // which is a Python variable
        v = p.run(c, "result_variable");
        return true;
    }
    catch(py_err) {
        v = py_err;
        return false;
    }
}

execution(?model, ?code, ?result) :-
    println("Execution"),
    ?success is execute_code(?code, ?r),
    handle_execution(?success, ?model, ?code, ?r, ?result).

handle_execution(true, ?model, ?code, ?r, ?r) :- !,
    println("Execution successful:", ?r).

handle_execution(false, ?model, ?code, ?error, ?result) :-
    println("Execution failed. Attempting correction."),
    correct_and_retry(?model, ?code, ?error, ?result).

In this setup, we define an execute_code function that executes Python code and captures either the result or any error message. The execution predicate then uses this function in the logical flow of our agent.

What is important here is the way Tamgu allows us to specify a particular variable name ("result_variable" in this case) from which to extract the result of the Python execution. This means that we can ask an LLM to generate Python code that stores its output in a specific variable, and then extract and use this value transparently in our predicate-based reasoning.

If the execution fails, our handle_execution predicate can trigger a correction process, potentially asking the LLM to fix the code based on the error message. This creates a robust and self-correcting system that combines the strengths of logic programming and modern AI.

(see PREDIBAG for an example of the code)

Prolog as a Language for Agent Architectures

Prolog, with its roots in logic programming, offers a robust and expressive foundation for designing agent architectures. Its unique characteristics make it a particularly well-suited choice for creating intelligent agents capable of reasoning, planning, and interacting with their environment. Here are several reasons why Prolog is an excellent choice for agent architectures:

Declarative Nature

  • Logic-based Reasoning: The declarative nature of Prolog allows developers to focus on specifying what the agent should accomplish rather than how to accomplish it. This high-level abstraction simplifies the development of complex reasoning systems.
  • Rule-based Systems: Agents can be designed using a set of rules that define their behavior. These rules can be easily modified or extended, making the system flexible and adaptable to changing requirements.

Backtracking and Search

  • Exploring Solutions: Prolog's backtracking mechanism allows agents to explore different reasoning paths. This is crucial for problem-solving and decision-making, as agents can evaluate various options and select the most appropriate one.
  • Efficient Search: Prolog's built-in search capabilities enable agents to efficiently navigate large knowledge bases, finding relevant information and drawing conclusions based on the available data.

Knowledge Representation

  • Dynamic Knowledge Bases: Prolog's ability to dynamically update and query knowledge bases makes it ideal for agents that need to maintain and manipulate complex, evolving information. This is particularly useful in domains where the environment is constantly changing.
  • Symbolic Reasoning: Prolog excels at symbolic reasoning, allowing agents to represent and manipulate abstract concepts and relationships. This is essential for tasks that require understanding and reasoning about the world.

Integration with Modern AI Techniques

  • Hybrid Systems: Prolog can be integrated with modern AI techniques, such as machine learning and large language models (LLMs). This combination allows agents to leverage both symbolic reasoning and data-driven approaches, creating more powerful and versatile systems.
  • Interfacing with External Systems: Prolog can interface with external systems and libraries, enabling agents to interact with a wide range of tools and technologies. This includes executing Python code, querying databases, and communicating with other agents.

Transparency and Explainability

  • Explainable AI: The logical structure of Prolog makes it easier to understand and explain the reasoning process of agents. This is crucial for building trust in AI systems, especially in domains where transparency and accountability are important.
  • Debugging and Verification: Prolog's declarative nature and logical foundations make it easier to debug and verify the correctness of agent behaviors. This is essential for ensuring the reliability and robustness of AI systems.

Multi-Agent Collaboration

  • Shared Knowledge Bases: Prolog's ability to maintain and query shared knowledge bases provides a framework for multi-agent collaboration. Agents can exchange information, update shared knowledge, and coordinate their actions to achieve common goals.
  • Communication Protocols: Prolog can be used to define communication protocols and interaction patterns between agents. This enables the development of complex, cooperative agent systems that can work together to solve problems.

Conclusion

Prolog-based agent architectures offer a unique blend of logical reasoning, declarative programming, and dynamic knowledge representation. In comparison to current implementations, Tamgu-based agents provide a robust and expressive framework for building intelligent and adaptable AI systems. They are particularly valuable in domains where explainability, dynamic knowledge representation, and multi-agent collaboration are important. As AI continues to evolve, the revitalization of Prolog through innovative approaches like Tamgu offers exciting possibilities for the future of agent-based systems.

The legacy of Prolog, far from being confined to history, could find new life in the era of large language models and hybrid AI systems.

Clone this wiki locally