Thinkers360

Building an Intelligent Flight Assistant: A Multi-Level AI Journey - Agentic and Gemini 2.5 Flash

Aug



The journey begins with the foundations of GenAI and transformer models (Level 1), where the system is initialized by configuring the LLM (specifically, gemini-2.5-flash) and embedding models. This initial setup establishes the core AI engine. Building upon this, Level 2 delves into language model behaviour and prompting, demonstrating how to craft prompts for flight-related queries. Crucially, it introduces the concept of managing "hallucinations" by adding disclaimers to responses, ensuring users understand the simulated nature of the information. The output at this stage successfully explains complex aviation concepts like ICAO codes, showcasing the LLM's ability to generate informative text.

The system then advances to integrate external knowledge and capabilities. Level 3 introduces Retrieval-Augmented Generation (RAG), a vital technique for grounding LLM responses in factual data. By simulating the retrieval of relevant flight information from a pre-defined dataset, the system can provide contextually accurate answers to specific queries, such as details about "Air Canada flight AC123." Following this, Level 4 explores LLMOps and tool integration. Here, the AI is empowered to interact with external "tools," exemplified by a mock weather API. This allows the system to respond to queries requiring real-time data, even if the data itself is simulated, demonstrating a critical step towards practical application.

The code demonstrates a multi-level approach to building a flight planning and booking system using a large language model (LLM). It starts with the fundamental concepts of GenAI and prompting, then progressively introduces more advanced topics. The levels are structured as follows:

  • Foundations of GenAI: The code begins by setting up the environment, configuring the LLM and embedding models, and defining basic parameters like temperature.
  • Prompting: A function is created to generate flight-related responses from the LLM, which also includes a disclaimer to handle potential inaccuracies or "hallucinations."
  • Retrieval-Augmented Generation (RAG): The system simulates retrieving relevant flight information from a static data source and uses this information to enrich the prompt given to the LLM.
  • Tool Integration: It introduces the ability for the agent to use external "tools" by creating a mock function to fetch real-time weather data for a given airport code.
  • Agents and Agentic Frameworks: A basic agent is defined to handle a flight planning request, simulating a thought process to determine the first step in creating a flight itinerary.
  • Agent Memory and State: A booking assistant is created that can maintain a conversation history and keep track of key information, such as the origin, destination, and date of a flight.
  • Multi-Agent Systems: The code shows how different agents—a planning agent and a booking assistant—can collaborate to fulfill a single, comprehensive user request.
  • Evaluation and Feedback Loops: A function is implemented to evaluate the success of an agent's response, and a feedback loop is simulated to refine the reaction if it is deemed insufficient.
  • Safety and Alignment: A safety-oriented prompt is used to ensure the agent's responses are factual, safe, and professional, preventing it from providing harmful or non-compliant information.
  • Production Concepts: The final level conceptually discusses what would be required to deploy such a system in a real-world production environment, including topics like prompt caching, observability, and cost management.

Here is a summary of the final output:

  • Level 1: Foundations of GenAI and Transformers. This level involves the foundational setup of the system. It initializes the genai.GenerativeModel using the gemini-2.5-flash model and the genai.embed_content for embeddings. The Google Generative AI is configured successfully using a Google API key, and the model names and temperature are printed.
  • Level 2: Flight Prompting The system successfully explains what an ICAO code is. It provides a breakdown of its purpose, format, and distinction from IATA codes, using Montreal's airport (CYUL) as an example.
  • Level 3: RAG for Flight Planning. The system uses pre-defined flight information to answer a query about "Air Canada flight AC123," including departure, arrival, and flight duration details.
  • Level 4: Tool Integration for Flight Data. The output shows a response to a weather query, indicating that weather data is not available for a specific airport. It also provides a detailed response to a query about the best month to travel to London, breaking down the pros, cons, and "vibe" for different seasons.
  • Level 5: Agentic Flight Planning The planning agent's thought process is demonstrated in response to a flight booking request, where it identifies the need to gather more information from the user before proceeding.
  • Level 6: Agent Memory & State (Flight Booking) The booking assistant demonstrates its ability to maintain a state by updating its conversation history and state variables (origin, destination, and date) as the user provides more information.
  • Level 7: Multi-Agent Flight Planning. This level illustrates collaboration between a planning agent and a booking assistant. The planning agent receives a request, formulates a plan, and then passes the plan to the booking assistant.
  • Level 8: Evaluation, Feedback Loops, and RL. This is a conceptual level where a dummy function evaluate_booking_success is used to score a response based on keywords. The output also shows a simulated feedback loop where a response is refined after an initial, insufficient response is given.
  • Level 9: Protocols, Safety, and Alignment. The output demonstrates the use of a safety_prompt to ensure the agent provides factual and safe information.
  • Level 10: Build, Operate & Deploy in Production. This is a conceptual level that outlines production-level concerns, such as prompt caching, observability, traceability (using a unique booking_id), and cost management.

As the system grows more sophisticated, the focus shifts to creating more autonomous and stateful components. Level 5 introduces the concept of agents and agentic frameworks, where a FlightPlannerAgent is designed to simulate intelligent planning. This agent can analyze a user's request and determine the necessary next steps, such as identifying missing information for a flight search. This agentic behaviour is further enhanced in Level 6, which focuses on agent memory, state, and orchestration. A FlightBookingAssistant is developed to maintain a continuous conversation, updating its internal state with user-provided details like origin, destination, and travel dates. This allows for more natural and coherent multi-turn interactions.

The pinnacle of the system's design is reached with multi-agent systems and collaboration (Level 7). Here, a MultiAgentFlightSystem orchestrates the interaction between the PlanningAgent and the BookingAssistant. The planning agent initiates the process, formulates a preliminary plan, and then seamlessly hands it off to the booking assistant for further processing, showcasing a modular and collaborative AI architecture. Beyond functionality, the document addresses critical aspects of AI system reliability and deployment. Level 8 delves into evaluation, feedback loops, and reinforcement learning (RL), conceptually demonstrating how a system's performance can be evaluated and refined over time through simulated feedback. Level 9 emphasizes protocols, safety, and advanced alignment, illustrating how strict safety prompts can be integrated to prevent the agent from providing harmful or non-compliant information, a crucial consideration for real-world applications. Finally, Level 10 provides a conceptual overview of building, operating, and deploying such a system in production. This level touches upon vital LLMOps considerations like prompt caching for efficiency, observability for monitoring, traceability for debugging, and cost management for optimizing resource usage.

In conclusion, the Jupyter Notebook presents a compelling narrative of building a complex AI application from the ground up. It meticulously guides the reader through ten distinct levels, each adding a layer of sophistication to the flight assistant. From initial LLM configuration and intelligent prompting to robust data integration, multi-agent collaboration, and essential safety and production considerations, the document offers a holistic view of the iterative process of developing advanced Generative AI solutions.

By FRANK MORALES

Keywords: Agentic AI, Generative AI, Predictive Analytics

Share this article
Search
How do I climb the Thinkers360 thought leadership leaderboards?
What enterprise services are offered by Thinkers360?
How can I run a B2B Influencer Marketing campaign on Thinkers360?