Quick Tour¶
Pre-requisites¶
Note: We use OpenAI's models, for the embedding, LLM and Vision model. You will need to set the OPENAI_API_KEY variable in your local environment with a valid API key for this example to work.
If you don't have an OpenAI API key, please get one here: https://platform.openai.com/docs/quickstart/developer-quickstart
Installation¶
We start by download LaVague.
!pip install lavague
We will need to set our OpenAI Key as a Colab secret (see the key icon on the left-hand side of the Colab notebook) named 'OPENAI_API_KEY' and then convert it to an environment variable with the same name.
import os
# Check if running in Google Colab
try:
from google.colab import userdata
IN_COLAB = True
except ImportError:
IN_COLAB = False
if IN_COLAB:
os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')
else:
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
ActionEngine¶
An WebAgent is made up of two components: an ActionEngine and a WorldModel.
Let's start by initializing an ActionEngine, which is responsible for generating automation code for text instructions and executing them.
from lavague.core import ActionEngine
from lavague.drivers.selenium import SeleniumDriver
selenium_driver = SeleniumDriver()
action_engine = ActionEngine(selenium_driver)
World model¶
Next, we will initialize our WorldModel, providing it with examples of global objectives for actions and the desired thought process and reasoning we wish it to replicate to generate the next instruction that needs to be passed to the ActionEngine.
from lavague.core import WorldModel
world_model = WorldModel()
WebAgent Demo¶
We can now use these two elements to initialize a WebAgent and start playing with it!
In the following example, we show how our agent can achieve a user-defined goal, here going on the quicktour of Hugging Face's PEFT framework for model finetuning.
from lavague.core.agents import WebAgent
agent = WebAgent(world_model, action_engine)
agent.get("https://huggingface.co/docs")
agent.run("Go on the quicktour of PEFT", display=True)
Interactive demo mode¶
You can also launch an interactive Gradio interface for using the agent with the agent.demo() method.
from lavague.core.agents import WebAgent
driver = SeleniumDriver(headless=True)
action_engine = ActionEngine(driver)
world_model = WorldModel()
# Create Web Agent
agent = WebAgent(world_model, action_engine)
# Set URL
agent.get("https://huggingface.co/docs")
# Launch the agent in the Agent Gradio Demo mode
agent.demo("Go on the quicktour of PEFT")