LaVague Testing Documentation
The lavague-tests
command-line interface (CLI) facilitates the testing and benchmarking of LaVague across a variety of common use cases.
Installation
Install the CLI tool via pip:
pip install lavague-tests
Quick Start
Default tests
By default, lavague-test
will run all tests defined in /sites
with the default OpenAI Context
.
The/sites
folder contains is already populated with several tests which cover a range of popular websites and use cases.
You can test out the performance of LaVague on these tests with the following command:
Running lavague-test
Note that in order to use the default configuration, you will need to run lavague-test
from the root of the LaVague repo.
> lavague-test
Command line options overview
Option | Alias | Description | Default Value | Required |
---|---|---|---|---|
--context |
-c |
Path to a Python file containing an initialized Context and TokenCounter . |
./lavague-tests/contexts/default_context.py |
No |
--directory |
-d |
Directory where the site configurations are stored. | ./lavague-tests/sites |
No |
--site |
-s |
Name of the site(s) to test. Multiple sites can be specified. | All sites in ./lavague-tests/sites |
No |
--display |
None | If set, displays the browser during the test. | False | No |
--log-to-db |
-db |
If enabled, logs test results to the default SQLite database. | False | No |
The output of this command is a report including information on which tasks succeeded or failed, the percentage of tests that succeeded and a table outlining the token consumption and estimated total cost of these tests:
Result: 78 % (18 / 23) in 727.5s
Component | Input | Output | Total | Cost (USD) |
----------------------------------------------------------------------
World Model | 128688 | 3655 | 132343 | $ 0.6983 |
Action Engine | 256960 | 6577 | 263537 | $ 1.3835 |
Embeddings | 3732520 | | 3732520 | $ 0.4852 |
----------------------------------------------------------------------
Total | 4118168 | 10232 | 4128400 | $ 2.5669 |
Selecting which sites to test
You can select to run lavague-test
on just one or several sites defined in the sites
folder by specifying the sites to be tested with the --site
or -s
option. Note this site name much match a corresponding folder containing a test configuration file in the lavague-tests/sites
folder.
lavague-test -s amazon.com -s youtube.com
Testing LaVague with a built-in context
We provide the following configuration files for running your tests with alternative built-in configurations:
- For testing with
Anthropic's Claude 3.5 Sonnet
: lavague-tests/contexts/anthropic_context.py - For testing with
Llama3.1 405B via the Fireworks API
: lavague-tests/contexts/fireworks_context.py - For testing with
Gemini
: lavague-tests/contexts/gemini_context.py
You can provide these config files as an argument to our lavague-test
command to run tests with these built-in configuration files:
lavague-test-c lavague-tests/contexts/anthropic_context.py
Note, you will need to set the relevant API keys and install the relevant lavague pypi packages (lavague-contexts-anthropic, lavague-contexts-fireworks or lavague-contexts-gemini) for these to work.
Providing a custom configuration file
To define your own custom context, you can create a .py
file in the lavague-tests/contexts
folder. Within this file, you should define a context
and token_counter
variable.
The context
variable should be a Context
which you can create with the llm
, mm_llm
and embedding
of your choice. These models should be instances of a llama_index.llms
, llama_index.multi_modal_llms
and llama_index.embeddings
models respectively.
Custom configuration file example
from lavague.core.token_counter import TokenCounter
from llama_index.llms.openai import OpenAI
from llama_index.multi_modal_llms.openai import OpenAIMultiModal
from llama_index.embeddings.openai import OpenAIEmbedding
from lavague.core.context import Context
llm_name = "gpt-4o-mini"
mm_llm_name = "gpt-4o-mini"
embedding_name = "text-embedding-3-large"
token_counter = TokenCounter()
# Initialize models
llm = OpenAI(model=llm_name)
mm_llm = OpenAIMultiModal(model=mm_llm_name)
embedding = OpenAIEmbedding(model=embedding_name)
# Initialize context
context = Context(llm, mm_llm, embedding)
How to add your own tests
To add a custom test, you will need to add a folder named after the site you will test within the sites/
directory.
Each site folder should contain a config.yml
file specifying the tasks to execute during testing.
The config.yml file has a tasks:
option, within which you should define the tests to be run and any configuration for these tasks:
tasks:
- name: Name # Optional display name
max_steps: 5 # to override global value
n_attempts: 1 # to override global value
url: https://example.com # the initial task URL
prompt: Prompt for the agent # the agent prompt
expect: # the list of tests to perform on task completion, see below for details
- <property> <operator> <value>
user_data: # optional task-scoped user data to feed the Agent
key: value
You can also set any options globally such as
max_steps
orn_attempts
above thetasks
option. These will be overwritten by any task-specific options of the same type.
Available operators:
Operator | Python operation |
---|---|
is | operator.eq |
is not | operator.ne |
is lower than | operator.lt |
is greater than | operator.gt |
contains | operator.contains |
does not contain | not operator.contains |
Available properties:
Property | Type |
---|---|
URL | string |
Status | success / failure |
Output | string |
Steps | number |
HTML | string |
Tabs | string |
Example of config.yml
tasks:
- name: HuggingFace navigation
url: https://huggingface.co/docs
prompt: Go on the quicktour of PEFT
expect:
- URL is https://huggingface.co/docs/peft/quicktour
- Status is success
- HTML contains PEFT offers parameter-efficient methods for finetuning large pretrained models
- name: HuggingFace search
url: https://huggingface.co
prompt: Find the-wave-250 dataset
expect:
- URL is https://huggingface.co/datasets/BigAction/the-wave-250
- Status is success
Testing on your static website
To test with a static website, you can add the following options to your task within you config.yml
:
- type: static
- directory : the directory to serve files from. Defaults to
www
which is expected in the same folder as yourconfig.yml
- port : the port to serve the files on. Defaults to 8000
Contributing your tests
We would like to add your tests to our set of default tests, enabling us to build a wide-ranging collection of tests reflecting our communities needs and interests.
To do so, you will need to:
- Clone the LaVague repo.
- Add a test website folder to the
lavague-tests/sites
folder: e.g. For a test ofyahoo.com
, you should add ayahoo.com
folder in thesites
directory. - Add your config.yml file defining your test in this folder.
- Push your additions and create a PR to the main LaVague repo.
For more information on how to submit a PR to the repo, see our contribution guide.