Hugging Face Papers retrieval¶
This notebook shows how one can use LaVague to create an agent that can outperform Gemini or ChatGPT on the task of finding the latest hottest papers on Hugging Face Papers!
We will create the agent and serve it through a Gradio interface.
You can see in the demo below how our agent is able to outperform both Gemini and ChatGPT to answer the query "What is the most trendy recent paper on text-to-video on Hugging Face papers? Provide the date and a summary of the paper".
Pre-requisites¶
Note: We use OpenAI's models, for the embedding, LLM and Vision model. You will need to set the OPENAI_API_KEY variable in your local environment with a valid API key for this example to work.
If you don't have an OpenAI API key, please get one here: https://platform.openai.com/docs/quickstart/developer-quickstart
Installation¶
We start by downloading LaVague.
!pip install lavague
Collecting lavague
Downloading lavague-1.1.3-py3-none-any.whl (8.3 kB)
Collecting lavague-contexts-openai<0.3.0,>=0.2.0 (from lavague)
Downloading lavague_contexts_openai-0.2.0-py3-none-any.whl (2.4 kB)
Collecting lavague-core<0.3.0,>=0.2.14 (from lavague)
Downloading lavague_core-0.2.14-py3-none-any.whl (35 kB)
Collecting lavague-drivers-selenium<0.3.0,>=0.2.3 (from lavague)
Downloading lavague_drivers_selenium-0.2.3-py3-none-any.whl (6.9 kB)
Collecting lavague-gradio<0.3.0,>=0.2.4 (from lavague)
Downloading lavague_gradio-0.2.4-py3-none-any.whl (3.9 kB)
Collecting llama-index-embeddings-openai<0.2.0,>=0.1.9 (from lavague-contexts-openai<0.3.0,>=0.2.0->lavague)
Downloading llama_index_embeddings_openai-0.1.10-py3-none-any.whl (6.2 kB)
Collecting llama-index-llms-azure-openai<0.2.0,>=0.1.8 (from lavague-contexts-openai<0.3.0,>=0.2.0->lavague)
Downloading llama_index_llms_azure_openai-0.1.8-py3-none-any.whl (4.9 kB)
Collecting llama-index-llms-openai<0.2.0,>=0.1.9 (from lavague-contexts-openai<0.3.0,>=0.2.0->lavague)
Downloading llama_index_llms_openai-0.1.22-py3-none-any.whl (11 kB)
Collecting llama-index-multi-modal-llms-azure-openai<0.2.0,>=0.1.4 (from lavague-contexts-openai<0.3.0,>=0.2.0->lavague)
Downloading llama_index_multi_modal_llms_azure_openai-0.1.4-py3-none-any.whl (3.7 kB)
Collecting llama-index-multi-modal-llms-openai<0.2.0,>=0.1.6 (from lavague-contexts-openai<0.3.0,>=0.2.0->lavague)
Downloading llama_index_multi_modal_llms_openai-0.1.6-py3-none-any.whl (5.8 kB)
Requirement already satisfied: ipython<8.0.0,>=7.34.0 in /usr/local/lib/python3.10/dist-packages (from lavague-core<0.3.0,>=0.2.14->lavague) (7.34.0)
Collecting langchain<0.2.0,>=0.1.20 (from lavague-core<0.3.0,>=0.2.14->lavague)
Downloading langchain-0.1.20-py3-none-any.whl (1.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 15.9 MB/s eta 0:00:00
Collecting llama-index<0.11.0,>=0.10.19 (from lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llama_index-0.10.43-py3-none-any.whl (6.8 kB)
Collecting llama-index-retrievers-bm25<0.2.0,>=0.1.3 (from lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llama_index_retrievers_bm25-0.1.3-py3-none-any.whl (2.9 kB)
Collecting lxml<6.0.0,>=5.1.1 (from lavague-core<0.3.0,>=0.2.14->lavague)
Downloading lxml-5.2.2-cp310-cp310-manylinux_2_28_x86_64.whl (5.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.0/5.0 MB 61.8 MB/s eta 0:00:00
Collecting lxml-html-clean<0.2.0,>=0.1.1 (from lavague-core<0.3.0,>=0.2.14->lavague)
Downloading lxml_html_clean-0.1.1-py3-none-any.whl (11 kB)
Requirement already satisfied: msgpack<2.0.0,>=1.0.8 in /usr/local/lib/python3.10/dist-packages (from lavague-core<0.3.0,>=0.2.14->lavague) (1.0.8)
Collecting trafilatura<2.0.0,>=1.9.0 (from lavague-core<0.3.0,>=0.2.14->lavague)
Downloading trafilatura-1.10.0-py3-none-any.whl (1.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 63.6 MB/s eta 0:00:00
Collecting selenium<5.0.0,>=4.18.1 (from lavague-drivers-selenium<0.3.0,>=0.2.3->lavague)
Downloading selenium-4.21.0-py3-none-any.whl (9.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.5/9.5 MB 112.1 MB/s eta 0:00:00
Collecting gradio==4.26.0 (from lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading gradio-4.26.0-py3-none-any.whl (17.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.1/17.1 MB 67.9 MB/s eta 0:00:00
Collecting aiofiles<24.0,>=22.0 (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading aiofiles-23.2.1-py3-none-any.whl (15 kB)
Requirement already satisfied: altair<6.0,>=4.2.0 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (4.2.2)
Collecting fastapi (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading fastapi-0.111.0-py3-none-any.whl (91 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 92.0/92.0 kB 8.9 MB/s eta 0:00:00
Collecting ffmpy (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading ffmpy-0.3.2.tar.gz (5.5 kB)
Preparing metadata (setup.py) ... done
Collecting gradio-client==0.15.1 (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading gradio_client-0.15.1-py3-none-any.whl (313 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 313.6/313.6 kB 27.3 MB/s eta 0:00:00
Collecting httpx>=0.24.1 (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 75.6/75.6 kB 6.4 MB/s eta 0:00:00
Requirement already satisfied: huggingface-hub>=0.19.3 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (0.23.2)
Requirement already satisfied: importlib-resources<7.0,>=1.3 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (6.4.0)
Requirement already satisfied: jinja2<4.0 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (3.1.4)
Requirement already satisfied: markupsafe~=2.0 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (2.1.5)
Requirement already satisfied: matplotlib~=3.0 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (3.7.1)
Requirement already satisfied: numpy~=1.0 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (1.25.2)
Collecting orjson~=3.0 (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading orjson-3.10.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (142 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 142.5/142.5 kB 7.5 MB/s eta 0:00:00
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (24.0)
Requirement already satisfied: pandas<3.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (2.0.3)
Requirement already satisfied: pillow<11.0,>=8.0 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (9.4.0)
Requirement already satisfied: pydantic>=2.0 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (2.7.3)
Collecting pydub (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Collecting python-multipart>=0.0.9 (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading python_multipart-0.0.9-py3-none-any.whl (22 kB)
Requirement already satisfied: pyyaml<7.0,>=5.0 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (6.0.1)
Collecting ruff>=0.2.2 (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading ruff-0.4.8-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.8/8.8 MB 54.9 MB/s eta 0:00:00
Collecting semantic-version~=2.0 (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading semantic_version-2.10.0-py2.py3-none-any.whl (15 kB)
Collecting tomlkit==0.12.0 (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading tomlkit-0.12.0-py3-none-any.whl (37 kB)
Requirement already satisfied: typer[all]<1.0,>=0.9 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (0.9.4)
Requirement already satisfied: typing-extensions~=4.0 in /usr/local/lib/python3.10/dist-packages (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (4.12.1)
Collecting uvicorn>=0.14.0 (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading uvicorn-0.30.1-py3-none-any.whl (62 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.4/62.4 kB 7.5 MB/s eta 0:00:00
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from gradio-client==0.15.1->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (2023.6.0)
Collecting websockets<12.0,>=10.0 (from gradio-client==0.15.1->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading websockets-11.0.3-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (129 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 129.9/129.9 kB 10.9 MB/s eta 0:00:00
Requirement already satisfied: setuptools>=18.5 in /usr/local/lib/python3.10/dist-packages (from ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague) (67.7.2)
Collecting jedi>=0.16 (from ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading jedi-0.19.1-py2.py3-none-any.whl (1.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 94.1 MB/s eta 0:00:00
Requirement already satisfied: decorator in /usr/local/lib/python3.10/dist-packages (from ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague) (4.4.2)
Requirement already satisfied: pickleshare in /usr/local/lib/python3.10/dist-packages (from ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague) (0.7.5)
Requirement already satisfied: traitlets>=4.2 in /usr/local/lib/python3.10/dist-packages (from ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague) (5.7.1)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague) (3.0.45)
Requirement already satisfied: pygments in /usr/local/lib/python3.10/dist-packages (from ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague) (2.16.1)
Requirement already satisfied: backcall in /usr/local/lib/python3.10/dist-packages (from ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague) (0.2.0)
Requirement already satisfied: matplotlib-inline in /usr/local/lib/python3.10/dist-packages (from ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague) (0.1.7)
Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.10/dist-packages (from ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague) (4.9.0)
Requirement already satisfied: SQLAlchemy<3,>=1.4 in /usr/local/lib/python3.10/dist-packages (from langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague) (2.0.30)
Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /usr/local/lib/python3.10/dist-packages (from langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague) (3.9.5)
Requirement already satisfied: async-timeout<5.0.0,>=4.0.0 in /usr/local/lib/python3.10/dist-packages (from langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague) (4.0.3)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading dataclasses_json-0.6.7-py3-none-any.whl (28 kB)
Collecting langchain-community<0.1,>=0.0.38 (from langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading langchain_community-0.0.38-py3-none-any.whl (2.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 87.1 MB/s eta 0:00:00
Collecting langchain-core<0.2.0,>=0.1.52 (from langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading langchain_core-0.1.52-py3-none-any.whl (302 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 302.9/302.9 kB 33.4 MB/s eta 0:00:00
Collecting langchain-text-splitters<0.1,>=0.0.1 (from langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading langchain_text_splitters-0.0.2-py3-none-any.whl (23 kB)
Collecting langsmith<0.2.0,>=0.1.17 (from langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading langsmith-0.1.75-py3-none-any.whl (124 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.9/124.9 kB 15.3 MB/s eta 0:00:00
Requirement already satisfied: requests<3,>=2 in /usr/local/lib/python3.10/dist-packages (from langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague) (2.31.0)
Requirement already satisfied: tenacity<9.0.0,>=8.1.0 in /usr/local/lib/python3.10/dist-packages (from langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague) (8.3.0)
Collecting llama-index-agent-openai<0.3.0,>=0.1.4 (from llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llama_index_agent_openai-0.2.7-py3-none-any.whl (12 kB)
Collecting llama-index-cli<0.2.0,>=0.1.2 (from llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llama_index_cli-0.1.12-py3-none-any.whl (26 kB)
Collecting llama-index-core==0.10.43 (from llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llama_index_core-0.10.43-py3-none-any.whl (15.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.4/15.4 MB 79.3 MB/s eta 0:00:00
Collecting llama-index-indices-managed-llama-cloud<0.2.0,>=0.1.2 (from llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llama_index_indices_managed_llama_cloud-0.1.6-py3-none-any.whl (6.7 kB)
Collecting llama-index-legacy<0.10.0,>=0.9.48 (from llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llama_index_legacy-0.9.48-py3-none-any.whl (2.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 102.7 MB/s eta 0:00:00
Collecting llama-index-program-openai<0.2.0,>=0.1.3 (from llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llama_index_program_openai-0.1.6-py3-none-any.whl (5.2 kB)
Collecting llama-index-question-gen-openai<0.2.0,>=0.1.2 (from llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llama_index_question_gen_openai-0.1.3-py3-none-any.whl (2.9 kB)
Collecting llama-index-readers-file<0.2.0,>=0.1.4 (from llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llama_index_readers_file-0.1.23-py3-none-any.whl (36 kB)
Collecting llama-index-readers-llama-parse<0.2.0,>=0.1.2 (from llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llama_index_readers_llama_parse-0.1.4-py3-none-any.whl (2.5 kB)
Collecting deprecated>=1.2.9.3 (from llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading Deprecated-1.2.14-py2.py3-none-any.whl (9.6 kB)
Collecting dirtyjson<2.0.0,>=1.0.8 (from llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading dirtyjson-1.0.8-py3-none-any.whl (25 kB)
Collecting llamaindex-py-client<0.2.0,>=0.1.18 (from llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llamaindex_py_client-0.1.19-py3-none-any.whl (141 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 141.9/141.9 kB 18.5 MB/s eta 0:00:00
Requirement already satisfied: nest-asyncio<2.0.0,>=1.5.8 in /usr/local/lib/python3.10/dist-packages (from llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague) (1.6.0)
Requirement already satisfied: networkx>=3.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague) (3.3)
Requirement already satisfied: nltk<4.0.0,>=3.8.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague) (3.8.1)
Collecting openai>=1.1.0 (from llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading openai-1.33.0-py3-none-any.whl (325 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 325.5/325.5 kB 39.8 MB/s eta 0:00:00
Collecting tiktoken>=0.3.3 (from llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 80.4 MB/s eta 0:00:00
Requirement already satisfied: tqdm<5.0.0,>=4.66.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague) (4.66.4)
Collecting typing-inspect>=0.8.0 (from llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)
Requirement already satisfied: wrapt in /usr/local/lib/python3.10/dist-packages (from llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague) (1.14.1)
Collecting azure-identity<2.0.0,>=1.15.0 (from llama-index-llms-azure-openai<0.2.0,>=0.1.8->lavague-contexts-openai<0.3.0,>=0.2.0->lavague)
Downloading azure_identity-1.16.0-py3-none-any.whl (166 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 166.1/166.1 kB 18.4 MB/s eta 0:00:00
Collecting rank-bm25<0.3.0,>=0.2.2 (from llama-index-retrievers-bm25<0.2.0,>=0.1.3->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading rank_bm25-0.2.2-py3-none-any.whl (8.6 kB)
Requirement already satisfied: urllib3[socks]<3,>=1.26 in /usr/local/lib/python3.10/dist-packages (from selenium<5.0.0,>=4.18.1->lavague-drivers-selenium<0.3.0,>=0.2.3->lavague) (2.0.7)
Collecting trio~=0.17 (from selenium<5.0.0,>=4.18.1->lavague-drivers-selenium<0.3.0,>=0.2.3->lavague)
Downloading trio-0.25.1-py3-none-any.whl (467 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 467.7/467.7 kB 45.1 MB/s eta 0:00:00
Collecting trio-websocket~=0.9 (from selenium<5.0.0,>=4.18.1->lavague-drivers-selenium<0.3.0,>=0.2.3->lavague)
Downloading trio_websocket-0.11.1-py3-none-any.whl (17 kB)
Requirement already satisfied: certifi>=2021.10.8 in /usr/local/lib/python3.10/dist-packages (from selenium<5.0.0,>=4.18.1->lavague-drivers-selenium<0.3.0,>=0.2.3->lavague) (2024.6.2)
Collecting courlan>=1.1.0 (from trafilatura<2.0.0,>=1.9.0->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading courlan-1.2.0-py3-none-any.whl (33 kB)
Collecting htmldate>=1.8.1 (from trafilatura<2.0.0,>=1.9.0->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading htmldate-1.8.1-py3-none-any.whl (31 kB)
Collecting justext>=3.0.1 (from trafilatura<2.0.0,>=1.9.0->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading jusText-3.0.1-py2.py3-none-any.whl (837 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 837.8/837.8 kB 62.6 MB/s eta 0:00:00
Requirement already satisfied: charset-normalizer>=3.2.0 in /usr/local/lib/python3.10/dist-packages (from trafilatura<2.0.0,>=1.9.0->lavague-core<0.3.0,>=0.2.14->lavague) (3.3.2)
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague) (1.3.1)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague) (23.2.0)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague) (6.0.5)
Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague) (1.9.4)
Requirement already satisfied: entrypoints in /usr/local/lib/python3.10/dist-packages (from altair<6.0,>=4.2.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (0.4)
Requirement already satisfied: jsonschema>=3.0 in /usr/local/lib/python3.10/dist-packages (from altair<6.0,>=4.2.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (4.19.2)
Requirement already satisfied: toolz in /usr/local/lib/python3.10/dist-packages (from altair<6.0,>=4.2.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (0.12.1)
Collecting azure-core>=1.23.0 (from azure-identity<2.0.0,>=1.15.0->llama-index-llms-azure-openai<0.2.0,>=0.1.8->lavague-contexts-openai<0.3.0,>=0.2.0->lavague)
Downloading azure_core-1.30.2-py3-none-any.whl (194 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 194.3/194.3 kB 16.2 MB/s eta 0:00:00
Requirement already satisfied: cryptography>=2.5 in /usr/local/lib/python3.10/dist-packages (from azure-identity<2.0.0,>=1.15.0->llama-index-llms-azure-openai<0.2.0,>=0.1.8->lavague-contexts-openai<0.3.0,>=0.2.0->lavague) (42.0.7)
Collecting msal>=1.24.0 (from azure-identity<2.0.0,>=1.15.0->llama-index-llms-azure-openai<0.2.0,>=0.1.8->lavague-contexts-openai<0.3.0,>=0.2.0->lavague)
Downloading msal-1.28.0-py3-none-any.whl (102 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 102.2/102.2 kB 10.9 MB/s eta 0:00:00
Collecting msal-extensions>=0.3.0 (from azure-identity<2.0.0,>=1.15.0->llama-index-llms-azure-openai<0.2.0,>=0.1.8->lavague-contexts-openai<0.3.0,>=0.2.0->lavague)
Downloading msal_extensions-1.1.0-py3-none-any.whl (19 kB)
Requirement already satisfied: babel>=2.11.0 in /usr/local/lib/python3.10/dist-packages (from courlan>=1.1.0->trafilatura<2.0.0,>=1.9.0->lavague-core<0.3.0,>=0.2.14->lavague) (2.15.0)
Collecting tld>=0.13 (from courlan>=1.1.0->trafilatura<2.0.0,>=1.9.0->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading tld-0.13-py2.py3-none-any.whl (263 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 263.8/263.8 kB 28.2 MB/s eta 0:00:00
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading marshmallow-3.21.3-py3-none-any.whl (49 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.2/49.2 kB 6.0 MB/s eta 0:00:00
Collecting dateparser>=1.1.2 (from htmldate>=1.8.1->trafilatura<2.0.0,>=1.9.0->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading dateparser-1.2.0-py2.py3-none-any.whl (294 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 295.0/295.0 kB 33.5 MB/s eta 0:00:00
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from htmldate>=1.8.1->trafilatura<2.0.0,>=1.9.0->lavague-core<0.3.0,>=0.2.14->lavague) (2.8.2)
Requirement already satisfied: anyio in /usr/local/lib/python3.10/dist-packages (from httpx>=0.24.1->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (3.7.1)
Collecting httpcore==1.* (from httpx>=0.24.1->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.9/77.9 kB 8.6 MB/s eta 0:00:00
Requirement already satisfied: idna in /usr/local/lib/python3.10/dist-packages (from httpx>=0.24.1->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (3.7)
Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from httpx>=0.24.1->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (1.3.1)
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx>=0.24.1->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading h11-0.14.0-py3-none-any.whl (58 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.3/58.3 kB 7.3 MB/s eta 0:00:00
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from huggingface-hub>=0.19.3->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (3.14.0)
Requirement already satisfied: parso<0.9.0,>=0.8.3 in /usr/local/lib/python3.10/dist-packages (from jedi>=0.16->ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague) (0.8.4)
Requirement already satisfied: lxml[html_clean]>=4.4.2 in /usr/local/lib/python3.10/dist-packages (from justext>=3.0.1->trafilatura<2.0.0,>=1.9.0->lavague-core<0.3.0,>=0.2.14->lavague) (4.9.4)
Collecting jsonpatch<2.0,>=1.33 (from langchain-core<0.2.0,>=0.1.52->langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)
Collecting packaging (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading packaging-23.2-py3-none-any.whl (53 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.0/53.0 kB 7.4 MB/s eta 0:00:00
Requirement already satisfied: beautifulsoup4<5.0.0,>=4.12.3 in /usr/local/lib/python3.10/dist-packages (from llama-index-readers-file<0.2.0,>=0.1.4->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague) (4.12.3)
Collecting pypdf<5.0.0,>=4.0.1 (from llama-index-readers-file<0.2.0,>=0.1.4->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading pypdf-4.2.0-py3-none-any.whl (290 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 290.4/290.4 kB 31.3 MB/s eta 0:00:00
Collecting striprtf<0.0.27,>=0.0.26 (from llama-index-readers-file<0.2.0,>=0.1.4->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading striprtf-0.0.26-py3-none-any.whl (6.9 kB)
Collecting llama-parse<0.5.0,>=0.4.0 (from llama-index-readers-llama-parse<0.2.0,>=0.1.2->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading llama_parse-0.4.4-py3-none-any.whl (8.0 kB)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib~=3.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (1.2.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib~=3.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib~=3.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (4.53.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib~=3.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (1.4.5)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib~=3.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (3.1.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas<3.0,>=1.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (2023.4)
Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas<3.0,>=1.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (2024.1)
Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.10/dist-packages (from pexpect>4.3->ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague) (0.7.0)
Requirement already satisfied: wcwidth in /usr/local/lib/python3.10/dist-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython<8.0.0,>=7.34.0->lavague-core<0.3.0,>=0.2.14->lavague) (0.2.13)
Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic>=2.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (0.7.0)
Requirement already satisfied: pydantic-core==2.18.4 in /usr/local/lib/python3.10/dist-packages (from pydantic>=2.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (2.18.4)
Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from SQLAlchemy<3,>=1.4->langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague) (3.0.3)
Requirement already satisfied: sortedcontainers in /usr/local/lib/python3.10/dist-packages (from trio~=0.17->selenium<5.0.0,>=4.18.1->lavague-drivers-selenium<0.3.0,>=0.2.3->lavague) (2.4.0)
Collecting outcome (from trio~=0.17->selenium<5.0.0,>=4.18.1->lavague-drivers-selenium<0.3.0,>=0.2.3->lavague)
Downloading outcome-1.3.0.post0-py2.py3-none-any.whl (10 kB)
Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from trio~=0.17->selenium<5.0.0,>=4.18.1->lavague-drivers-selenium<0.3.0,>=0.2.3->lavague) (1.2.1)
Collecting wsproto>=0.14 (from trio-websocket~=0.9->selenium<5.0.0,>=4.18.1->lavague-drivers-selenium<0.3.0,>=0.2.3->lavague)
Downloading wsproto-1.2.0-py3-none-any.whl (24 kB)
Requirement already satisfied: click<9.0.0,>=7.1.1 in /usr/local/lib/python3.10/dist-packages (from typer[all]<1.0,>=0.9->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (8.1.7)
Collecting colorama<0.5.0,>=0.4.3 (from typer[all]<1.0,>=0.9->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Collecting shellingham<2.0.0,>=1.3.0 (from typer[all]<1.0,>=0.9->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading shellingham-1.5.4-py2.py3-none-any.whl (9.8 kB)
Requirement already satisfied: rich<14.0.0,>=10.11.0 in /usr/local/lib/python3.10/dist-packages (from typer[all]<1.0,>=0.9->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (13.7.1)
Requirement already satisfied: pysocks!=1.5.7,<2.0,>=1.5.6 in /usr/local/lib/python3.10/dist-packages (from urllib3[socks]<3,>=1.26->selenium<5.0.0,>=4.18.1->lavague-drivers-selenium<0.3.0,>=0.2.3->lavague) (1.7.1)
Collecting starlette<0.38.0,>=0.37.2 (from fastapi->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading starlette-0.37.2-py3-none-any.whl (71 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 71.9/71.9 kB 8.8 MB/s eta 0:00:00
Collecting fastapi-cli>=0.0.2 (from fastapi->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading fastapi_cli-0.0.4-py3-none-any.whl (9.5 kB)
Collecting ujson!=4.0.2,!=4.1.0,!=4.2.0,!=4.3.0,!=5.0.0,!=5.1.0,>=4.0.1 (from fastapi->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading ujson-5.10.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (53 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.6/53.6 kB 7.3 MB/s eta 0:00:00
Collecting email_validator>=2.0.0 (from fastapi->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading email_validator-2.1.1-py3-none-any.whl (30 kB)
Requirement already satisfied: six>=1.11.0 in /usr/local/lib/python3.10/dist-packages (from azure-core>=1.23.0->azure-identity<2.0.0,>=1.15.0->llama-index-llms-azure-openai<0.2.0,>=0.1.8->lavague-contexts-openai<0.3.0,>=0.2.0->lavague) (1.16.0)
Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.10/dist-packages (from beautifulsoup4<5.0.0,>=4.12.3->llama-index-readers-file<0.2.0,>=0.1.4->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague) (2.5)
Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.10/dist-packages (from cryptography>=2.5->azure-identity<2.0.0,>=1.15.0->llama-index-llms-azure-openai<0.2.0,>=0.1.8->lavague-contexts-openai<0.3.0,>=0.2.0->lavague) (1.16.0)
Requirement already satisfied: regex!=2019.02.19,!=2021.8.27 in /usr/local/lib/python3.10/dist-packages (from dateparser>=1.1.2->htmldate>=1.8.1->trafilatura<2.0.0,>=1.9.0->lavague-core<0.3.0,>=0.2.14->lavague) (2024.5.15)
Requirement already satisfied: tzlocal in /usr/local/lib/python3.10/dist-packages (from dateparser>=1.1.2->htmldate>=1.8.1->trafilatura<2.0.0,>=1.9.0->lavague-core<0.3.0,>=0.2.14->lavague) (5.2)
Collecting dnspython>=2.0.0 (from email_validator>=2.0.0->fastapi->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading dnspython-2.6.1-py3-none-any.whl (307 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 307.7/307.7 kB 37.0 MB/s eta 0:00:00
INFO: pip is looking at multiple versions of fastapi-cli to determine which version is compatible with other requirements. This could take a while.
Collecting fastapi-cli>=0.0.2 (from fastapi->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading fastapi_cli-0.0.3-py3-none-any.whl (9.2 kB)
Downloading fastapi_cli-0.0.2-py3-none-any.whl (9.1 kB)
Collecting fastapi (from gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague)
Downloading fastapi-0.110.3-py3-none-any.whl (91 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 91.8/91.8 kB 12.0 MB/s eta 0:00:00
Collecting jsonpointer>=1.9 (from jsonpatch<2.0,>=1.33->langchain-core<0.2.0,>=0.1.52->langchain<0.2.0,>=0.1.20->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading jsonpointer-2.4-py2.py3-none-any.whl (7.8 kB)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.10/dist-packages (from jsonschema>=3.0->altair<6.0,>=4.2.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (2023.12.1)
Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.10/dist-packages (from jsonschema>=3.0->altair<6.0,>=4.2.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (0.35.1)
Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.10/dist-packages (from jsonschema>=3.0->altair<6.0,>=4.2.0->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (0.18.1)
WARNING: lxml 4.9.4 does not provide the extra 'html_clean'
INFO: pip is looking at multiple versions of lxml[html-clean] to determine which version is compatible with other requirements. This could take a while.
Requirement already satisfied: PyJWT[crypto]<3,>=1.0.0 in /usr/lib/python3/dist-packages (from msal>=1.24.0->azure-identity<2.0.0,>=1.15.0->llama-index-llms-azure-openai<0.2.0,>=0.1.8->lavague-contexts-openai<0.3.0,>=0.2.0->lavague) (2.3.0)
Collecting portalocker<3,>=1.0 (from msal-extensions>=0.3.0->azure-identity<2.0.0,>=1.15.0->llama-index-llms-azure-openai<0.2.0,>=0.1.8->lavague-contexts-openai<0.3.0,>=0.2.0->lavague)
Downloading portalocker-2.8.2-py3-none-any.whl (17 kB)
Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from nltk<4.0.0,>=3.8.1->llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague) (1.4.2)
Requirement already satisfied: distro<2,>=1.7.0 in /usr/lib/python3/dist-packages (from openai>=1.1.0->llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague) (1.7.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich<14.0.0,>=10.11.0->typer[all]<1.0,>=0.9->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (3.0.0)
Collecting mypy-extensions>=0.3.0 (from typing-inspect>=0.8.0->llama-index-core==0.10.43->llama-index<0.11.0,>=0.10.19->lavague-core<0.3.0,>=0.2.14->lavague)
Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
Requirement already satisfied: pycparser in /usr/local/lib/python3.10/dist-packages (from cffi>=1.12->cryptography>=2.5->azure-identity<2.0.0,>=1.15.0->llama-index-llms-azure-openai<0.2.0,>=0.1.8->lavague-contexts-openai<0.3.0,>=0.2.0->lavague) (2.22)
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich<14.0.0,>=10.11.0->typer[all]<1.0,>=0.9->gradio==4.26.0->lavague-gradio<0.3.0,>=0.2.4->lavague) (0.1.2)
Building wheels for collected packages: ffmpy
Building wheel for ffmpy (setup.py) ... done
Created wheel for ffmpy: filename=ffmpy-0.3.2-py3-none-any.whl size=5584 sha256=e16ca5ce39cfe1158cd22046a9134cc5a253407096592f683988b62082feb21a
Stored in directory: /root/.cache/pip/wheels/bd/65/9a/671fc6dcde07d4418df0c592f8df512b26d7a0029c2a23dd81
Successfully built ffmpy
Installing collected packages: striprtf, pydub, ffmpy, dirtyjson, websockets, tomlkit, tld, shellingham, semantic-version, ruff, rank-bm25, python-multipart, pypdf, portalocker, packaging, outcome, orjson, mypy-extensions, lxml, jsonpointer, jedi, h11, deprecated, colorama, aiofiles, wsproto, uvicorn, typing-inspect, trio, tiktoken, starlette, marshmallow, lxml-html-clean, jsonpatch, httpcore, dateparser, courlan, azure-core, trio-websocket, langsmith, httpx, htmldate, fastapi, dataclasses-json, selenium, openai, msal, llamaindex-py-client, langchain-core, justext, gradio-client, trafilatura, msal-extensions, llama-index-legacy, llama-index-core, langchain-text-splitters, langchain-community, gradio, llama-parse, llama-index-retrievers-bm25, llama-index-readers-file, llama-index-llms-openai, llama-index-indices-managed-llama-cloud, llama-index-embeddings-openai, langchain, azure-identity, llama-index-readers-llama-parse, llama-index-multi-modal-llms-openai, llama-index-llms-azure-openai, llama-index-cli, llama-index-agent-openai, llama-index-program-openai, llama-index-multi-modal-llms-azure-openai, llama-index-question-gen-openai, llama-index, lavague-core, lavague-gradio, lavague-drivers-selenium, lavague-contexts-openai, lavague
Attempting uninstall: packaging
Found existing installation: packaging 24.0
Uninstalling packaging-24.0:
Successfully uninstalled packaging-24.0
Attempting uninstall: lxml
Found existing installation: lxml 4.9.4
Uninstalling lxml-4.9.4:
Successfully uninstalled lxml-4.9.4
Successfully installed aiofiles-23.2.1 azure-core-1.30.2 azure-identity-1.16.0 colorama-0.4.6 courlan-1.2.0 dataclasses-json-0.6.7 dateparser-1.2.0 deprecated-1.2.14 dirtyjson-1.0.8 fastapi-0.110.3 ffmpy-0.3.2 gradio-4.26.0 gradio-client-0.15.1 h11-0.14.0 htmldate-1.8.1 httpcore-1.0.5 httpx-0.27.0 jedi-0.19.1 jsonpatch-1.33 jsonpointer-2.4 justext-3.0.1 langchain-0.1.20 langchain-community-0.0.38 langchain-core-0.1.52 langchain-text-splitters-0.0.2 langsmith-0.1.75 lavague-1.1.3 lavague-contexts-openai-0.2.0 lavague-core-0.2.14 lavague-drivers-selenium-0.2.3 lavague-gradio-0.2.4 llama-index-0.10.43 llama-index-agent-openai-0.2.7 llama-index-cli-0.1.12 llama-index-core-0.10.43 llama-index-embeddings-openai-0.1.10 llama-index-indices-managed-llama-cloud-0.1.6 llama-index-legacy-0.9.48 llama-index-llms-azure-openai-0.1.8 llama-index-llms-openai-0.1.22 llama-index-multi-modal-llms-azure-openai-0.1.4 llama-index-multi-modal-llms-openai-0.1.6 llama-index-program-openai-0.1.6 llama-index-question-gen-openai-0.1.3 llama-index-readers-file-0.1.23 llama-index-readers-llama-parse-0.1.4 llama-index-retrievers-bm25-0.1.3 llama-parse-0.4.4 llamaindex-py-client-0.1.19 lxml-5.2.2 lxml-html-clean-0.1.1 marshmallow-3.21.3 msal-1.28.0 msal-extensions-1.1.0 mypy-extensions-1.0.0 openai-1.33.0 orjson-3.10.3 outcome-1.3.0.post0 packaging-23.2 portalocker-2.8.2 pydub-0.25.1 pypdf-4.2.0 python-multipart-0.0.9 rank-bm25-0.2.2 ruff-0.4.8 selenium-4.21.0 semantic-version-2.10.0 shellingham-1.5.4 starlette-0.37.2 striprtf-0.0.26 tiktoken-0.7.0 tld-0.13 tomlkit-0.12.0 trafilatura-1.10.0 trio-0.25.1 trio-websocket-0.11.1 typing-inspect-0.9.0 uvicorn-0.30.1 websockets-11.0.3 wsproto-1.2.0
We will need to set our OpenAI Key as a Colab secret (see the key icon on the left-hand side of the Colab notebook) named OPENAI_API_KEY and then convert it to an environment variable with the same name.
import os
# Check if running in Google Colab
try:
from google.colab import userdata
IN_COLAB = True
except ImportError:
IN_COLAB = False
if IN_COLAB:
os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')
else:
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
Demo¶
We start here by pulling extra knowledge about Hugging Face's Papers pages to ensure the WorldModel will provide the right reasoning steps.
You can learn more about building Agents with LaVague in our webinar.
!wget https://raw.githubusercontent.com/lavague-ai/LaVague/main/examples/knowledge/hf_knowledge.txt
--2024-06-10 13:58:02-- https://raw.githubusercontent.com/lavague-ai/LaVague/main/examples/knowledge/hf_knowledge.txt Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.108.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 3968 (3.9K) [text/plain] Saving to: ‘hf_knowledge.txt.1’ hf_knowledge.txt.1 0%[ ] 0 --.-KB/s hf_knowledge.txt.1 100%[===================>] 3.88K --.-KB/s in 0s 2024-06-10 13:58:02 (41.6 MB/s) - ‘hf_knowledge.txt.1’ saved [3968/3968]
We can now define our agent and run it to create a Gradio demo.
from lavague.drivers.selenium import SeleniumDriver
from lavague.core import ActionEngine, WorldModel
from lavague.core.agents import WebAgent
selenium_driver = SeleniumDriver()
action_engine = ActionEngine(selenium_driver)
world_model = WorldModel()
world_model.add_knowledge("hf_knowledge.txt")
agent = WebAgent(world_model, action_engine)
agent.get("https://huggingface.co/papers")
agent.demo("What is the most trendy recent paper on text to video on Hugging Face papers? Provide the date and a summary of the paper")
2024-06-10 13:59:51,356 - INFO - Screenshot folder cleared
Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch(). IMPORTANT: You are using gradio version 4.26.0, however version 4.29.0 is available, please upgrade. -------- Running on public URL: https://776719fb5f91f2162a.gradio.live This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
2024-06-10 14:00:04,798 - INFO - Screenshot folder cleared 2024-06-10 14:00:10,889 - INFO - Thoughts: - The current screenshot shows the "Daily Papers" section on Hugging Face. - The objective is to find the most trendy recent paper on text to video. - The current screenshot does not show any paper related to text to video. - To ensure we have complete information, we need to scan the entire page as the current screenshot seems to show only part of the content. Next engine: Navigation Controls Instruction: SCAN 2024-06-10 14:00:20,292 - INFO - Thoughts: - The current screenshots show the daily papers on Hugging Face. - The objective is to find the most trendy recent paper on text to video. - The current screenshots do not show any paper related to text to video. - To find the relevant paper, we need to navigate to older papers. - The 'Previous' button allows navigating to pages containing older papers. - Therefore, the best next step is to click on the 'Previous' button to find papers on text to video. Next engine: Navigation Engine Instruction: Click on the 'Previous' button. 2024-06-10 14:00:41,124 - INFO - Thoughts: - The current screenshot shows the "Daily Papers" section on Hugging Face. - The objective is to find the most trendy recent paper on text to video. - The paper "ShareGPT4Video: Improving Video Understanding and Generation with Better Captions" seems relevant to the objective. - The next step is to click on this paper to gather more details, including the date and a summary. Next engine: Navigation Engine Instruction: Click on the paper "ShareGPT4Video: Improving Video Understanding and Generation with Better Captions". 2024-06-10 14:01:04,979 - INFO - Thoughts: - The current screenshot shows the paper "ShareGPT4Video: Improving Video Understanding and Generation with Better Captions" on Hugging Face. - The objective is to provide the date and a summary of the paper. - The paper's title and authors are visible, and the abstract provides a summary of the paper. - The publication date is also visible. Next engine: COMPLETE Instruction: The most trendy recent paper on text to video on Hugging Face is "ShareGPT4Video: Improving Video Understanding and Generation with Better Captions," published on June 6. The paper presents the ShareGPT4Video system, which aims to facilitate the video generation of large-scale language models (LLMs) and the video generation of text-to-video models (T2VMs) via dense and precise captions. The system includes a high-quality dataset with dense captions and a new evaluation metric for video generation. The authors demonstrate the effectiveness of ShareGPT4Video through extensive experiments and comparisons with existing methods. 2024-06-10 14:01:05,144 - INFO - Objective reached. Stopping...