Module core.utilities.format_utils

Functions

def clean_html(html_to_clean: str, tags_to_remove: List[str] = ['style', 'svg', 'script'], attributes_to_keep: List[str] = ['id', 'href']) ‑> str

Clean HTML content by removing specified tags and attributes while keeping specified attributes.

Args

html_to_clean : str
The HTML content to clean.
tags_to_remove : List[str]
List of tags to remove from the HTML content. Default is ['style', 'svg', 'script'].
attributes_to_keep : List[str]
List of attributes to keep in the HTML tags. Default is ['id', 'href'].

Returns

str
The cleaned HTML content.

Example:

>>> from clean_html_for_llm import clean_html
>>> cleaned_html = clean_html('<div id="main" style="color:red">Hello <script>alert("World")</script></div>', tags_to_remove=['script'], attributes_to_keep=['id'])
def extract_and_eval(string, extract_list=True)
def extract_code_from_funct(funct: Callable) ‑> List[str]

Extract code lines from a function while removing the first line (function definition) and the last line (return) and correcting indentation

def extract_imports_from_lines(lines: List[str]) ‑> str

Only keep import lines from python code lines and join them

def extract_next_engine(text: str, next_engines: List[str] = ['Navigation Controls', 'Python Engine', 'Navigation Engine', 'COMPLETE']) ‑> str
def extract_world_model_instruction(text)
def keep_assignments(code_snippet)
def return_assigned_variables(code_snippet)

Returns the variables assigned in a code snippet.

Classes

class VariableVisitor

Helper class to visit AST nodes and extract variables assigned in the code.

Ancestors

  • ast.NodeVisitor

Methods

def visit_Assign(self, node)