An Engine is a system that sets up and runs specific processes to perform handler. It organizes different handler functions and uses a given prompt along with a language model to activate them. Essentially, the engine helps manage and trigger these processes to carry out actions based on the instructions it receives.

Parameters

AttributeParametersDescription
HandlerhandlerImplementation of BaseHandler, its method(s) will be executed based on the given prompts.
LLM ClientllmInterface for communicating with the large language model (LLM).
Prompt Templateprompt_templateDefines the structure and format of prompts sent to the LLM using PromptTemplate.
Tools (optional)toolsList of handler method names (as dictionaries or strings) available for use during interactions. Defaults to None. If nothing provide Engine will get it dynamically using dir(handler).
Output Parser (optional)output_parserAn optional parser to format and process the handler tools output. Defaults to None.
from superagentx.engine import Engine

walmart_engine = Engine(
    handler=walmart_ecom_handler,
    llm=llm_client,
    prompt_template=prompt_template,
    tools=None,
    output_parser=None,
)

Browser Engine

The BrowserEngine is a specialized engine designed to execute browser-based automation tasks using a set of predefined browser handlers. It coordinates a language model, a prompt template, and a browser context to perform interactive web actions such as navigation, input, clicking, scrolling, and data extraction. By interpreting instructions and invoking the appropriate browser tools, the BrowserEngine enables intelligent and automated web browsing workflows, including optional screenshot capturing and step-based execution control.

Parameters

AttributeParametersDescription
LLM ClientllmAn instance of LLMClient, which likely represents a client or interface to interact with a Large Language Model (LLM) such as GPT. This is used for generating text, answering queries, etc.
PromptTemplateprompt_templateAn instance of PromptTemplate, which defines the format or structure of the prompts that will be sent to the LLM. It helps maintain consistency and clarity in how requests are made to the model.
Browser Instance Path (optional)browser_instance_pathIf we want to reuse a browser session that was saved earlier, this tells the program where to find it. Defaults to None
Browser (optional)browserA web browser (like Chrome or Firefox) that the program can control automatically. We can use this if we need to interact with websites without doing it manually. Defaults to None
Browser Context (optional)browser_contextA separate “space” inside the browser that keeps things organized. It’s like using different tabs in the browser for different tasks or projects. Defaults to None
Tools (optional)toolsA list of special tools (like a calculator, search tool, or others) that the program can use during its tasks. Defaults to None
Max Stepsmax_stepsThe maximum number of steps the program can take before it stops. This helps prevent it from running forever or getting stuck in endless loops. Default Step is 100
Screen Shot (optional)take_screenshotA yes/no option that tells the program whether or not to take screenshots during its work. This is useful for keeping a visual record of what happens.Defaults to None
Screenshot Path (optional)screenshot_pathIf screenshots are taken, this tells the program where to save them on your computer. Defaults to None
BrowserEngine:

from superagentx.browser_engine import BrowserEngine

browser_engine = BrowserEngine(
    llm=llm_client,
    prompt_template=prompt_template,
)

Browser Configuration:

BrowserConfig controls the global launch behavior and core settings for the browser instance. It specifies parameters like headless mode, browser type, proxy settings, and remote connections. This configuration is applied when the browser starts and affects all created contexts and pages.

Browser config
from superagentx.computer_use.browser.browser import Browser, BrowserConfig

config = BrowserConfig(
    headless=False,
    disable_security=True,
    browser_type='firefox'
)

browser = Browser(config=config)

BrowserConfig Parameters:

AttributeParametersDescription
HeadlessheadlessWhen True, launches the browser without a visible UI. Useful for automation or CI environments, but some sites may detect headless mode. Default to False.
Disables Securitydisable_securityDisables standard browser security features (e.g., CORS, mixed content blocking). Use only when necessary and only with trusted content. Default to True.
extra_chromium_args (optional)extra_chromium_argsAdditional command-line flags to pass to Chromium during launch (e.g., —no-sandbox, —disable-gpu).Default is [].
chrome_instance_path (optional)chrome_instance_pathPath to a custom Chrome or Chromium executable. Useful when using a non-default installation. Default to None.
wss_url (optional)wss_urlWebSocket endpoint for connecting to a remote browser instance via the DevTools Protocol (e.g., for cloud or containerized browsers). Default to None.
cdp_url (optional)cdp_urlChrome DevTools Protocol (CDP) endpoint used for remote browser automation. Default to None.
Proxy (optional)proxyConfiguration for routing browser traffic through a proxy server. Follows Playwright-style proxy settings (e.g., server, username, password). Default to None.
New Content Config (optional)new_context_configDefault configuration for new browser contexts. Controls cookies, viewport, locale, user agent, and other context-level behaviors. Default is BrowserContextConfig().
_force_keep_browser_alive`_force_keep_browser_aliveInternal flag to keep the browser instance running across multiple tasks or scripts. Default to False.
Browse Typebrowser_typeSpecifies which browser engine to use: ‘chromium’, ‘firefox’, or ‘webkit’. In this example, ‘firefox’ is selected. Default is chromium.

Browser Context Configuration:

BrowserContextConfig defines the default environment and behavior for each new browser context. It includes settings like cookies, viewport size, user agent, locale, and download paths. This configuration ensures isolated, customizable sessions within a single browser instance.

from superagentx.computer_use.browser.context import BrowserContextConfig

config = BrowserContextConfig(
    cookies_file="path/to/cookies.json",
    wait_for_network_idle_page_load_time=3.0,
    window_width=1280,
    window_height=1100,
    locale='en-US',
    user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36',
    highlight_elements=True,
    viewport_expansion=500,
    allowed_domains=['google.com', 'wikipedia.org'],
)

browser = Browser()
context = BrowserContext(browser=browser, config=config)

browser_engine = BrowserEngine(
    llm=llm_client,
    prompt_template=prompt_template,
    browser_context=context

)

BrowserContextConfig Parameters:

AttributeParametersDescription
Cookies File (optional)cookies_filePath to a file containing cookies to be loaded into the browser session. Useful for persisting login states. Default to None.
Minimum Wait Page Load Timeminimum_wait_page_load_timeMinimum time (in seconds) to wait after page navigation begins before proceeding. Ensures smoother page loading. Default to 0.25
Wait For Network Idle Page Load Timewait_for_network_idle_page_load_timeAdditional wait time after the network becomes idle, simulating user behavior more closely. Default to 0.5
Maximum Wait Page Load Timemaximum_wait_page_load_timeMaximum duration to wait for a page to fully load. Acts as a timeout for slow-loading pages. Default to 5.
Wait Between Actionswait_between_actionsWait time between sequential browser actions (e.g., clicks, navigation). Helps avoid detection by anti-bot systems. Default to 0.5
Disable Security (optional)disable_securityDisables browser-level security features. Can solve cross-origin issues, but should not be used with untrusted websites. Default to True
Browser Window Size (optional)browser_window_sizeSets the browser window size. Useful for rendering consistency and testing responsive layouts. Default to {'width': 1280, 'height': 1100}
No Viewport (optional)no_viewportWhen set to True, disables Playwright’s default viewport setting and uses the system window size instead. Default to None.
Save Recording Path (optional)save_recording_pathPath to save screen recordings of the browser session, if enabled. Default to None.
Save Downloads Path (optional)save_downloads_pathDirectory path to store downloaded files from within the browser context. Default to None.
Trace Path (optional)trace_pathPath to store browser traces (used for debugging and performance analysis). Default to None.
Locale (optional)localeSets the locale of the browser context (e.g., “en-US”), which can affect how websites render language and date formats.Default to None.
User Agentuser_agentThe user agent string used for the browser session. Can be customized to mimic specific devices or browsers. Default to realistic Chrome UA string.
Highlight Elementshighlight_elementsHighlights elements being interacted with in the browser. Useful for debugging and visual confirmation during automation. Default to False
Viewport Expansionviewport_expansionExpands the viewport vertically (in pixels) to ensure that off-screen elements are brought into view. Default is 500.
Allowed Domains (optional)allowed_domainsList of domains that are permitted to load. Helps limit unwanted navigation or third-party content. Default to None.
Include Dynamic Attributesinclude_dynamic_attributesWhether to include dynamically generated attributes (like data-, aria-) during automation tasks. Default to True.
Force Keep Context Alive_force_keep_context_aliveInternal setting to keep the browser context alive beyond the usual lifecycle. Useful for chained or long-lived tasks. Default to False.