Data Extraction Actions
Actions for reading data from page elements.
getText
Get the text content of an element.
Value: string (selector)
{ "getText": "h1.title" }
Returns: string — the element's textContent
Options
| Option | Type | Default | Description |
|---|---|---|---|
save | string | — | Cache result with this key for later use with ${key} |
push | string | — | Push result into an array in the data cache |
waitFor | boolean | true | Wait for element |
timeout | number | 10000 | Max wait time |
iframe | string | string[] | — | Target inside iframe |
{
"getText": ".product-price",
"options": { "save": "price" }
}
getHTML
Get the HTML content of an element.
Value: string (selector)
{ "getHTML": ".content" }
Returns: string — the element's innerHTML (or outerHTML if outer: true)
Options
| Option | Type | Default | Description |
|---|---|---|---|
outer | boolean | false | Return outerHTML instead of innerHTML |
save | string | — | Cache the result |
push | string | — | Push into array |
{
"getHTML": "#article",
"options": { "outer": true, "save": "articleHTML" }
}
getAttribute
Get an attribute value from an element.
Value: [selector, attribute]
{ "getAttribute": ["a.main-link", "href"] }
Returns: string — the attribute value
{
"getAttribute": ["img.avatar", "src"],
"options": { "save": "avatarUrl" }
}
getValue
Get the current value of a form input, textarea, or select element.
Value: string (selector)
{ "getValue": "input[name=email]" }
Returns: string — the element's .value
{
"getValue": "select#country",
"options": { "save": "selectedCountry" }
}
extractAll
Extract structured data from multiple matching elements. Powerful for scraping lists, tables, and grids.
Value: string (container selector)
{
"extractAll": ".product-card",
"options": {
"fieldMap": {
"title": "h3",
"price": ".price",
"link": ["a", "href"],
"image": { "selector": "img", "attribute": "src" }
}
}
}
Returns: array of objects
[
{ "title": "Product A", "price": "$19.99", "link": "/products/a", "image": "/img/a.jpg" },
{ "title": "Product B", "price": "$29.99", "link": "/products/b", "image": "/img/b.jpg" }
]
Field Map Syntax
The fieldMap maps field names to extraction rules:
| Format | Description | Example |
|---|---|---|
"selector" | Get textContent of first match | "title": "h3" |
["selector", "attr"] | Get attribute of first match | "link": ["a", "href"] |
{ selector, attribute } | Explicit object form | "img": { "selector": "img", "attribute": "src" } |
Options
| Option | Type | Default | Description |
|---|---|---|---|
fieldMap | object | — | Map of field names to selectors |
limit | number | 0 (unlimited) | Max number of items to extract |
save | string | — | Cache the result array |
push | string | — | Push result into array |
waitFor | boolean | true | Wait for container elements |
timeout | number | 10000 | Max wait time |
{
"extractAll": "table tbody tr",
"options": {
"fieldMap": {
"name": "td:nth-child(1)",
"email": "td:nth-child(2)",
"role": "td:nth-child(3)"
},
"limit": 10,
"save": "users"
}
}
getSavedData
Retrieve previously cached data values.
Value: string[] (array of key names)
{ "getSavedData": ["price", "title", "users"] }
Returns: object — key-value map of cached data
{
"price": "$19.99",
"title": "Product A",
"users": [...]
}
clearSavedData
Clear cached data.
Value: true or string[] (specific keys to clear)
{ "clearSavedData": true }
{ "clearSavedData": ["price", "title"] }
Example: Scrape a Product Listing
{
"actions": [
{ "openNewTab": "https://example.com/products" },
{ "waitForElement": ".product-card" },
{
"extractAll": ".product-card",
"options": {
"fieldMap": {
"name": "h3.product-name",
"price": ".price",
"rating": ".stars",
"url": ["a", "href"],
"image": ["img", "src"]
},
"limit": 20,
"save": "products"
}
},
{ "getSavedData": ["products"] }
]
}
Example: Collect Data Across Pages
{
"actions": [
{ "openNewTab": "https://example.com/page/1" },
{ "getText": "h1", "options": { "save": "pageTitle" } },
{ "getText": ".result-count", "options": { "save": "totalResults" } },
{ "getAttribute": ["a.next-page", "href"], "options": { "save": "nextPageUrl" } },
{ "getSavedData": ["pageTitle", "totalResults", "nextPageUrl"] }
]
}