{ "id": "DswhuYzoemjA6iNN", "meta": { "instanceId": "a1ae5c8dc6c65e674f9c3947d083abcc749ef2546dff9f4ff01de4d6a36ebfe6", "templateCredsSetupCompleted": true }, "name": "Scrape Books from URL with Dumpling AI, Clean HTML, Save to Sheets, Email as CSV", "tags": [ { "id": "TlcNkmb96fUfZ2eA", "name": "Tutorials", "createdAt": "2025-04-15T17:02:00.249Z", "updatedAt": "2025-04-15T17:02:00.249Z" } ], "nodes": [ { "id": "2e4f64a5-353c-4dd3-9822-62df795d4940", "name": "Convert to CSV File", "type": "n8n-nodes-base.convertToFile", "position": [ 1640, 340 ], "parameters": { "options": {} }, "typeVersion": 1.1 }, { "id": "472442d3-a691-4310-93f8-019579d0c473", "name": "Extract all books from the page", "type": "n8n-nodes-base.html", "position": [ 760, 340 ], "parameters": { "options": {}, "operation": "extractHtmlContent", "dataPropertyName": "content", "extractionValues": { "values": [ { "key": "books", "cssSelector": ".row > li", "returnArray": true, "returnValue": "html" } ] } }, "typeVersion": 1.2 }, { "id": "92765257-d64d-47c9-bd57-50914342138b", "name": "Sort by price", "type": "n8n-nodes-base.sort", "position": [ 1420, 340 ], "parameters": { "options": {}, "sortFieldsUi": { "sortField": [ { "order": "descending", "fieldName": "price" } ] } }, "typeVersion": 1 }, { "id": "efc2f33f-1bef-4906-b3b7-b02868080a54", "name": "Extract individual book price", "type": "n8n-nodes-base.html", "position": [ 1200, 340 ], "parameters": { "options": {}, "operation": "extractHtmlContent", "dataPropertyName": "books", "extractionValues": { "values": [ { "key": "title", "attribute": "title", "cssSelector": "h3 > a", "returnValue": "attribute" }, { "key": "price", "cssSelector": ".price_color" } ] } }, "typeVersion": 1.2 }, { "id": "74c7c3af-d63c-4b6c-95a0-15f45b19134b", "name": "Send CSV via e-mail", "type": "n8n-nodes-base.gmail", "position": [ 1860, 340 ], "webhookId": "40f2d609-52ed-40bf-b190-1f1cebbe3fb7", "parameters": { "sendTo": "", "message": "Hey, here's the scraped data from the online bookstore!", "options": { "attachmentsUi": { "attachmentsBinary": [ {} ] } }, "subject": "bookstore csv", "emailType": "text" }, "credentials": { "gmailOAuth2": { "id": "j70r3RTMED1pgN3R", "name": "Gmail account 2" } }, "typeVersion": 2.1 }, { "id": "95c7998b-ece0-4dea-b99e-97ac22fb8a59", "name": "Sticky Note3", "type": "n8n-nodes-base.stickyNote", "position": [ 140, -260 ], "parameters": { "width": 619, "height": 297, "content": "### Scrape Books from URL with Dumpling AI, Clean HTML, Save to Sheets, Email as CSV\n\nš This workflow scrapes book data from a website, turns it into a CSV, saves it, and sends it by email.\n\nš§ It starts from a Google Sheets trigger, fetches the page using DumplingAI, extracts books, sorts by price, and emails the CSV.\n\nā Make sure APIs for Gmail, Sheets & Drive are enabled in Google Cloud. Update the URL in the \"Fetch website content\" node.\n" }, "typeVersion": 1 }, { "id": "f599028a-49a9-4b85-b484-5abf1229e373", "name": "Sticky Note", "type": "n8n-nodes-base.stickyNote", "position": [ 140, 60 ], "parameters": { "color": 4, "width": 900, "height": 300, "content": "### š Trigger to Raw Book HTML\n\n1. **Google Sheets Trigger** \n Watches a sheet for new row entries. Once a new URL is added, the workflow starts.\n\n2. **Fetch Website Content (Dumpling AI)** \n Makes an HTTP POST request to Dumpling AI to scrape and return the full HTML of the target URL.\n\n3. **Extract All Books** \n Uses CSS selectors to isolate the list items (`li.row > li`) containing book entries.\n\n4. **Split Out Node** \n Breaks the array of book HTML blocks into individual items, so each book can be processed separately in the next steps.\n" }, "typeVersion": 1 }, { "id": "bc6ab72c-de03-4e79-9da0-ca12ddf31811", "name": "Sticky Note1", "type": "n8n-nodes-base.stickyNote", "position": [ 1140, 60 ], "parameters": { "color": 6, "width": 840, "height": 300, "content": "### š¦ Parse, Sort, Export & Email\n\n5. **Extract Individual Book Data** \n From each book, extract the title (`