autoscrape-labs
diff --git a/‎README.md‎
Lines changed: 33 additions & 0 deletions b/‎README.md‎
Lines changed: 33 additions & 0 deletions
diff --git a/‎README_zh.md‎
Lines changed: 34 additions & 0 deletions b/‎README_zh.md‎
Lines changed: 34 additions & 0 deletions
diff --git a/‎docs/features.md‎
Lines changed: 55 additions & 0 deletions b/‎docs/features.md‎
Lines changed: 55 additions & 0 deletions
diff --git a/‎docs/zh/features.md‎
Lines changed: 55 additions & 0 deletions b/‎docs/zh/features.md‎
Lines changed: 55 additions & 0 deletions
diff --git a/‎pydoll/browser/chromium/base.py‎
Lines changed: 14 additions & 4 deletions b/‎pydoll/browser/chromium/base.py‎
Lines changed: 14 additions & 4 deletions
diff --git a/‎pydoll/browser/options.py‎
Lines changed: 1 addition & 4 deletions b/‎pydoll/browser/options.py‎
Lines changed: 1 addition & 4 deletions
@@ -97,6 +97,39 @@ await tab.request.get('https://api.example.com/data', headers=headers)
 
 This opens up incredible possibilities for automation scenarios where you need both browser interaction AND API efficiency!
 
+### New expect_download() context manager — robust file downloads made easy!
+Tired of fighting with flaky download flows, missing files, or racy event listeners? Meet `tab.expect_download()`, a delightful, reliable way to handle file downloads.
+
+- Automatically sets the browser’s download behavior
+- Works with your own directory or a temporary folder (auto-cleaned!)
+- Waits for completion with a timeout (so your tests don’t hang)
+- Gives you a handy handle to read bytes/base64 or check `file_path`
+
+Tiny example that just works:
+
+```python
+import asyncio
+from pathlib import Path
+from pydoll.browser import Chrome
+
+async def download_report():
+    async with Chrome() as browser:
+        tab = await browser.start()
+        await tab.go_to('https://example.com/reports')
+
+        target_dir = Path('/tmp/my-downloads')
+        async with tab.expect_download(keep_file_at=target_dir, timeout=10) as download:
+            # Trigger the download in the page (button/link/etc.)
+            await (await tab.find(text='Download latest report')).click()
+            # Wait until finished and read the content
+            data = await download.read_bytes()
+            print(f"Downloaded {len(data)} bytes to: {download.file_path}")
+
+asyncio.run(download_report())
+```
+
+Want zero-hassle cleanup? Omit `keep_file_at` and we’ll create a temp folder and remove it automatically after the context exits. Perfect for tests.
+
 ### Total browser control with custom preferences! (thanks to [@LucasAlvws](https://github.com/LucasAlvws))
 Want to completely customize how Chrome behaves? **Now you can control EVERYTHING!**<br>
 The new `browser_preferences` system gives you access to hundreds of internal Chrome settings that were previously impossible to change programmatically. We're talking about deep browser customization that goes way beyond command-line flags!
 
@@ -195,6 +195,40 @@ options = ChromiumOptions()
 options.start_timeout = 20  # 等待 20 秒
 ```
 
+### 新的 expect_download() 上下文管理器 —— 稳健、优雅的文件下载！
+还在为不稳定的下载流程、丢失的文件或混乱的事件监听而头疼吗？`tab.expect_download()` 来了：一种可靠、简洁的下载方式。
+
+- 自动配置浏览器下载行为
+- 支持自定义下载目录或临时目录（自动清理！）
+- 内置超时等待，防止任务卡住
+- 提供便捷句柄：读取字节/BASE64，获取 `file_path`
+
+一个“开箱即用”的小示例：
+
+```python
+import asyncio
+from pathlib import Path
+from pydoll.browser import Chrome
+
+async def download_report():
+    async with Chrome() as browser:
+        tab = await browser.start()
+        await tab.go_to('https://example.com/reports')
+
+        target_dir = Path('/tmp/my-downloads')
+        async with tab.expect_download(keep_file_at=target_dir, timeout=10) as dl:
+            # 触发页面上的下载（按钮/链接等）
+            await (await tab.find(text='Download latest report')).click()
+
+            # 等待完成并读取内容
+            data = await dl.read_bytes()
+            print(f"已下载 {len(data)} 字节，保存至: {dl.file_path}")
+
+asyncio.run(download_report())
+```
+
+想要“零成本清理”？不传 `keep_file_at` 即可——我们会创建临时目录，并在上下文退出后自动清理。对测试场景非常友好。
+
 ## 📦 安装
 
 ```bash
 
@@ -209,6 +209,61 @@ asyncio.run(background_bypass_example())
 
 Access websites that actively block automation tools without using third-party captcha solving services. This native captcha handling makes Pydoll suitable for automating previously inaccessible websites.
 
+## Reliable Download Handling with expect_download
+
+The `tab.expect_download()` context manager provides a robust, event-driven way to capture file downloads.
+
+- Configures browser download behavior for you
+- Supports persistent target directory (`keep_file_at`) or temporary directory with auto-cleanup
+- Exposes a `_DownloadHandle` with convenience methods
+- Includes timeout protection to avoid indefinite waits
+
+### API Overview
+
+```python
+async with tab.expect_download(
+    keep_file_at: Optional[str | Path] = None,
+    timeout: Optional[float] = None,
+) as handle:
+    ... # trigger download action in page
+```
+
+- `keep_file_at`: Target directory to keep the downloaded file. If `None`, a temporary directory is created and removed automatically when the context exits.
+- `timeout`: Maximum seconds to wait for completion (defaults to 60 if not provided).
+
+`handle` exposes:
+
+- `handle.file_path: Optional[str]` — final resolved path after completion
+- `await handle.read_bytes() -> bytes`
+- `await handle.read_base64() -> str`
+- `await handle.wait_started(timeout: Optional[float] = None) -> None`
+- `await handle.wait_finished(timeout: Optional[float] = None) -> None`
+
+### Usage Examples
+
+Persist file in a specific directory:
+
+```python
+async with tab.expect_download(keep_file_at='/tmp/dl', timeout=15) as dl:
+    await (await tab.find(text='Export CSV')).click()
+    data = await dl.read_bytes()
+    print('Saved at:', dl.file_path)
+```
+
+Use a temporary directory (auto-cleanup) for tests:
+
+```python
+async with tab.expect_download() as dl:
+    await (await tab.find(text='Download PDF')).click()
+    pdf_b64 = await dl.read_base64()
+    # temp directory is cleaned automatically when leaving the context
+```
+
+Notes:
+
+- When the page emits no completion event within the configured `timeout`, a `DownloadTimeout` exception is raised.
+- If the browser does not provide a `filePath`, the manager falls back to the suggested filename in the chosen directory.
+
 ## Multi-Tab Management
 
 Pydoll provides sophisticated tab management capabilities with a singleton pattern that ensures efficient resource usage and prevents duplicate Tab instances for the same browser tab.
 
@@ -212,6 +212,61 @@ asyncio.run(background_bypass_example())
 
 无需使用第三方验证码服务，即可访问屏蔽自动化工具的网站。
 
+## 可靠的下载处理：expect_download
+
+`tab.expect_download()` 提供稳健的、基于事件的文件下载捕获方式。
+
+- 自动为您配置浏览器下载行为
+- 支持持久化目录（`keep_file_at`），或使用临时目录并在退出上下文后自动清理
+- 提供 `_DownloadHandle` 便捷接口
+- 内置超时保护，避免无限等待
+
+### API 概览
+
+```python
+async with tab.expect_download(
+    keep_file_at: Optional[str | Path] = None,
+    timeout: Optional[float] = None,
+) as handle:
+    ... # 在页面中触发下载
+```
+
+- `keep_file_at`：指定持久化目录。若为 `None`，则使用临时目录并在退出上下文后自动清理。
+- `timeout`：完成等待的最大秒数（未提供时默认 60）。
+
+`handle` 提供：
+
+- `handle.file_path: Optional[str]` — 完成后解析出的最终文件路径
+- `await handle.read_bytes() -> bytes`
+- `await handle.read_base64() -> str`
+- `await handle.wait_started(timeout: Optional[float] = None) -> None`
+- `await handle.wait_finished(timeout: Optional[float] = None) -> None`
+
+### 使用示例
+
+在指定目录中持久化下载文件：
+
+```python
+async with tab.expect_download(keep_file_at='/tmp/dl', timeout=15) as dl:
+    await (await tab.find(text='Export CSV')).click()
+    data = await dl.read_bytes()
+    print('Saved at:', dl.file_path)
+```
+
+用于测试的临时目录（自动清理）：
+
+```python
+async with tab.expect_download() as dl:
+    await (await tab.find(text='Download PDF')).click()
+    pdf_b64 = await dl.read_base64()
+    # 退出上下文后临时目录会被自动清理
+```
+
+注意：
+
+- 如果在配置的 `timeout` 内页面未发出完成事件，将抛出 `DownloadTimeout` 异常。
+- 如果浏览器未提供 `filePath`，管理器将回退到使用建议文件名并写入选定目录。
+
 ## 多标签页管理
 
 Pydoll 采用单例模式提供完善的标签页管理功能，确保资源高效利用，并防止同一浏览器标签页出现重复的标签页实例。
 
@@ -8,7 +8,7 @@
 from functools import partial
 from random import randint
 from tempfile import TemporaryDirectory
-from typing import Any, Callable, Optional
+from typing import Any, Awaitable, Callable, Optional, overload
 
 from pydoll.browser.interfaces import BrowserOptionsManager
 from pydoll.browser.managers import (
@@ -118,9 +118,9 @@ async def start(self, headless: bool = False) -> Tab:
         if headless:
             warnings.warn(
                 "The 'headless' parameter is deprecated and will be removed in a future version. "
-                "Use `options.headless = True` instead.",
+                'Use `options.headless = True` instead.',
                 DeprecationWarning,
-                stacklevel=2
+                stacklevel=2,
             )
             self.options.headless = headless
 
@@ -378,9 +378,15 @@ async def reset_permissions(self, browser_context_id: Optional[str] = None):
         """Reset all permissions to defaults and restore prompting behavior."""
         return await self._execute_command(BrowserCommands.reset_permissions(browser_context_id))
 
+    @overload
     async def on(
         self, event_name: str, callback: Callable[[Any], Any], temporary: bool = False
-    ) -> int:
+    ) -> int: ...
+    @overload
+    async def on(
+        self, event_name: str, callback: Callable[[Any], Awaitable[Any]], temporary: bool = False
+    ) -> int: ...
+    async def on(self, event_name, callback, temporary: bool = False) -> int:
         """
         Register CDP event listener at browser level.
 
@@ -409,6 +415,10 @@ async def callback_wrapper(event):
             event_name, function_to_register, temporary
         )
 
+    async def remove_callback(self, callback_id: int):
+        """Remove callback from browser."""
+        return await self._connection_handler.remove_callback(callback_id)
+
     async def enable_fetch_events(
         self,
         handle_auth_requests: bool = False,
 
@@ -312,10 +312,7 @@ def headless(self) -> bool:
     def headless(self, headless: bool):
         self._headless = headless
         has_argument = '--headless' in self.arguments
-        methods_map = {
-            True: self.add_argument,
-            False: self.remove_argument
-        }
+        methods_map = {True: self.add_argument, False: self.remove_argument}
         if headless == has_argument:
             return
         methods_map[headless]('--headless')