增加epub、html工作流的示例说明

2025-10-15 08:41:44 +08:00
parent e89c1174ed
commit d0fb9e4405
3 changed files with 277 additions and 78 deletions
--- a/README.md
+++ b/README.md
@@ -20,18 +20,26 @@
  A lightweight local file translation tool based on Large Language Models
 </p>
- ✅ **Multiple Format Support**: Translates various files including `pdf`, `docx`, `xlsx`, `md`, `txt`, `json`, `epub`, `srt`, `ass`, and more.
+- ✅ **Multiple Format Support**: Translates various files including `pdf`, `docx`, `xlsx`, `md`, `txt`, `json`, `epub`,
  `srt`, `ass`, and more.
 - ✅ **Automatic Glossary Generation**: Supports automatic generation of glossaries for term alignment.
- ✅ **PDF Table, Formula, and Code Recognition**: Recognizes and translates tables, formulas, and code often found in academic papers, powered by `docling` and `mineru` PDF parsing engines.
+- ✅ **PDF Table, Formula, and Code Recognition**: Recognizes and translates tables, formulas, and code often found in
- ✅ **JSON Translation**: Supports specifying values to be translated in JSON using JSON paths (following `jsonpath-ng` syntax).
+  academic papers, powered by `docling` and `mineru` PDF parsing engines.
- ✅ **Word/Excel Format Preservation**: Translates `docx` and `xlsx` files while preserving their original formatting (does not yet support `doc` or `xls` files).
+- ✅ **JSON Translation**: Supports specifying values to be translated in JSON using JSON paths (following `jsonpath-ng`
- ✅ **Multi-AI Platform Support**: Compatible with most AI platforms, enabling high-performance, concurrent AI translation with custom prompts.
+  syntax).
- ✅ **Asynchronous Support**: Designed for high-performance scenarios with full asynchronous support, offering service interfaces for parallel tasks.
+- ✅ **Word/Excel Format Preservation**: Translates `docx` and `xlsx` files while preserving their original formatting (
  does not yet support `doc` or `xls` files).
 - ✅ **Multi-AI Platform Support**: Compatible with most AI platforms, enabling high-performance, concurrent AI
  translation with custom prompts.
 - ✅ **Asynchronous Support**: Designed for high-performance scenarios with full asynchronous support, offering service
  interfaces for parallel tasks.
 - ✅ **LAN and Multi-user Support**: Can be used by multiple people simultaneously on a local area network.
 - ✅ **Interactive Web Interface**: Provides an out-of-the-box Web UI and RESTful API for easy integration and use.
- ✅ **Small, Multi-platform Standalone Packages**: Windows and Mac standalone packages under 40MB (for versions not using the `docling` local PDF parser).
+- ✅ **Small, Multi-platform Standalone Packages**: Windows and Mac standalone packages under 40MB (for versions not
  using the `docling` local PDF parser).
-> When translating `pdf` files, they are first converted to Markdown, which will **cause the original layout to be lost**. Users with strict layout requirements should take note.
+> When translating `pdf` files, they are first converted to Markdown, which will **cause the original layout to be lost
 **. Users with strict layout requirements should take note.
 > QQ Discussion Group: 1047781902
@@ -46,10 +54,14 @@
 ## All-in-One Packages
-For users who want to get started quickly, we provide all-in-one packages on [GitHub Releases](https://github.com/xunbu/docutranslate/releases). Simply download, unzip, and enter your AI platform API Key to begin.
+For users who want to get started quickly, we provide all-in-one packages
 on [GitHub Releases](https://github.com/xunbu/docutranslate/releases). Simply download, unzip, and enter your AI
 platform API Key to begin.
- **DocuTranslate**: Standard version, uses the online `minerU` engine to parse PDF documents. Choose this version if you don't need local PDF parsing (recommended).
+- **DocuTranslate**: Standard version, uses the online `minerU` engine to parse PDF documents. Choose this version if
- **DocuTranslate_full**: Full version, includes the built-in `docling` local PDF parsing engine. Choose this version if you need local PDF parsing.
+  you don't need local PDF parsing (recommended).
 - **DocuTranslate_full**: Full version, includes the built-in `docling` local PDF parsing engine. Choose this version if
  you need local PDF parsing.
 ## Installation
@@ -90,31 +102,35 @@ uv sync
 ## Core Concept: Workflow
-The core of the new DocuTranslate is the **Workflow**. Each workflow is a complete, end-to-end translation pipeline designed for a specific file type. Instead of interacting with a single large class, you select and configure a workflow based on your file type.
+The core of the new DocuTranslate is the **Workflow**. Each workflow is a complete, end-to-end translation pipeline
 designed for a specific file type. Instead of interacting with a single large class, you select and configure a workflow
 based on your file type.
 **The basic usage flow is as follows:**
-1.  **Select a Workflow**: Choose a workflow based on your input file type (e.g., PDF/Word or TXT), such as `MarkdownBasedWorkflow` or `TXTWorkflow`.
+1. **Select a Workflow**: Choose a workflow based on your input file type (e.g., PDF/Word or TXT), such as
-2.  **Build Configuration**: Create the corresponding configuration object for the selected workflow (e.g., `MarkdownBasedWorkflowConfig`). This object contains all necessary sub-configurations, such as:
+   `MarkdownBasedWorkflow` or `TXTWorkflow`.
-    *   **Converter Config**: Defines how to convert the original file (like a PDF) to Markdown.
+2. **Build Configuration**: Create the corresponding configuration object for the selected workflow (e.g.,
-    *   **Translator Config**: Defines which LLM, API-Key, target language, etc., to use.
+   `MarkdownBasedWorkflowConfig`). This object contains all necessary sub-configurations, such as:
-    *   **Exporter Config**: Defines specific options for the output format (like HTML).
+    * **Converter Config**: Defines how to convert the original file (like a PDF) to Markdown.
-3.  **Instantiate the Workflow**: Create an instance of the workflow using the configuration object.
+    * **Translator Config**: Defines which LLM, API-Key, target language, etc., to use.
-4.  **Execute Translation**: Call the workflow's `.read_*()` and `.translate()` / `.translate_async()` methods.
+    * **Exporter Config**: Defines specific options for the output format (like HTML).
-5.  **Export/Save Results**: Call the `.export_to_*()` or `.save_as_*()` methods to get or save the translation results.
+3. **Instantiate the Workflow**: Create an instance of the workflow using the configuration object.
 4. **Execute Translation**: Call the workflow's `.read_*()` and `.translate()` / `.translate_async()` methods.
 5. **Export/Save Results**: Call the `.export_to_*()` or `.save_as_*()` methods to get or save the translation results.
 ## Available Workflows
-| Workflow                    | Use Case                                                        | Input Formats                                | Output Formats             | Core Config Class             |
+| Workflow                    | Use Case                                                                                               | Input Formats                                | Output Formats         | Core Config Class             |
-|:----------------------------|:----------------------------------------------------------------|:---------------------------------------------|:---------------------------|:------------------------------|
+|:----------------------------|:-------------------------------------------------------------------------------------------------------|:---------------------------------------------|:-----------------------|:------------------------------|
-| **`MarkdownBasedWorkflow`** | Processes rich text documents like PDF, Word, images. Flow: `File -> Markdown -> Translate -> Export`. | `.pdf`, `.docx`, `.md`, `.png`, `.jpg`, etc. | `.md`, `.zip`, `.html`     | `MarkdownBasedWorkflowConfig` |
+| **`MarkdownBasedWorkflow`** | Processes rich text documents like PDF, Word, images. Flow: `File -> Markdown -> Translate -> Export`. | `.pdf`, `.docx`, `.md`, `.png`, `.jpg`, etc. | `.md`, `.zip`, `.html` | `MarkdownBasedWorkflowConfig` |
-| **`TXTWorkflow`**           | Processes plain text documents. Flow: `txt -> Translate -> Export`. | `.txt` and other plain text formats          | `.txt`, `.html`            | `TXTWorkflowConfig`           |
+| **`TXTWorkflow`**           | Processes plain text documents. Flow: `txt -> Translate -> Export`.                                    | `.txt` and other plain text formats          | `.txt`, `.html`        | `TXTWorkflowConfig`           |
-| **`JsonWorkflow`**          | Processes JSON files. Flow: `json -> Translate -> Export`.          | `.json`                                      | `.json`, `.html`           | `JsonWorkflowConfig`          |
+| **`JsonWorkflow`**          | Processes JSON files. Flow: `json -> Translate -> Export`.                                             | `.json`                                      | `.json`, `.html`       | `JsonWorkflowConfig`          |
-| **`DocxWorkflow`**          | Processes docx files. Flow: `docx -> Translate -> Export`.          | `.docx`                                      | `.docx`, `.html`           | `DocxWorkflowConfig`          |
+| **`DocxWorkflow`**          | Processes docx files. Flow: `docx -> Translate -> Export`.                                             | `.docx`                                      | `.docx`, `.html`       | `DocxWorkflowConfig`          |
-| **`XlsxWorkflow`**          | Processes xlsx files. Flow: `xlsx -> Translate -> Export`.          | `.xlsx`, `.csv`                              | `.xlsx`, `.html`           | `XlsxWorkflowConfig`          |
+| **`XlsxWorkflow`**          | Processes xlsx files. Flow: `xlsx -> Translate -> Export`.                                             | `.xlsx`, `.csv`                              | `.xlsx`, `.html`       | `XlsxWorkflowConfig`          |
-| **`SrtWorkflow`**           | Processes srt files. Flow: `srt -> Translate -> Export`.            | `.srt`                                       | `.srt`, `.html`            | `SrtWorkflowConfig`           |
+| **`SrtWorkflow`**           | Processes srt files. Flow: `srt -> Translate -> Export`.                                               | `.srt`                                       | `.srt`, `.html`        | `SrtWorkflowConfig`           |
-| **`EpubWorkflow`**          | Processes epub files. Flow: `epub -> Translate -> Export`.          | `.epub`                                      | `.epub`, `.html`           | `EpubWorkflowConfig`          |
+| **`EpubWorkflow`**          | Processes epub files. Flow: `epub -> Translate -> Export`.                                             | `.epub`                                      | `.epub`, `.html`       | `EpubWorkflowConfig`          |
-| **`HtmlWorkflow`**          | Processes html files. Flow: `html -> Translate -> Export`.          | `.html`, `.htm`                              | `.html`                    | `HtmlWorkflowConfig`          |
+| **`HtmlWorkflow`**          | Processes html files. Flow: `html -> Translate -> Export`.                                             | `.html`, `.htm`                              | `.html`                | `HtmlWorkflowConfig`          |
 > You can export to PDF format in the interactive interface.
@@ -136,14 +152,16 @@ export DOCUTRANSLATE_PORT=8011
 docutranslate -i
 ```
-   **Interactive Interface**: After starting the service, visit `http://127.0.0.1:8010` (or your specified port) in your browser.
+- **Interactive Interface**: After starting the service, visit `http://127.0.0.1:8010` (or your specified port) in your
-   **API Documentation**: The complete API documentation (Swagger UI) is available at `http://127.0.0.1:8010/docs`.
+  browser.
 - **API Documentation**: The complete API documentation (Swagger UI) is available at `http://127.0.0.1:8010/docs`.
 ## Usage
 ### Example 1: Translate a PDF file (using `MarkdownBasedWorkflow`)
-This is the most common use case. We will use the `minerU` engine to convert the PDF to Markdown and then use an LLM for translation. This example uses the asynchronous method.
+This is the most common use case. We will use the `minerU` engine to convert the PDF to Markdown and then use an LLM for
 translation. This example uses the asynchronous method.
 ```python
 import asyncio
@@ -210,7 +228,8 @@ if __name__ == "__main__":
 ### Example 2: Translate a TXT file (using `TXTWorkflow`)
-For plain text files, the process is simpler as it doesn't require a document parsing (conversion) step. This example uses the asynchronous method.
+For plain text files, the process is simpler as it doesn't require a document parsing (conversion) step. This example
 uses the asynchronous method.
 ```python
 import asyncio
@@ -257,7 +276,8 @@ if __name__ == "__main__":
 ### Example 3: Translate a JSON file (using `JsonWorkflow`)
-This example uses the asynchronous method. The `json_paths` item in `JsonTranslatorConfig` needs to specify the JSON paths to be translated (conforming to the `jsonpath-ng` syntax). Only values matching these paths will be translated.
+This example uses the asynchronous method. The `json_paths` item in `JsonTranslatorConfig` needs to specify the JSON
 paths to be translated (conforming to the `jsonpath-ng` syntax). Only values matching these paths will be translated.
 ```python
 import asyncio
@@ -404,32 +424,87 @@ if __name__ == "__main__":
    asyncio.run(main())
 ```
 ### Example 5: Configuration Items for Other Workflows (Using `HtmlWorkflow`, `EpubWorkflow`)
 Here is an example using asynchronous mode.
 ```python
 # HtmlWorkflow
 from docutranslate.translator.ai_translator.html_translator import HtmlTranslatorConfig
 from docutranslate.workflow.html_workflow import HtmlWorkflowConfig, HtmlWorkflow
 async def html():
    # 1. Create translator configuration
    translator_config = HtmlTranslatorConfig(
        base_url="https://api.openai.com/v1/",
        api_key="YOUR_OPENAI_API_KEY",
        model_id="gpt-4o",
        to_lang="Chinese",
        insert_mode="replace",  # Options: "replace", "append", "prepend"
        separator="\n",  # Separator used for "append" and "prepend" modes
    )
    # 2. Create main workflow configuration
    workflow_config = HtmlWorkflowConfig(
        translator_config=translator_config,
    )
    workflow_html = HtmlWorkflow(config=workflow_config)
 # EpubWorkflow
 from docutranslate.exporter.epub.epub2html_exporter import Epub2HTMLExporterConfig
 from docutranslate.translator.ai_translator.epub_translator import EpubTranslatorConfig
 from docutranslate.workflow.epub_workflow import EpubWorkflowConfig, EpubWorkflow
 async def epub():
    # 1. Create translator configuration
    translator_config = EpubTranslatorConfig(
        base_url="https://api.openai.com/v1/",
        api_key="YOUR_OPENAI_API_KEY",
        model_id="gpt-4o",
        to_lang="Chinese",
        insert_mode="replace",  # Options: "replace", "append", "prepend"
        separator="\n",  # Separator used for "append" and "prepend" modes
    )
    # 2. Create main workflow configuration
    workflow_config = EpubWorkflowConfig(
        translator_config=translator_config,
        html_exporter_config=Epub2HTMLExporterConfig(cdn=True),
    )
    workflow_epub = EpubWorkflow(config=workflow_config)
 ```
 ## Prerequisites and Configuration Details
 ### 1. Get a Large Model API Key
-The translation feature relies on large language models. You need to obtain a `base_url`, `api_key`, and `model_id` from the respective AI platform.
+The translation feature relies on large language models. You need to obtain a `base_url`, `api_key`, and `model_id` from
 the respective AI platform.
-> Recommended models: Volcengine's `doubao-seed-1-6-flash` and `doubao-seed-1-6` series, Zhipu's `glm-4-flash`, Alibaba Cloud's `qwen-plus` and `qwen-flash`, Deepseek's `deepseek-chat`, etc.
+> Recommended models: Volcengine's `doubao-seed-1-6-flash` and `doubao-seed-1-6` series, Zhipu's `glm-4-flash`, Alibaba
 > Cloud's `qwen-plus` and `qwen-flash`, Deepseek's `deepseek-chat`, etc.
 > [302.AI](https://share.302.ai/BgRLAe)👈 Register through this link to enjoy a $1 free credit
-| Platform Name       | Get API Key                                                                              | Base URL                                                 |
+| Platform Name                 | Get API Key                                                                                   | Base URL                                                   |
-|:--------------------|:-----------------------------------------------------------------------------------------|:---------------------------------------------------------|
+|:------------------------------|:----------------------------------------------------------------------------------------------|:-----------------------------------------------------------|
-| ollama              |                                                                                          | `http://127.0.0.1:11434/v1`                              |
+| ollama                        |                                                                                               | `http://127.0.0.1:11434/v1`                                |
-| lm studio           |                                                                                          | `http://127.0.0.1:1234/v1`                               |
+| lm studio                     |                                                                                               | `http://127.0.0.1:1234/v1`                                 |
-| 302.AI     | [Click to get](https://share.302.ai/BgRLAe)                                                   | `https://api.302.ai/v1`                                      |
+| 302.AI                        | [Click to get](https://share.302.ai/BgRLAe)                                                   | `https://api.302.ai/v1`                                    |
-| openrouter          | [Click to get](https://openrouter.ai/settings/keys)                                      | `https://openrouter.ai/api/v1`                           |
+| openrouter                    | [Click to get](https://openrouter.ai/settings/keys)                                           | `https://openrouter.ai/api/v1`                             |
-| openai              | [Click to get](https://platform.openai.com/api-keys)                                     | `https://api.openai.com/v1/`                             |
+| openai                        | [Click to get](https://platform.openai.com/api-keys)                                          | `https://api.openai.com/v1/`                               |
-| gemini              | [Click to get](https://aistudio.google.com/u/0/apikey)                                   | `https://generativelanguage.googleapis.com/v1beta/openai/` |
+| gemini                        | [Click to get](https://aistudio.google.com/u/0/apikey)                                        | `https://generativelanguage.googleapis.com/v1beta/openai/` |
-| deepseek            | [Click to get](https://platform.deepseek.com/api_keys)                                   | `https://api.deepseek.com/v1`                            |
+| deepseek                      | [Click to get](https://platform.deepseek.com/api_keys)                                        | `https://api.deepseek.com/v1`                              |
-| Zhipu AI (智谱ai)     | [Click to get](https://open.bigmodel.cn/usercenter/apikeys)                                | `https://open.bigmodel.cn/api/paas/v4`                   |
+| Zhipu AI (智谱ai)               | [Click to get](https://open.bigmodel.cn/usercenter/apikeys)                                   | `https://open.bigmodel.cn/api/paas/v4`                     |
-| Tencent Hunyuan (腾讯混元) | [Click to get](https://console.cloud.tencent.com/hunyuan/api-key)                          | `https://api.hunyuan.cloud.tencent.com/v1`               |
+| Tencent Hunyuan (腾讯混元)        | [Click to get](https://console.cloud.tencent.com/hunyuan/api-key)                             | `https://api.hunyuan.cloud.tencent.com/v1`                 |
-| Alibaba Cloud Bailian (阿里云百炼) | [Click to get](https://bailian.console.aliyun.com/?tab=model#/api-key)                     | `https://dashscope.aliyuncs.com/compatible-mode/v1`      |
+| Alibaba Cloud Bailian (阿里云百炼) | [Click to get](https://bailian.console.aliyun.com/?tab=model#/api-key)                        | `https://dashscope.aliyuncs.com/compatible-mode/v1`        |
-| Volcengine (火山引擎) | [Click to get](https://console.volcengine.com/ark/region:ark+cn-beijing/apiKey?apikey=%7B%7D) | `https://ark.cn-beijing.volces.com/api/v3`               |
+| Volcengine (火山引擎)             | [Click to get](https://console.volcengine.com/ark/region:ark+cn-beijing/apiKey?apikey=%7B%7D) | `https://ark.cn-beijing.volces.com/api/v3`                 |
-| SiliconFlow (硅基流动) | [Click to get](https://cloud.siliconflow.cn/account/ak)                                    | `https://api.siliconflow.cn/v1`                          |
+| SiliconFlow (硅基流动)            | [Click to get](https://cloud.siliconflow.cn/account/ak)                                       | `https://api.siliconflow.cn/v1`                            |
-| DMXAPI              | [Click to get](https://www.dmxapi.cn/token)                                                | `https://www.dmxapi.cn/v1`                               |
+| DMXAPI                        | [Click to get](https://www.dmxapi.cn/token)                                                   | `https://www.dmxapi.cn/v1`                                 |
-| Juguang AI (聚光AI)   | [Click to get](https://ai.juguang.chat/console/token)                                      | `https://ai.juguang.chat/v1`                             |
+| Juguang AI (聚光AI)             | [Click to get](https://ai.juguang.chat/console/token)                                         | `https://ai.juguang.chat/v1`                               |
 ### 2. PDF Parsing Engine (ignore if not translating PDFs)
@@ -437,37 +512,45 @@ The translation feature relies on large language models. You need to obtain a `b
 If you choose `mineru` as your document parsing engine (`convert_engine="mineru"`), you need to apply for a free token.
-1.  Visit the [minerU official website](https://mineru.net/apiManage/docs) to register and apply for an API.
+1. Visit the [minerU official website](https://mineru.net/apiManage/docs) to register and apply for an API.
-2.  Create a new API Token in the [API Token Management interface](https://mineru.net/apiManage/token).
+2. Create a new API Token in the [API Token Management interface](https://mineru.net/apiManage/token).
 > **Note**: minerU Tokens are valid for 14 days. Please create a new one after expiration.
 #### 2.2. docling Engine Configuration (Local PDF parsing)
-If you choose `docling` as your document parsing engine (`convert_engine="docling"`), it will download the required models from Hugging Face upon first use.
+If you choose `docling` as your document parsing engine (`convert_engine="docling"`), it will download the required
 models from Hugging Face upon first use.
-> A better option is to download `docling_artifact.zip` from [GitHub Releases](https://github.com/xunbu/docutranslate/releases) and extract it to your working directory.
+> A better option is to download `docling_artifact.zip`
 > from [GitHub Releases](https://github.com/xunbu/docutranslate/releases) and extract it to your working directory.
 **Solutions for network issues when downloading `docling` models:**
-1.  **Set a Hugging Face mirror (Recommended)**:
+1. **Set a Hugging Face mirror (Recommended)**:
-    *   **Method A (Environment Variable)**: Set the system environment variable `HF_ENDPOINT` and restart your IDE or terminal.
+    * **Method A (Environment Variable)**: Set the system environment variable `HF_ENDPOINT` and restart your IDE or
-        ```
+      terminal.
-        HF_ENDPOINT=https://hf-mirror.com
+      ```
-        ```
+      HF_ENDPOINT=https://hf-mirror.com
-*   **Method B (Set in code)**: Add the following code at the beginning of your Python script.
+      ```
 * **Method B (Set in code)**: Add the following code at the beginning of your Python script.
 ```python
 import os
-    
+
 os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'
 ```
-2.  **Offline Usage (Download the model package in advance)**:
+
-    *   Download `docling_artifact.zip` from [GitHub Releases](https://github.com/xunbu/docutranslate/releases).
+2. **Offline Usage (Download the model package in advance)**:
-    *   Extract it into your project directory.
+    * Download `docling_artifact.zip` from [GitHub Releases](https://github.com/xunbu/docutranslate/releases).
-*   Specify the model path in your configuration (if the model is not in the same directory as the script):
+    * Extract it into your project directory.
 * Specify the model path in your configuration (if the model is not in the same directory as the script):
 ```python
 from docutranslate.converter.x2md.converter_docling import ConverterDoclingConfig
-    
+
 converter_config = ConverterDoclingConfig(
    artifact="./docling_artifact",  # Path to the extracted folder
    code_ocr=True,
@@ -478,7 +561,8 @@ converter_config = ConverterDoclingConfig(
 ## FAQ
 **Q: Why is the translated text still in the original language?**  
-A: Check the logs for errors. It's usually due to an overdue payment on the AI platform or network issues (check if you need to enable the system proxy).
+A: Check the logs for errors. It's usually due to an overdue payment on the AI platform or network issues (check if you
 need to enable the system proxy).
 **Q: Port 8010 is already in use. What should I do?**  
 A: Use the `-p` parameter to specify a new port, or set the `DOCUTRANSLATE_PORT` environment variable.
@@ -487,18 +571,25 @@ A: Use the `-p` parameter to specify a new port, or set the `DOCUTRANSLATE_PORT`
 A: Yes. Please use the `mineru` parsing engine, which has powerful OCR capabilities.
 **Q: Why is the first PDF translation very slow?**  
-A: If you are using the `docling` engine, it needs to download models from Hugging Face on its first run. Please refer to the "Network Issues Solutions" section above to speed up this process.
+A: If you are using the `docling` engine, it needs to download models from Hugging Face on its first run. Please refer
 to the "Network Issues Solutions" section above to speed up this process.
 **Q: How can I use it in an intranet (offline) environment?**  
 A: Absolutely. You need to meet the following conditions:
-1.  **Local LLM**: Deploy a language model locally using tools like [Ollama](https://ollama.com/) or [LM Studio](https://lmstudio.ai/), and fill in the local model's `base_url` in `TranslatorConfig`.
+
-2.  **Local PDF Parsing Engine** (only for parsing PDFs): Use the `docling` engine and download the model package in advance as described in the "Offline Usage" section above.
+1. **Local LLM**: Deploy a language model locally using tools like [Ollama](https://ollama.com/)
   or [LM Studio](https://lmstudio.ai/), and fill in the local model's `base_url` in `TranslatorConfig`.
 2. **Local PDF Parsing Engine** (only for parsing PDFs): Use the `docling` engine and download the model package in
   advance as described in the "Offline Usage" section above.
 **Q: How does the PDF parsing cache mechanism work?**  
-A: `MarkdownBasedWorkflow` automatically caches the results of document parsing (file-to-Markdown conversion) to avoid repetitive, time-consuming parsing. The cache is stored in memory by default and records the last 10 parses. You can change the cache size using the `DOCUTRANSLATE_CACHE_NUM` environment variable.
+A: `MarkdownBasedWorkflow` automatically caches the results of document parsing (file-to-Markdown conversion) to avoid
 repetitive, time-consuming parsing. The cache is stored in memory by default and records the last 10 parses. You can
 change the cache size using the `DOCUTRANSLATE_CACHE_NUM` environment variable.
 **Q: How can I make the software use a proxy?**  
-A: By default, the software does not use the system proxy. You can enable it by setting `system_proxy_enable=True` in `TranslatorConfig`.
+A: By default, the software does not use the system proxy. You can enable it by setting `system_proxy_enable=True` in
 `TranslatorConfig`.
 ## Star History
--- a/README_JP.md
+++ b/README_JP.md
@@ -411,13 +411,67 @@ if __name__ == "__main__":
    asyncio.run(main())
 ```
 ### 例 5: その他のワークフローの設定項目 (`HtmlWorkflow`、`EpubWorkflow` の使用)
 以下は非同期モードの使用例です。
 ```python
 # HtmlWorkflow
 from docutranslate.translator.ai_translator.html_translator import HtmlTranslatorConfig
 from docutranslate.workflow.html_workflow import HtmlWorkflowConfig, HtmlWorkflow
 async def html():
    # 1. 翻訳機の設定を作成
    translator_config = HtmlTranslatorConfig(
        base_url="https://api.openai.com/v1/",
        api_key="YOUR_OPENAI_API_KEY",
        model_id="gpt-4o",
        to_lang="中国語",
        insert_mode="replace",  # 選択肢: "replace", "append", "prepend"
        separator="\n",  # "append", "prepend" モードで使用される区切り文字
    )
    # 2. メインワークフローの設定を作成
    workflow_config = HtmlWorkflowConfig(
        translator_config=translator_config,
    )
    workflow_html = HtmlWorkflow(config=workflow_config)
 # EpubWorkflow
 from docutranslate.exporter.epub.epub2html_exporter import Epub2HTMLExporterConfig
 from docutranslate.translator.ai_translator.epub_translator import EpubTranslatorConfig
 from docutranslate.workflow.epub_workflow import EpubWorkflowConfig, EpubWorkflow
 async def epub():
    # 1. 翻訳機の設定を作成
    translator_config = EpubTranslatorConfig(
        base_url="https://api.openai.com/v1/",
        api_key="YOUR_OPENAI_API_KEY",
        model_id="gpt-4o",
        to_lang="中国語",
        insert_mode="replace",  # 選択肢: "replace", "append", "prepend"
        separator="\n",  # "append", "prepend" モードで使用される区切り文字
    )
    # 2. メインワークフローの設定を作成
    workflow_config = EpubWorkflowConfig(
        translator_config=translator_config,
        html_exporter_config=Epub2HTMLExporterConfig(cdn=True),
    )
    workflow_epub = EpubWorkflow(config=workflow_config)
 ```
 ## 前提条件と設定詳細
 ### 1. 大規模モデルAPIキーの取得
 翻訳機能は大規模言語モデルに依存しているため、対応するAIプラットフォームから`base_url`、`api_key`、`model_id`を取得する必要があります。
-> 推奨モデル：火山引擎の`doubao-seed-1-6-flash`、`doubao-seed-1-6`シリーズ、智譜の`glm-4-flash`、阿里雲の`qwen-plus`、 `qwen-flash`、deepseekの`deepseek-chat`など。
+> 推奨モデル：火山引擎の`doubao-seed-1-6-flash`、`doubao-seed-1-6`シリーズ、智譜の`glm-4-flash`、阿里雲の`qwen-plus`、
 `qwen-flash`、deepseekの`deepseek-chat`など。
 > [302.AI](https://share.302.ai/BgRLAe)👈 このリンクから登録で1ドル分の無料クレジットを提供
@@ -425,7 +479,7 @@ if __name__ == "__main__":
 |:-----------|:---------------------------------------------------------------------------------------------|:-----------------------------------------------------------|
 | ollama     |                                                                                              | `http://127.0.0.1:11434/v1`                                |
 | lm studio  |                                                                                              | `http://127.0.0.1:1234/v1`                                 |
-| 302.AI     | [ここをクリックして取得](https://share.302.ai/BgRLAe)                                                   | `https://api.302.ai/v1`                                      |
+| 302.AI     | [ここをクリックして取得](https://share.302.ai/BgRLAe)                                                   | `https://api.302.ai/v1`                                    |
 | openrouter | [ここをクリックして取得](https://openrouter.ai/settings/keys)                                           | `https://openrouter.ai/api/v1`                             |
 | openai     | [ここをクリックして取得](https://platform.openai.com/api-keys)                                          | `https://api.openai.com/v1/`                               |
 | gemini     | [ここをクリックして取得](https://aistudio.google.com/u/0/apikey)                                        | `https://generativelanguage.googleapis.com/v1beta/openai/` |
--- a/README_ZH.md
+++ b/README_ZH.md
@@ -406,13 +406,67 @@ if __name__ == "__main__":
    asyncio.run(main())
 ```
 ### 示例 5: 其它workflow的配置项(使用 `HtmlWorkflow`、`EpubWorkflow`)
 这里以异步方式为例。
 ```python
 # HtmlWorkflow
 from docutranslate.translator.ai_translator.html_translator import HtmlTranslatorConfig
 from docutranslate.workflow.html_workflow import HtmlWorkflowConfig, HtmlWorkflow
 async def html():
    # 1. 构建翻译器配置
    translator_config = HtmlTranslatorConfig(
        base_url="https://api.openai.com/v1/",
        api_key="YOUR_OPENAI_API_KEY",
        model_id="gpt-4o",
        to_lang="中文",
        insert_mode="replace",  # 备选项 "replace", "append", "prepend"
        separator="\n",  # "append", "prepend"模式时使用的分隔符
    )
    # 2. 构建主工作流配置
    workflow_config = HtmlWorkflowConfig(
        translator_config=translator_config,
    )
    workflow_html = HtmlWorkflow(config=workflow_config)
 # EpubWorkflow
 from docutranslate.exporter.epub.epub2html_exporter import Epub2HTMLExporterConfig
 from docutranslate.translator.ai_translator.epub_translator import EpubTranslatorConfig
 from docutranslate.workflow.epub_workflow import EpubWorkflowConfig, EpubWorkflow
 async def epub():
    # 1. 构建翻译器配置
    translator_config = EpubTranslatorConfig(
        base_url="https://api.openai.com/v1/",
        api_key="YOUR_OPENAI_API_KEY",
        model_id="gpt-4o",
        to_lang="中文",
        insert_mode="replace",  # 备选项 "replace", "append", "prepend"
        separator="\n",  # "append", "prepend"模式时使用的分隔符
    )
    # 2. 构建主工作流配置
    workflow_config = EpubWorkflowConfig(
        translator_config=translator_config,
        html_exporter_config=Epub2HTMLExporterConfig(cdn=True),
    )
    workflow_epub = EpubWorkflow(config=workflow_config)
 ```
 ## 前置条件与配置详解
 ### 1. 获取大模型 API Key
 翻译功能依赖于大型语言模型，您需要从相应的 AI 平台获取 `base_url`, `api_key` 和 `model_id`。
-> 推荐模型：火山引擎的`doubao-seed-1-6-flash`、`doubao-seed-1-6`系列、智谱的`glm-4-flash`，阿里云的 `qwen-plus`、`qwen-flash`，deepseek的`deepseek-chat`等。
+> 推荐模型：火山引擎的`doubao-seed-1-6-flash`、`doubao-seed-1-6`系列、智谱的`glm-4-flash`，阿里云的 `qwen-plus`、`qwen-flash`
 > ，deepseek的`deepseek-chat`等。
 > [302.AI](https://share.302.ai/BgRLAe)👈从该链接注册可享1美元免费额度