Use this skill for the patent status batch query program. The project is a local Streamlit app that reads Excel workbooks, queries CNIPA patent status pages through Playwright or an Edge CDP session, matches results, and exports an annotated Excel workbook.
Keep private data out of generated code, docs, and commits. Never include real case names, applicant names, account credentials, cookies, local storage tokens, browser profiles, logs, screenshots, or generated Excel outputs in the skill or repository.
Core entry points:
app.py: Streamlit UI and main workflow.requirements.txt: Python dependencies..env.example: safe runtime configuration template.一键打开网站.bat: root launcher for the website.开国知局网页.bat and scripts/start_edge_debug.bat: launch Edge with CDP debugging for manual login.scripts/start_streamlit_8502.bat: start Streamlit on http://127.0.0.1:8502.Package modules:
patent_status_tool/config.py: runtime paths and environment variables.patent_status_tool/browser_client.py: browser/CDP orchestration.patent_status_tool/query_service.py: batch query flow.patent_status_tool/excel_io.py: workbook parsing and export.patent_status_tool/matcher.py: hit selection logic.patent_status_tool/status.py: status normalization.patent_status_tool/selectors.py: CNIPA DOM selectors.patent_status_tool/models.py: data structures.patent_status_tool/result_cache.py: JSONL checkpoint cache.patent_status_tool/logger.py: logging setup.Runtime-only paths:
.env.session/logs/output/edge-cdp-profile/.venv/__pycache__/Treat runtime-only paths as sensitive or generated. Keep them ignored and do not upload them.
app.py, the relevant patent_status_tool/*.py modules, and the launch scripts..gitignore before adding files.rg for searches..xlsx from a local path or upload fallback.output/.patent_status_tool/selectors.py when the CNIPA DOM changes..\.venv\Scripts\python.exe -m py_compile app.py scripts\inspect_workbook_sheets.py patent_status_tool\config.py patent_status_tool\excel_io.py
.\一键打开网站.bat
.\开国知局网页.bat
.gitignore excludes .env, .session/, logs/, output/, edge-cdp-profile/, .venv/, caches, and workbook files.Supported required column aliases:
专利名称, 名称, 案件名称撰写时间, 撰写日, 交底书时间, 起草时间Supported optional column aliases:
申请号, 申请号/专利号, 专利号, 申请编号申请人, 专利申请人, 申请(专利权)人, 申请人/专利权人Sheet/status behavior:
复审中, 答复, 已完成, 已消化, or 审查.非正常, 驳回, or 授权; query other named rows.申请日 >= 撰写时间; unresolved ambiguity goes to manual review.The exported workbook should preserve the original sheet data and append result columns:
申请号申请人当前状态规范化状态申请日匹配说明是否需人工核对查询时间原始命中数量Also create a 人工复核 sheet for ambiguous, failed, or manually reviewable rows.
.env; use .env.example for defaults.sample_input.xlsx, 示例专利名称, and 示例申请人.Prefer updating root launch scripts and documenting the new flow in README.md. Keep launch scripts relative to %~dp0 or %CD%; do not use absolute user paths.
Edit patent_status_tool/selectors.py. Keep selector names aligned with query concepts such as query_application_no_input, query_input, query_button, result_rows, result_status, and detail fields.
Edit patent_status_tool/excel_io.py and preserve the return contract used by app.py. If load_inputs() changes shape, update all callers including scripts in scripts/.
Edit patent_status_tool/matcher.py and, when needed, patent_status_tool/status.py. Preserve application-number priority and manual-review behavior unless the user explicitly asks for a rule change.
Prefer .gitignore plus content cleanup. If destructive deletion is blocked, clear sensitive file contents and tell the user which directories are still present but ignored.
共 1 个版本