概述

Patent Status Query Skill

Overview

Use this skill for the patent status batch query program. The project is a local Streamlit app that reads Excel workbooks, queries CNIPA patent status pages through Playwright or an Edge CDP session, matches results, and exports an annotated Excel workbook.

Keep private data out of generated code, docs, and commits. Never include real case names, applicant names, account credentials, cookies, local storage tokens, browser profiles, logs, screenshots, or generated Excel outputs in the skill or repository.

Project Map

Core entry points:

app.py: Streamlit UI and main workflow.
requirements.txt: Python dependencies.
.env.example: safe runtime configuration template.
一键打开网站.bat: root launcher for the website.
开国知局网页.bat and scripts/start_edge_debug.bat: launch Edge with CDP debugging for manual login.
scripts/start_streamlit_8502.bat: start Streamlit on http://127.0.0.1:8502.

Package modules:

patent_status_tool/config.py: runtime paths and environment variables.
patent_status_tool/browser_client.py: browser/CDP orchestration.
patent_status_tool/query_service.py: batch query flow.
patent_status_tool/excel_io.py: workbook parsing and export.
patent_status_tool/matcher.py: hit selection logic.
patent_status_tool/status.py: status normalization.
patent_status_tool/selectors.py: CNIPA DOM selectors.
patent_status_tool/models.py: data structures.
patent_status_tool/result_cache.py: JSONL checkpoint cache.
patent_status_tool/logger.py: logging setup.

Runtime-only paths:

.env
.session/
logs/
output/
edge-cdp-profile/
.venv/
__pycache__/

Treat runtime-only paths as sensitive or generated. Keep them ignored and do not upload them.

Workflow

Inspect local context before changing behavior.

Read app.py, the relevant patent_status_tool/*.py modules, and the launch scripts.
Check .gitignore before adding files.
Use rg for searches.

Preserve the data flow.

Load .xlsx from a local path or upload fallback.
Detect name, writing date, optional application number, optional applicant, and status columns.
Prefer application-number queries when an input row already has an application number.
Use patent-name queries only when there is no application number.
Export results and a manual-review sheet to output/.

Keep CNIPA automation robust.

Update selectors in patent_status_tool/selectors.py when the CNIPA DOM changes.
Prefer stable labels, roles, text anchors, or business-semantic selectors over brittle positional selectors.
Use the CDP Edge flow when login, captcha, or manual verification is needed.
Do not assume the live CNIPA page can be queried without user login.

Validate changes locally.

At minimum run:

.\.venv\Scripts\python.exe -m py_compile app.py scripts\inspect_workbook_sheets.py patent_status_tool\config.py patent_status_tool\excel_io.py

For UI launch checks, use:

.\一键打开网站.bat

For manual CNIPA login/CDP checks, use:

.\开国知局网页.bat

Prepare for GitHub safely.

Confirm .gitignore excludes .env, .session/, logs/, output/, edge-cdp-profile/, .venv/, caches, and workbook files.
Remove or redact hardcoded local paths, personal names, company names, real case names, and internal workbook filenames.
Search for private paths, browser tokens, real account display names, personal/company identifiers, and workbook names before upload.

Excel Rules

Supported required column aliases:

Name: 专利名称, 名称, 案件名称
Writing date: 撰写时间, 撰写日, 交底书时间, 起草时间

Supported optional column aliases:

Application number: 申请号, 申请号/专利号, 专利号, 申请编号
Applicant: 申请人, 专利申请人, 申请(专利权)人, 申请人/专利权人

Sheet/status behavior:

In invention sheets, query rows whose status includes 复审中, 答复, 已完成, 已消化, or 审查.
In utility-model sheets, skip rows whose L-column status includes 非正常, 驳回, or 授权; query other named rows.
When multiple hits remain after a name query, filter by 申请日 >= 撰写时间; unresolved ambiguity goes to manual review.

Output Rules

The exported workbook should preserve the original sheet data and append result columns:

申请号
申请人
当前状态
规范化状态
申请日
匹配说明
是否需人工核对
查询时间
原始命中数量

Also create a 人工复核 sheet for ambiguous, failed, or manually reviewable rows.

Safety Rules

Do not commit .env; use .env.example for defaults.
Do not commit browser storage, cookies, localStorage, screenshots, HTML captures, logs, output workbooks, or user case data.
Do not hardcode absolute user-profile paths or project-specific private workbook names.
Do not store CNIPA usernames or passwords in code or docs.
When creating examples, use placeholders such as sample_input.xlsx, 示例专利名称, and 示例申请人.
If a requested change requires real CNIPA access, clearly distinguish code validation from live-query validation.

Common Tasks

Add or update launch behavior

Prefer updating root launch scripts and documenting the new flow in README.md. Keep launch scripts relative to %~dp0 or %CD%; do not use absolute user paths.

Update selectors

Edit patent_status_tool/selectors.py. Keep selector names aligned with query concepts such as query_application_no_input, query_input, query_button, result_rows, result_status, and detail fields.

Change Excel parsing

Edit patent_status_tool/excel_io.py and preserve the return contract used by app.py. If load_inputs() changes shape, update all callers including scripts in scripts/.

Change matching rules

Edit patent_status_tool/matcher.py and, when needed, patent_status_tool/status.py. Preserve application-number priority and manual-review behavior unless the user explicitly asks for a rule change.

Clean for upload

Prefer .gitignore plus content cleanup. If destructive deletion is blocked, clear sensitive file contents and tell the user which directories are still present but ignored.

版本历史

共 1 个版本

v1.0.0 Initial release 当前

2026-06-04 17:16 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)