酒店网站建设范文,90设计包图网,商城网站的psd模板免费下载,株洲微豆网络科技有限公司网页设计还在手动录入身份证信息#xff1f;一行行输姓名、身份证号、地址#xff0c;不仅费眼还容易错#xff0c;每天录上百条能累到手腕疼#xff01;在网上找到了这款吾爱大神写的工具#xff0c;从 V1.0.3 开源到现在更新到 V1.0.5#xff0c;核心解决 “识别慢、字段错、依…还在手动录入身份证信息一行行输姓名、身份证号、地址不仅费眼还容易错每天录上百条能累到手腕疼在网上找到了这款吾爱大神写的工具从 V1.0.3 开源到现在更新到 V1.0.5核心解决 “识别慢、字段错、依赖显卡 / 网络” 的问题纯 CPU 运行、离线可用Win10/11 直接跑现在分享给有同样需求的同行文末附下载地址版本迭代每一次更新都直击使用痛点从最初版本到 V1.0.5工具一直在优化核心体验尤其解决了大家反馈的关键问题V1.0.3开源版基础功能拉满效率先提档核心升级优化启动速度打开软件从 10 秒缩到 3 秒新增识别字段补全民族、出生日期、有效期等易漏字段实用功能图片重命名可选配置识别后自动按 “姓名 - 身份证号” 命名图片、Excel 导出可选仅填图片路径 / 直接嵌入图片代码开源核心逻辑已放出有开发能力的伙伴可下载源码自定义修改附件自取适配自家业务场景。V1.0.4解决使用体验 “小麻烦”模型本地化模型随程序分发初次启动不用再下载几百 M 的模型包开箱即用终端弹窗修复彻底解决启动后黑框终端关不掉的问题界面更清爽不用手动杀进程。V1.0.5精准度再升级字段匹配优化重构识别字段的匹配算法姓名、身份证号、地址等核心字段识别准确率提升 15%尤其解决了 “地址含生僻字识别错”“身份证号末尾 X 识别成 x” 的问题无额外依赖变更依旧纯 CPU 运行不用折腾显卡配置老电脑也能扛。工具已经打包好链接https://pan.quark.cn/s/423f3761c712源码分享# 在gui.py文件中添加以下代码 import rep/p pfrom PySide6.QtCore import QThread, Signal import traceback from openpyxl.workbook import Workbook from openpyxl.drawing.image import Image as XLImage from openpyxl.styles import Alignment/p pimport cv2 import numpy as np from PIL import Image/p p ocr实际开始工作的线程 需要将前边加载好的模型传递过来 class OCRWorker(QThread):/p h2 id53093959_定义信号用于通知主线程处理进度和结果定义信号用于通知主线程处理进度和结果/h2 precode data-highlightedyes classhljs language-pythonprogress_updated Signal(span classhljs-built_inint/span, span classhljs-built_inint/span) span classhljs-comment# 当前进度总数量/span finished_signal Signal() span classhljs-comment# 处理完成信号/span error_occurred Signal(span classhljs-built_instr/span) span classhljs-comment# 错误信息信号/span span classhljs-keyworddef/span span classhljs-title function___init__/span(span classhljs-paramsself, file_paths, export_options, ocr/span): span classhljs-built_insuper/span().__init__() span classhljs-variable language_self/span.file_paths file_paths span classhljs-variable language_self/span.export_options export_options span classhljs-variable language_self/span.ocr ocr span classhljs-variable language_self/span._should_terminate span classhljs-literalFalse/span span classhljs-comment# 添加终止标志/span span classhljs-keyworddef/span span classhljs-title function_run/span(span classhljs-paramsself/span): span classhljs-keywordtry/span: span classhljs-comment# 处理所有文件/span span classhljs-variable language_self/span.process_files(span classhljs-variable language_self/span.file_paths) span classhljs-keywordexcept/span Exception span classhljs-keywordas/span e: error_msg span classhljs-stringf处理过程中发生错误: span classhljs-subst{span classhljs-built_instr/span(e)}/span\nspan classhljs-subst{traceback.format_exc()}/span/span span classhljs-variable language_self/span.error_occurred.emit(error_msg) span classhljs-keyworddef/span span classhljs-title function_process_files/span(span classhljs-paramsself, file_paths/span): wb Workbook() ws wb.active ws.append([span classhljs-string图片/span, span classhljs-string姓名/span, span classhljs-string性别/span, span classhljs-string民族/span, span classhljs-string出生日期/span, span classhljs-string住址/span, span classhljs-string身份证号/span, span classhljs-string有效期限/span]) row_idx span classhljs-number2/span total_files span classhljs-built_inlen/span(span classhljs-variable language_self/span.file_paths) processed_count span classhljs-number0/span span classhljs-keywordfor/span i, path span classhljs-keywordin/span span classhljs-built_inenumerate/span(file_paths): span classhljs-comment# 检查是否收到终止请求/span span classhljs-keywordif/span span classhljs-variable language_self/span._should_terminate: span classhljs-built_inprint/span(span classhljs-string收到终止请求正在保存已处理的数据.../span) span classhljs-keywordbreak/span span classhljs-comment# 发送进度更新信号/span span classhljs-variable language_self/span.progress_updated.emit(i span classhljs-number1/span, total_files) info span classhljs-variable language_self/span.extract_info_from_image(path) span classhljs-comment# 检查线程是否被中断/span span classhljs-keywordif/span span classhljs-variable language_self/span.isInterruptionRequested(): span classhljs-keywordbreak/span span classhljs-keywordif/span info: ws.cell(rowrow_idx, columnspan classhljs-number2/span, valueinfo[span classhljs-string姓名/span]) ws.cell(rowrow_idx, columnspan classhljs-number3/span, valueinfo[span classhljs-string性别/span]) ws.cell(rowrow_idx, columnspan classhljs-number4/span, valueinfo[span classhljs-string民族/span]) ws.cell(rowrow_idx, columnspan classhljs-number5/span, valueinfo[span classhljs-string出生日期/span]) ws.cell(rowrow_idx, columnspan classhljs-number6/span, valueinfo[span classhljs-string住址/span]) ws.cell(rowrow_idx, columnspan classhljs-number7/span, valueinfo[span classhljs-string身份证号/span]) ws.cell(rowrow_idx, columnspan classhljs-number8/span, valueinfo[span classhljs-string有效期限/span]) span classhljs-comment# 根据导出选项决定如何处理图片/span export_option span classhljs-variable language_self/span.export_options.get(span classhljs-stringexport_option/span, span classhljs-stringimage_path/span) span classhljs-comment# 处理重命名如果配置了重命名选项/span span classhljs-keywordif/span span classhljs-variable language_self/span.should_rename_file(info): new_path span classhljs-variable language_self/span.rename_file(path, info) span classhljs-comment# 更新图片路径为重命名后的路径/span span classhljs-keywordif/span export_option span classhljs-stringimage_path/span: ws.cell(rowrow_idx, columnspan classhljs-number1/span, valuenew_path) span classhljs-comment# 更新返回数据中的图片路径/span info[span classhljs-string图片路径/span] new_path span classhljs-keywordif/span export_option span classhljs-stringimage_file/span: span classhljs-comment# 直接嵌入图片文件/span span classhljs-keywordtry/span: img XLImage(info[span classhljs-string图片路径/span]) img.width span classhljs-number500/span img.height span classhljs-number300/span ws.row_dimensions[zxsq-anti-bbcode-row_idx].height img.height ws.add_image(img, span classhljs-stringfAspan classhljs-subst{row_idx}/span/span) ws.column_dimensions[span classhljs-stringA/span].width img.width * span classhljs-number0.14/span span classhljs-keywordexcept/span Exception span classhljs-keywordas/span e: span classhljs-built_inprint/span(span classhljs-stringf无法插入图片 span classhljs-subst{path}/span: span classhljs-subst{e}/span/span) span classhljs-keywordelse/span : span classhljs-comment# 仅保存图片路径/span ws.cell(rowrow_idx, columnspan classhljs-number1/span, valueinfo[span classhljs-string图片路径/span]) span classhljs-keywordfor/span col span classhljs-keywordin/span span classhljs-built_inrange/span(span classhljs-number1/span, span classhljs-number9/span): cell ws.cell(rowrow_idx, columncol) cell.alignment Alignment(horizontalspan classhljs-stringcenter/span, verticalspan classhljs-stringcenter/span) row_idx span classhljs-number1/span processed_count span classhljs-number1/span span classhljs-keywordfor/span col span classhljs-keywordin/span span classhljs-built_inrange/span(span classhljs-number1/span, span classhljs-number9/span): header_cell ws.cell(rowspan classhljs-number1/span, columncol) header_cell.alignment Alignment(horizontalspan classhljs-stringcenter/span, verticalspan classhljs-stringcenter/span) output_path span classhljs-string身份证识别结果.xlsx/span wb.save(output_path) span classhljs-keywordif/span span classhljs-variable language_self/span._should_terminate: span classhljs-built_inprint/span(span classhljs-stringf处理已终止已完成 span classhljs-subst{processed_count}/span/span classhljs-subst{total_files}/span 个文件结果已保存到 span classhljs-subst{output_path}/span/span) span classhljs-keywordelse/span: span classhljs-built_inprint/span(span classhljs-stringf处理完成共处理 span classhljs-subst{processed_count}/span 个文件结果已保存到 span classhljs-subst{output_path}/span/span) span classhljs-comment# 发送完成信号/span span classhljs-variable language_self/span.finished_signal.emit() span classhljs-keyworddef/span span classhljs-title function_extract_info_from_image/span(span classhljs-paramsself, image_path/span): span classhljs-string从图片中提取信息优化版文本处理/span span classhljs-keywordtry/span: span classhljs-comment# 检查文件是否存在和可读/span span classhljs-comment# import os/span span classhljs-comment# if not os.path.exists(image_path):/span span classhljs-comment# raise FileNotFoundError(f图片文件不存在: {image_path})/span span classhljs-comment#/span span classhljs-comment# if not os.access(image_path, os.R_OK):/span span classhljs-comment# raise PermissionError(f没有权限读取图片文件: {image_path})/span span classhljs-comment# # 检查是否需要预处理身份证图片/span span classhljs-comment# if self.export_options.get(preprocess_id_card, True):/span span classhljs-comment# processed_image_path self.preprocess_id_card_image(image_path)/span span classhljs-comment# else:/span span classhljs-comment# processed_image_path image_path/span span classhljs-comment#/span span classhljs-comment# result self.ocr.ocr(processed_image_path, clsTrue)/span result span classhljs-variable language_self/span.ocr.ocr(image_path, clsspan classhljs-literalTrue/span) span classhljs-comment# 1. 先整体拼接所有文本/span all_text span classhljs-string/span span classhljs-keywordfor/span res span classhljs-keywordin/span result: span classhljs-keywordfor/span line span classhljs-keywordin/span res: text line[zxsq-anti-bbcode-span classhljs-number1/span][zxsq-anti-bbcode-span classhljs-number0/span] span classhljs-keywordif/span text: all_text text span classhljs-comment# 2. 去除中华人民共和国居民身份证标题/span all_text re.sub(span classhljs-stringr中华人民共和国居民身份证/span, span classhljs-string/span, all_text) span classhljs-comment# 3. 去除所有空格和特殊空白字符/span all_text re.sub(span classhljs-stringr\s/span, span classhljs-string/span, all_text) span classhljs-comment# 4. 在关键字段前添加换行符/span keywords [span classhljs-string姓名/span, span classhljs-string性别/span, span classhljs-string民族/span, span classhljs-string出生/span, span classhljs-string住址/span, span classhljs-string公民身份号码/span, span classhljs-string签发机关/span, span classhljs-string有效期限/span] span classhljs-keywordfor/span keyword span classhljs-keywordin/span keywords: all_text re.sub(span classhljs-stringf(span classhljs-subst{keyword}/span)/span, span classhljs-stringr\n\1/span, all_text) span classhljs-built_inprint/span(span classhljs-stringf处理后的文本: span classhljs-subst{all_text}/span/span) span classhljs-comment# 初始化提取结果/span name gender nation birth address id_number expire span classhljs-string/span span classhljs-comment# 提取各字段信息/span span classhljs-comment# 提取身份证号/span span classhljs-comment# 直接匹配17位数字1位校验码数字或X/span id_match re.search(span classhljs-stringr[\d]{17}[\dXx]/span, all_text) span classhljs-keywordif/span id_match: id_number id_match.group().strip() span classhljs-comment# 移除身份证号码干扰/span all_text all_text.replace(id_match.group(), span classhljs-string/span) span classhljs-comment# 提取姓名/span name_match re.search(span classhljs-stringr姓名(.?)(?\n|$)/span, all_text) span classhljs-keywordif/span name_match: name name_match.group(span classhljs-number1/span).strip() span classhljs-comment# 提取性别/span gender_match re.search(span classhljs-stringr性别(男|女)/span, all_text) span classhljs-keywordif/span gender_match: gender gender_match.group(span classhljs-number1/span).strip() span classhljs-comment# 提取民族/span nation_match re.search(span classhljs-stringr民族(.?)(?\n|$)/span, all_text) span classhljs-keywordif/span nation_match: nation nation_match.group(span classhljs-number1/span).strip() span classhljs-comment# 提取出生日期/span birth_match re.search(span classhljs-stringr出生(.?)(?\n|$)/span, all_text) span classhljs-keywordif/span birth_match: birth birth_match.group(span classhljs-number1/span).strip() span classhljs-comment# 提取住址/span address_match re.search(span classhljs-stringr住址(.?)(?\n|$)/span, all_text) span classhljs-keywordif/span address_match: address address_match.group(span classhljs-number1/span).strip() span classhljs-comment# 提取有效期限/span expire_match re.search(span classhljs-stringr有效期限(.?)(?\n|$)/span, all_text) span classhljs-keywordif/span expire_match: expire expire_match.group(span classhljs-number1/span).strip() data { span classhljs-string姓名/span: name, span classhljs-string性别/span: gender, span classhljs-string民族/span: nation, span classhljs-string出生日期/span: birth, span classhljs-string住址/span: address, span classhljs-string身份证号/span: id_number, span classhljs-string有效期限/span: expire, span classhljs-string图片路径/span: image_path } span classhljs-built_inprint/span(span classhljs-stringfdata span classhljs-subst{data}/span/span) span classhljs-keywordreturn/span data span classhljs-keywordexcept/span Exception span classhljs-keywordas/span e: span classhljs-built_inprint/span(span classhljs-stringf处理 span classhljs-subst{image_path}/span 失败: span classhljs-subst{e}/span/span) span classhljs-keywordreturn/span span classhljs-literalNone/span span classhljs-keyworddef/span span classhljs-title function_should_rename_file/span(span classhljs-paramsself, info/span): span classhljs-string检查是否需要重命名文件/span rename_options span classhljs-variable language_self/span.export_options.get(span classhljs-stringrename_options/span, []) span classhljs-keywordreturn/span span classhljs-built_inlen/span(rename_options) span classhljs-number0/span span classhljs-keyworddef/span span classhljs-title function_rename_file/span(span classhljs-paramsself, original_path, info/span): span classhljs-string根据配置重命名文件/span span classhljs-keywordif/span span classhljs-keywordnot/span span classhljs-variable language_self/span.should_rename_file(info): span classhljs-keywordreturn/span original_path rename_options span classhljs-variable language_self/span.export_options.get(span classhljs-stringrename_options/span, []) separator span classhljs-variable language_self/span.export_options.get(span classhljs-stringseparator/span, span classhljs-string_/span) span classhljs-comment# 构建新的文件名部分/span name_parts [] span classhljs-keywordfor/span option span classhljs-keywordin/span rename_options: span classhljs-keywordif/span option span classhljs-stringname/span span classhljs-keywordand/span info.get(span classhljs-string姓名/span): name_parts.append(info[span classhljs-string姓名/span]) span classhljs-keywordelif/span option span classhljs-stringid/span span classhljs-keywordand/span info.get(span classhljs-string身份证号/span): name_parts.append(info[span classhljs-string身份证号/span]) span classhljs-keywordelif/span option span classhljs-stringnation/span span classhljs-keywordand/span info.get(span classhljs-string民族/span): name_parts.append(info[span classhljs-string民族/span]) span classhljs-keywordelif/span option span classhljs-stringsex/span span classhljs-keywordand/span info.get(span classhljs-string性别/span): name_parts.append(info[span classhljs-string性别/span]) span classhljs-keywordelif/span option span classhljs-stringaddress/span span classhljs-keywordand/span info.get(span classhljs-string住址/span): name_parts.append(info[span classhljs-string住址/span]) span classhljs-keywordif/span span classhljs-keywordnot/span name_parts: span classhljs-keywordreturn/span original_path span classhljs-comment# 构造新文件名/span new_name separator.join(name_parts) span classhljs-comment# 保持原始文件扩展名/span span classhljs-keywordimport/span os dir_name os.path.dirname(original_path) file_ext os.path.splitext(original_path)[zxsq-anti-bbcode-span classhljs-number1/span] new_path os.path.join(dir_name, new_name file_ext) span classhljs-comment# 重命名文件/span span classhljs-keywordtry/span: os.rename(original_path, new_path) span classhljs-keywordreturn/span new_path span classhljs-keywordexcept/span Exception span classhljs-keywordas/span e: span classhljs-built_inprint/span(span classhljs-stringf重命名文件失败 span classhljs-subst{original_path}/span - span classhljs-subst{new_path}/span: span classhljs-subst{e}/span/span) span classhljs-keywordreturn/span original_path span classhljs-comment# 图片灰度处理, 处理成扫描件, 下面还没写好 不要用/span span classhljs-comment# def preprocess_id_card_image(self, image_path):/span span classhljs-comment# 对身份证图片进行校正、裁剪并转换为黑白扫描件/span span classhljs-comment# try:/span span classhljs-comment# # 读取图片/span span classhljs-comment# img cv2.imread(image_path)/span span classhljs-comment# if img is None:/span span classhljs-comment# return image_path/span span classhljs-comment#/span span classhljs-comment# # 1. 转换为灰度图/span span classhljs-comment# gray cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)/span span classhljs-comment#/span span classhljs-comment# # 2. 使用中值滤波代替高斯模糊/span span classhljs-comment# denoised cv2.medianBlur(gray, 3)/span span classhljs-comment#/span span classhljs-comment# # 3. 使用自适应阈值/span span classhljs-comment# binary cv2.adaptiveThreshold(/span span classhljs-comment# denoised, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,/span span classhljs-comment# cv2.THRESH_BINARY, 15, 3/span span classhljs-comment# )/span span classhljs-comment#/span span classhljs-comment# # 4. 可选轻微平滑处理/span span classhljs-comment# smoothed cv2.medianBlur(binary, 1)/span span classhljs-comment#/span span classhljs-comment# # 5. 保存处理后的图片/span span classhljs-comment# import os/span span classhljs-comment# dir_name os.path.dirname(image_path)/span span classhljs-comment# file_name os.path.splitext(os.path.basename(image_path))[zxsq-anti-bbcode-0]/span span classhljs-comment# file_ext os.path.splitext(image_path)[zxsq-anti-bbcode-1]/span span classhljs-comment# processed_path os.path.join(dir_name, f{file_name}_processed{file_ext})/span span classhljs-comment#/span span classhljs-comment# cv2.imwrite(processed_path, smoothed)/span span classhljs-comment#/span span classhljs-comment# return processed_path/span span classhljs-comment# except Exception as e:/span span classhljs-comment# print(f身份证图片预处理失败 {image_path}: {e})/span span classhljs-comment# return image_path/span span classhljs-comment#/span span classhljs-comment# def order_points(self, pts):/span span classhljs-comment# 对四个点进行排序左上、右上、右下、左下/span span classhljs-comment# rect np.zeros((4, 2), dtypefloat32)/span span classhljs-comment#/span span classhljs-comment# # 计算坐标和/span span classhljs-comment# s pts.sum(axis1)/span span classhljs-comment# rect[zxsq-anti-bbcode-0] pts[np.argmin(s)] # 左上角点坐标和最小/span span classhljs-comment# rect[zxsq-anti-bbcode-2] pts[np.argmax(s)] # 右下角点坐标和最大/span span classhljs-comment#/span span classhljs-comment# # 计算坐标差/span span classhljs-comment# diff np.diff(pts, axis1)/span span classhljs-comment# rect[zxsq-anti-bbcode-1] pts[np.argmin(diff)] # 右上角点坐标差最小/span span classhljs-comment# rect[zxsq-anti-bbcode-3] pts[np.argmax(diff)] # 左下角点坐标差最大/span span classhljs-comment#/span span classhljs-comment# return rect/span span classhljs-comment#/span span classhljs-comment# def four_point_transform(self, image, pts):/span span classhljs-comment# 四点透视变换/span span classhljs-comment# # 获取排序后的坐标/span span classhljs-comment# rect self.order_points(pts)/span span classhljs-comment# (tl, tr, br, bl) rect/span span classhljs-comment#/span span classhljs-comment# # 计算新图像的宽度和高度/span span classhljs-comment# width_a np.sqrt(((br[zxsq-anti-bbcode-0] - bl[zxsq-anti-bbcode-0]) ** 2) ((br[zxsq-anti-bbcode-1] - bl[zxsq-anti-bbcode-1]) ** 2))/span span classhljs-comment# width_b np.sqrt(((tr[zxsq-anti-bbcode-0] - tl[zxsq-anti-bbcode-0]) ** 2) ((tr[zxsq-anti-bbcode-1] - tl[zxsq-anti-bbcode-1]) ** 2))/span span classhljs-comment# max_width max(int(width_a), int(width_b))/span span classhljs-comment#/span span classhljs-comment# height_a np.sqrt(((tr[zxsq-anti-bbcode-0] - br[zxsq-anti-bbcode-0]) ** 2) ((tr[zxsq-anti-bbcode-1] - br[zxsq-anti-bbcode-1]) ** 2))/span span classhljs-comment# height_b np.sqrt(((tl[zxsq-anti-bbcode-0] - bl[zxsq-anti-bbcode-0]) ** 2) ((tl[zxsq-anti-bbcode-1] - bl[zxsq-anti-bbcode-1]) ** 2))/span span classhljs-comment# max_height max(int(height_a), int(height_b))/span span classhljs-comment#/span span classhljs-comment# # 目标点/span span classhljs-comment# dst np.array([/span span classhljs-comment# [0, 0],/span span classhljs-comment# [max_width - 1, 0],/span span classhljs-comment# [max_width - 1, max_height - 1],/span span classhljs-comment# [0, max_height - 1]], dtypefloat32)/span span classhljs-comment#/span span classhljs-comment# # 计算透视变换矩阵并应用/span span classhljs-comment# M cv2.getPerspectiveTransform(rect, dst)/span span classhljs-comment# warped cv2.warpPerspective(image, M, (max_width, max_height))/span span classhljs-comment#/span span classhljs-comment# return warped/span span classhljs-comment# 中断处理, 此处不要直接中断线程, 可能导致excel未能处理完毕线程就退出了/span span classhljs-comment# 我们应该保证excel/span span classhljs-keyworddef/span span classhljs-title function_request_termination/span(span classhljs-paramsself/span): span classhljs-string请求终止处理过程/span span classhljs-variable language_self/span._should_terminate span classhljs-literalTrue/span/code/pre