python html转pdf 遇到的问题（pdfkit、wkhtmltopdf 方式）-python-星河码客

代码放最后，先说遇到的问题：

1、报错：Python OSError: wkhtmltopdf reported an error:Exit with code 1 due to network error:ProtocolUnknownE

原因：本地文件访问权限被禁止了，所以在调用时才报错

解决办法：在python的程序中打开本地文件访问权限即可，主要添加"enable-local-file-access":True

2、pdfkit.from_string生成PDF图片不展示，空白

原因：没搞明白，设计文件下载添加比较麻烦

解决办法：将html代码保存到本地将，图片也下载到本地，将html中图片地址改成本机保存的绝对地址，

使用 pdfkit.from_file("0file.html", filename, configuration=self.config, options=self.options) 生成

3、中文乱码问题

原因：字体wkhhtmltopdf.exe 找不到

解决办法：可以把字体拷贝到工具目录也可以直接删掉“font-family:” 设置

以下以上问题解决关键代码：

        self.options = {
            'encoding': "UTF-8",
            'quiet': '',
            "enable-local-file-access": True,
            'page-size': 'A4',
            # 'footer-right': '[page]/[topage]',
        }
        self.config = pdfkit.configuration(wkhtmltopdf='C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe')
 content = node.get_attribute('innerHTML')
            content = re.sub(r'font-family:\s*[^;]+;', '', content)
image_links = re.findall('img.*?src="(.*?)"', content)
            imgMap = {}
            for index, img in enumerate(image_links):
                if img:
                    img_url = img
                    print("下载图片==" + img_url)
                    if not img.startswith('http'):
                        img_url = 'http:' + img
                    name = str(uuid.uuid1()).replace('-', '') + '.jpg'
                    filename = name
                    filename = os.path.join(self.pic_path, name)
                    try:
                        response = requests.get(img_url)
                        with open(filename, 'wb') as f:
                            f.write(response.content)

                        imgMap[img] = filename
                        time.sleep(5)
                    except Exception as ex:
                        print(str(ex))

            for key, value in imgMap.items():
                content = content.replace(key, value)
  with open("0file.html", 'wb') as f:
                f.write(content.encode())
            filename = os.path.join(self.base_path, title + ".pdf")
            pdfkit.from_file("0file.html", filename, configuration=self.config,
                             options=self.options)  # all ok!