代码放最后,先说遇到的问题:
1、报错:Python OSError: wkhtmltopdf reported an error:Exit with code 1 due to network error:ProtocolUnknownE
原因:本地文件访问权限被禁止了,所以在调用时才报错
解决办法:在python的程序中打开本地文件访问权限即可,主要添加"enable-local-file-access":True
2、pdfkit.from_string生成PDF图片不展示,空白
原因:没搞明白,设计文件下载添加 比较麻烦
解决办法:将html代码保存到本地将,图片也下载到本地,将html中图片地址改成本机保存的绝对地址,
使用 pdfkit.from_file("0file.html", filename, configuration=self.config, options=self.options) 生成
3、中文乱码问题
原因:字体wkhhtmltopdf.exe 找不到
解决办法:可以把字体拷贝到工具目录 也可以直接删掉“font-family:” 设置
以下以上问题解决关键代码:
self.options = {
'encoding': "UTF-8",
'quiet': '',
"enable-local-file-access": True,
'page-size': 'A4',
# 'footer-right': '[page]/[topage]',
}
self.config = pdfkit.configuration(wkhtmltopdf='C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe')
content = node.get_attribute('innerHTML')
content = re.sub(r'font-family:\s*[^;]+;', '', content)
image_links = re.findall('img.*?src="(.*?)"', content)
imgMap = {}
for index, img in enumerate(image_links):
if img:
img_url = img
print("下载图片==" + img_url)
if not img.startswith('http'):
img_url = 'http:' + img
name = str(uuid.uuid1()).replace('-', '') + '.jpg'
filename = name
filename = os.path.join(self.pic_path, name)
try:
response = requests.get(img_url)
with open(filename, 'wb') as f:
f.write(response.content)
imgMap[img] = filename
time.sleep(5)
except Exception as ex:
print(str(ex))
for key, value in imgMap.items():
content = content.replace(key, value)
with open("0file.html", 'wb') as f:
f.write(content.encode())
filename = os.path.join(self.base_path, title + ".pdf")
pdfkit.from_file("0file.html", filename, configuration=self.config,
options=self.options) # all ok!