怎么使用Python根據(jù)模板批量生成docx文檔

來源: 迪士尼在逃公主 2021-08-18 14:39:04 瀏覽數(shù) (3140)

反饋

有些word文檔的內(nèi)容有相當大一部分是完全相同的，只有部分的內(nèi)容有所更改，比如成績單、錄取通知書等。這些文檔如果使用手工一個一個去創(chuàng)建的話是一件相當大的工程。如果能根據(jù)模板批量生產(chǎn)docx文檔就好了。這樣的美夢，已經(jīng)可以用python實現(xiàn)了，接下來，我們就來了解如何用python根據(jù)模板批量生成docx文檔。

一、需求說明

能夠根據(jù)模板批量生成docx文檔。具體而言，讀取excel中的數(shù)據(jù)，然后使用python批量生成docx文檔。

二、實驗準備

準備excel數(shù)據(jù)：

這里是關于學生語數(shù)英成績的統(tǒng)計表，文件名為score.xls

準備模板：

這是給學生家長的成績通知書，文件名為template.doc

另外，在使用python進行實驗之前，需要先安裝第三方庫docxtpl和xlrd，直接pip install就行：

pip install docxtpl
pip install xlrd

然后將xls和doc和python文件放在同一個目錄下

三、代碼實現(xiàn)

首先打開xls，讀取數(shù)據(jù)：

workbook = xlrd.open_workbook(sheet_path)

然后從文件中獲取第一個表格：

sheet = workbook.sheet_by_index(0)

然后遍歷表格的每一行，將數(shù)據(jù)存入字典列表：

tables = []
for num in range(1, sheet.nrows):
    stu = {}
    stu['name'] = sheet.cell_value(num, 0)
    stu['class'] = sheet.cell_value(num, 1)
    stu['language'] = sheet.cell_value(num, 2)
    stu['math'] = sheet.cell_value(num, 3)
    stu['English'] = sheet.cell_value(num, 4)
    tables.append(stu)

接下來將列表中的數(shù)據(jù)寫入docx文檔，其實這個過程可以在讀數(shù)據(jù)時同時進行，即讀完一行數(shù)據(jù)，然后生成一個文檔。

首先在指定路徑生成一個docx文檔：

document = Document(word_path)

然后逐行進行正則表達式的替換：

paragraphs = document.paragraphs
    text = re.sub('name', stu['name'], paragraphs[1].text)
    paragraphs[1].text = text
    text = re.sub('name', stu['name'], paragraphs[2].text)
    text = re.sub('class', stu['class'], text)
    text = re.sub('language', str(stu['language']), text)
    text = re.sub('math', str(stu['math']), text)
    text = re.sub('English', str(stu['English']), text)
    paragraphs[2].text = text

其實不關心格式問題的，到現(xiàn)在為止就已經(jīng)結(jié)束了。但是這樣替換后docx中被替換的文字格式也被更改為系統(tǒng)默認的正文格式，所以接下來是將這些改成自己想要的格式：

遍歷需要更改格式的段落，然后更改字體大小和字體格式：

for run in paragraph.runs:
            run.font.size = Pt(16)
            run.font.name = "宋體"
            r = run._element.rPr.rFonts
            r.set(qn("w:eastAsia"), "宋體")

最后保存文件：

document.save(path + "\" + r"{}的成績通知單.docx".format(stu['name']))

完整代碼：

from docxtpl import DocxTemplate
import pandas as pd
import os
import xlrd
path = os.getcwd()
# 讀表格
sheet_path = path + "score.xls"
workbook = xlrd.open_workbook(sheet_path)
sheet = workbook.sheet_by_index(0)
tables = []
for num in range(1, sheet.nrows):
    stu = {}
    stu['name'] = sheet.cell_value(num, 0)
    stu['class'] = sheet.cell_value(num, 1)
    stu['language'] = sheet.cell_value(num, 2)
    stu['math'] = sheet.cell_value(num, 3)
    stu['English'] = sheet.cell_value(num, 4)
    tables.append(stu)
print(tables)
 
# 寫文檔
from docx import Document
import re
from docx.oxml.ns import qn
from docx.shared import Cm,Pt
for stu in tables:
    word_path = path + "\template.doc"
    document = Document(word_path)
    paragraphs = document.paragraphs
    text = re.sub('name', stu['name'], paragraphs[1].text)
    paragraphs[1].text = text
    text = re.sub('name', stu['name'], paragraphs[2].text)
    text = re.sub('class', stu['class'], text)
    text = re.sub('language', str(stu['language']), text)
    text = re.sub('math', str(stu['math']), text)
    text = re.sub('English', str(stu['English']), text)
    paragraphs[2].text = text
    for paragraph in paragraphs[1:]:
        for run in paragraph.runs:
            run.font.size = Pt(16)
            run.font.name = "宋體"
            r = run._element.rPr.rFonts
            r.set(qn("w:eastAsia"), "宋體")
    document.save(path + "\" + r"{}的成績通知單.docx".format(stu['name']))