You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
def extract_tables(pdf_path, output_folder):
with pdfplumber.open(pdf_path) as pdf:
pages = len(pdf.pages)
for pi in range(pages):
page = pdf.pages[pi]
tables = page.extract_tables()
# 创建输出文件夹(如果不存在)
if not os.path.exists(output_folder):
os.makedirs(output_folder)
# 处理表格
print("page:", pi, ", table size:", len(tables))
for i, table in enumerate(tables):
with open(os.path.join(output_folder, 'table_{}.txt'.format(str(pi) + "_" + str(i+1))), 'w') as f:
for row in table:
row =[x if x is not None else '' for x in row]
f.write('\t'.join(row) + '\n')
The text was updated successfully, but these errors were encountered:
抽取出 pdf 文件中的表格
采用 camelot
采用 pdfplumber
The text was updated successfully, but these errors were encountered: