Any way to extract images from the pdf itself on the front end? #165
-
First off I just want to say thank you so much for being so responsive TaTo. I love when devs really support their work. You've helped me a lot with all of my questions. My question today is if there is a way to extract an image from a pdf on the front end. I have a requirement where users are creating SOPs (standard operating procedures) while referencing a pdf. Sometimes the pdfs contain images that are very helpful that they would like to include inside their SOP. Is there anyway that you know of that I can somehow expose the images so they can select them? If I could figure out how to get the binary data for the image, then I can convert it to base64 and display it easily in my WYSIWYG editor for the SOP they are working on. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Based on: https://stackoverflow.com/questions/18680261/extract-images-from-pdf-file-with-javascript I've did a small test and this seems to work only when raster images are embebed in the document (eg. 9.pdf): <script setup lang="ts">
import pdf14 from "@samples/42.pdf";
import { VuePDF, usePDF } from "@tato30/vue-pdf";
import * as PDFJS from "pdfjs-dist";
const { pdf } = usePDF(pdf14);
function getPageImages(page: number) {
pdf.value?.promise.then(async (document) => {
const pageProxy = await document.getPage(page);
const ops = await pageProxy.getOperatorList();
const objs = [];
for (var i = 0; i < ops.fnArray.length; i++) {
if (ops.fnArray[i] == PDFJS.OPS.paintImageXObject) {
objs.push(ops.argsArray[i][0]);
}
}
objs.map(async (val) => {
pageProxy.objs.get(val, async (obj) => {
const bitmap = await createImageBitmap(obj.bitmap);
const ocanvas = new OffscreenCanvas(bitmap.width, bitmap.height);
ocanvas.getContext("bitmaprenderer")!.transferFromImageBitmap(bitmap);
const blob = await ocanvas.convertToBlob({ type: "image/png" });
console.log(URL.createObjectURL(blob));
});
});
});
}
</script>
<template>
<div>
<VuePDF :pdf="pdf" :page="1" @loaded="getPageImages(1)" />
</div>
</template>
|
Beta Was this translation helpful? Give feedback.
Based on: https://stackoverflow.com/questions/18680261/extract-images-from-pdf-file-with-javascript
I've did a small test and this seems to work only when raster images are embebed in the document (eg. 9.pdf):