OiO.lk Blog pdf How to upload and read pdf file?
pdf

How to upload and read pdf file?


I am having issues with sending a user-uploaded PDF file to a python back-end to read the data, then send to a back-end to be used for a RAG pipeline.

I’m currently using formData, but cannot get my code to work and most tutorials online require the PDF filename, which I am not sure is available if I do not have the file locally.

Here is what I have been working on. ChatGPT gave me the python file opening and processing, but it doesn’t know how to navigate the formData object.

@app.route('/uploads', methods=['POST', 'GET'])
def uploadspdf():`
    text = ""
    if 'file' not in request.files:
        return jsonify({"error": "No file part"}), 400

    file = request.files['file']
    if file.filename == '':
        return jsonify({"error": "No selected file"}), 400
 
    with open(file, 'rb') as pdf_file:
        pdf_reader = PyPDF2.PdfReader(pdf_file)
        for page_num in range(len(pdf_reader.pages)):
            page = pdf_reader.pages[page_num]
            text += page.extract_text()
    return jsonify({"content": "\n".join(text)}), 200

`

JS frontend

`const formData = new FormData();
    formData.append('pdf', pdfFile);
    console.log("formdata ", formData)

    try{
      const res1 = await axios.post("http://localhost:5010/uploads", formData, { headers:     {'Content-Type': 'multipart/form-data','Access-Control-Allow-Origin': '*',},})
      console.log("Flask server response: ", res1.data);
    } catch (error) {
      console.log("erro herr ", error)
    }

`
I can get simple responses from the flask server if I just return some text in the route, but I get a 400 bad request when I try to use the uploaded file.

I have this button that takes the user uploaded pdf:

<Button variant="contained" component="label" className="w-full mb-4"  sx={{
    backgroundColor: 'green',
    border: '2px solid green',
    color: 'white',
    '&:hover': {
        backgroundColor: 'darkgreen',
        border: '2px solid darkgreen',
    },
}}>
    Upload PDF
    <input type="file" hidden accept=".pdf" onChange={handleFileChange}/>
</Button>`

A nonempty formdata is being sent to the back-end, but I can’t figure out how to get the PDF data

Is using formData the way to go? Should I change my overall approach?



You need to sign in to view this answers

Exit mobile version