I’m trying to use cURL to extract an embedded PDF located at site that requires a username/password login (wordpress cookie). The actual file is located here, and the login is located here.
I was trying to follow steps here and here but am still getting a broken PDF as a result. Here’s what I’ve tried:
- Get cURL command to log into server
- Load login page for website and open Network pane of Developer Tools
In firefox, right click page, choose ‘Inspect Element (Q)’ and click
on Network tab - Go to login form, enter username, password and log in
- After you have logged in, go back to Network pane and scroll to the
top to find the POST entry. - Right click and choose Copy -> Copy as CURL
- Paste this to a text editor
- Load login page for website and open Network pane of Developer Tools
- Save session cookie:
- Using the pasted info from #1, remove the entry
-H 'Cookie: <somestuff>'
- Insert
-curl -c login_cookie.txt
to the beginning and run the code - this saves ‘login_cookie.txt’ to the folder
*the issue I have at this point is login_cookie.txt does not appear to have worked. When opening up the .txt file, I’m seeing "fakesessid" and a lot of FALSE & TRUE messages in the .txt file that I can’t figure out
- Using the pasted info from #1, remove the entry
- Get cURL command to the actual PDF
- go to the actual webpage where the PDF is located
- inspect element, go to Network tab, look for the XHR file (size is ~equal to the size of PDF I want downloaded)
- Right click and choose Copy -> Copy as CURL
- paste this into text editor
- remove the entry
-H 'Cookie: <somestuff>'
- Insert
-curl -b login_cookie.txt
to the beginning - Insert
--output filename.pdf
to the end and run code
*At this point of the process, I do see a PDF downloaded, but the PDF is corrupt.
What am I doing wrong? My exact codes (less password info located below):
Code to save login cookies:
curl -c login_cookie.txt 'https://fddexchange.com/wp-login.php' \
-H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7' \
-H 'accept-language: en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7' \
-H 'cache-control: max-age=0' \
-H 'content-type: application/x-www-form-urlencoded' \
-H 'origin: https://fddexchange.com' \
-H 'priority: u=0, i' \
-H 'referer: https://fddexchange.com/login-2/' \
-H 'sec-ch-ua: "Google Chrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129"' \
-H 'sec-ch-ua-mobile: ?1' \
-H 'sec-ch-ua-platform: "Android"' \
-H 'sec-fetch-dest: document' \
-H 'sec-fetch-mode: navigate' \
-H 'sec-fetch-site: same-origin' \
-H 'sec-fetch-user: ?1' \
-H 'upgrade-insecure-requests: 1' \
-H 'user-agent: Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Mobile Safari/537.36' \
--data-raw 'pmpro_login_form_used=1&log=REDACTEDUSERNAME&pwd=REDACTEDPWORD&wp-submit=Log+In&redirect_to='
Code to download PDF:
curl 'https://fddexchange.com/?title=Dogtopia%20Complete%20FDD%20April%2015%202024&index=1&pdfID=86513&pdfemb-serveurl=https%3A%2F%2Ffddexchange.com%2Fwp-content%2Fuploads%2Fsecurepdfs%2F2024%2F05%2FDogtopia-Complete-FDD-April-15-2024.pdf&pdfemb-nonce=baee332b39' \
-H 'accept: */*' \
-H 'accept-language: en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7' \
-H 'priority: u=1, i' \
-H 'referer: https://fddexchange.com/?title=Dogtopia%20Complete%20FDD%20April%2015%202024&index=1&pdfID=86513&pdfemb-serveurl=https%3A%2F%2Ffddexchange.com%2Fwp-content%2Fuploads%2Fsecurepdfs%2F2024%2F05%2FDogtopia-Complete-FDD-April-15-2024.pdf' \
-H 'sec-ch-ua: "Google Chrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129"' \
-H 'sec-ch-ua-mobile: ?1' \
-H 'sec-ch-ua-platform: "Android"' \
-H 'sec-fetch-dest: empty' \
-H 'sec-fetch-mode: cors' \
-H 'sec-fetch-site: same-origin' \
-H 'user-agent: Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Mobile Safari/537.36'
You need to sign in to view this answers