Python Web-scraping Csrf Token Issue
Solution 1:
I believe the issue here is that <input>
elements must have name
attributes for them to be submitted via POST or GET. Since your token is in a name
-less <input>
element, it is not processed by MechanicalSoup because that's what the browser would do.
From the W3C specification:
Every successful control has its control name paired with its current value as part of the submitted form data set. A successful control must be defined within a FORM element and must have a control name.
...
A control's "control name" is given by its name attribute.
Perhaps there is some JavaScript that is handling the CSRF token.
For a similar discussion, see Does form data still transfer if the input tag has no name?
Regarding your usage of MechanicalSoup, the classes StatefulBrowser
and Form
would simplify your script. For example, if you just had to open the page and input a username and password:
import mechanicalsoup
# These values are filled by the user
url = ""
username = ""
password = ""# Open the page
browser = mechanicalsoup.StatefulBrowser(raise_on_404=True)
browser.open(url)
# Fill in the form values
form = browser.select_form('form[id=loginForm]')
form['username'] = username
form['password'] = password
# Submit the form and print the resulting page text
response = browser.submit_selected()
print(response.text)
Post a Comment for "Python Web-scraping Csrf Token Issue"