Some data (forums, accounts, dashboards) is only visible after logging in.
With Python Requests you can:
- Start a session to keep cookies and headers
- Log in with a POST request using your credentials
- Reuse the session to access protected pages
This allows you to collect information that isn’t available to anonymous visitors while keeping the login state active across requests.
Use browser DevTools (F12 → Elements) to find:
- Authentication endpoint (usually from the
actionattribute in<form>). - HTTP method (often
POST). - Field names (e.g.,
username,password).
pip install requests beautifulsoup4Send a POST with credentials to authenticate.
import requests
from bs4 import BeautifulSoup
# Create a session object
session = requests.Session()
# Add login data
login_url = 'https://practice.expandtesting.com/authenticate'
credentials = {
'username': 'practice',
'password': 'SuperSecretPassword!'
}
# Send POST request
response = session.post(login_url, data=credentials)
if response.ok:
print("Login successful!")
else:
print("Login failed!")Reuse the session to fetch protected content.
data_url = 'https://practice.expandtesting.com/secure'
data_page = session.get(data_url)
if data_page.ok:
print("Data retrieved successfully!")
soup = BeautifulSoup(data_page.text, 'html.parser')
first_paragraph = soup.find('h1')
print("First text:", first_paragraph.text)
else:
print("Failed to retrieve data.")- CSRF tokens: may need an initial GET to extract token from headers or HTML.
- CAPTCHAs: switch IPs, use automation, or CAPTCHA-solving services.
- 2FA (two-factor authentication): use throwaway accounts with 2FA disabled, or handle custom flows.
import requests
from bs4 import BeautifulSoup
session = requests.Session()
login_url = 'https://practice.expandtesting.com/authenticate'
credentials = {
'username': 'practice',
'password': 'SuperSecretPassword!'
}
response = session.post(login_url, data=credentials)
if response.ok:
print("Login successful!")
else:
print("Login failed!")
data_url = 'https://practice.expandtesting.com/secure'
data_page = session.get(data_url)
if data_page.ok:
print("Data retrieved successfully!")
soup = BeautifulSoup(data_page.text, 'html.parser')
first_paragraph = soup.find('h1')
print("First text:", first_paragraph.text)
else:
print("Failed to retrieve data.")- Always check legality (ToS).
- For JS-heavy or CAPTCHA-protected logins, switch to Selenium.
- Sessions save cookies—always reuse the same session for authenticated requests.
