What do you know about Python’s request user agent support? If you know little to nothing, then the article below has been written for you. Cone in now to learn how to set and rotate user agents in requests.
Whenever a client sends a request to a server, it must identify itself by telling the web server its name — some also include the Operating System they are run on and even the version of the software. This means of identification is known as the user agent and is set by configuring the user agent header for the HTTP header. It does not matter whether a client is a browser, a bot, or even an official app created by the web server itself, a client must send this header.
All popular clients have their user agent string they use. If you are a bot developer, you also need to know that this is important to you. When coding your bot, you have to set this header too. Without setting it, your HTTP library will set one for you.
The focus of this article is on Python’s requests user agent, which is known for setting a generic user agent string that will tell your target it is a bot. If you need to scrape popular targets on the web, you must change this to something different, and that is why this article has been written — to provide you a guide on how to change your Python Requests user agent string by faking it.
Faking User Agent Using Python Requests — An Overview
Python Requests is the de facto third-party library for HTTP requests because of its simplicity, ease of usage, robustness, and better error handling. Among other things, one of its features is that it allows you to set up fake user agents. Using this, you can identify yourself as a popular web browser, even as Googlebot, to avoid getting blocked. However, requests do not make it compulsory for you to set a user agent header. But if you do not, it will provide you with a default user agent. Below is the default user agent header for requests.
Python-requests/x.y.z
The x.y.z is a placeholder and contains the version of the requests you are using. There is no problem with using this. However, if your target is a tough one, you will get blocked. Take, for instance, Amazon blocks this user agent by default — even on getting your first requests. Changing your user agent to that of a popular acceptable client, such as web browsers, is one of the tasks you should do if you want to avoid getting blocked when web scraping or doing other kinds of botting. And interestingly, it is very easy to do using requests without paying a dime.
How to Set User Agent in Python Requests
The steps required in setting a user agent in Python’s requests are easy. All you need to do is add it as a key-value pair in a dictionary and pass it to the header parameter in requests. Let me walk you through the method, including how to get a user agent string to use. In this guide, I will use my browser user agent.
Step 1: Visit the User Agent String API page of Httpbin using your web browser. For me, I got the below as a response.
{ "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1 Safari/605.1.15" }
Step 2: As you can see above, it is returned in a dictionary — typical of an API. But you can see the user agent string — Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1 Safari/605.1.15. This tells the web server I am using a Mac computer with the Safari Browser.
Step 3: The next thing to do is to install requests using the pip command below.
pip install requests
Step 4: Then import requests, create the header dictionary, pass the user agent as a key-value pair, then pass the header as a dictionary.
import requests headers ={ "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1 Safari/605.1.15" } user_agent = requests.get(“https://httpbin.org/user-agent”, headers=headers).json() print(user_agent)
As you can see above, I requested for my user agent using the same httpbin API and got the same value as I got when I used my browser. With this, I can successfully say I have faked the user agent in Python requests, making it seem as if I was using my browser when in reality, I am using Python request.
How to Rotate User Agent String in Python Requests
It is good to change your user agent, but when web scraping, it is even better to have a bunch of user agent strings and rotate randomly among them. In this section, I will show you how to rotate a user agent string in Python requests.
Step 1: First, you need a bunch of user-agent strings. Go to the User Agent library on Device Atlas (https://deviceatlas.com/blog/list-of-user-agent-strings) and get a bunch of them. I will use only 3 in this example.
Step 2: Copy the once you choose into your notepad, as you will need them later.
Step 3: Go to your IDE and create a new Python file.
Step 4: Import requests and the random module
import requests import random
Step 5: Load the user agents you copied as a list variable
user_agent_strings = [ ‘Mozilla/5.0 (Linux; Android 13; SM-S908B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Mobile Safari/537.36’, ‘Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Mobile Safari/537.36’, ‘Mozilla/5.0 (iPhone12,1; U; CPU iPhone OS 13_0 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) Version/10.0 Mobile/15E148 Safari/602.1’, ]
Step 6: Use the random function to choose just one and assign it as a user agent in the header variable
random_number = random.randint(0, len(user_agent_strings)-1) headers = {User-Agent: user_agent_strings[random_number]}
Step 7: Add the header as a header parameter in requests
user_agent = requests.get(“https://httpbin.org/user-agent”, headers=headers) print(user_agent)
Step 8: Below is the full code
import requests import random user_agent_strings = [ ‘Mozilla/5.0 (Linux; Android 13; SM-S908B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Mobile Safari/537.36’, ‘Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Mobile Safari/537.36’, ‘Mozilla/5.0 (iPhone12,1; U; CPU iPhone OS 13_0 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) Version/10.0 Mobile/15E148 Safari/602.1’, ] random_number = random.randint(0, len(user_agent_strings)-1) headers = {User-Agent: user_agent_strings[random_number]} user_agent = requests.get(“https://httpbin.org/user-agent”, headers=headers) print(user_agent)
FAQs About Python Requests User Agent
Q. Is the User-Agent Header the Only Header I Need to Set?
The user agent is just one of the headers you can set. On most websites, just setting user agent is enough. But there are some websites that go further in checking whether you have the other headers set, and if not, it becomes suspicious. Some of the other user agents you need to set include accept, accept-encoding, and accept-language. You can always check the values from the developer tool of your browser to see the headers your browser is sending.
Q. Will Setting HTTP Headers Prevent Blocking?
Just setting HTTP headers in Python requests does not help protect you against blocking. Websites use a good number of techniques to detect and block bots, and user headers are just a small fragment of the pie. This is because of how easily they can be spoofed as liked. For you to actually prevent block, you will need to rotate IPs using proxies and have a captcha solver at hand if your target makes use of a captcha to protect its system. For some websites, you will also need to use an automation browser.
Q. Is Python Requests Good for Scraping?
As far as scraping in Python is concerned, python’s requests is one of the great tools out there. It is built as a wrapper for the much difficult-to-use urllib. It does support the use of proxies, error handling, and other advanced features. However, there are task requests that wouldn’t be good for because of it lack of support for Javascript execution. If you need to render and execute Javascript to access your data, then Selenium is the best for you as a Python developer.
Conclusion
From the above, you can see how easy it is to set use agent header and fake it using Python’s requests. While faking a user agent is good for hiding your client, you also need to know that it could cause your bot to break if it is changed to a user agent that your target sends a unique page design to. So, you should have an understanding of the user agents you use to avoid getting your code broken.