Are you planning on using Selenium for automated testing or web scraping? Depending on your specific project requirements, you might need proxies. Come in now to discover our Selenium proxy top picks.
The importance of Selenium cannot be overemphasized. If it is not being used in automated testing, you get to see web scrapers utilizing them for scraping data off JavaScript featured websites. In the two areas in which Selenium is used extensively, proxies are required.
In some instances, you can get away without using proxies; in others, proxies are a must except if you are ready to use other expensive options. This article will be used to discuss the proxies you can use together with the Selenium library for it to function effectively.
Before discussing the proxies, we are going to be taking a look at an overview of Selenium and why you need proxies for Selenium. You are also going to learn how to setup Selenium to work with proxies.
What is Selenium?
Selenium is a browser automating tool. With this tool, browsers can be automated to carry out tasks such as filling forms, visiting a website, and doing all kinds of tasks you can do with a browser. It is used majorly for automated testing.
It is also being used for web scraping since it can be used to view web pages and has some web scraping capabilities. Selenium has supports for a good number of browsers, including Chrome, Internet Explorer, and Firefox. Older versions of Selenium have support for headless browsers such as PhantomJS.
Its language support is also one of the things that make it popular among developers as it provides support for Python, Java, JavaScript, C#, and Ruby.
Why you Need Proxies for Selenium?
Proxies are not a must. However, depending on the project requirement, you will need to use proxies. As stated earlier, Selenium is used for automated testing and web scraping. For automated testing, you actually do not need proxies except if you are testing for localization.
Take, for instance, you are developing sites for different regions, and you would want to test if the language that appears for certain regions is the language used there. Aside from localization, there is actually no reason you will want to use proxies for automated testing.
In the area of web scraping, proxies are also required if there is a need for localized web content. They are also required when you are going to be sending too many requests to a website in a short period of time.
- Proxies for Preventing Bans and Captchas When Scraping Google
- How To Generate A Random IP Address For Each Session
Where to find the Best Selenium Proxies
There is actually nothing like best proxies for Selenium because Selenium itself does not require proxies. the site you intend to use Selenium on determines the proxies you should use. Because of this, we are going to be providing you recommendations on proxies that cut across the datacenter and residential proxy categories.
Residential Proxies for Selenium
Residential proxies are the proxy of choice for the Selenium web driver. This is because, unlike datacenter proxies, residential proxies do not easily get detected. This is because they route clients’ requests through residential IPs, and these types of IPs earn more trust than datacenter IPs. Residential proxies are good for accessing complex sites such as Instagram, Google, and YouTube, among others. Some of the residential proxy providers for Selenium are discussed below.
Luminati
- IP Pool Size: Over 40 million
- Locations: All countries in the world
- Concurrency Allowed: Unlimited
- Bandwidth Allowed: Starts at 40GB
- Cost: Starts at $500 monthly for 40GB
Luminati is arguably the best residential proxy provider in the market. It is the largest proxy network in the world, with over 40 million residential IP addresses in its pool. There are two reasons that make Luminati residential proxies perfect for Selenium. The most important one being that Luminati has proxies in every country and in most cities around the world. This means that you can target specific locations when using their proxies, and this is perfect for testing content localization using Selenium. Luminati has got high-rotating proxies, which will reassign you a different IP Address after every web request, making it difficult to be blocked and, as such, perfect for web scraping.
Smartproxy
- IP Pool Size: Over 10 million
- Locations: 195 locations across the globe
- Concurrency Allowed: Unlimited
- Bandwidth Allowed: Starts at 5GB
- Cost: Starts at $75 monthly for 5GB
Smartproxy is another residential proxy service with premium proxies perfect for accessing websites with a smart anti-spam system and for content localization testing using Selenium. Just like Luminati, Smartproxy has got good location coverage with proxies in about 195 countries in the world and over in 8 major cities around the world. They have got high-rotating proxies as well. Smartproxy is the proxy provider of choice among those that want to use premium proxies but have a small budget. With $75, you can buy 5GB from them as opposed to Luminati that you require $450.
Stormproxies
- IP Pool Size: 40,000
- Locations: the US and EU region only
- Concurrency Allowed: only one device per port
- Cost: Starts at $50 monthly for 10 ports
Luminati and Smartproxy have one problem in common – their proxies come with exhaustible bandwidth. That’s, their proxies are metered, and after consuming the bandwidth allocated to you, you won’t be able to use their proxies again until you pay for additional bandwidth. Stormproxies residential proxies come with inexhaustible bandwidth – you are allowed an unlimited bandwidth usage. However, for performance sake, the number of threads you can create is limited. Stormproxies residential proxies are perfect for web scraping and can be used together with Selenium to access a good number of sites.
Datacenter Proxies for Selenium
Datacenter proxies are the cheapest proxies you can get in the market. They make use of IP Addresses owned by data centers. Because their IP Addresses are assigned by datacenter, they are easily detected and banned. Some of them have proven to evade detections and bans. Some of these are discussed above.
Myprivateproxy
- Locations: US and EU region only
- Concurrency Allowed: Up to 100 threads
- Bandwidth Allowed: Unlimited
- Cost: $1.49 per proxy for a month
MyPrivateProxy is arguably the best datacenter proxy provider in the market. Its proxies are some of the fastest – they are also secure and reliable. With MyPrivateProxy datacenter proxies, you can use Selenium for web scraping non-localized web content. This is because MyPrivateProxy only has a few location support, and as such, it is not a good proxy provider for automating localization testing, but it works quite great for web scraping. Some of the datacenter of MyPrivateProxy are powered by green energy sources. Their proxies are quite cheap.
Highproxies
- Locations: 10 countries
- Concurrency Allowed: Unlimited
- Bandwidth Allowed: Unlimited
- Cost: $1.40 per proxy for a month
Highproxies datacenter proxies can be a good choice for both web scraping and automating localization testing. This is because unlike MyPrivateProxy, Highproxies has proxies in a good number of countries, including the United States, Canada, Italy, Israel, Spain, Germany, France, the Netherlands, Japan, and Australia. Highproxies perform well in terms of speed, reliability, and security. Highproxies datacenter proxies do not easily get blocked by websites as they are not easily detected. Highproxies, just like MyPrivateProxy, can be used on some complex websites such as Facebook and Twitter without any problem. They are, however, more expensive than other datacenter proxies on the list.
InstantProxies
- Locations: Worldwide
- Concurrency Allowed: Unlimited
- Bandwidth Allowed: Unlimited
- Cost: $1.00 per proxy for a month
I stated above that MyPrivateProxy datacenter proxies are cheap. InstantProxies are actually cheaper. In fact, with only $10, you will have access to 10 proxies to make use of. InstantProxies supports a good number of locations but does not give you the chance to select the location by yourself. Before selling proxies for you, InstantProxies test the proxies to make sure they are working so as to avoid wasting your time. Just like MyPrivateProxy, their proxies are best only for web scraping and not Selenium automated testing.
How to Setup Proxies on Selenium
One of the problems developers have is how to setup proxies on Selenium. Because of the variety of browsers and programming language it supports, answers to questions like how to setup proxies vary.
Selenium proxy setting for Chrome browser
In this section of the article, we will look at how to setup Selenium to work with proxies driving the popular Chrome browser using Python.
The below codes show how to setup proxies on Selenium. The code is for Chrome.
from selenium import webdriver PROXY = "21.65.32.65:3124" chrome_options = WebDriverWait.ChromeOptions() chrome_options.add_argument('--proxy-server=%s' % PROXY) chrome = webdriver.Chrome(chrome_options=chrome_options) chrome.get("https://whatismyipaddress.com")
Looking at the last line of the code, you can see that the code opens up the WhatIsMyIpAdress website, so you can see that Chrome is using your preferred proxy.
Add Options,
from selenium import webdriver from selenium.webdriver.chrome.options import Options ops = Options() # ops.add_argument('--headless') # ops.add_argument('--no-sandbox') # ops.add_argument('--disable-dev-shm-usage') # ops.add_argument('--disable-gpu') print('--proxy-server=http://%s' % proxy) ops.add_argument('--user-agent=%s' % ua) ops.add_argument('--proxy-server=http://%s' % proxy) driver = webdriver.Chrome(executable_path=r"/root/chromedriver", chrome_options=ops) driver.delete_all_cookies() driver.maximize_window() driver.get("https://whatismyipaddress.com") print(driver.page_source) driver.quit()
For proxy mainly note here,
opt .add_argument(“–proxy-server=http://ip:port”)
browser = webdriver.Chrome(chrome_options = opt )
Selenium proxy setting for firefox
Also, you can add options,
from selenium import webdriver from selenium.webdriver.common.proxy import Proxy, ProxyType proxy = Proxy({ 'proxyType': ProxyType.MANUAL, 'httpProxy': my_proxy, 'noProxy': '' }) driver = webdriver.Firefox(proxy = proxy, executable_path=r"/root/geckodriver") driver.delete_all_cookies() driver.maximize_window() driver.get("https://whatismyipaddress.com") print(driver.page_source) driver.quit()
Selenium private proxy setting
Need to authentication with username and password,
from selenium import webdriver def create_proxyauth_extension(proxy_host, proxy_port, proxy_username, proxy_password, scheme='http', plugin_path=None): """Proxy Auth Extension args: proxy_host (str): domain or ip address, ie proxy.domain.com proxy_port (int): port proxy_username (str): auth username proxy_password (str): auth password kwargs: scheme (str): proxy scheme, default http plugin_path (str): absolute path of the extension return str -> plugin_path """ import string import zipfile if plugin_path is None: plugin_path = 'd:/webdriver/vimm_chrome_proxyauth_plugin.zip' manifest_json = """ { "version": "1.0.0", "manifest_version": 2, "name": "Chrome Proxy", "permissions": [ "proxy", "tabs", "unlimitedStorage", "storage", "", "webRequest", "webRequestBlocking" ], "background": { "scripts": ["background.js"] }, "minimum_chrome_version":"22.0.0" } """ background_js = string.Template( """ var config = { mode: "fixed_servers", rules: { singleProxy: { scheme: "${scheme}", host: "${host}", port: parseInt(${port}) }, bypassList: ["foobar.com"] } }; chrome.proxy.settings.set({value: config, scope: "regular"}, function() {}); function callbackFn(details) { return { authCredentials: { username: "${username}", password: "${password}" } }; } chrome.webRequest.onAuthRequired.addListener( callbackFn, {urls: [""]}, ['blocking'] ); """ ).substitute( host=proxy_host, port=proxy_port, username=proxy_username, password=proxy_password, scheme=scheme, ) with zipfile.ZipFile(plugin_path, 'w') as zp: zp.writestr("manifest.json", manifest_json) zp.writestr("background.js", background_js) return plugin_path proxyauth_plugin_path = create_proxyauth_extension( proxy_host="proxy.crawlera.com", proxy_port=8010, proxy_username="77409f72fe0c4a3e8413654411de0380", proxy_password="" ) co = webdriver.ChromeOptions() co.add_argument("--start-maximized") co.add_extension(proxyauth_plugin_path) driver = webdriver.Chrome(chrome_options=co) driver.get("http://httpbin.org/get")
- Building a Web Crawler Using Selenium and Proxies
- How to cURL with a proxy
- How to Use Rotating Proxy API & Proxy lists with CURL for data mining
Conclusion
Selenium is one of the tools available for automated testing, and web scraping JavaScript featured websites. Depending on what you require Selenium for, you might need to make use of proxies. the proxies discussed above are some of the best options available to you.