Python Webkit With Proxy Support

April 08, 2023 Post a Comment

I am writing a python script for scraping a webpage. I have created a webkit webview object and used the open method for loading the url. But I want to load the url through a proxy

Solution 1:

try below code snippets. (reference from url)

import gtk, webkit
import ctypes
libgobject = ctypes.CDLL('/usr/lib/libgobject-2.0.so.0')
libwebkit = ctypes.CDLL('/usr/lib/libsoup-2.4.so.1')
libsoup = ctypes.CDLL('/usr/lib/libsoup-2.4.so.1')
libwebkit = ctypes.CDLL('/usr/lib/libwebkit-1.0.so')

proxy_uri = libsoup.soup_uri_new('http://127.0.0.1:8000') # set your proxy url

session = libwebkit.webkit_get_default_session()
libgobject.g_object_set(session, "proxy-uri", proxy_uri, None)

w = gtk.Window()
s = gtk.ScrolledWindow()
v = webkit.WebView()
s.add(v)
w.add(s)
w.show_all()

v.open('http://www.google.com')

Hope, it could help you.

Solution 2:

You can use QApplicationProxy if you're on pyqt or this snippet if you're using pygi:

from gi.repository import WebKit
from gi.repository import Soup

proxy_uri = Soup.URI.new("http://127.0.0.1:8080")
session = WebKit.get_default_session().set_property("proxy-uri")
session.set_property("proxy-uri",proxy_uri)

References:
PyGI
PyQt

Solution 3:

How about a solution that's already made?

PyPhantomJS is a minimalistic, headless, WebKit-based, JavaScript-driven tool. It is written in PyQt4 and Python. It runs on Linux, Windows, and Mac OS X.

It gives you access to a full headless WebKit browser, controllable via scripts written in JavaScript, with the ability to do various things, amongst which is screen scraping and proxy support. It uses the command line.

You can see the API here.

* When I say screen scraping, I mean you can either scrape page content, or even save page renders to a file. There's even a screen scraping JS library already written here.

Introduction to Python Course

Python Webkit With Proxy Support

Solution 1:

Solution 2:

Solution 3:

Post a Comment for "Python Webkit With Proxy Support"