使用python请求设置代理-Java 学习之路

我正在尝试编写一个webserver（代理？），以便我可以向 http://localhost:8080/foo/bar 提出请求，这将从 https://www.gyford.com/foo/bar 透明地返回响应 .

下面的python脚本适用于网页本身，但不返回某些类型的文件（例如https://www.gyford.com/static/hines/js/site-340675b4c7.min.js） . 如果我手动请求该文件，而此服务器正在运行，如：

import requests
r = requests.get('http://localhost:8080/static/hines/js/site-340675b4c7.min.js')

然后我得到：

'收到内容编码的响应：gzip，但无法对其进行解码 . '，错误（'解压缩数据时错误-3：错误的 Headers 检查'，）

所以我想我需要以不同的方式处理gzip文件，但我无法弄清楚如何 .

from http.server import HTTPServer, BaseHTTPRequestHandler
import requests

HOST_NAME = 'localhost'
PORT_NUMBER = 8080
TARGET_DOMAIN = 'www.gyford.com'

class MyHandler(BaseHTTPRequestHandler):

    def do_GET(self):
        host_domain = '{}:{}'.format(HOST_NAME, PORT_NUMBER)

        host = self.headers.get('Host').replace(host_domain, TARGET_DOMAIN)

        url = ''.join(['https://', host, self.path])

        r = requests.get(url)

        self.send_response(r.status_code)

        for k,v in r.headers.items():
            self.send_header(k, v)

        self.end_headers()

        self.wfile.write( bytes(r.text, 'UTF-8') )

if __name__ == '__main__':
    server_class = HTTPServer
    httpd = server_class((HOST_NAME, PORT_NUMBER), MyHandler)
    try:
        httpd.serve_forever()
    except KeyboardInterrupt:
        pass
    httpd.server_close()

EDIT: 这是 print(r.headers) 的输出：

{'Connection'：'keep-alive'，'Server'：'gunicorn / 19.7.1'，'Date'：'Wed，26 Sep 2018 13:43:43 GMT'，'Content-Type'：'application / JavaScript的; charset =“utf-8”'，'Cache-Control'：'max-age = 60，public'，'Access-Control-Allow-Origin'：'*'，'Vary'：'Accept-Encoding'，' Last-Modified'：'星期四，2019年9月20日16:11:29 GMT'，'Etag'：'“5ba3c6b1-6be”'，'Content-Length'：'771'，'Content-Encoding'：'gzip' ，'Via'：'1.1 vegur'}

1 回答

0
问题：我需要以不同方式处理gzip压缩文件 .

我想知道，这对于一个网页本身是如何工作的，但假设有一些神奇的浏览器处理 .

你在做什么：r = requests.get（url）
你得到网址内容，自动解码gzip和deflate传输编码 . self.wfile.write（bytes（r.text，'UTF-8'））
你编写解码的r.text，编码为字节，这与Transfer Encoding不同 .

更改以下内容：
Read and write as raw stream of bytes - 它不会转换响应内容 .
您也可以将其用于其他数据，例如"html"请求 .
```
r = requests.get(url, stream=True)
    ...
    self.wfile.write(r.raw.read())
```
docs.python-requests.org注意：
阅读有关 Raw Response Content 的章节 .
如果要传输非常大的数据，则必须在阅读时使用 chunk .

注意：这是默认的Headers，python-requests正在使用 . 已经存在'Accept-Encoding'：'gzip，deflate' Headers ，因此客户端无需任何操作 . {'headers'：{'接受'：'* / *'，
'User-Agent'：'python-requests / 2.11.1'，
'Accept-Encoding'：'gzip，deflate'，
'连接'：'关闭'，
'主持人'：'httpbin.org'}
}
回复于 2024-04-27T23:50:40+08:00

使用python请求设置代理

1 回答

相关问题