HTTP协议中没有规定post提交的数据必须使用什么编码方式,服务端根据请求头中的Content-Type字段来获取编码方式,再对数据进行解析。具体的编码方式包括如下:
- application/x-www-form-urlencoded # 以form表单形式提交数据,最常见最熟悉
- application/json # 以json串提交数据。
- multipart/form-data # 上传文件
下面使用requests来发送上述三种编码的POST请求。
1、提交Form表单
requests提交Form表单,一般存在于网站的登录,用来提交用户名和密码。以 /post 为例,在requests中,以form表单形式发送post请求,只需要将请求的参数构造成一个字典,然后传给requests.post()的data参数即可。( 网站可以显示提交请求的内容,输出的”Content-Type”:”application/x-www-form-urlencoded”,证明这是提交Form的方式。)代码如下:
# -*- coding: utf-8 -*-
import requests
def get_html(url, key_value, retry=2):
try:
r = requests.post(url=url, headers=headers, data=key_value, timeout=5)
except Exception as e:
print(e)
if retry > 0:
get_html(url, retry - 1)
else:
page = r.text
return page
if __name__ == "__main__":
# 自定义请求头信息
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36',
}
url = '/post'
kw = {'wd': ''}
html = get_html(url, kw)
print(html)
D:python3installpython.exe D:/python/py3script/test.py
{
"args": {},
"data": "",
"files": {},
"form": {
"wd": ""
},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Content-Length": "19",
"Content-Type": "application/x-www-form-urlencoded",
"Host": "",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
},
"json": null,
"origin": "223.72.81.198, 223.72.81.198",
"url": "/post"
}
Process finished with exit code 0
2、提交json串
对于提交json串(浏览器中抓包显示payload),主要是用于发送ajax请求中,动态加载数据。
可以用json.dumps()对dict进行编码,可以使用json参数直接传递,然后它就会被自动编码,在请求头中也不用显示声明这是 2.4.2 版的新加功能。代码如下:
# -*- coding: utf-8 -*-
import requests
import json
def get_html(url, key_value, retry=2):
try:
r = requests.post(url=url, headers=headers, data=key_value, timeout=5)
except Exception as e:
print(e)
if retry > 0:
get_html(url, retry - 1)
else:
page = r.text
return page
def get_html_json(url, key_value, retry=2):
try:
r = requests.post(url=url, headers=headers, json=key_value, timeout=5)
except Exception as e:
print(e)
if retry > 0:
get_html_json(url, retry - 1)
else:
page = r.text
return page
if __name__ == "__main__":
# 自定义请求头信息
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36',
'Content-Type':'application/json; charset=UTF-8',
}
url = '/xxx/xxx'
kw = {'domain': ''}
# json.dumps
html = get_html(url, json.dumps(kw))
# 传递json参数
html_json = get_html_json(url,kw)
3.上传文件:
上传文件在爬虫中使用的很少。Content-Type类型为multipart/form-data,以multipart形式发送post请求,只需将一文件传给 requests.post() 的 files参数即可。还是以 /post 为例,代码如下:
# -*- coding: utf-8 -*-
import requests
def get_html(url, key_value, retry=2):
try:
r = requests.post(url=url, headers=headers, data=key_value, timeout=5)
except Exception as e:
print(e)
if retry > 0:
get_html(url, retry - 1)
else:
page = r.text
return page
if __name__ == "__main__":
# 自定义请求头信息
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36',
}
url = '/post'
files = {'file': open('ajax.png', 'rb')}
html = get_html(url, files)
print(html)
D:python3installpython.exe D:/python/py3script/test.py
{
"args": {},
"data": "",
"files": {
"file": "data:application/octet-stream;base64,...太长..省略..."
},
"form": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Content-Length": "68870",
"Content-Type": "multipart/form-data; boundary=66f5b203f18f79960ac438c59af481b0",
"Host": "",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
},
"json": null,
"origin": "223.72.72.67, 223.72.72.67",
"url": "/post"
}
Process finished with exit code 0
警告
建议用二进制模式(binary mode)打开文件,因为 Requests 可能会试图为你提供 Content-Length header,在它这样做的时候,这个值会被设为文件的字节数(bytes)。如果用文本模式(text mode)打开文件,就可能会发生错误。
如果觉得《python post 请求json文件_requests的post请求提交表单 json串和文件数据讲解》对你有帮助,请点赞、收藏,并留下你的观点哦!