失眠网,内容丰富有趣,生活中的好帮手!
失眠网 > Curl 采集乱码与采集不到 PHP

Curl 采集乱码与采集不到 PHP

时间:2021-07-19 17:48:33

相关推荐

Curl 采集乱码与采集不到 PHP

后端开发|php教程

Curl 采集乱码与采集不到 PHP

后端开发-php教程

PHP程序是用gbk2312编码的:

网站项目源码下载,vscode j配置文件,Ubuntu跑bench,无法连接tomcat原因,爬虫模板代码,php 获取当前编码,seo快排优化免费咨询,最新网站源代码asp acc,织梦 模板文件在哪lzw

<?php

$url = “”;//gbk2312编码

//$url = “”;//gbk2312编码

//$url = “”;//gbk2312编码

dedecms 5.7源码,vscode无法读取文件,Ubuntu增加终端,tomcat跨系统吗,第一个sqlite,爬虫进一步学什么,php的魔术变量,seo顾问优化推广软件,网络营销网站源码,bootstrap 炫酷模板lzw

$ch = curl_init($url);

curl_setopt($ch, CURLOPT_RETURNTRANSFER , true);//返回获取的输出的文本流

$ret = curl_exec($ch);

curl_setopt($ch, CURLOPT_TIMEOUT, 1);

curl_close($ch);

echo $ret;

源码目录结构,vscode如何修改图片,ubuntu网络加速,tomcat 增加并发数,水爬虫价格,php excle,武隆区seo优化优惠码,个人网站怎么收款,zblog商品展示模板lzw

?>

在采集时,是正常的,但是采集时是为空的,采集时是丢码的.

这是怎么回事呢?如何解决?有哪位怎么呀?先谢谢了!!!没多少分了,不好意思。

回复讨论(解决方案)

网易限制了API采集不到。sohu也可能限制了。

用 fopen 或 file_get_content可以,但file_get_content容易出现超时就停止程序执行了。

别的不说,我就是来拿分的.楼主记得给全分

$curl=curl_init(\);curl_setopt($curl,CURLOPT_RETURNTRANSFER,1);curl_setopt($curl,CURLOPT_USERAGENT,Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322));$html=curl_exec($curl);var_dump($html);$curl=curl_init(\);curl_setopt($curl,CURLOPT_RETURNTRANSFER,1);curl_setopt($curl,CURLOPT_USERAGENT,Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322));$html=curl_exec($curl);//$html=strstr($html,<);$html=gzdecode($html);var_dump($html);function gzdecode($data) {$len = strlen($data);if ($len < 18 || strcmp(substr($data,0,2),"\x1f\x8b")) {return null; // Not GZIP format (See RFC 1952)}$method = ord(substr($data,2,1)); // Compression method$flags = ord(substr($data,3,1)); // Flagsif ($flags & 31 != $flags) {// Reserved bits are set -- NOT ALLOWED by RFC 1952return null;}// NOTE: $mtime may be negative (PHP integer limitations)$mtime = unpack("V", substr($data,4,4));$mtime = $mtime[1];$xfl = substr($data,8,1);$os = substr($data,8,1);$headerlen = 10;$extralen = 0;$extra = "";if ($flags & 4) {// 2-byte length prefixed EXTRA data in headerif ($len - $headerlen - 2 < 8) { return false; // Invalid format}$extralen = unpack("v",substr($data,8,2));$extralen = $extralen[1];if ($len - $headerlen - 2 - $extralen < 8) { return false; // Invalid format}$extra = substr($data,10,$extralen);$headerlen += 2 + $extralen;} $filenamelen = 0;$filename = "";if ($flags & 8) {// C-style string file NAME data in headerif ($len - $headerlen - 1 < 8) { return false; // Invalid format}$filenamelen = strpos(substr($data,8+$extralen),chr(0));if ($filenamelen === false || $len - $headerlen - $filenamelen - 1 < 8) { return false; // Invalid format}$filename = substr($data,$headerlen,$filenamelen);$headerlen += $filenamelen + 1;} $commentlen = 0;$comment = "";if ($flags & 16) {// C-style string COMMENT data in headerif ($len - $headerlen - 1 < 8) { return false; // Invalid format}$commentlen = strpos(substr($data,8+$extralen+$filenamelen),chr(0));if ($commentlen === false || $len - $headerlen - $commentlen - 1 < 8) { return false; // Invalid header format}$comment = substr($data,$headerlen,$commentlen);$headerlen += $commentlen + 1;} $headercrc = "";if ($flags & 1) {// 2-bytes (lowest order) of CRC32 on header presentif ($len - $headerlen - 2 < 8) { return false; // Invalid format}$calccrc = crc32(substr($data,0,$headerlen)) & 0xffff;$headercrc = unpack("v", substr($data,$headerlen,2));$headercrc = $headercrc[1];if ($headercrc != $calccrc) { return false; // Bad header CRC}$headerlen += 2;} // GZIP FOOTER - These be negative due to PHPs limitations$datacrc = unpack("V",substr($data,-8,4));$datacrc = $datacrc[1];$isize = unpack("V",substr($data,-4));$isize = $isize[1]; // Perform the decompression:$bodylen = $len-$headerlen-8;if ($bodylen 0) {switch ($method) { case 8:// Currently the only supported compression method:$data = gzinflate($body);break; default:// Unknown compression methodreturn false;}} else {// Im not sure if zero-byte body content is allowed.// Allow it for now... Do nothing...} // Verifiy decompressed size and CRC32:// NOTE: This may fail with large data sizes depending on how//PHPs integer limitations affect strlen() since $isize//may be negative for large sizes.if ($isize != strlen($data) || crc32($data) != $datacrc) {// Bad format! Length or CRC doesn match!return false;}return $data; }

非常感谢young5335,给全分,可惜就这么点分了,想多给都不行呀。

curl_setopt($ch, CURLOPT_USERAGENT,’Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322)’);

那么一大堆代码,这句最有用,也解决了问题

如果觉得《Curl 采集乱码与采集不到 PHP》对你有帮助,请点赞、收藏,并留下你的观点哦!

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。