失眠网 > 在Module中使用自定义过滤器来统一对站内所有请求响应的输出内容进行采集或更改。...

在Module中使用自定义过滤器来统一对站内所有请求响应的输出内容进行采集或更改。...

时间：2023-08-28 21:47:16

因项目需要，对每一个访问网站的请求要做原始数据记录，其中要包括几个要素：

1.客户端的IP

2.客户端请求的页面路径

3.客户端发出的请求头

4.服务器返回的正文内容。

在代码设计前分析了一下，前三个都很好解决，对于截获服务器返回的正文，准备用HttpResponse 对象中的Output 和 OutputStream 属性输出信息来解决。

可是在正式编码的过程中，发现Output和OutputStream 并不是想像中可以直接把数据转出取回，耗费了近两天的时间，想尽了一切办法可还是仅仅可以追加内容并无法读取。

在网上查阅到，对于HttpResponse 对象，仅仅可以使用过滤器来对其中将要输出的内容进行修改。

这个过滤器要继承自Stream 类，并要实现其中的虚方法。看来之前企图使用HttpWriter,TextWriter,Stream,HttpStream 这些类来转出数据完全是错误的。

现在有信心来截获服务器返回内容了，说干就干吧！

1.首先要建立一个简易过滤器。

代码如下：

usingSystem;

usingSystem.Collections.Generic;

usingSystem.Text;

usingSystem.Text.RegularExpressions;

usingSystem.IO;

usingSystem.Web;

/**////<summary>

///定义原始数据EventArgs,便于在截获完整数据后，由事件传递数据

///</summary>

publicclassRawDataEventArgs:EventArgs

{

privatestringsourceCode;

publicRawDataEventArgs(stringSourceCode)

{

sourceCode=SourceCode;

}

publicstringSourceCode

{

get{returnsourceCode;}

set{sourceCode=value;}

}

//自定义过滤器

publicclassRawFilter:Stream

{

StreamresponseStream;

longposition;

StringBuilderresponseHtml;

/**////<summary>

///当原始数据采集成功后激发。

///</summary>

publiceventEventHandler<RawDataEventArgs>OnRawDataRecordedEvent;

publicRawFilter(StreaminputStream)

{

responseStream=inputStream;

responseHtml=newStringBuilder();

}

//实现Stream虚方法

FilterOverrides#regionFilterOverrides

publicoverrideboolCanRead

{

get

{

returntrue;

}

publicoverrideboolCanSeek

{

get

{

returntrue;

}

publicoverrideboolCanWrite

{

get

{

returntrue;

}

publicoverridevoidClose()

{

responseStream.Close();

}

publicoverridevoidFlush()

{

responseStream.Flush();

}

publicoverridelongLength

{

get

{

return0;

}

publicoverridelongPosition

{

get

{

returnposition;

}

set

{

position=value;

}

publicoverrideintRead(byte[]buffer,intoffset,intcount)

{

returnresponseStream.Read(buffer,offset,count);

}

publicoverridelongSeek(longoffset,SeekOriginorigin)

{

returnresponseStream.Seek(offset,origin);

}

publicoverridevoidSetLength(longlength)

{

responseStream.SetLength(length);

}

#endregion

//关键的点，在HttpResponse输入内容的时候，一定会调用此方法输入数据，所以要在此方法内截获数据

publicoverridevoidWrite(byte[]buffer,intoffset,intcount)

{

stringstrBuffer=System.Text.UTF8Encoding.UTF8.GetString(buffer,offset,count);

//采用正则，检查输入的是否有页面结束符</html>

Regexeof=newRegex("</html>",RegexOptions.IgnoreCase);

if(!eof.IsMatch(strBuffer))

{

//页面没有输出完毕，继续追加内容

responseHtml.Append(strBuffer);

}

else

{

//页面输出已经完毕，截获内容

responseHtml.Append(strBuffer);

stringfinalHtml=responseHtml.ToString();

//激发数据已经获取事件

OnRawDataRecordedEvent(this,newRawDataEventArgs(finalHtml));

//继续传递要发出的内容写入流

byte[]data=System.Text.UTF8Encoding.UTF8.GetBytes(finalHtml);

responseStream.Write(data,0,data.Length);

}

至此，过滤器定义完毕了，接下来还需要把这个过滤器装配到HttpResponse 对象中。

为了能够截获整站的aspx 页面输出的内容，我们可以定义一个HttpModule 来完成。

代码如下：

usingSystem;

usingSystem.Web;

usingSystem.Collections.Generic;

usingSystem.Text;

usingSystem.IO;

usingSystem.Diagnostics;

publicclassHttpRawDataModule:IHttpModule

{

IHttpModule成员#regionIHttpModule成员

publicvoidDispose()

{

}

publicvoidInit(HttpApplicationcontext)

{

//绑定事件，在对此请求处理过程全部结束后进行过滤操作

context.ReleaseRequestState+=newEventHandler(context_ReleaseRequestState);

}

#endregion

/**////<summary>

///对此HTTP请求处理的过程全部结束

///</summary>

///<paramname="sender"></param>

///<paramname="e"></param>

voidcontext_ReleaseRequestState(objectsender,EventArgse)

{

HttpApplicationapplication=(HttpApplication)sender;

//这里需要针对ASPX页面进行拦截，测试发现如果不这么做，Wap访问站点图片容易显示为X，奇怪

string[]temp=application.Request.CurrentExecutionFilePath.Split('.');

if(temp.Length>0&&temp[temp.Length-1].ToLower()=="aspx")

{

//装配过滤器

application.Response.Filter=newRawFilter(application.Response.Filter);

//绑定过滤器事件

RawFilterfilter=(RawFilter)application.Response.Filter;

filter.OnRawDataRecordedEvent+=newEventHandler<RawDataEventArgs>(filter_OnRawDataRecordedEvent);

}

/**////<summary>

///当原始数据采集到以后，入库

///</summary>

///<paramname="sender"></param>

///<paramname="e"></param>

voidfilter_OnRawDataRecordedEvent(objectsender,RawDataEventArgse)

{

stringallcode=e.SourceCode;

WapSite.SiteDataClasswapdata=newWapSite.SiteDataClass();

wapdata.WriteRawDataLog(allcode);

}

HttpModule 准备完毕，也装配上了过滤器，接下来还需要在配置文件中配置HttpModules配置节，把自定义的HttpModule 加入到HTTP处理管道中。

在Web.config 中增加配置节如下：

<system.web>

<addname="RawDataModule"type="HttpRawDataModule"/>

</httpModules>

</system.web>

测试成功，能准确的获得服务器向客户端输出的HTML内容。

其中，在过滤器中，可以直接对即将要输出的内容做对于字符串的任意处理。

而且采用这样的方式来对站点即将输出的内容做修改和采集，可以通过修改配置文件，随时打开和关闭，有很强的优越性和灵活性还有重用性。

记得看到过很多需要产生静态页面的网站，都是通过代码HttpWebRequest 向自己请求并记录返回的代码产生静态页面，不知道我当前介绍的方法是否更好写，比如需要产生静态页面时，不管是谁发出请求，由服务器检查自己是否有静态页面，否则产生静态页面，并转向。给出引子，希望大家还是自己开阔思路比较好。

这里我还想到一个额外的使用场景，比如入侵到一台支撑IIS 的服务器，上传自定义的过滤器和自定义的HttpModule 库，修改对方站点内的配置文件使之生效，就可以轻松做到窃取客户端输入内容和输出内容。不过修改配置文件不知道会不会让人容易发觉呀？？？

^_^

如果觉得《在Module中使用自定义过滤器来统一对站内所有请求响应的输出内容进行采集或更改。...》对你有帮助，请点赞、收藏，并留下你的观点哦！

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。

在Module中使用自定义过滤器 来统一对站内所有请求响应的输出内容进行采集或更改。...

在Module中使用自定义过滤器来统一对站内所有请求响应的输出内容进行采集或更改。...