概述
网络上的介绍webkit资源加载过程的文档已经不少了,看了之后有一点似懂非懂的感觉,况且看完了就看完了,过几天又还给作者了,所以我还是决定自己对着代码把资源加载的流程走一下,加深一些记忆。
首先了解一下MemoryCache
MemoryCache维护着所有缓存的资源列表,这是一个hash表,hash key是资源的url,
value是CachedResource*,如果你了解MemCached(用于服务器缓存)的话,完全可以用它的概念来理解这里的MemoryCache,完全是一回事。
HTML支持的资源主要包含以下这些类型:
1.HTML页面
2.字体文件
3.图片
4.CSS Shader
5.视频,音频和字幕
6.Script
7.CSS样式表
8.XSL样式表
9.SVG:用来绘制SVG的2D图形
在Webkit当中对这9种类型的资源都有对应的类来表示,这些类都是以CachedResource为父类
CachedResource主要有下面几个子类用来表示具体的资源:
CachedRawResource
CachedFont
CachedImage
CachedShader
CachedTextTrack
CachedScript
CachedCSSStyleSheet
CachedXSLStyleSheet
CachedSVGDocument
资源加载过程当中,首先与外界接触的是CachedResourceLoader类,它会先访问MemoryCache,如果找到对应的资源,则直接返回,否则就定义ResourceRequest,进而调用ResourceLoader去从网络访问资源。
Webkit中的三类Loader:
1.ResourceLoader从网络栈或本地磁盘访问资源
2.CachedResourceLoader从MemoryCache访问资源
3.专有Loader,例如FontLoader, LinkLoader,ImageLoader,它们在Element当中用于与使用者打交道的
ImageLoader加载图片的整个过程:
HTMLImageElement被创建时同时会创建ImageLoader实例,它用来加载图片的。那就从updateFromElement()中看看这个ImageLoader实例m_imageLoader是如何使用的:
void ImageLoader::updateFromElement()
{
Document*
document = m_element->document();
CachedResourceHandlenewImage = 0;
if (!attr.isNull() &&
!stripLeadingAndTrailingHTMLSpaces(attr).isEmpty()) {
CachedResourceRequest request(ResourceRequest(document->completeURL(sourceURI(attr))));
……
if (m_loadManually) {
……
} else {
//从document取cachedResourceLoader,从而调用CachedResourceLoader的requestImage方法,下面看requestImage方法
newImage =
document->cachedResourceLoader()->requestImage(request);
}
// If we do not have an image here, it means that a cross-site
// violation occurred, or that
the image was blocked via Content
// Security Policy, or the page is being dismissed. Trigger an
// error event if the page is not being dismissed.
if (!newImage && !pageIsBeingDismissed(document)) {
m_failedLoadURL = attr;
m_hasPendingErrorEvent = true;
errorEventSender().dispatchEventSoon(this);
} else
clearFailedLoadURL();
} else if
(!attr.isNull()) {
……
}
}
省掉了大部分代码,只看如何请求图片资源的,在CachedResourceLoader当中的资源请求最终都会调用到requestResource
CachedResourceHandleCachedResourceLoader::requestImage(CachedResourceRequest& request)
{
……
return
static_cast(requestResource(CachedResource::ImageResource,
request).get());
}
CachedResourceHandleCachedResourceLoader::requestResource(CachedResource::Type type,
CachedResourceRequest& request)
{
KURL url = request.resourceRequest().url();
……// 查MemoryCache
resource =
memoryCache()->resourceForRequest(request.resourceRequest());
//决定接下来要做的动作
const RevalidationPolicy policy = determineRevalidationPolicy(type,
request.mutableResourceRequest(), request.forPreload(), resource.get(),
request.defer());
switch (policy) {
case Reload:
memoryCache()->remove(resource.get());
// Fall through
case Load:
resource = loadResource(type, request, request.charset());
break;
case Revalidate:
resource = revalidateResource(request, resource.get());
break;
case Use:
if (!shouldContinueAfterNotifyingLoadedFromMemoryCache(resource.get()))
return 0;
memoryCache()->resourceAccessed(resource.get());
break;
}
if (!resource)
return 0;
if (!request.forPreload() || policy != Use)
resource->setLoadPriority(request.priority());
if ((policy !=
Use || resource->stillNeedsLoad()) && CachedResourceRequest::NoDefer
== request.defer()) {
//加载资源
resource->load(this,
request.options());
// We don't support immediate loads, but we do support immediate
failure.
if (resource->errorOccurred()) {
if (resource->inCache())
memoryCache()->remove(resource.get());
return 0;
}
}
……
return resource;
}
CachedResourceLoader::requestResource(…)暂且不关心资源已经在MemeoryCache中的情况,这里只看资源从网络栈请求的情况下是如何走的。资源的加载最终会走到resource->load(this, request.options());,而在这里调用的应该CachedImage::load(…),稍微留言一下这里的两个参数,第一个是this,意指CachedResourceLoaderr指针,第二个是一些参数,暂不关心。
void CachedImage::load(CachedResourceLoader* cachedResourceLoader,
const ResourceLoaderOptions& options)
{
if (!cachedResourceLoader
|| cachedResourceLoader->autoLoadImages())
CachedResource::load(cachedResourceLoader, options);
else
setLoading(false);
}
又调用了CachedResource::load(…),在此函数中我只关心那一行最长的代码:
m_loader =
platformStrategies()->loaderStrategy()->resourceLoadScheduler()->scheduleSubresourceLoad(cachedResourceLoader->frame(),
this, request, request.priority(), options);
搜遍了整个代码发现有两个地方有:scheduleSubresourceLoad,一处是在WebCore当中,另一处是在Webkit2的WebProcess当中实现的,这里暂且不针对webkit2的多进程进行分析,所以就直接看WebCore当中的实现:
PassRefPtrResourceLoadScheduler::scheduleSubresourceLoad(Frame* frame, CachedResource*
resource, const ResourceRequest& request, ResourceLoadPriority priority,
const ResourceLoaderOptions& options)
{
RefPtrloader = SubresourceLoader::create(frame, resource, request, options);
if (loader)
scheduleLoad(loader.get(), priority);
return loader.release();
}
这个函数很简单,创建loader并进行调度就结束了。其实就是将资源请求放到队列当中,然后再决定是意即调度还是让他慢慢排队等待被调度,看看scheduleLoad:
void ResourceLoadScheduler::scheduleLoad(ResourceLoader*
resourceLoader, ResourceLoadPriority priority)
{
……
HostInformation* host =
hostForURL(resourceLoader->url(), CreateIfNotFound);
bool hadRequests =
host->hasRequests();
host->schedule(resourceLoader,
priority);
if (priority >
ResourceLoadPriorityLow || !resourceLoader->url().protocolIsInHTTPFamily()
|| (priority == ResourceLoadPriorityLow && !hadRequests)) {
// Try to request important resources immediately.
servePendingRequests(host, priority);
return;
}
notifyDidScheduleResourceRequest(resourceLoader);
// Handle asynchronously
so early low priority requests don't
// get scheduled before
later high priority ones.
scheduleServePendingRequests();
}
不管是意即被调度还是过一会再被调度,最终都会调用到函数ResourceLoadScheduler::servePendingRequests
void ResourceLoadScheduler::servePendingRequests(HostInformation*
host, ResourceLoadPriority minimumPriority)
{
LOG(ResourceLoading,
"ResourceLoadScheduler::servePendingRequests HostInformation.m_name='%s'",
host->name().latin1().data());
for (int priority =
ResourceLoadPriorityHighest; priority >= minimumPriority; --priority) {
HostInformation::RequestQueue& requestsPending =
host->requestsPending(ResourceLoadPriority(priority));
while (!requestsPending.isEmpty()) {
RefPtrresourceLoader = requestsPending.first();
……
resourceLoader->start();
}
}
}
-------------------------------从这里开始要上演从网络请求资源的过程了。-------------------------------------------
ResourceLoader是直接被ResourceLoadScheduler使用的类,他将ResourceLoadScheduler和ResourceHandle联系起来了,在ResourceLoadScheduler当中会按一定的优先级调用每一个资源请求job,而每一个请求job被封成了ResourceLoader实例去请求,当调度到某一个ResourceLoader时,就会调用ResourceLoader::start()方法,而在ResourceLoader::start()当中创建了ResourceHandle,如下面代码段:
void ResourceLoader::start()
{
ASSERT(!m_handle);
ASSERT(!m_request.isNull());
ASSERT(m_deferredRequest.isNull());
#if ENABLE(WEB_ARCHIVE) || ENABLE(MHTML)
if (m_documentLoader->scheduleArchiveLoad(this, m_request))
return;
#endif
if
(m_documentLoader->applicationCacheHost()->maybeLoadResource(this,
m_request, m_request.url()))
return;
if (m_defersLoading) {
m_deferredRequest = m_request;
return;
}
// 这里创建了ResourceHandle实例
if (!m_reachedTerminalState)
m_handle = ResourceHandle::create(m_frame->loader()->networkingContext(),
m_request, this, m_defersLoading, m_options.sniffContent == SniffContent);
}
ResourceHandle创建即会运行,下面是ResourceHandle的创建函数:
PassRefPtrResourceHandle::create(NetworkingContext* context, const ResourceRequest&
request, ResourceHandleClient* client, bool defersLoading, bool
shouldContentSniff)
{
BuiltinResourceHandleConstructorMap::iterator protocolMapItem =
builtinResourceHandleConstructorMap().find(request.url().protocol());
if (protocolMapItem != builtinResourceHandleConstructorMap().end())
return
protocolMapItem->value(request, client);
RefPtrnewHandle(adoptRef(new ResourceHandle(context, request, client, defersLoading,
shouldContentSniff)));
if (newHandle->d->m_scheduledFailureType
!= NoFailure)
return newHandle.release();
//这里调用了newHandle 的start()方法
if (newHandle->start())
return newHandle.release();
return 0;
}
直接来看ResourceHandle的start方法,此函数在ResourceHandleQt.cpp当中实现:
bool
ResourceHandle::start()
{
printf("ResourceHandle::startn");
// If NetworkingContext is invalid then we
are no longer attached to a Page,
// this must be an attempted load from an
unload event handler, so let's just block it.
if (d->m_context &&
!d->m_context->isValid())
return false;
if (!d->m_user.isEmpty() ||
!d->m_pass.isEmpty()) {
// If credentials were specified for
this request, add them to the url,
// so that they will be passed to
QNetworkRequest.
KURL urlWithCredentials(firstRequest().url());
urlWithCredentials.setUser(d->m_user);
urlWithCredentials.setPass(d->m_pass);
d->m_firstRequest.setURL(urlWithCredentials);
}
ResourceHandleInternal *d = getInternal();
//下面这行创建了QNetworkReplyHandler实例,并且指定了参数是AsynchronousLoad(异步加载)
d->m_job = new
QNetworkReplyHandler(this, QNetworkReplyHandler::AsynchronousLoad,
d->m_defersLoading);
return true;
}
查看QNetworkReplyHandler的构造函数:
QNetworkReplyHandler::QNetworkReplyHandler(ResourceHandle*
handle, LoadType loadType, bool deferred)
: QObject(0)
, m_resourceHandle(handle)
, m_loadType(loadType)
, m_redirectionTries(gMaxRedirections)
, m_queue(this, deferred)
{
const ResourceRequest &r =
m_resourceHandle->firstRequest();
if (r.httpMethod() == "GET")
m_method =
QNetworkAccessManager::GetOperation;
else if (r.httpMethod() ==
"HEAD")
m_method =
QNetworkAccessManager::HeadOperation;
else if (r.httpMethod() ==
"POST")
m_method = QNetworkAccessManager::PostOperation;
else if (r.httpMethod() == "PUT")
m_method =
QNetworkAccessManager::PutOperation;
else if (r.httpMethod() ==
"DELETE")
m_method =
QNetworkAccessManager::DeleteOperation;
else
m_method = QNetworkAccessManager::CustomOperation;
m_request =
r.toNetworkRequest(m_resourceHandle->getInternal()->m_context.get());
// 注意这一句: 传递的参数是一个函数
m_queue.push(&QNetworkReplyHandler::start);
}
m_queue是什么东西,我先不管他,从这里给的参数是一个函数,就能知道这个函数某个时间点肯定会被调用,那我就假设他进入队列后马上就被调用了,所以跳过来看:
void
QNetworkReplyHandler::start()
{
printf("QNetworkReplyHandler::startn");
ResourceHandleInternal* d =
m_resourceHandle->getInternal();
if (!d || !d->m_context)
return;
// ###这里发送请求了哦
QNetworkReply* reply =
sendNetworkRequest(d->m_context->networkAccessManager(),
d->m_firstRequest);
if (!reply)
return;
// ###这里的QNetworkReplyWrapper里面注册了数据ready的回调
m_replyWrapper = adoptPtr(new
QNetworkReplyWrapper(&m_queue, reply, m_resourceHandle->shouldContentSniff()
&& d->m_context->mimeSniffingEnabled(), this));
if (m_loadType == SynchronousLoad) {
m_replyWrapper->synchronousLoad();
// If supported, a synchronous request
will be finished at this point, no need to hook up the signals.
// 如果是同步,直接在这里完成就返回
return;
}
// 异步情况会走到这里,先启动一个定时器,如果有必要注册一个进度更新回调
double timeoutInSeconds =
d->m_firstRequest.timeoutInterval();
if (timeoutInSeconds > 0 &&
timeoutInSeconds < (INT_MAX / 1000))
m_timeoutTimer.start(timeoutInSeconds *
1000, this);
if
(m_resourceHandle->firstRequest().reportUploadProgress())
connect(m_replyWrapper->reply(),
SIGNAL(uploadProgress(qint64, qint64)), this, SLOT(uploadProgress(qint64,
qint64)));
}
sendNetworkRequest到底做了些什么呢?我也不知道,因为我对 QNetworkAccessManager类不熟悉,所以暂且不分析细节,暂时就认为他发送了http请求,然后就马上退出了(因为是异步请求)。
好了,请求发出去了,数据怎么样得到呢?在调用完sendNetworkRequest()函数后,创建了一个QNetworkReplyWrapper实例,而QNetworkReplyWrapper类就是专用来处理数据reply的。这里只需要知道他是处理reply就行,内容细节不研究。
数据历经磨难,终于来到了我们的眼前,最后会调用到void
QNetworkReplyHandler::forwardData(),此时我们就知道有数据来了,而当数据全部接收结束后,又会调用到void
QNetworkReplyHandler::finish(),来看看forwardData()
void
QNetworkReplyHandler::forwardData()
{
ASSERT(m_replyWrapper &&
m_replyWrapper->reply() && !wasAborted() &&
!m_replyWrapper->wasRedirected());
printf("QNetworkReplyHandler::forwardDatan");
ResourceHandleClient* client =
m_resourceHandle->client();
if (!client)
return;
qint64 bytesAvailable =
m_replyWrapper->reply()->bytesAvailable();
char* buffer = new char[8128 + 1]; //
smaller than 8192 to fit within 8k including overhead.
while (bytesAvailable > 0 &&
!m_queue.deferSignals()) {
buffer[8128] = 0x00;
qint64 readSize =
m_replyWrapper->reply()->read(buffer, 8128);
if (readSize <= 0)
break;
bytesAvailable -= readSize;
//我在这里加了打印,输出结果就是我们请求的html文档内容
printf("bytesAvailable =
%d, readSize %dn", bytesAvailable, readSize);
printf("%sn",
buffer);
printf("didReceiveData
%dn", readSize);
// FIXME:
// -1 means we do not provide any data
about transfer size to inspector so it would use
// Content-Length headers or content
size to show transfer size.
//这里上报数据,这里的client是谁呢?回去看void ResourceLoader::start()
client->didReceiveData(m_resourceHandle,
buffer, readSize, -1);
}
delete[] buffer;
if (bytesAvailable > 0)
m_queue.requeue(&QNetworkReplyHandler::forwardData);
}
在void ResourceLoader::start()当中通过ResourceHandle::create创建ResourceHandle实例时传入的第3个参数是this,而这个参数就是这里所创建的这个ResourceHandle的client,既然明确了这个client是谁,来看他的didReceiveData()方法,ResourceLoader有2个didReceiveData()方法,不过最终都会调用到didReceiveDataOrBuffer()方法:
void
ResourceLoader::didReceiveDataOrBuffer(const char* data, int length,
PassRefPtrprpBuffer, long long encodedDataLength,
DataPayloadType dataPayloadType)
{
// This method should only get data+length
*OR* a SharedBuffer.
ASSERT(!prpBuffer || (!data &&
!length));
// Protect this in this delegate method
since the additional processing can do
// anything including possibly derefing
this; one example of this is Radar 3266216.
RefPtrprotector(this);
RefPtrbuffer =
prpBuffer;
// 数据存入buffer当中先
addDataOrBuffer(data, length, buffer.get(),
dataPayloadType);
// FIXME: If we get a resource with more
than 2B bytes, this code won't do the right thing.
// However, with today's computers and
networking speeds, this won't happen in practice.
// Could be an issue with a giant local file.
// 提交数据
if (m_options.sendLoadCallbacks ==
SendCallbacks && m_frame)
frameLoader()->notifier()->didReceiveData(this,
buffer ? buffer->data() : data, buffer ? buffer->size() : length,
static_cast(encodedDataLength));
}
数据是怎么提交上去的呢?
frameLoader()->notifier()->didReceiveData(…)
frameLoader()->notifier()返回的是ResourceLoadNotifier,那就去找找ResourceLoadNotifier:: didReceiveData(…)
void
ResourceLoadNotifier::didReceiveData(ResourceLoader* loader, const char* data,
int dataLength, int encodedDataLength)
{// 更新进度
if (Page* page = m_frame->page())
page->progress()->incrementProgress(loader->identifier(), data,
dataLength);
// 继续上报数据
dispatchDidReceiveData(loader->documentLoader(),
loader->identifier(), data, dataLength, encodedDataLength);
}
到这里还没有走出ResourceLoadNotifier,继续往下跟,看看ResourceLoadNotifier:: dispatchDidReceiveData
void
ResourceLoadNotifier::dispatchDidReceiveData(DocumentLoader* loader, unsigned
long identifier, const char* data, int dataLength, int encodedDataLength)
{
m_frame->loader()->client()->dispatchDidReceiveContentLength(loader,
identifier, dataLength);
InspectorInstrumentation::didReceiveData(m_frame, identifier, data,
dataLength, encodedDataLength);
}
InspectorInstrumentation::didReceiveData可以不用关心。看到这里让我感觉有点奇怪了,为什么m_frame->loader()->client()->dispatchDidReceiveContentLength(loader,
identifier, dataLength);没有送data参数,而只是上报了长度,难道这里不是真正上报数据吗?于是我查看了void
FrameLoaderClientQt::dispatchDidReceiveContentLength,发现这个函数是空实现,看来这里确实没有真正上报数据,那真正上报数据在哪里做的呢?肯定是上面某个环节出问题了。
经常验证证明从QnetworkReplyHandler::forwardData(…)当中调用client->didReceiveData(…)时,这个client不是ResourceLoader,而是SubResourceLoader,SubResourceLoader是ResourceLoader的子类,ResourceLoader只是一个接口类,实例了大部分通用功能,但并没有实现上报数据的功能,所以数据上报完成在SubresourceLoader::didReceiveDataOrBuffer(…)当中,其中调用了父类的ResourceLoader::didReceiveDataOrBuffer(…),原因是数据保存、上报进度、还有inspector相关的功能都是在父类当中实现,子类完全可以不用关心。
void
SubresourceLoader::didReceiveDataOrBuffer(const char* data, int length,
PassRefPtrprpBuffer, long long encodedDataLength,
DataPayloadType dataPayloadType)
{ // 1.参数检查
……
printf("SubresourceLoader::didReceiveDataOrBuffern");
// Reference the object in this method
since the additional processing can do
// anything including removing the last
reference to this object; one example of this is 3266216.
RefPtrprotect(this);
RefPtrbuffer =
prpBuffer;
// 2.调用父类didReceiveDataOrBuffer函数
ResourceLoader::didReceiveDataOrBuffer(data, length, buffer,
encodedDataLength, dataPayloadType);
// 3.提交数据
if (!m_loadingMultipartContent) {
if (ResourceBuffer* resourceData =
this->resourceData())
m_resource->addDataBuffer(resourceData);
else
m_resource->addData(buffer ?
buffer->data() : data, buffer ? buffer->size() : length);
}
}
看来可以继续往上跟了。看最后几行代码发现调用了m_resource->addDataBuffer / addData,m_resource是CachedResource的指针,CachedResource是接口类,所以这里的实例肯定是他的某个子类,是哪个子类要看具体请求的资源是什么了,因为我这里测试的页面是一个空白页,所以对应的子类应该是CachedRawResource,那这里应该会调用到了CachedRawResource::addDataBuffer(…)。
下面再来理一理往上调用的类层次关系:
CachedRawResource::addDataBuffer
CachedRawResource ::notifyClientsDataWasReceived
{
while
(CachedRawResourceClient* c = w.next())
c->dataReceived(this,
data, length);
}
//这里的c就是DocumentLoader
DocumentLoader::dataReceived
DocumentLoader::commitLoad
FrameLoaderClientQt::committedLoad
DocumentLoader::commitData
DocumentWriter::addData
DecodedDataDocumentParser::appendBytes
HTMLDocumentParser::append
从HTMLDocumentParser::append开始已经属于解析部分的内容的。
哎呀,不好。我发现我的分析思路跑偏了,我刚开始是在分析ImageLoader如何加载图片数据,但数据上报时却当html页面去分析了,思路有点混乱了,不过不要紧,都是资源,请求过程大同小异,等有时间再来分析Image的显示过程,一并将数据的上报过程走一下。
最后
以上就是清秀水蜜桃为你收集整理的嵌入式linux webkit,[Webkit分析]Webkit的资源加载流程的全部内容,希望文章能够帮你解决嵌入式linux webkit,[Webkit分析]Webkit的资源加载流程所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复