TensorRT序列化后结果保存

106 阅读 0 评论 70 点赞

我是靠谱客的博主眼睛大鱼，这篇文章主要介绍TensorRT序列化后结果保存，现在分享给大家，希望可以做个参考。

今天主要工作是对序列化后的结果进行保存，这部分的工作也是依靠论坛的力量和nvidia官方给予的信息：

主要参考的网页如下：

https://github.com/dusty-nv/jetson-inference/blob/master/tensorNet.cpp#L244

https://devtalk.nvidia.com/default/topic/1030534/jetson-tx2/segmentation-fault-core-dumped-while-doing-tensorrt-optimization-of-lenet/

https://devtalk.nvidia.com/default/topic/1038294/tensorrt/saving-a-serialized-model-to-disk/

主要的代码我粘贴在本网页中：



    //cache_path是保存序列化后的结果，这里填写路径
    std::stringstream gieModelStream;
	gieModelStream.seekg(0, gieModelStream.beg);

	
	std::ifstream cache( cache_path );

	if( !cache )
	{
       //如果没有保存序列化的结果，就需要进行第一次加载，生成trtmodel;	
		if( !ProfileModel(prototxt_path, model_path, output_blobs, maxBatchSize, gieModelStream) )
		{
			printf("failed to load %sn", model_path);
			return 0;
		}
	   
        //保存序列化的结果;
		std::ofstream outFile;
		outFile.open(cache_path);
		outFile << gieModelStream.rdbuf();
		outFile.close();
		gieModelStream.seekg(0, gieModelStream.beg);
		printf("completed writing cache to %sn", cache_path);
	}
	else
	{
       //使用保存的序列化结果，无需重新加载和序列化;
		printf(loading network profile from cache... %sn", cache_path);
		gieModelStream << cache.rdbuf();
		cache.close();

		// test for half FP16 support
		nvinfer1::IBuilder* builder = CREATE_INFER_BUILDER(gLogger);
		
		if( builder != NULL )
		{
            //这块确定本次算法是否使用半精度（FP16）进行计算
			mEnableFP16 = !mOverride16 && builder->platformHasFastFp16();
			printf("platform %s FP16 support.n", mEnableFP16 ? "has" : "does not have");
			builder->destroy();	
		}
	}

	

     /*
	 * create runtime inference engine execution context
	 */
	nvinfer1::IRuntime* infer = CREATE_INFER_RUNTIME(gLogger);
	
	if( !infer )
	{
		printf(LOG_GIE "failed to create InferRuntimen");
		return 0;
	}
	

	// support for stringstream deserialization was deprecated in TensorRT v2
	// instead, read the stringstream into a memory buffer and pass that to TRT.
	gieModelStream.seekg(0, std::ios::end);
	const int modelSize = gieModelStream.tellg();
	gieModelStream.seekg(0, std::ios::beg);

	void* modelMem = malloc(modelSize);

	if( !modelMem )
	{
		printf(LOG_GIE "failed to allocate %i bytes to deserialize modeln", modelSize);
		return 0;
	}

	gieModelStream.read((char*)modelMem, modelSize);
	nvinfer1::ICudaEngine* engine = infer->deserializeCudaEngine(modelMem, modelSize, NULL);
	free(modelMem);
	if( !engine )
	{
		printf(LOG_GIE "failed to create CUDA enginen");
		return 0;
	}
	
	nvinfer1::IExecutionContext* context = engine->createExecutionContext();
	
	if( !context )
	{
		printf(LOG_GIE "failed to create execution contextn");
		return 0;
	}

	if( mEnableDebug )
	{
		printf(LOG_GIE "enabling context debug sync.n");
		context->setDebugSync(true);
	}

	if( mEnableProfiler )
		context->setProfiler(&gProfiler);
	
	mInfer   = infer;
	mEngine  = engine;
	mContext = context;

上面的代码是摘抄的https://github.com/dusty-nv/jetson-inference/blob/master/tensorNet.cpp#L244中的代码，需要的基本功能进行了备注，若有其他不懂问题，可以留言，也可以进参考代码中进行查阅;