概述
(1) To help you practice strategies for machine learning, in this week we’ll present another scenario and ask how you would act. We think this “simulator” of working in a machine learning project will give a task of what leading a machine learning project could be like!
You are employed by a startup building self-driving cars. You are in charge of detecting road signs (stop sign, pedestrian crossing sign, construction ahead sign) and traffic signals (red and green lights) in images. The goal is to recognize which of these objects appear in each image. As an example, the above image contains a pedestrian crossing sign and red traffic lights.
Your 100,000 labeled images are taken using the front-facing camera of your car. This is also the distribution of data you care most about doing well on. You think you might be able to get much large dataset off the internet, that could be helpful for training even if the distribution of internet data is not the same.
You are just getting starts on this project. What is the first thing you do? Assume each of the steps below would take about an equal amount of time (a few days).
[A]Spend a few days getting the internet data, so that you understand better what data is available.
[B]Spend a few days collecting more data using the front-facing camera of your car, to better understand how much data per unit time you can collect.
[C]Spend a few days checking what is human-level performance for these tasks so that you can get an accurate estimate of Bayes error.
[D]Spend a few days training a basic model and see what mistakes it makes.
答案:D
解析:先跑出一个基本的模型,然后评估其表现和制定后续优化的方向。详情见视频2.3 Build your first system quickly, then iterate
(2)Your goal is to detect road signs (stop sign pedestrian crossing sign, construction ahead sign) and traffic signals (red and green lights) in images. The goal is to recognize which of these objects appear in each image. You plan to use a deep neural network with ReLU units in the hidden layers.
For the output layer, a softmax activation would be a good choice for the output layer because this is a multi-task learning problem. True/False?
答案:False
解析:softmax激活函数输出的是每个的概率,不能在一个网络中检测多个物体,不适用于该多任务学习。
(3)You are carrying out error analysis and counting up what errors the algorithm makes. Which of these datasets do you think you should manually go through and carefully examine, one image at a time?
[A]500 images on which the algorithm made a mistake.
[B]500 randomly chosen images.
[C]10,000 images on which the algorithm made a mistake.
[D]10,000 randomly chosen images.
答案:A
解析:查看分类出错的图片,得出算法出错的这些图片的共性,从而对算法进行优化,但由于查看图片比较费时,所以查看500个这样的图片较为合适。
(4)After working on the data for several weeks, your team ends up with the following data:
- 100,000 labeled images taken using the front-facing camera of your car.
- 900,000 labeled image of roads downloaded from the internet.
- Each image’s labels precisely indicate the presence of any specific road signs and traffic signals or combinations of them. For example
y
(
i
)
=
(
1
0
0
1
0
)
y^{(i)}=left( begin{array}{l} 1\ 0\ 0\ 1\ 0\ end{array} right)
y(i)=⎝⎜⎜⎜⎜⎛10010⎠⎟⎟⎟⎟⎞ means the image contains stop sign and a red traffic light.
Because this is a multi-task learning problem, you need to have all your y ( i ) y^{(i)} y(i) vectors fully labeled. If one example is equal to ( 0 ? 1 1 ? ) left( begin{array}{l} 0\ ?\ 1\ 1\ ?\ end{array} right) ⎝⎜⎜⎜⎜⎛0?11?⎠⎟⎟⎟⎟⎞ then the learning algorithm will not be able to use that example. True/False?
答案:False
解析:在多任务学习中,即使只有一小部分标签,也可以进行学习,对?标签不计算其损失即可。详情见视频2.8 Multi-task learning
(5)The distribution of data you care about contains images from your car’s front-facing camera; which comes from a different distribution than the images you were able to find and download off the internet. How should you split the dataset into train/dev/test sets?
[A]Mix all the 100,000 images with the 900,000 images you found online. shuffle everything. Split the 1,000,000 images dataset into 600,000 for the training set, 200,000 for the dev set and 200,000 for the test set.
[B]Choose the training set to be the 900,000 images from the internet along with 80,000 images from your car’s front-facing camera. The 20,000 remaining images will be split equally in dev and test sets.
[C]Mix all the 100,000 images will the 900,000 images you found online. Shuffle everything. Split the 1,000,000 images dataset into 980,000 for the training set, 10,000 for the dev set and 10,000 for the test set.
[D]Choose the training set to be the 900,000 images from the internet along with 20,000 images from your car’s front-facing camera. The 80,000 remaining images well be split equally in dev and test sets.
答案:B
解析:dev set和test set应该是最关心的数据,所以A,C混合在一起了并且网上下载的占大头,故错。10,000数据量足够验证和测试了,所以将剩下的80,000用于训练即选项B可以达到比选项D更好的效果。
(6)Assume you’ve finally chosen the following split between of the data:
Dataset | Contains | Error of the algorithm |
---|---|---|
Training | 940,000 images randomly picked from (900,000 internet images + 60,000 car’s front-facing camera images) | 8.8% |
Training-Dev | 20,000 images randomly picked from (900,000 internet images + 60,000 car’s front-facing camera images) | 9.1% |
Dev | 20,000 images from your car’s front-facing camera | 14.3% |
Test | 20,000 images from your car’s front-facing camera | 14.8% |
You also know that human-level error on the road sign and traffic signals classification task is around 0.5%. Which of the following are True? (Check all that apply)
[A]Your algorithm overfits the dev set because the error of the dev and test sets are very close.
[B]You have a large variance problem because your model is not generalizing well to data from the same training distribution but that it has never seen before.
[C]You have a large data-mismatch problem because your model does a lot better on the training-dev set than on the dev set.
[D]you have a large avoidable-bias problem because your training error is quite a bit higher than the human-level error.
[E]You have a large variance problem because your training error is quite higher than the human-level error.
答案:C,D
(7)Based on the table from the previous question, a friend thinks that the training data distribution is much easier than the dev/test distribution. What do you think?
[A]Your friend is right. (i.e.,Bayes error for the training data distribution is probably lower than for the dev/test distribution)
[B]Your friend is wrong. (i.e.,Bayes error for the training data distribution is probably higher than for the dev/test distribution)
[C]There’s insufficient information to tell if your friend is right or wrong.
答案:C
解析:如果两个数据的识别难度相同,一般情况下,训练集的误差会比验证集要低。如果测试集的误差小于训练集的误差,则测试集的识别难度比训练集要低。如果不是很理解可以参考机器学习中,测试集的误差反而比训练集的误差要低,这个该怎么解释? - 论智的回答 - 知乎
故本题无法比较训练集和验证集的识别难度。
(8)You decide to focus on the dev set and check by hand what are the errors due to. Here is a table summarizing your discoveries:
Overall dev set error | 14.3% |
Errors due to incorrectly labeled data | 4.1% |
Error due to foggy pictures | 8.0% |
Errors due to rain drops stuck on your car's front-facing camera | 2.2% |
Errors due to other causes | 1.0% |
In this table, 4.1%, 8.0%, etc. are a fraction of the total dev set (not just examples your algorithm mislabeled). I.e. about 8.0/14.3=56% of your errors are due to foggy pictures.
The results from this analysis implies that the team’s highest priority should be to bring more foggy pictures into the training set so as to address the 8.0% of errors in that category. True/False?
[A]True because it is the largest category of errors. As discussed in lecture, we should prioritize the largest category of error to avoid wasting the team’s time.
[B]True because it is greater than the other error categories added together (8.0 > 4.1+2.2+1.0).
[C]False because this would depend on how easy it is to add this data and how much you think your team thinks it’ll help.
[D]False because data augmentation (synthesizing foggy images by clean/non-foggy images) is more efficient.
答案:C
解析:即使出错的数据中很多是因为雾天引起的,但有可能雾天的数据很难获得,但其他的数据很容易获得,可以优先解决其他。
(9)You can buy a specially designed windshield wiper that help wipe off some of the raindrops on the front-facing camera. Based on the table from the previous question, which of the following statements do you agree with?
[A] 2.2% would be a reasonable estimate of the maximum amount this windshield wiper could improve performance.
[B] 2.2% would be a reasonable estimate of the minimum amount this windshield wiper could improve performance.
[C] 2.2% would be a reasonable estimate of how much this windshield wiper will improve performance.
[D] 2.2% would be a reasonable estimate of how much this windshield wiper could worsen performance in the worst case.
答案:A
解析:由雨滴因此的错误占2.2%,但是由于其他原因,不一定加了雨刮器就能将雨滴引起的错误全部解决。
(10)You decide to use data augmentation to address foggy images. You find 1,000 pictures of fog off the internet, and “add” them to clean images to synthesize foggy days, like this:
Which of the following statements do you agree with?
[A]So long as the synthesized fog looks realistic to the human eye, you can be confident that the synthesized data is accurately capturing the distribution of real foggy images (or a subset of it), since human vision is very accurate for the problem you’re solving.
[B]There is little risk of overfitting to the 1,000 pictures of fog so long as you are combing it with a much larger (>>1,000) of clean/non-foggy images.
[C]Adding synthesized images that look like real foggy pictures taken from the front-facing camera of your car to training dataset won’t help the model improve because it will introduce avoidable-bias.
答案:A
解析:用1000张雾天的图片去合成远大于1000张图片,这些雾只是整个空间很小的一个子集,会有很大的过拟合风险,故B错。训练集增加雾天的图片会提升模型的鲁棒性,故C错。用1000张雾天的图片去合成少量的图片并不会有过拟合风险,因为每个雾天的图片只用到很少的次数,故A是正确的。
(11)After working further on the problem, you’ve decided to correct the incorrectly labeled data on the dev set. Which of these statements do you agree with? (Check all that apply).
[A]you should also correct the incorrectly labeled data in the test set, so that the dev and test sets continue to come from the same distribution.
[B]You should correct incorrectly labeled data in the training set as well so as to avoid your training set now being even more different from your dev set.
[C]You should not correct the incorrectly labeled data in the test set, do that the dev and test sets continue to come from the same distribution.
[D]You should not correct incorrectly labeled data in the training set as it does not worth the time.
答案:A,D
解析:验证集和测试集必须来自同一分布,所以测试集上的错误标签也需要更正。因为深度学习算法对于训练集中的随机错误具有强大的鲁棒性,即使训练集和验证集测试集的分布稍微不同也没关系,并且训练集的数据要比验证集和测试集多得多,所以不值得花时间去修正训练集的错误标签。
(12)So far your algorithm only recognizes red and green traffic lights. One of your colleagues in the startup is starting to work on recognizing a yellow traffic light. (Some countries call it an orange light rather than a yellow light; we’ll use the US convention of calling it yellow.) Images containing yellow lights are quite rare, and she doesn’t have enough data to build a good model. She hopes you can help her out using transfer learning.
What do you tell your colleague?
[A]She should try using weights pre-trained on your dataset, and fine-tuning further with the yellow-light dataset.
[B]If she has 10,000 images of yellow lights, randomly sample 10,000 images from your dataset and put your and her data together. This pervent your dataset from “swamping” the yellow lights dataset.
[C]you cannot help her because the distribution of data you have is different from hers, and is also lacking the yellow label.
[D]Recommend that she try multi-task learning instead of transfer learning using all the data.
答案:A
关键词:迁移学习。
解析:You have trained your model on a huge dataset, and she has a small dataset. Although your labels are different, the parameters of your model have been trained to recognize many characteristics of road and traffic images which will be useful for her problem. This is a perfect case for transfer learning, she can start with a model with the same architecture as yours, change what is after the last hidden layer and initialize it with your trained parameters.
你已经在一个庞大的数据集上训练了你的模型,并且她有一个小数据集。 尽管您的标签不同,但您的模型参数已经过训练,可以识别道路和交通图像的许多特征,这些特征对于她的问题很有用。 这对于转移学习来说是一个完美的例子,她可以从一个与您的架构相同的模型开始,改变最后一个隐藏层之后的内容,并使用您的训练参数对其进行初始化。
(13)Another colleague wants to use microphones placed outside the car to better hear if there’re other vehicles around you. For example, it there is a police vehicle behind you, you would be able to hear their siren. However, they don’t have much to train this audio system. How can you help?
[A]Transfer learning from your vision dataset could help your colleague get going faster. Multi-task learning seems significantly less promising.
[B]Multi-task learning from your vision dataset could help your colleague get going faster. Transfer learning seems significantly less promising.
[C]Either transfer learning or multi-task learning could help our colleague get going faster.
[D]Neither transfer learning nor multi-task learning seems promising.
答案:D
解析:他试图解决的问题是音频方面,和图像毫无关系。
(14)To recognize red and green lights, you have been using this approach:
- (A) Input an image (x) to a neural network and have it directly learn a mapping to make a prediction as to whether there’s a red light and/or green light (y).
A teammate proposes a different, two-step approach:
- (B) In this two-step approach, you would first (i) detect the traffic light in the image (if any), then (ii) determine the color of the illuminated lamp in the traffic light.
Between these two, Approach B is more of an end-to-end approach because it has distinct steps for the input end and the output end. True/False?
答案:False
解析:A是一种端到端的方法,因为中间没有其他步骤,x直接映射成y。
(15)Approach A (int the question above) tends to be more promising than approach B if you have a _______ (fill in the blank).
[A]Large training set.
[B]Multi-task learning problem.
[C]Large bias problem.
[D]Problem with a high Bayes error.
答案:A
解析:端到端的深度学习需要大量数据集才能达到比较好的效果。
最后
以上就是甜蜜红酒为你收集整理的【吴恩达深度学习】03_week2_quiz Autonomous driving (case study)的全部内容,希望文章能够帮你解决【吴恩达深度学习】03_week2_quiz Autonomous driving (case study)所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复