概述
数据科学自动化 With data science becoming more and more popular, companies are figuring out how many data scientists they will need on one team to make a successful product or answer a business problem successfully. While companies focus on hiring data scientists, they have most likely noticed that instead of hiring people to perform data science, they could hire a platform — or perform data science in other ways to employ data science at their company. Ultimately, data science can be automated, just like most technical processes, which is a bit of inception. The question, however, turns into, should it be automated and how well does data science perform when it is automated by a tool or platform? I will discuss these questions below by highlighting the pros and cons of auto — data science and/or machine learning. 随着数据科学越来越流行,公司正在计算一个团队要成功制作成功的产品或成功解决业务问题所需的数据科学家数量。 尽管公司专注于聘用数据科学家,但他们很可能已经注意到,与其雇用人员来执行数据科学,他们可以租用平台-或以其他方式执行数据科学以在公司中使用数据科学。 最终,就像大多数技术流程一样,数据科学可以实现自动化,这只是一个开始。 但是,问题变成了它应该自动化吗?当通过工具或平台自动化时,数据科学的性能如何? 我将在下面通过重点介绍自动技术(数据科学和/或机器学习)的利弊来讨论这些问题。 Like most things in life, moderation is key, so to eliminate your human data scientists and replace them with a tool is probably going to lead to some chaos and confusion — at first. Just like in education, an online platform could teach many people to become successful in an academic area, as can automated data science platforms. Data science can be learned by a human from a machine. But, when you automate data science this early in the history of the field (yes, I know it not as new of a field as many people think), you can run into some serious problems. Opposingly, you can run into some awesome pros. 就像生活中的大多数事物一样,节制是关键,因此,消除您的人类数据科学家并用一种工具代替它们可能首先会导致混乱和混乱。 就像在教育中一样,在线平台可以教许多人在学术领域取得成功,自动化数据科学平台也可以。 数据科学可以由人从机器中学习。 但是,当您在该领域的历史早期就自动化数据科学时( 是的,我知道它并不像许多人认为的那样新出现 ),您会遇到一些严重的问题。 相反,您会遇到一些很棒的专业人士。 There are pros and cons to everything, automated data science is no exception. I am not going to detail the specific tools/companies where their main product is data science automation, but you can expect some of these pros and cons to represent some of these tools. 一切都有利弊,自动化数据科学也不例外。 我不会详细介绍其主要产品是数据科学自动化的特定工具/公司,但是您可以期望其中的一些利弊代表这些工具。 Pros 优点 Easy to Use 易于使用 The main function of automated data science platforms is to make it easier for users to implement data science in their business. Therefore, someone who has a background in data analytics or product management could expect to easily use a platform, to say — categorize images. 自动化数据科学平台的主要功能是使用户更轻松地在其业务中实施数据科学。 因此,具有数据分析或产品管理背景的人可能希望轻松使用平台,例如对图像进行分类。 Cheaper 便宜一点 Whereas hiring data scientists can cost a company well over $100,000 from salary and onboarding costs, an automated platform could cost significantly less than even just one data scientist — it is important to note that some companies have plenty over one data scientist. 聘用数据科学家可能会使公司的薪金和入门成本大大超过10万美元,而自动化平台的成本甚至可能仅比一名数据科学家还要低-重要的是要注意,有些公司拥有大量的数据科学家。 Powerful 强大 Data science is widely known as a powerful tool in itself that can significantly impact a company or business. Data science and machine learning has lead countless products and served nearly every human in some way. Use your phone today? Was it an iPhone? Did you use Face ID? Then you probably already used machine learning without even realizing it (unless you are a data scientist now and know it already). Maybe you used Netflix’s recommendation algorithm that suggested a show or movie. These are some of the examples of everyday machine learning that you will encounter. There are countless more, and a company can truly benefit from the power of data science on their business, whether it be internally or externally. 数据科学本身就是一种功能强大的工具,可以对公司或企业产生重大影响。 数据科学和机器学习已经引领了无数产品,并以某种方式为几乎每个人服务。 今天使用手机吗? 是iPhone吗? 您是否使用了Face ID? 然后,您可能甚至在没有意识到的情况下就已经使用了机器学习(除非您现在是数据科学家并且已经知道它)。 也许您使用了Netflix推荐节目或电影的推荐算法。 这些是您将遇到的日常机器学习的一些示例。 数不胜数,无论是内部还是外部,公司都可以从数据科学的业务中真正受益。 Cons 缺点 I am going to highlight the cons next, as I believe they are more important and outweigh the pros (as of now — this could change quickly). 接下来,我将重点介绍缺点,因为我认为它们更为重要,并且胜过了优点(到目前为止,这可能会Swift改变)。 Hard to Explain 很难解释 The cons are where it gets tricky. These points can really mess up a company from a user not using the platform correctly and/or interpreting the results and model incorrectly. It can be hard to explain the results of a complicated data science model. Now imagine you are not a data scientist and have not had an academic background in the various types of machine learning algorithms. You will have to explain these platform model results and implement the suggestions or predictions with regards to your company’s integrations (sometimes), which could prove to be time-consuming and difficult. 缺点是棘手的地方。 这些问题确实会使用户无法正确使用平台和/或错误地解释结果和模型,从而使公司陷入混乱。 很难解释复杂的数据科学模型的结果。 现在,假设您不是数据科学家,并且没有各种机器学习算法的学术背景。 您将不得不解释这些平台模型的结果,并就您公司的集成( 有时 )实施建议或预测,这可能会非常耗时且困难。 Misleading Results 误导性结果 Since you did not build the model yourself, you may be unaware of possible parameters that need to be tuned. Additionally, you might not know that you need to use an elbow plot to find the optimal number of clusters for an unsupervised segmentation algorithm. All of these complications of not understanding the model from scratch could lead to results that may not make the most sense. Perhaps you used logistic regression to predict temperature for the next few months, but then later realize it was best to use the algorithm as a classification model instead, despite the contradicting name. There are small nuances that can add up and could lead to some serious mistakes. 由于您不是自己构建模型的,因此您可能没有意识到需要调整的可能参数。 此外,您可能不知道是否需要使用弯头图来找到无监督分割算法的最佳聚类数。 不从头开始理解模型的所有这些复杂情况可能导致结果可能没有任何意义。 也许您使用逻辑回归来预测接下来几个月的温度,但是后来意识到,尽管名称相互矛盾,但最好还是使用该算法作为分类模型。 有一些细微差别可能加起来,并可能导致一些严重的错误。 Ultimately, it depends on if data science will be completely automated. Sure, use an automated data science platform if you already have a data analyst on your team. Or, use the automated solution for predictions that are not harmful if incorrect. Categorizing clothes incorrectly is not the worst thing that can happen, but when you are in the health or finance industry and you classify a disease or large sums of money incorrectly, the harm is undeniable. 最终,这取决于数据科学是否将完全自动化。 当然,如果您的团队中已经有数据分析师,请使用自动化数据科学平台。 或者,使用自动解决方案进行预测,如果预测不正确,则无害。 错误地分类衣服并不是可能发生的最坏的事情,但是当您在医疗保健或金融行业中,将疾病或大量金钱错误地分类时,危害是不可否认的。 Figure out what company you are, your goals, and weigh the pros and cons, and from there, you can decide if automated data science is right for you. That being said, data science is already being automated but will face platforms that will try to completely automate the whole entire process in the future. 弄清楚您是一家公司,您的目标,并权衡利弊,然后您可以决定自动数据科学是否适合您。 话虽这么说,数据科学已经实现了自动化,但是将面向未来将尝试完全自动化整个过程的平台。 I hope this article brings some interesting discussion. Of course, I am biased and prefer to keep data scientists around; however, I know how much data science is automated already with importing popular libraries that are pre-saved. The solution may be that you could use the human-in-the-loop method: automate what you can, and then provide checks and balances to account for model error. 我希望本文带来一些有趣的讨论。 当然,我有偏见,更喜欢让数据科学家留在身边。 但是,我知道导入预先保存的流行库已经实现了多少数据科学自动化。 解决方案可能是您可以使用“ 在环方法”:自动化您所能做的,然后提供制衡以解决模型错误。 Feel free to comment down below. Thank you for reading! 请在下面随意评论。 感谢您的阅读! 翻译自: https://towardsdatascience.com/will-data-science-become-automated-407f32270de6 数据科学自动化 意见 (Opinion)
目录 (Table of Contents)
介绍 (Introduction)
数据科学自动化 (Automation of Data Science)
利弊 (Pros and Cons)
摘要 (Summary)
最后
以上就是神勇云朵为你收集整理的数据科学自动化_数据科学会自动化吗? 目录 (Table of Contents) 介绍 (Introduction) 数据科学自动化 (Automation of Data Science) 利弊 (Pros and Cons) 摘要 (Summary)的全部内容,希望文章能够帮你解决数据科学自动化_数据科学会自动化吗? 目录 (Table of Contents) 介绍 (Introduction) 数据科学自动化 (Automation of Data Science) 利弊 (Pros and Cons) 摘要 (Summary)所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复