我是靠谱客的博主 专注雪碧,最近开发中收集的这篇文章主要介绍理解MPP(Massively Parallel Processing) in database,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

  • Overview

    While storage and computing power have come long a way in the last several decades, the unfortunate reality is that they haven’t kept up with modern data storage and analysis needs.

    MPP databases solve this problem by allotting the required processing power onto several different nodes to most efficiently analyze large datasets.

    MPP databases are usually columar, which allows analytical queries to be processed faster;

  • Massively Parallel

    Massively parallel is the term for using a large number of computer processors (or separate computers) to simultaneously ferform a set of coordinated computations in paralle.

  • Massively Parallel Processing

    Massively Parallel Processing (MPP) is a storage structure designed to handle the coordinated processing of program operations by multiple processors.

    This coordinated processing can work on different parts of a program, with each processor using its own operating system and memory. This allows MPP databases to handle massive amounts of data and provide much faster analytics based on large datasets.

    MPP splitting up simple but large tasks into multiple buckets and getting those buckets processed at the same time will be much faster than one person working alone, no matter how skilled that person is.

    SImply put, an MPP database is a type of database or data warehouse where the data and processing power are split up among several different nodes (servers), with one leader node and one or many compute nodes.

    MPP databases can scale horizontally by adding more compute nodes, rather than having to worry about upgrading to more and more expensive individual servers (scalling vertically);

  • One Leader Node

    Leader node tell all the other nodes what to do and sorting the final tally;

  • Many Computer Nodes

    Compute nodes are dealing with all the data, running the queries and counting up the words;

  • Approachs of MPP

    There are several types of MPP database architectures, each with their own benefits:

    • Grid computing

      The processing power of many computers in disributed, diverse administrative domains is opportunistically used whenever a computer is available.

      Use mutiple computers in distributed networks. This type of architecture uses use resources opportunistically based on their avaiablity. This architecture reduces costs for server space, but also limits bandwidth and capacity at peak times or when there are too many requests;

    • Computer clustering

      Links the avaiable power into nodes that can connect with each other to handle multiple tasks at once;

  • 计算机微观层面的Massively Parallel (Processor)

    Graphics cards, containing multiple Graphic Processing Units (GPUs), are self-contained parallel computational devices that can be housed in conventional desktop and laptop computers and can be thought of as prototypes of the next generation of many-core processors.

  • Summary

    看一圈下来,MPP有两个意思,一个是Massively Parallel Processing,一个是Massively Parallel Processor;

    前者是针对database而言的一种业务逻辑层面的架构;

    后者是计算机硬件层面的一种组合架构;

    而Massively Parallel是词源,表示对于大量数据的并行处理,是一种scalling horizontally

  • References

  1. What Is Massively Parallel Processing (MPP)
  2. What is an MPP Database? Intro to Massively Parallel Processing
  3. MPP Architecture in database
  4. Many SQL databases designed for large data volumes are built on column-store and massively parallel processing (MPP) architectures.

最后

以上就是专注雪碧为你收集整理的理解MPP(Massively Parallel Processing) in database的全部内容,希望文章能够帮你解决理解MPP(Massively Parallel Processing) in database所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(44)

评论列表共有 0 条评论

立即
投稿
返回
顶部