大数据技术与应用基础-第11、12章事件流OLAP之Druid、事件数据流引擎Flink
druid 工作原理

druid 工作原理When it comes to the working principle of druid, it is essential to understand how it handles large volumes of data for real-time analytics. Druid is a high-performance, column-oriented, distributed data store that is designed for OLAP queries on time-series data.说起druid的工作原理,重要的是要了解它是如何处理大量数据进行实时分析的。
Druid是一个高性能、面向列的、分布式数据存储系统,专门设计用于对时间序列数据进行OLAP查询。
From a technical perspective, Druid's architecture consists of a coordinator node, a historical node, a broker node, and a real-time node. The coordinator node is responsible for maintaining the global state of the cluster, while the historical node is responsible for serving queries for data that has been persisted to the deep storage. The broker node is responsible for receiving queries from clients and routing them to the appropriate nodes in the cluster, and the real-time node is responsible for ingesting and indexing new data in real-time.从技术角度来看,Druid的架构包括协调节点、历史节点、代理节点和实时节点。
druid 用法

druid 用法
Druid是一种开源的数据存储和分析系统,用于处理大量实时数据。
它可以存储、查询和分析大规模的数据集,包括日志数据、指标数据和事件数据等。
以下是 Druid 的一些常见用法:
1. 数据存储:Druid 可以在分布式环境下存储大量数据,支持
数据的快速插入和查询,具有高容错性和可扩展性。
2. 数据查询:Druid 可以快速查询和过滤大量数据,支持高级
查询语言和聚合函数,如 GROUP BY、SUM、AVG 等。
3. 实时数据处理:Druid 支持实时数据处理,可以实时处理和
分析数据流,支持实时聚合和过滤,可以快速响应业务需求。
4. 数据可视化:Druid 提供了丰富的数据可视化工具和插件,
可以将数据呈现为图表、仪表盘等形式,帮助用户更好地理解和分析数据。
5. 数据集成:Druid 可以与其他数据存储和处理系统集成,如Hadoop、Kafka、MySQL 等,可以灵活地处理和管理数据。
通过以上的介绍,我们可以看出 Druid 是一款非常强大和灵活
的数据存储和分析系统,可以帮助企业快速处理和分析大量实时数据,提高数据处理和决策的效率和准确性。
- 1 -。
Druid(准)实时分析统计数据库——列存储+高效压缩

Druid(准)实时分析统计数据库——列存储+⾼效压缩Druid是⼀个开源的、分布式的、列存储系统,特别适⽤于⼤数据上的(准)实时分析统计。
且具有较好的稳定性(Highly Available)。
其相对⽐较轻量级,⽂档⾮常完善,也⽐较容易上⼿。
Druid vs 其他系统Druid vs Impala/SharkDruid和Impala、Shark 的⽐较基本上可以归结为需要设计什么样的系统Druid被设计⽤于:1. ⼀直在线的服务2. 获取实时数据3. 处理slice-n-dice式的即时查询查询速度不同:Druid是列存储⽅式,数据经过压缩加⼊到索引结构中,压缩增加了RAM中的数据存储能⼒,能够使RAM适应更多的数据快速存取。
索引结构意味着,当添加过滤器来查询,Druid少做⼀些处理,将会查询的更快。
Impala/Shark可以认为是HDFS之上的后台程序缓存层。
但是他们没有超越缓存功能,真正的提⾼查询速度。
数据的获取不同:Druid可以获取实时数据。
Impala/Shark是基于HDFS或者其他后备存储,限制了数据获取的速度。
查询的形式不同:Druid⽀持时间序列和groupby样式的查询,但不⽀持join。
Impala/Shark⽀持SQL样式的查询。
Druid vs ElasticsearchElasticsearch(ES) 是基于Apache Lucene的搜索服务器。
它提供了全⽂搜索的模式,并提供了访问原始事件级数据。
Elasticsearch还提供了分析和汇总⽀持。
根据研究,ES在数据获取和聚集⽤的资源⽐在Druid⾼。
Druid侧重于OLAP⼯作流程。
Druid是⾼性能(快速聚集和获取)以较低的成本进⾏了优化,并⽀持⼴泛的分析操作。
Druid提供了结构化的事件数据的⼀些基本的搜索⽀持。
Segment: Druid中有个重要的数据单位叫segment,其是Druid通过bitmap indexing从raw data⽣成的(batch or realtime)。
林子雨大数据技术原理与应用答案(全)

林子雨大数据技术原理及应用课后题答案大数据第一章大数据概述课后题 (1)大数据第二章大数据处理架构Hadoop课后题 (5)大数据第三章Hadoop分布式文件系统课后题 (10)大数据第四章分布式数据库HBase课后题 (16)大数据第五章NoSQl数据库课后题 (22)大数据第六章云数据库课后作题 (28)大数据第七章MapReduce课后题 (34)大数据第八章流计算课后题 (41)大数据第九章图计算课后题 (50)大数据第十章数据可视化课后题 (53)大数据第一章课后题——大数据概述1.试述信息技术发展史上的3次信息化浪潮及其具体内容。
第一次信息化浪潮1980年前后个人计算机开始普及,计算机走入企业和千家万户。
代表企业:Intel,AMD,IBM,苹果,微软,联想,戴尔,惠普等。
第二次信息化浪潮1995年前后进入互联网时代。
代表企业:雅虎,谷歌阿里巴巴,百度,腾讯。
第三次信息浪潮2010年前后,云计算大数据,物联网快速发展,即将涌现一批新的市场标杆企业。
2.试述数据产生方式经历的几个阶段。
经历了三个阶段:运营式系统阶段数据伴随一定的运营活动而产生并记录在数据库。
用户原创内容阶段Web2.0时代。
感知式系统阶段物联网中的设备每时每刻自动产生大量数据。
3.试述大数据的4个基本特征。
数据量大(Volume)据类型繁多(Variety)处理速度快(Velocity)价值密度低(Value)4.试述大数据时代的“数据爆炸”特性。
大数据摩尔定律:人类社会产生的数据一直都在以每年50%的速度增长,即每两年就增加一倍。
5.科学研究经历了那四个阶段?实验比萨斜塔实验理论采用各种数学,几何,物理等理论,构建问题模型和解决方案。
例如:牛一,牛二,牛三定律。
计算设计算法并编写相应程序输入计算机运行。
数据以数据为中心,从数据中发现问题解决问题。
6.试述大数据对思维方式的重要影响。
全样而非抽样效率而非精确相关而非因果7.大数据决策与传统的基于数据仓库的决策有什么区别?数据仓库以关系数据库为基础,在数据类型和数据量方面存在较大限制。
实时大数据分析之利器Druid

▪ ClickHouse: P, C, A?
DRUID介绍
• 2011 Metamarket 开发,2012年开源 • 初始用于广告分析,程序化分析 • +150贡献者 • 典型应用
• 300亿事件 /天 (品友互动) • 10亿事件/分钟( Jolata) • 用户行为分析(今日头条) • 广告实时分析(小米) • 性能监控分析(OneAPM) • 等等
(Druid.io)
SQL4D
(srikalyc)
PlyQL
(imply.io)
Calcite
(Apache)访Biblioteka 扩展数据源Kafka
HDFS
Storm
S3
Druid行业应用:程序化广告平台分析
《Druid实时大数据分析》
Druid的应用:头条用户行为分析
From:第四次中国Druid用户组Meetup
数据分析的演化阶段
DRUID 架构
Druid 的类 LSM-tree
Druid一些高级特性
• 近似直方图和分位数 • 预估数据(Data Sketch) • 地理索引和查询 • 路由器(Router) • Kafka 索引服务
Druid数据分析生态系统
分析平台
Imply
(imply.io)
实时大数据分析之利器Druid
议程
• 关于品友 • 大数据分析的繁花似锦 • 历史和发展 • 架构 • 技术优势 • 应用 • 其他分析工具
品友:中国程序化营销的领跑者
品牌程序化市场占有率 59.8%
独立第三方广告技术领先者:创新&执行力
品友大数据计算平台
1.5P
每日处理1P的数据量
Druid简单介绍

Druid简单介绍什么是druiddruid是⼀个为OLAP查询需求⽽设计的开源⼤数据系统,druid提供低延时的数据插⼊,实时的数据查询druid使⽤Java开发,基于Jetty提供http rest服务,也提供了Java/Python等语⾔的⼯具包druid是⼀个集群系统,使⽤zookeeper做节点管理和事件监控druid的特点druid的核⼼是时间序列,把数据按照时间序列分批存储,⼗分适合⽤于对按时间进⾏统计分析的场景druid把数据列分为三类:时间戳、维度列、指标列druid不⽀持多表Joindruid中的数据⼀般是使⽤其他计算框架(Spark等)预计算好的低层次统计数据druid执⾏其擅长的查询类型时,从数⼗亿条记录中过滤、汇聚只有亚秒级延迟druid⽀持⽔平扩展,查询节点越多、所⽀持的查询数据量越⼤、响应越快druid完美⽀持的查询类型⽐较简单,查询场景限制较多,⼀些常⽤的SQL(groupby等)语句在druid⾥运⾏速度⼀般druid⽀持低延时的数据插⼊,数据实时可查,不⽀持⾏级别的数据更新druid为什么快druid在数据插⼊时按照时间序列将数据分为若⼲segment,⽀持低延时地按照时间序列上卷,所以按时间做聚合效率很⾼druid数据按列存储,每个维度列都建⽴索引,所以按列过滤取值效率很⾼druid⽤以查询的Broker和Historical⽀持多级缓存,每个segment启动⼀个线程并发执⾏查询,查询⽀持多Historical内部的线程级并发及Historical之间的进程间并发,Broker将各Historical的查询结果做合并druid的⾼可⽤性1. MetaStore挂掉:⽆法感知新的Segment⽣成,不影响⽼数据2. Indexing Service挂掉:⽆法执⾏新的任务,新数据⽆法摄⼊,不影响查询3. Broker挂掉:本Broker节点不能查询,其他节点Broker继续服务,不影响数据摄⼊4. Historical挂掉:Coordinator Node重分配该节点上segment到其它节点5. Coordinator挂掉:Segment不会被加载和删除,选举新leader6. Zookeeper挂掉:⽆法执⾏新的任务,新数据进不来;Broker有缓存-----------------------------------------------------------------------------------------------感谢到访!期待您的下次光临!。
大数据基础知识论文库-074--druid

DruidA Real-time Analytical Data StoreFangjin Y angMetamarkets Group,Inc. fangjin@Eric Tschetterecheddar@Xavier LéautéMetamarkets Group,Inc.xavier@Nelson Ray ncray86@Gian MerlinoMetamarkets Group,Inc.gian@Deep GanguliMetamarkets Group,Inc.deep@ABSTRACTDruid is an open source1data store designed for real-time exploratory analytics on large data sets.The system combines a column-oriented storage layout,a distributed,shared-nothing architecture,and an advanced indexing structure to allow for the arbitrary exploration of billion-row tables with sub-second latencies.In this paper,we describe Druid’s architecture,and detail how it supports fast aggre-gations,flexiblefilters,and low latency data ingestion.Categories and Subject DescriptorsH.2.4[Database Management]:Systems—Distributed databasesKeywordsdistributed;real-time;fault-tolerant;highly available;open source; analytics;column-oriented;OLAP1.INTRODUCTIONIn recent years,the proliferation of internet technology has cre-ated a surge in machine-generated events.Individually,these events contain minimal useful information and are of low value.Given the time and resources required to extract meaning from large collec-tions of events,many companies were willing to discard this data in-stead.Although infrastructure has been built to handle event-based data(e.g.IBM’s Netezza[37],HP’s Vertica[5],and EMC’s Green-plum[29]),they are largely sold at high price points and are only targeted towards those companies who can afford the offering.A few years ago,Google introduced MapReduce[11]as their mechanism of leveraging commodity hardware to index the inter-net and analyze logs.The Hadoop[36]project soon followed and was largely patterned after the insights that came out of the original MapReduce paper.Hadoop is currently deployed in many orga-nizations to store and analyze large amounts of log data.Hadoop has contributed much to helping companies convert their low-value 1http://druid.io/https:///metamx/druidPermission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on thefirst page.Copyrights for components of this work owned by others than the author(s)must be honored.Abstracting with credit is permitted.To copy otherwise,or republish,to post on servers or to redistribute to lists,requires prior specific permission and/or a fee.Request permissions from permissions@.SIGMOD’14,June22–27,2014,Snowbird,UT,USA.Copyright is held by the owner/author(s).Publication rights licensed to ACM.ACM978-1-4503-2376-5/14/06...$15.00./10.1145/2588555.2595631.event streams into high-value aggregates for a variety of applica-tions such as business intelligence and A-B testing.As with many great systems,Hadoop has opened our eyes to a new space of problems.Specifically,Hadoop excels at storing and providing access to large amounts of data,however,it does not make any performance guarantees around how quickly that data can be accessed.Furthermore,although Hadoop is a highly available system,performance degrades under heavy concurrent stly, while Hadoop works well for storing data,it is not optimized for in-gesting data and making that data immediately readable.Early on in the development of the Metamarkets product,we ran into each of these issues and came to the realization that Hadoop is a great back-office,batch processing,and data warehousing sys-tem.However,as a company that has product-level guarantees around query performance and data availability in a highly concur-rent environment(1000+users),Hadoop wasn’t going to meet our needs.We explored different solutions in the space,and after trying both Relational Database Management Systems and NoSQL archi-tectures,we came to the conclusion that there was nothing in the open source world that could be fully leveraged for our require-ments.We ended up creating Druid,an open source,distributed, column-oriented,real-time analytical data store.In many ways, Druid shares similarities with other OLAP systems[30,35,22],in-teractive query systems[28],main-memory databases[14],as well as widely known distributed data stores[7,12,23].The distribution and query model also borrow ideas from current generation search infrastructure[25,3,4].This paper describes the architecture of Druid,explores the vari-ous design decisions made in creating an always-on production sys-tem that powers a hosted service,and attempts to help inform any-one who faces a similar problem about a potential method of solving it.Druid is deployed in production at several technology compa-nies2.The structure of the paper is as follows:wefirst describe the problem in Section2.Next,we detail system architecture from the point of view of how dataflows through the system in Section 3.We then discuss how and why data gets converted into a binary format in Section4.We briefly describe the query API in Section5 and present performance results in stly,we leave off with our lessons from running Druid in production in Section7,and related work in Section8.2.PROBLEM DEFINITIONDruid was originally designed to solve problems around ingest-ing and exploring large quantities of transactional events(log data). This form of timeseries data is commonly found in OLAP work-2http://druid.io/druid.htmlTimestamp Page Username Gender City Characters Added Characters Removed 2011-01-01T01:00:00Z Justin Bieber Boxer Male San Francisco1800252011-01-01T01:00:00Z Justin Bieber Reach Male Waterloo2912422011-01-01T02:00:00Z Ke$ha Helz Male Calgary1953172011-01-01T02:00:00Z Ke$ha Xeno Male Taiyuan3194170Table1:Sample Druid data for edits that have occurred on Wikipedia.flows and the nature of the data tends to be very append heavy.For example,consider the data shown in Table1.Table1contains data for edits that have occurred on Wikipedia.Each time a user edits a page in Wikipedia,an event is generated that contains metadata about the edit.This metadata is comprised of3distinct compo-nents.First,there is a timestamp column indicating when the edit was made.Next,there are a set dimension columns indicating var-ious attributes about the edit such as the page that was edited,the user who made the edit,and the location of the user.Finally,there are a set of metric columns that contain values(usually numeric) that can be aggregated,such as the number of characters added or removed in an edit.Our goal is to rapidly compute drill-downs and aggregates over this data.We want to answer questions like“How many edits were made on the page Justin Bieber from males in San Francisco?”and “What is the average number of characters that were added by peo-ple from Calgary over the span of a month?”.We also want queries over any arbitrary combination of dimensions to return with sub-second latencies.The need for Druid was facilitated by the fact that existing open source Relational Database Management Systems(RDBMS)and NoSQL key/value stores were unable to provide a low latency data ingestion and query platform for interactive applications[40].In the early days of Metamarkets,we were focused on building a hosted dashboard that would allow users to arbitrarily explore and visualize event streams.The data store powering the dashboard needed to return queries fast enough that the data visualizations built on top of it could provide users with an interactive experience.In addition to the query latency needs,the system had to be multi-tenant and highly available.The Metamarkets product is used in a highly concurrent environment.Downtime is costly and many busi-nesses cannot afford to wait if a system is unavailable in the face of software upgrades or network failure.Downtime for startups,who often lack proper internal operations management,can determine business success or failure.Finally,another challenge that Metamarkets faced in its early days was to allow users and alerting systems to be able to make business decisions in“real-time”.The time from when an event is created to when that event is queryable determines how fast inter-ested parties are able to react to potentially catastrophic situations in their systems.Popular open source data warehousing systems such as Hadoop were unable to provide the sub-second data ingestion latencies we required.The problems of data exploration,ingestion,and availability span multiple industries.Since Druid was open sourced in October2012, it been deployed as a video,network monitoring,operations mon-itoring,and online advertising analytics platform at multiple com-panies.3.ARCHITECTUREA Druid cluster consists of different types of nodes and each node type is designed to perform a specific set of things.We believe this design separates concerns and simplifies the complexity of the overall system.The different node types operate fairly independent of each other and there is minimal interaction among them.Hence, intra-cluster communication failures have minimal impact on data availability.To solve complex data analysis problems,the different node types come together to form a fully working system.The name Druid comes from the Druid class in many role-playing games:it is a shape-shifter,capable of taking on many different forms to fulfill various different roles in a group.The composition of andflow of data in a Druid cluster are shown in Figure1.3.1Real-time NodesReal-time nodes encapsulate the functionality to ingest and query event streams.Events indexed via these nodes are immediately available for querying.The nodes are only concerned with events for some small time range and periodically hand off immutable batches of events they have collected over this small time range to other nodes in the Druid cluster that are specialized in dealing with batches of immutable events.Real-time nodes leverage Zookeeper [19]for coordination with the rest of the Druid cluster.The nodes announce their online state and the data they serve in Zookeeper. Real-time nodes maintain an in-memory index buffer for all in-coming events.These indexes are incrementally populated as events are ingested and the indexes are also directly queryable.Druid be-haves as a row store for queries on events that exist in this JVM heap-based buffer.To avoid heap overflow problems,real-time nodes persist their in-memory indexes to disk either periodically or after some maximum row limit is reached.This persist process converts data stored in the in-memory buffer to a column oriented storage format described in Section4.Each persisted index is im-mutable and real-time nodes load persisted indexes into off-heap memory such that they can still be queried.This process is de-scribed in detail in[33]and is illustrated in Figure2.On a periodic basis,each real-time node will schedule a back-ground task that searches for all locally persisted indexes.The task merges these indexes together and builds an immutable block of data that contains all the events that have been ingested by a real-time node for some span of time.We refer to this block of data as a“segment”.During the handoff stage,a real-time node uploads this segment to a permanent backup storage,typically a distributed file system such as S3[12]or HDFS[36],which Druid refers to as “deep storage”.The ingest,persist,merge,and handoff steps are fluid;there is no data loss during any of the processes.Figure3illustrates the operations of a real-time node.The node starts at13:37and will only accept events for the current hour or the next hour.When events are ingested,the node announces that it is serving a segment of data for an interval from13:00to14:00.Every 10minutes(the persist period is configurable),the node willflush and persist its in-memory buffer to disk.Near the end of the hour, the node will likely see events for14:00to15:00.When this occurs, the node prepares to serve data for the next hour and creates a new in-memory index.The node then announces that it is also serving a segment from14:00to15:00.The node does not immediately merge persisted indexes from13:00to14:00,instead it waits for a configurable window period for straggling events from13:00toimmutable segment and hands the segment off.Once this segment is loaded and queryable somewhere else in the Druid cluster,the real-time nodeflushes all information about the data it collected for 13:00to14:00and unannounces it is serving this data.3.1.1Availability and ScalabilityReal-time nodes are a consumer of data and require a correspond-ing producer to provide the data monly,for data dura-bility purposes,a message bus such as Kafka[21]sits between the producer and the real-time node as shown in Figure4.Real-time nodes ingest data by reading events from the message bus.The time from event creation to event consumption is ordinarily on the order of hundreds of milliseconds.The purpose of the message bus in Figure4is two-fold.First,the message bus acts as a buffer for incoming events.A message bus such as Kafka maintains positional offsets indicating how far a con-sumer(a real-time node)has read in an event stream.Consumers can programmatically update these offsets.Real-time nodes update the nodes.The nodes have no knowledge of one another and are operationally simple;they only know how to load,drop,and serve immutable segments.Similar to real-time nodes,historical nodes announce their on-line state and the data they are serving in Zookeeper.Instructions to load and drop segments are sent over Zookeeper and contain infor-mation about where the segment is located in deep storage and how to decompress and process the segment.Before a historical node downloads a particular segment from deep storage,itfirst checks a local cache that maintains information about what segments already exist on the node.If information about a segment is not present in the cache,the historical node will proceed to download the segment from deep storage.This process is shown in Figure5.Once pro-cessing is complete,the segment is announced in Zookeeper.At this point,the segment is queryable.The local cache also allows for historical nodes to be quickly updated and restarted.On startup, the node examines its cache and immediately serves whatever data itfinds.Figure5:Historical nodes download immutable segments from deep storage.Segments must be loaded in memory before they can be queried.data,however,because the queries are served over HTTP,histori-cal nodes are still able to respond to query requests for the data they are currently serving.This means that Zookeeper outages do not impact current data availability on historical nodes.3.3Broker NodesBroker nodes act as query routers to historical and real-time nodes. Broker nodes understand the metadata published in Zookeeper about what segments are queryable and where those segments are located. Broker nodes route incoming queries such that the queries hit the right historical or real-time nodes.Broker nodes also merge partial results from historical and real-time nodes before returning afinal consolidated result to the caller.3.3.1CachingBroker nodes contain a cache with a LRU[31,20]invalidation strategy.The cache can use local heap memory or an external dis-tributed key/value store such as Memcached[16].Each time a bro-ker node receives a query,itfirst maps the query to a set of seg-ments.Results for certain segments may already exist in the cache and there is no need to recompute them.For any results that do not exist in the cache,the broker node will forward the query to the correct historical and real-time nodes.Once historical nodes return their results,the broker will cache these results on a per segment ba-sis for future use.This process is illustrated in Figure6.Real-time data is never cached and hence requests for real-time data will al-ways be forwarded to real-time nodes.Real-time data is perpetually changing and caching the results is unreliable.The cache also acts as an additional level of data durability.In the event that all historical nodes fail,it is still possible to query results if those results already exist in the cache.3.3.2AvailabilityIn the event of a total Zookeeper outage,data is still queryable. If broker nodes are unable to communicate to Zookeeper,they use their last known view of the cluster and continue to forward queries to real-time and historical nodes.Broker nodes make the assump-tion that the structure of the cluster is the same as it was before the outage.In practice,this availability model has allowed our Druid cluster to continue serving queries for a significant period of time while we diagnosed Zookeeper outages.3.4Coordinator NodesDruid coordinator nodes are primarily in charge of data manage-ment and distribution on historical nodes.The coordinator nodes tell historical nodes to load new data,drop outdated data,replicate data,and move data to load balance.Druid uses a multi-version concurrency control swapping protocol for managing immutable segments in order to maintain stable views.If any immutable seg-ment contains data that is wholly obsoleted by newer segments,the outdated segment is dropped from the cluster.Coordinator nodes undergo a leader-election process that determines a single node that runs the coordinator functionality.The remaining coordinator nodes act as redundant backups.A coordinator node runs periodically to determine the current state of the cluster.It makes decisions by comparing the expected state of the cluster with the actual state of the cluster at the time of the run.As with all Druid nodes,coordinator nodes maintain a Zookeeper connection for current cluster information.Coordinator nodes also maintain a connection to a MySQL database that con-tains additional operational parameters and configurations.One of the key pieces of information located in the MySQL database is a table that contains a list of all segments that should be served by historical nodes.This table can be updated by any service that cre-ates segments,for example,real-time nodes.The MySQL database also contains a rule table that governs how segments are created, destroyed,and replicated in the cluster.3.4.1RulesRules govern how historical segments are loaded and dropped from the cluster.Rules indicate how segments should be assigned to different historical node tiers and how many replicates of a segment should exist in each tier.Rules may also indicate when segments should be dropped entirely from the cluster.Rules are usually set for a period of time.For example,a user may use rules to load the most recent one month’s worth of segments into a“hot”cluster,the most recent one year’s worth of segments into a“cold”cluster,and drop any segments that are older.The coordinator nodes load a set of rules from a rule table in the MySQL database.Rules may be specific to a certain data source and/or a default set of rules may be configured.The coordinator node will cycle through all available segments and match each seg-ment with thefirst rule that applies to it.3.4.2Load BalancingIn a typical production environment,queries often hit dozens or even hundreds of segments.Since each historical node has limited resources,segments must be distributed among the cluster to en-sure that the cluster load is not too imbalanced.Determining opti-mal load distribution requires some knowledge about query patterns and speeds.Typically,queries cover recent segments spanning con-tiguous time intervals for a single data source.On average,queries that access smaller segments are faster.These query patterns suggest replicating recent historical seg-ments at a higher rate,spreading out large segments that are close in time to different historical nodes,and co-locating segments from different data sources.To optimally distribute and balance seg-ments among the cluster,we developed a cost-based optimization procedure that takes into account the segment data source,recency, and size.The exact details of the algorithm are beyond the scope of this paper and may be discussed in future literature.3.4.3ReplicationCoordinator nodes may tell different historical nodes to load a copy of the same segment.The number of replicates in each tier of the historical compute cluster is fully configurable.Setups that require high levels of fault tolerance can be configured to have a high number of replicas.Replicated segments are treated the same as the originals and follow the same load distribution algorithm.By replicating segments,single historical node failures are transparent in the Druid cluster.We use this property for software upgrades. We can seamlessly take a historical node offline,update it,bring it back up,and repeat the process for every historical node in a cluster. Over the last two years,we have never taken downtime in our Druid cluster for software upgrades.3.4.4AvailabilityDruid coordinator nodes have Zookeeper and MySQL as external dependencies.Coordinator nodes rely on Zookeeper to determine what historical nodes already exist in the cluster.If Zookeeper be-comes unavailable,the coordinator will no longer be able to send instructions to assign,balance,and drop segments.However,these operations do not affect data availability at all.The design principle for responding to MySQL and Zookeeper failures is the same:if an external dependency responsible for co-ordination fails,the cluster maintains the status quo.Druid uses MySQL to store operational management information and segment metadata information about what segments should exist in the clus-ter.If MySQL goes down,this information becomes unavailable to coordinator nodes.However,this does not mean data itself is un-available.If coordinator nodes cannot communicate to MySQL, they will cease to assign new segments and drop outdated ones. Broker,historical,and real-time nodes are still queryable during MySQL outages.4.STORAGE FORMATData tables in Druid(called data sources)are collections of times-tamped events and partitioned into a set of segments,where each segment is typically5–10million rows.Formally,we define a seg-ment as a collection of rows of data that span some period of time. Segments represent the fundamental storage unit in Druid and repli-cation and distribution are done at a segment level.with results computed on historical and real-time nodes.performance degradations [1].Druid has multiple column types to represent various data for-mats.Depending on the column type,different compression meth-ods are used to reduce the cost of storing a column in memory and on disk.In the example given in Table 1,the page,user,gender,and city columns only contain strings.Storing strings directly is unnecessarily costly and string columns can be dictionary encoded instead.Dictionary encoding is a common method to compress data and has been used in other data stores such as PowerDrill [17].In the example in Table 1,we can map each page to a unique integer identifier.Justin Bieber ->0Ke$ha ->1This mapping allows us to represent the page column as an in-teger array where the array indices correspond to the rows of theoriginal data set.For the page column,we can represent the unique pages as follows:[0,0,1,1]The resulting integer array lends itself very well to compression methods.Generic compression algorithms on top of encodings are extremely common in column-stores.Druid uses the LZF [24]com-pression algorithm.Similar compression methods can be applied to numeric columns.For example,the characters added and characters removed columns in Table 1can also be expressed as individual arrays.Characters Added ->[1800,2912,1953,3194]Characters Removed ->[25,42,17,170]In this case,we compress the raw values as opposed to their dic-tionary representations.Figure 7:Integer array size versus Concise set size.Indices for Filtering Datamany real world OLAP workflows,queries are issued for the results of some set of metrics where some set of di-specifications are met.An example query is:“How many edits were done by users in San Francisco who are also This query is filtering the Wikipedia data set in Table 1based a Boolean expression of dimension values.In many real world sets,dimension columns contain strings and metric columns contain numeric values.Druid creates additional lookup indices for string columns such that only those rows that pertain to a particular query filter are ever scanned.Let us consider the page column in Table 1.For each unique page in Table 1,we can form some representation indicating in which table rows a particular page is seen.We can store this information in a binary array where the array indices represent our rows.If a particular page is seen in a certain row,that array index is marked as 1.For example:Justin Bieber ->rows [0,1]->[1][1][0][0]Ke$ha ->rows [2,3]->[0][0][1][1]Justin Bieber is seen in rows 0and 1.This mapping of col-umn values to row indices forms an inverted index [39].To know which rows contain Justin Bieber or Ke$ha ,we can OR together the two arrays.[0][1][0][1]OR [1][0][1][0]=[1][1][1][1]This approach of performing Boolean operations on large bitmap sets is commonly used in search engines.Bitmap indices for OLAP workloads is described in detail in [32].Bitmap compression al-gorithms are a well-defined area of research [2,44,42]and often utilize run-length encoding.Druid opted to use the Concise algo-rithm [10].Figure 7illustrates the number of bytes using Concise compression versus using an integer array.The results were gen-erated on a cc2.8xlarge system with a single thread,2G heap,512m young gen,and a forced GC between each run.The data set is a single day’s worth of data collected from the Twitter garden hose [41]data stream.The data set contains 2,272,295rows and12dimensions of varying cardinality.As an additional comparison, we also resorted the data set rows to maximize compression.In the unsorted case,the total Concise size was53,451,144bytes and the total integer array size was127,248,520bytes.Overall, Concise compressed sets are about42%smaller than integer ar-rays.In the sorted case,the total Concise compressed size was 43,832,884bytes and the total integer array size was127,248,520 bytes.What is interesting to note is that after sorting,global com-pression only increased minimally.4.2Storage EngineDruid’s persistence components allows for different storage en-gines to be plugged in,similar to Dynamo[12].These storage en-gines may store data in an entirely in-memory structure such as the JVM heap or in memory-mapped structures.The ability to swap storage engines allows for Druid to be configured depending on a particular application’s specifications.An in-memory storage en-gine may be operationally more expensive than a memory-mapped storage engine but could be a better alternative if performance is critical.By default,a memory-mapped storage engine is used. When using a memory-mapped storage engine,Druid relies on the operating system to page segments in and out of memory.Given that segments can only be scanned if they are loaded in memory,a memory-mapped storage engine allows recent segments to retain in memory whereas segments that are never queried are paged out. The main drawback with using the memory-mapped storage engine is when a query requires more segments to be paged into memory than a given node has capacity for.In this case,query performance will suffer from the cost of paging segments in and out of memory.5.QUERY APIDruid has its own query language and accepts queries as POST requests.Broker,historical,and real-time nodes all share the same query API.The body of the POST request is a JSON object containing key-value pairs specifying various query parameters.A typical query will contain the data source name,the granularity of the result data, time range of interest,the type of request,and the metrics to ag-gregate over.The result will also be a JSON object containing the aggregated metrics over the time period.Most query types will also support afilter set.Afilter set is a Boolean expression of dimension name and value pairs.Any num-ber and combination of dimensions and values may be specified. When afilter set is provided,only the subset of the data that per-tains to thefilter set will be scanned.The ability to handle complex nestedfilter sets is what enables Druid to drill into data at any depth. The exact query syntax depends on the query type and the infor-mation requested.A sample count query over a week of data is asfollows:{"queryType":"timeseries", "dataSource":"wikipedia","intervals":"2013-01-01/2013-01-08","filter":{"type":"selector","dimension":"page","value":"Ke$ha"},"granularity":"day","aggregations":[{"type":"count","name":"rows"}]}The query shown above will return a count of the number of rows in the Wikipedia data source from2013-01-01to2013-01-08,fil-tered for only those rows where the value of the“page”dimension is equal to“Ke$ha”.The results will be bucketed by day and will be a JSON array of the following form:[{"timestamp":"2012-01-01T00:00:00.000Z","result":{"rows":393298}},{"timestamp":"2012-01-02T00:00:00.000Z","result":{"rows":382932}},...{"timestamp":"2012-01-07T00:00:00.000Z","result":{"rows":1337}}]Druid supports many types of aggregations including sums on floating-point and integer types,minimums,maximums,and com-plex aggregations such as cardinality estimation and approximate quantile estimation.The results of aggregations can be combined in mathematical expressions to form other aggregations.It is be-yond the scope of this paper to fully describe the query API but more information can be found online3.As of this writing,a join query for Druid is not yet implemented. This has been a function of engineering resource allocation and use case decisions more than a decision driven by technical merit.In-deed,Druid’s storage format would allow for the implementation of joins(there is no loss offidelity for columns included as dimen-sions)and the implementation of them has been a conversation that we have every few months.To date,we have made the choice that the implementation cost is not worth the investment for our organi-zation.The reasons for this decision are generally two-fold.1.Scaling join queries has been,in our professional experience,a constant bottleneck of working with distributed databases.2.The incremental gains in functionality are perceived to beof less value than the anticipated problems with managinghighly concurrent,join-heavy workloads.A join query is essentially the merging of two or more streams of data based on a shared set of keys.The primary high-level strate-gies for join queries we are aware of are a hash-based strategy or a sorted-merge strategy.The hash-based strategy requires that all but one data set be available as something that looks like a hash table, a lookup operation is then performed on this hash table for every row in the“primary”stream.The sorted-merge strategy assumes that each stream is sorted by the join key and thus allows for the in-cremental joining of the streams.Each of these strategies,however, requires the materialization of some number of the streams either in sorted order or in a hash table form.When all sides of the join are significantly large tables(>1bil-lion records),materializing the pre-join streams requires complex distributed memory management.The complexity of the memory management is only amplified by the fact that we are targeting highly concurrent,multitenant workloads.This is,as far as we are aware, an active academic research problem that we would be willing to help resolve in a scalable manner.6.PERFORMANCEDruid runs in production at several organizations,and to demon-strate its performance,we have chosen to share some real world numbers for the main production cluster running at Metamarkets as of early2014.For comparison with other databases we also include results from synthetic workloads on TPC-H data.3http://druid.io/docs/latest/Querying.html。
druiddatasource的理解及其使用

一、什么是druid datasource?Druid datasource是一个开源的分布式数据存储和数据处理系统,最初由MetaMarkets公司开发并于2012年开源。
它主要用于实时分析查询,支持复杂的过滤、聚合和连接操作,适用于大规模的实时数据处理。
在大数据领域中,druid datasource被广泛应用于日志分析、实时监控、业务智能等方面。
二、 druid datasource的核心概念1. Segments(数据段)数据段是druid datasource中的最基本的数据单元,它类似于一张数据库表中的数据块,包含了特定时间范围内的数据。
每个数据段包括了数据的索引、元数据以及原始数据。
2. Data Source(数据源)数据源是对应于一组数据段的抽象,可以是一个表、一个文件夹或者一个数据库中的数据。
在druid datasource中,数据源被用来标识存储在系统中的数据集合。
3. Dimensions(维度)维度是指可以用来对数据进行划分和过滤的属性或者字段,比如时间、地点、用户ID等。
在druid datasource中,维度是用来进行数据分组和筛选的重要元素。
4. Metrics(度量)度量是对数据进行统计和计算的指标,比如总量、平均值、最大值等。
在druid datasource中,度量通常和维度一起使用,用来生成可视化图表和报表。
三、 druid datasource的使用案例1. 日志分析druid datasource可以实时地处理大规模的日志数据,并通过查询和聚合操作提供准确快速的分析结果。
比较典型的应用场景包括网络流量分析、用户行为分析、系统性能监控等。
2. 实时监控在监控系统中,druid datasource可以用来对系统指标、日志数据和报警信息进行实时分析和展示。
通过对多维度的数据聚合,可以及时发现系统中的异常和问题。
3. 业务智能druid datasource在企业内部的数据仓库和业务智能评台中有着广泛的应用。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
前几章以讲过,此处不赘述。
启动Druid:首先进入到Druid的根目录,执行bin/init。 Druid会自动创建一个var目录,内含两个目录。 一个是druid,用于存放本地环境下Hadoop的临时文件、 缓存和任务的临时文件等。另一个是tmp用于存放其他临 时文件。
四、Druid单机环能不同的节点组成的。
内容 导航
CONTENTS
Druid简介
Druid应用场所 Druid集群
Druid单机环境
四、Druid单机环境
安装Druid
下载并安装Druid,命令如下: curl -O http://static.druid.io/artifacts/releases/druid0.9.1.1-bin.tar.gz tar -xzvf druid-0.9.1.1-bin.tar.gz –C /hadoop/ cd /hadoop/druid-0.9.1.1
内容 导航
CONTENTS
Druid简介
Druid应用场所
Druid集群
Druid单机环境
三、Druid集群
Druid集群是由很多功能不同的节点组成的。
三、Druid集群
Druid集群是由很多功能不同的节点组成的。
Historical Nodes:Historical Nodes可以看做是Druid集群 的脊椎,它将segment固化到本地,供集群查询时使用。 Broker Nodes:Broker Nodes 是客户端和相关应用从 Druid集群上查询数据的节点,它的职责是对客户端过来 的查询做负载,聚集和合并查询结果。 Coordinator Nodes:Coordinator Nodes用来管理Druid 集群放在Historical Nodes上的segment。 Real-time Processing:实时数据处理可以在单点实时节 点或者索引服务(indexing service)完成 Overload Nodes:主要是用于批量索引服务。 ZooKeeper:用于集群内部通讯。 Metadata Storage:用户存储segment,configuration等 的metadata信息
此时,可以在overload控制台 http://localhost:8090/console.html来查看任务的运行情况,当状态 为“SUCCESS”时,说明任务执行成功。
四、Druid单机环境
加载流数据
下载并安装tranquility: curl -O http://static.druid.io/tranquility/releases/tranquility-distribution-0.8.0.tgz tar -xzvf tranquility-distribution-0.8.0.tgz cd tranquility-distribution-0.8.0
四、Druid单机环境
批量加载数据
服务启动之后,我们就可以将数据load到druid中进行查询了。
向Druid提交一个注入数据的任务,并将目录指向我们需要加载的数据文件: wikiticker-2015-09-12-sampled.json 在Druid根目录下执行如下命令: curl -X 'POST' -H 'Content-Type:application/json' –d @quickstart/wikiticker-index.json localhost:8090/druid/indexer/v1/task
2、数据可视化 Druid是面向用户分析应用的完美方案,有很多开源的应用支持Druid的数据可视化, 如pivot,caravel和metabase等。 3、查询组件 有许多查询组件供我们使用,如SQL引擎,还有其他各种语言提供的组件,如 Python和Ruby。
第12章
事件数据流引擎Flink
21世纪高等院校“云计算和大数据”人才培养规划教材 《大数据技术与应用基础》 人民邮电出版社
启动Druid服务
在单机情况下,我们可以在一台机器上启动所有的Druid服务进程,分5个终端在 Druid根目录下进行。
1.java `cat conf-quickstart/druid/historical/jvm.config | xargs` -cp "confquickstart/druid/_common:conf-quickstart/druid/historical:lib/*" io.druid.cli.Main server historical 2.java `cat conf-quickstart/druid/broker/jvm.config | xargs` -cp "confquickstart/druid/_common:conf-quickstart/druid/broker:lib/*" io.druid.cli.Main server broker 3.java `cat conf-quickstart/druid/coordinator/jvm.config | xargs` -cp "confquickstart/druid/_common:conf-quickstart/druid/coordinator:lib/*" io.druid.cli.Main server coordinator 4.java `cat conf-quickstart/druid/overlord/jvm.config | xargs` -cp "confquickstart/druid/_common:conf-quickstart/druid/overlord:lib/*" io.druid.cli.Main server overlord 5.java `cat conf-quickstart/druid/middleManager/jvm.config | xargs` -cp "confquickstart/druid/_common:conf-quickstart/druid/middleManager:lib/*" io.druid.cli.Main server middleManager
第11章
事件流OLAP之Druid
21世纪高等院校“云计算和大数据”人才培养规划教材 《大数据技术与应用基础》
能力 要求
CAPACITY
熟悉单机环境下Druid的搭建。 掌握使用Druid进行加载和查询数据。
内容 导航
CONTENTS Druid简介 Druid应用场所 Druid集群 Druid单机环境
内容 导航
CONTENTS
Flink概述 Flink基本架构
单机安装Flink Flink运行第一个例子
Flink集群部署
二、Flink基本架构
Flink系统的架构与Spark类似,是一个基于Master-Slave风格的架构。
(1)JobManager是Flink系统的协调者, JobManager 它负责接收Flink Job,调度组成Job的多 个Task的执行。它还负责收集Job的状态 Task 信息,并管理Flink集群中从节点 Manager TaskManager。 (2)TaskManager也是一个Actor,它是 Client 实际负责执行计算的Worker,在其上执行 Flink Job的一组Task。 (3)Client需要从用户提交的Flink程序配 置中获取JobManager的地址,并建立到 Flink系统主要包含如上3个主要的进程: JobManager的连接,将Flink Job提交给 JobManager。
02
0 1
03
内容 导航
CONTENTS
Flink概述
Flink基本架构
单机安装Flink
Flink运行第一个例子
Flink集群部署
三、单机安装Flink
安装Flink
(1)安装Jdk 1.7.X 或者以上的版本。 (2)进入Flink官网下载页面选择一个与你的Hadoop版本相匹配的Flink包。 下载并解压。 (3)单机本地启动Flink。在Flink目录中执行 bin/start-local.sh 启动local模 式。 bin/start-local.sh 通过查看logs文件夹下的相关日志来检查Flink系统是否在正确的运行。 tail log/flink-*-jobmanager-*.log 在浏览器中输入:http://localhost:8081/,Flink默认监听8081端口,防止其 他进程占用此端口。此时出现下面的管理界面如下图所示。
四、Druid单机环境
数据查询
1、直接通过Druid查询 Druid提供基于json的富文本查询方式。提供的示例中quickstart/wikiticker-toppages.json是一个topN的查询实例。通过curl命令向http://localhost:8082/druid/v2/ 请求服务,请求头设置参数Content-Type: application/json,post方式提交数据 quickstart/wikiticker-top-pages.json返回数据格式显示为美化(pretty)的格式。
能力 要求
CAPACITY
了解Flink系统中包含的主要进程间的 作用。
熟悉Flink的简单操作。
内容 导航
CONTENTS Flink概述 Flink基本架构 单机安装Flink Flink运行第一个例子
Flink集群部署
一、Flink概述
Apache Flink是一个开源的分 布式批数据以及流数据处理平 台。目前已经升级为Apache顶 级开源项目。无论是Spark还 是 Flink,他们的主要优势都是 基于内存运行机器学习算法, 运行速度非常快,而且Flink支 持迭代计算。