A Distributed Algorithm for Joins in Sensor Networks
大数据华为认证考试(习题卷3)

大数据华为认证考试(习题卷3)第1部分:单项选择题,共51题,每题只有一个正确答案,多选或少选均不得分。
1.[单选题]ElasticSearch 存放所有关键词的地方是()A)字典B)关键词C)词典D)索引答案:C解析:2.[单选题]DWS DN的高可用架构是:( )。
A)主备从架构B)一主多备架构C)两者兼有D)其他答案:A解析:3.[单选题]关于Hive与传统数据仓库的对比,下列描述错误的是:( )。
A)Hive元数据存储独立于数据存储之外,从而解耦合元数据和数据,灵活性高,二传统数据仓库数据应用单一,灵活性低B)Hive基于HDFS存储,理论上存储可以无限扩容,而传统数据仓库存储量有上限C)由于Hive的数据存储在HDFS上,所以可以保证数据的高容错,高可靠D)由于Hive基于大数据平台,所以查询效率比传统数据仓库快答案:D解析:4.[单选题]以下哪种机制使 Flink 能够实现窗口中无序数据的有序处理?()A)检查点B)窗口C)事件时间D)有状态处理答案:C解析:5.[单选题]下面( )不是属性选择度量。
A)ID3 使用的信息增益B)C4.5 使用的增益率C)CART 使用的基尼指数D)NNM 使用的梯度下降答案:D解析:C)HDFSD)DB答案:C解析:7.[单选题]关于FusionInsight HD Streaming的Supervisor描述正确的是:( )。
A)Supervisor负责资源的分配和任务的调度B)Supervisor负责接受Nimbus分配的任务,启动停止属于自己管理的Worker进程C)Supervisor是运行具体处理逻辑的进程D)Supervisor是在Topology中接收数据然后执行处理的组件答案:B解析:8.[单选题]在有N个节点FusionInsight HD集群中部署HBase时、推荐部署( )个H Master进程,( )个Region Server进程。
PIER (Peer-to-Peer Information Exchange and Retrieval)课件

Internet.”
精
4
Motivation
Databases:
powerful query facilities potential to scale up to few hundred
computers
For querying Internet, there is a well distributed system that has
DHT is divided into 3 modules
Very simple interface
Storage Manager
Any routing algorithm here: CAN,
Chord, Pastry, etc.
Overlay Routing API:
lookup(key) ipaddr join(landmarkNode) leave() CALLBACK: locationMapChange()
Standard Schemas
Achieved though common software
精
10
Initial Design Assumptions
Overlay Network
DHTs are highly scalable Resilient to network failures But DHTs provided limited functionalities Design challenge: get lots of functionality from this simple
精
2
Outline
Motivation Introduction Architecture Join Algorithms Experimental Results Conclusion
欧洲低分子肝素钠标准说明书

WHO International Standard2nd International Standard Low Molecular Weight Heparin forMolecular Weight CalibrationNIBSC code: 05/112 Instructions for use(Version 3.0, Dated 14/05/2008)1. INTENDED USEThe 2nd International Standard Low Molecular Weight Heparin for Molecular Weight Calibration consists of ampoules, coded 05/112, containing aliquots of a freeze-dried material prepared from porcine mucosa. This preparation was established as the 2nd International Standard Low Molecular Weight Heparin for Molecular Weight Calibration by the Expert Committee on Biological Standardisation of the World Health Organisation in 20072. CAUTIONThis preparation is not for administration to humans .The material is not of human or bovine origin. As with all materials of biological origin, this preparation should be regarded as potentially hazardous to health. It should be used and discarded according to your own laboratory's safety procedures. Such safety procedures should include the wearing of protective gloves and avoiding the generation of aerosols. Care should be exercised in opening ampoules or vials, to avoid cuts.3. UNITAGEThere is no assigned unitage associated with this standard. The standard was calibrated by 15 laboratories in 10 countries, against the 1st International Reference Reagent Low Molecular Weight Heparin for Molecular Weight Calibration (1). It is characterised by the Table in Appendix 1.4. CONTENTSCountry of origin of biological material: Denmark.In June 2005 , 251.3 mg bulk material was dissolved in 10 litres water for injection. The solution was distributed at 4°C into 10000 ampoules (CV for volume of fill 0.15% (n=136)), coded 05/112. The contents of the ampoules were then freeze-dried under the conditions normally used for international biological standards. The mean dry weight (n=6) of the freeze-dried plug was 23.5 mg, with a water content of 0.29%.5. STORAGEUnopened ampoules should be stored in the dark at or below –20°C.6. DIRECTIONS FOR OPENINGDIN ampoules have an …easy -open‟ coloured stress point, where the narrow ampoule stem joins the wider ampoule body.Tap the ampoule gently to collect the material at the bottom (labeled) end. Ensure that the disposable ampoule safety breaker provided is pushed down on the stem of the ampoule and against the shoulder of the ampoule body. Hold the body of the ampoule in one hand and the disposable ampoule breaker covering the ampoule stem between the thumb and first finger of the other hand. Apply a bending force to open the ampoule at the coloured stress point, primarily using the hand holding the plastic collar.Care should be taken to avoid cuts and projectile glass fragments that might enter the eyes, for example, by the use of suitable gloves and an eye shield. Take care that no material is lost from the ampoule and no glass falls into the ampoule. Within the ampoule is dry nitrogen gas at slightly less than atmospheric pressure. A new disposable ampoule breaker is provided with each DIN ampoule.7. USE OF MATERIALNo attempt should be made to weigh out any portion of the freeze-dried material prior to reconstitutionThe calibrant is intended for use in the determination of the molecular weight distribution of low molecular weight heparins by size exclusion chromatography (SEC, also sometimes known as gel permeation chromatography (GPC)). It may be used to calibrate a chromatography system by broad standard calibration (as has been described for the previous calibrant (2)), using the molecular weight distribution information as listed in the table in Appendix 1. For each molecular weight (M) in the Table, the percent of sample above M (%>M) and the percent of sample below M (%<M) are given. The use of specialised SEC computer software for calibration of the chromatography system and for calculation of the molecular weights of low molecular weight heparin samples is strongly recommended. It should be noted that the 2nd International Standard Low Molecular Weight Heparin for Molecular Weight Calibration is not suitable for use in the method of Nielsen (3).8. STABILITYReference materials are held at NIBSC within assured, temperature-controlled storage facilities. Reference Materials should be stored on receipt as indicated on the label.Accelerated degradation studies have shown that the 2nd International Standard is very stable in unopened ampoules stored at –20°C. No change in molecular weight characteristics was observed even when the material was stored at +45°C for 6 months.9. REFERENCES1. Mulloy, B., Heath, A., Behr-Gross, M.-E. (2007) Pharmeuropa Bio 2007-1, 29-48.2. Mulloy, B., Gee, C., Wheeler, S. F., Wait, R., Gray, E., Barrowcliffe, T. W. (1997) Thrombosis and Haemostasis 77, 668-674.3. Nielsen, J.-I. (1992) Thrombosis and Haemostasis 68, 478-480.10. ACKNOWLEDGEMENTSAcknowledgements are made to the joint organisers of the collaborative study at the European Directorate for the Quality of Medicines, in particular to the study co-ordinator Dr M.-E. Behr-Gross, as well as to the participants in the study. We also thank the donors of the bulk material for this standard, Leo Pharma A/S, Industriparken 55, DK-2750 Ballerup, Denmark.11. FURTHER INFORMATIONFurther information can be obtained as follows; This material:enquiries@ WHO Biological Standards:Http://www.who.int/biologicals/en/JCTLM Higher order reference materials: /en/committees/jc/jctlm/ Derivation of International Units:http://www.who.int/biologicals/reference_preparations/en/ Ordering standards from NIBSC:/products/ordering_information/frequently_asked_questions.aspxNIBSC Terms & Conditions:/terms_and_conditions.aspx12. CUSTOMER FEEDBACKCustomers are encouraged to provide feedback on the suitability or use of the material provided or other aspects of our service. Please send any comments to enquiries@13. CITATIONIn all publications, including data sheets, in which this material is referenced, it is important that the preparation's title, its status, the NIBSCcode number, and the name and address of NIBSC are cited and cited correctly.15. LIABILITY AND LOSSInformation provided by the Institute is given after the exercise of all reasonable care and skill in its compilation, preparation and issue, but it is provided without liability to the Recipient in its application and use. It is the responsibility of the Recipient to determine the appropriateness of the standards or reference materials supplied by the Institute to the Recipient (“the Goods”) for the proposed application and ensure that it has the necessary technical skills to determine thatthey are appropriate. Results obtained from the Goods are likely to be dependant on conditions of use by the Recipient and the variability of materials beyond the control of the Institute.All warranties are excluded to the fullest extent permitted by law, including without limitation that the Goods are free from infectious agents or that the supply of Goods will not infringe any rights of any third party.The Institute shall not be liable to the Recipient for any economic loss whether direct or indirect, which arise in connection with this agreement.The total liability of the Institute in connection with this agreement, whether for negligence or breach of contract or otherwise, shall in no event exceed 120% of any price paid or payable by the Recipient for the supply of the Goods.If any of the Goods supplied by the Institute should prove not to meettheir specification when stored and used correctly (and provided that the Recipient has returned the Goods to the Institute together with written notification of such alleged defect within seven days of the time when the Recipient discovers or ought to have discovered the defect), the Institute shall either replace the Goods or, at its sole option, refund the handling charge provided that performance of either one of the above options shall constitute an entire discharge of the Institute‟s liability under this Condition.APPENDIX 1: BROAD STANDARD TABLE FOR 05/112 (LMW Heparin for Molecular Weight Calibration Proposed 2nd International Reference Reagent)Point Log 10(M) M % >M % <M1 2.78 600 0.40 99.602 3.08 1200 3.87 96.13 3 3.26 1800 8.94 91.06 4 3.38 2400 14.49 85.515 3.48 3000 20.68 79.326 3.56 3600 27.20 72.807 3.62 4200 33.89 66.118 3.68 4800 40.49 59.51 9 3.73 5400 46.83 53.17 10 3.78 6000 52.92 47.08 11 3.82 6600 58.59 41.41 12 3.86 7200 63.89 36.11 13 3.92 8400 72.96 27.04 14 3.98 9600 80.09 19.91 15 4.08 12000 89.21 10.79 16 4.13 13600 92.96 7.04 17 4.19 15600 95.95 4.05 184.261800097.772.23。
GDCA认证考试

GDCA认证考试1. 日志恢复技术保证了事务的()?A. 一致性B. 隔离性C. 原子性D. 持久性2. 下列不属于字符串类型的是?A. CHARB. VARCHARC. MEDIUMTEXTD. TINYINT3. ()是MySQL的物理日志,也叫重做日志,记录存储引擎InnoDB(特有)的事务日志?A. errorlogB. redologC. binlogD. warnninglog4. ()指用户的应用程序与数据库中数据的物理存储是相互独立的。
当数据的物理存储改变了,应用程序不用改变。
A. 物理独立性B. 数据独立性C. 应用程序独立性D. 逻辑独立性5. 关于传统集中式架构数据库,哪种说法不正确?A. 方便简单B. 系统成熟稳定C. 管理成本低D. 灵活性大6. GoldenDB金融分布式数据库在哪一年立项?A. 2002B. 2011C. 2014D. 20197. GoldenDB同城RTO可达到?A. 0秒B.小于30秒C. 小于3分钟D. 小于30分钟8. 针对部分节点事务失败的问题,GoldenDB的解决方案是?A. 引入多个计算节点B. 引入全局回滚机制C. 引入一主多备机制D. 引入快同步机制9. GoldenDB数据备份如何实现全局一致状性?A.支持同步备份全局状态信息B. 支持全量备份和增量备份C. 支持任务可视化 D. 支持备份策略灵活可配10. 以下哪条命令可以查看端口是否占用?A. df-hB. free-hC. lsof-i:80D. pkill-9-uzxdb111. 一键安装标准安装的ini配置文件?A. install_senior.iniB. install_fast.iniC. install_advance.iniD. install_triple.ini12. 以下关于一键安装说法正确的是?A. C模块组件均支持容器化安装B. 一键安装时可选择同步创建MPP集群C.License未更新为企业版,仍可以一键安装多分片集群 D. 若一键安装互信步骤未完成,则无法登陆insight界面使用Goldendb产品服务13. 修改哪个文件回到特定步骤开始执行?A. install.txtB. install_fast.iniC. install_step_000000.txtD. install_senior.ini14. 混合部署需要提前执行的命令?A. shsetup.sh-uB. shsetup.sh-cC. shsetup.sh-aD. shsetup.sh-m15. 下列选项,对于表分布规则的描述正确的是?A. GoldenDB仅支持以下分片规则:hash、range、list、duplicateB. GoldenDB支持横向分片,不支持纵向的分区 C. GoldenDB采用一致性hash算法 D. GoldenDB 分片规则只能基于一个表字段16. 下列选项不属于多级分片表优点的是?A. 精确控制数据分布形态B. 操作简单C. 提升批处理访问性能D. 数据物理隔离17. 分片路由功能是下列哪个组件实现的?A. 管理节点B. 数据节点C. 计算节点D. GTM节点18. 关于GoldenDB分布式数据库备份说法错误的是?A. 支持实时和定时备份B. 支持备份指定机房C. 选择备份指定节点后,系统无法自动选择备份其它节点 D. 定时备份任务调整后当天的备份计划不生效19. 不属于GoldenDB分布式数据库租户扩缩容的是?A. CN节点扩缩容B. 管理节点扩缩容C. DN节点扩缩容D. GTM节点扩缩容20. 某集群有1个分片,该分片有3个Team,每个Team包含3个db,主db 在Team2中,该分片水位配置为高水位3、低水位2、主数据节点计数,Team 内DN响应数设置为2。
Hive基础(习题卷1)

Hive基础(习题卷1)说明:答案和解析在试卷最后第1部分:单项选择题,共177题,每题只有一个正确答案,多选或少选均不得分。
1.[单选题]OLTP是什么意思( )A)面向过程的实时处理系统B)面向对象的实时处理系统C)面向事务的实时处理系统D)面向系统的实时处理系统2.[单选题]下列不属于RDBMS常用的数据库软件有( )A)OracleB)SQL ServerC)MySQLD)redis3.[单选题]在Hive中查询语句命令使用的关键字为( )A)showB)lookC)selectD)looks4.[单选题]下列关于Hive中连接查询描述正确的是( )A)Hive中连接查询只支持相等连接而不支持不等连接B)Hive中连接查询支持相等连接和不等连接C)Hive中连接查询只支持不等连接而不支持相等连接D)以上都不对5.[单选题]下面命令中哪个是创建桶表所使用的关键字?( )A)Partitioned ByB)Clustered ByC)Sorted ByD)Fields By6.[单选题]通过数据、( )和对数据的约束三者组成的数据模型来存放和管理数据A)关系B)数据行C)数据列D)数据表7.[单选题]下面命令中哪个是HQL查询所使用的关键字?( )A)Clustered ByB)Stored ByC)Partitioned ByD)Order By8.[单选题]在Hive中使用那个子句筛选满足条件的组,即在分组之后过滤数据( )A)ORDERINGB)HAVINGC)HEVINGD)SORTING9.[单选题]Hive创建内部表之后,表的“Table_type”属性的值为( )A)Managed_tableB)Manag_tableC)Managed_dataD)以上都不对10.[单选题]创建内部表时,默认的数据存储目录在( )。
A)/hive/warehouseB)/hiveC)/user/hive/warehouseD)/warehouse11.[单选题]有关维度数据模型的描述错误的是( )A)是一套技术跟概念的集合B)用于数据仓库设计C)等同于关系数据模型,维度模型需引入关系数据库,在逻辑上相同的维度模型D)可以被用于多种物理形式事实和维度12.[单选题]当用户选择的列是集合数据类型时,Hive会使用( )格式应用于输出A)stringB)mapC)jsonD)list13.[单选题]Hive在处理数据时,默认的行分隔符是( )A)\tB)\nC)\bD)\a14.[单选题]如果A等于null,则返回true,反之返回false的条件是( )A)A to NULLB)A not NULLC)A is NULLD)A are NULL15.[单选题]在Hive中,标准查询关键字执行顺序为( )A)FROM→GROUP BY→WHERE→ORDER BY→HAVINGB)FROM→WHERE→GROUP BY→ORDER BY→HAVINGC)FROM→WHERE→GROUP BY→HAVING→ORDER BYD)FROM→WHERE→ORDER BY→HAVING→GROUP BY16.[单选题]hive-env. sh文件中的配置信息包括( )。
Hive基础(习题卷2)

Hive基础(习题卷2)第1部分:单项选择题,共88题,每题只有一个正确答案,多选或少选均不得分。
1.[单选题]在HBase系统架构中,HMaster主要负责( )A)Database和Region的管理工作B)Database和Master的管理工作C)Table和Region的管理工作D)Table和Master的管理工作答案:C解析:2.[单选题]以下关于数据仓库的叙述中,不正确的是( )A)数据仓库是相对稳定的B)数据仓库是反映历史变化的数据集合C)数据仓库的数据源可能是异构的D)数据仓库是动态的、实时的数据集合答案:D解析:3.[单选题]hive-env. sh文件中的配置信息包括( )。
A)HADOOP_HOMEB)HIVE_HOMEC)JAVA_HOMED)YARN答案:A解析:4.[单选题]在HBase系统架构中,HRegionServer主要负责相应用户I/O请求,向( )文件系统中读写数据A)HAFSB)HBFSC)HCFSD)HDFS答案:D解析:5.[单选题]在HiveCLI命令窗口中查看HDFS的命令是( )。
A)!IsB)dfsC)Ctrl+LD)cat.hivehistory答案:B解析:6.[单选题]下面命令中哪个不是创建表所使用的关键字?( )A)ExternalB)RowC)Location解析:7.[单选题]JVM重用可以使得JVM实例在同个作业中重新使用 N次。
N的值可以在配置文件()中进行配置。
A)hive default.xmlB)hive-site.xmlC)core-site.xmlD)mapred-site.xml答案:D解析:8.[单选题]Hive定义一个UDF函数时,需要继承以下哪个类?( )A)FunctionRegistryB)UDFC)MapReduceD)UDAF答案:B解析:9.[单选题]下列不属于Hive记录中默认分隔符( )A)\nB)^AC)^BD)\r\n答案:D解析:10.[单选题]以下关于Hive的设计特点的描述不正确的是( )A)支持索引,加快数据查询B)不支持不同的存储类型C)可以直接使用存储在Hadoop文件系统中的数据D)将元数据保存在关系数据库中答案:B解析:11.[单选题]比尔·恩门(Bill Inmon)在( )年出版了 Building the Data Warehouse一书,其中所提出的数据仓库(Data Warehouse)的定义被广泛接受。
思科Meraki MR55 802.11ax兼容接入点说明书

MR55Dual-band 802.11ax compatible access point with separate radios dedicated to security,RF management, and BluetoothHigh Performance 802.11ax compatible wirelessThe Cisco Meraki MR55 is a cloud-managed 8x8:8 802.11ax compatible access point that raises the bar for wireless performance and efficiency. Designed for next-generation deployments in offices, schools, hospitals, shops, and hotels, the MR55 offers high throughput, enterprise-grade security, and simple management. The MR55 provides a maximum of 5.9 Gbps* aggregate frame rate with concurrent 2.4 Ghz and 5 Ghz radios. A dedicated third radio provides real-time WIDS/WIPS with automated RF optimization, and a fourth integrated radio delivers Bluetooth scanning and beaconing. With the combination of cloud management, high performance hardware, multiple radios, and advanced software features, theMR55 makes an outstanding platform for the most demanding of uses—including high-density deployments and bandwidth or performance-intensive applications like voice andhigh-definition video.MR55 and Meraki cloud managementManagement of the MR55 is through the Meraki cloud, with an intuitive browser-based interface that enables rapid deployment without time-consuming training or costly certifications. Since the MR55 is self-configuring and managed over the web, it can be deployed at a remote location in a matter of minutes, even without on-site IT staff.24x7 monitoring via the Meraki cloud delivers real-time alerts if the network encounters problems. Remote diagnostic tools enable immediate troubleshooting over the web so that distributed networks can be managed with a minimum of hassle.The MR55’s firmware is automatically kept up to date via the cloud. New features, bug fixes, and enhancements are delivered seamlessly over the web. This means no manual software updates to download or missing security patches to worry about.Product Highlights• 8 x 8 802.11ax with MU-MIMO and OFDMAMulti-Gigabit 1G/2.5G/5G Ethernet• 5.9 Gbps dual-radio aggregate frame rate• 24 x 7 real-time WIPS/WIDS and spectrum analytics via dedicated third radio• Integrated Bluetooth Low Energy Beacon and scanning radio • Enhanced transmit power and receive sensitivity • Full-time Wi-Fi location tracking via dedicated 3rd radio • Integrated enterprise security and guest access• Application-aware traffic shaping• Optimized for voice and video• Self-configuring, plug-and-play deployment• Sleek, low-profile design blends into office environmentsDual–radio aggregate frame rate of up to 5.9 Gbps*A 5 GHz 8x8:8 radio and a 2.4 GHz 4x4:4 radio offer a combined dual–radio aggregate frame rate of 5.9 Gbps*, with up to 4,804 Mbps in the 5 GHz band and 1,147 Mbps in the 2.4 GHz band. Technologies like transmit beamforming and enhanced receive sensitivity allow the MR55 to support a higher client density than typical enterprise-class access points, resulting in better performance for more clients, from each AP.Multi User Multiple Input Multiple Output (MU-MIMO)With support for features of 802.11ax, the MR55 offers MU-MIMO and OFDMA for more efficient transmission to multiple clients. Especially suited to environments with numerous mobile devices, MU-MIMO enables multiple clients to receive data simultaneously. This increases the total network performance and the improves the end user experience.Multigigabit EthernetThe MR55 has an integrated multigigabit uplink that ensures maximum capacity for this high performance 802.11ax compatible hardware configuration.Bluetooth Low Energy Beacon and scanning radioAn integrated fourth Bluetooth radio provides seamless deployment of BLE Beacon functionality and effortless visibility of Bluetooth devices. The MR55 enables the next generation of location-aware applications while future proofing deployments, ensuring it’s ready for any new customer engagement strategies.Automatic cloud-based RF optimizationThe MR55’s sophisticated and automated RF optimization means that there is no need for the dedicated hardware and RF expertise typically required to tune a wireless network. The RF data collected by the dedicated third radio is continuously fed back to the Meraki cloud. This data is then used to automatically tune the channel selection, transmit power, and client connection settings for optimal performance under even the most challenging RF conditions.Integrated enterprise security and guest accessThe MR55 features integrated, easy-to-use security technologies to provide secure connectivity for employees and guests alike. Advanced security features such as AES hardware-based encryption and Enterprise authentication with 802.1X and Active Directory integration provide wired-like security while still being easy to configure. One-click guest isolation provides secure, Internet-only access for visitors. PCI compliance reports check network settings against PCI requirements to simplify secure retail deployments.3rd radio delivers 24x7 wireless security and RF analytics The MR55’s dedicated dual-band scanning and security radio continually assesses the environment, characterizing RF interference and containing wireless threats like rogue access points. There’s no need to choose between wireless security, advanced RF analysis, and serving client data - a dedicated third radio means that all functions occur in real-time, without any impact to client traffic or AP throughput.Enterprise Mobility Management (EMM) &Mobile Device Management (MDM) integrationMeraki Systems Manager natively integrates with the MR55 to offer automatic, context-aware security. Systems Manager’s self-service enrollment helps to rapidly deploy MDM without installing additional equipment, and then dynamically tie firewall and traffic shaping policies to client posture.Application-aware traffic shapingThe MR55 includes an integrated Layer 7 packet inspection, classification, and control engine, enabling the configuration of QoS policies based on traffic type, helping to prioritize mission critical applications while setting limits on recreational traffic like peer-to-peer and video streaming. Policies can be implemented per network, per SSID, per user group, or per individual user for maximum flexibility and control.* Refers to maximum over-the-air data frame rate capability of the radio chipsets, and may exceed data rates allowed by IEEE-compliant operation. FeaturesFeatures (cont’d)Voice and video optimizationIndustry standard QoS features are built-in and easy to configure. Wireless Multi Media (WMM) access categories, 802.1p, and DSCP standards support all ensure important applications get prioritized correctly, not only on the MR55, but on other devices in the network. Unscheduled Automatic Power Save Delivery (U-APSD) and new Target Wait Time features in 802.11ax clients ensure minimal battery drain on wireless VoIP phones.Self-configuring, self-maintaining, always up-to-date When plugged in, the MR55 automatically connects to the Meraki cloud, downloads its configuration, and joins the appropriate network. If new firmware is required, this is retrieved by the AP and updated automatically. This ensures the network is kept up-to-date with bug fixes, security updates, and new features.Advanced analyticsWireless Health is a tool integrated within the Meraki Dashboard to offer powerful heuristics for smarter troubleshooting of customer networks. Drilling down into the details of network usage provides highly granular traffic analytics. Visibility into the physical world can be enhanced with journey tracking through location analytics. Visitor numbers, dwell time, repeat visit rates, and track trends can all be easily monitored in the dashboard and deeper analysis is enabled with raw data available via simple APIs.MR55 Tx / Rx Tables |2.4 GHzOperating Band Operating Mode Data Rate TX Power RX Sensitivity2.4 GHz802.11ax(HE20)MCS026.0 dBm-93 dBm MCS126.0 dBm-91 dBm MCS226.0 dBm-89 dBm MCS326.0 dBm-86 dBm MCS426.0 dBm-83 dBm MCS524.0 dBm-79 dBm MCS624.0 dBm-78 dBm MCS723.5 dBm-76 dBm MCS822.5 dBm-72 dBm MCS922.5 dBm-70 dBm MCS1020.5 dBm-67 dBm MCS1120.5 dBm-64 dBmMR55 Tx / Rx Tables |5 GHzRadiation Pattern for 2.4 GHz Antennas MR5518090270Theta = 0XZ-cut18090270Theta = 0YZ-cut18027090Phi = 0XY-cut(Theta = 90˚)Radiation Pattern for 5 GHz Antennas MR5518090270XZ-cut18090270Theta = 0YZ-cut18027090Phi = 0XY-cut(Theta = 90˚)SpecificationsRadios2.4 GHz 802.11b/g/n/ax client access radio5 GHz 802.11a/n/ac/ax client access radio2.4 GHz & 5 GHz dual-band WIDS/WIPS, spectrum analysis, and location analytics radio2.4 GHz Bluetoth Low Energy (BLE) radio with Beacon and BLE scanning support Concurrent operation of all four radiosSupported frequency bands (country-specific restrictions apply):• 2.400-2.484 GHz• 5.170-5.250 GHz (UNII-1)• 5.250-5.330 GHz (UNII-2)• 5.490-5.730 GHz (UNII-2e)• 5.735-5.835 GHz (UNII-3)AntennaIntegrated omni-directional antennas (5.4 dBi gain at 2.4 GHz, 6 dBi gain at 5 GHz) 802.11ax Compatible, 802.11ac Wave 2 and 802.11n CapabilitiesDL-OFDMA**, TWT Support**, BSS Coloring**8 x 8 multiple input, multiple output (MIMO) with eight spatial streams on 5GHz4 x 4 multiple input, multiple output (MIMO) with four spatial streams on 2.4 GHzSU-MIMO and DL MU-MIMO supportMaximal ratio combining (MRC) and beamforming20 and 40 MHz channels (802.11n); 20, 40, and 80 MHz channels (802.11ac Wave 2)Up to 1024-QAM on both 2.4 GHz & 5 GHz bandsPacket aggregationPowerPower over Ethernet: 42.5-57 V (802.3at compliant)Alternative: 12 V DC inputPower consumption: 22 W maxPower over Ethernet injector and DC adapter sold separatelyInterfaces1x 1000/2.5G/5G BASE-T Ethernet1x DC power connector (5.5 mm x 2.5 mm, center positive)MountingAll standard mounting hardware includedDesktop, ceiling, and wall mount capableCeiling tile rail (9/16, 15/16, or 1 1/2” flush or recessed rails), assorted cable junction boxesBubble level on mounting cradle for accurate horizontal wall mountingPhysical SecurityTwo security screw options included13.5 mm long, 2.5 mm diameter, 5 mm headKensington lock hard pointConcealed mount plate with anti-tamper cable bayEnvironmentOperating temperature: 32 °F to 104 °F (0 °C to 40 °C)Humidity: 5% to 95%Physical Dimensions12.83” x 5.54” x 1.76” (32.6 cm x 14.08 cm x 4.47 cm), not including deskmount feet or mount plateWeight: 35.27 oz (1 kg)SecurityIntegrated Layer 7 firewall with mobile device policy managementReal-time WIDS/WIPS with alerting and automatic rogue AP containment with Air Marshal Flexible guest access with device isolationVLAN tagging (802.1Q) and tunneling with IPSec VPNPCI compliance reportingWEP, WPA, WPA2-PSK, WPA2-Enterprise with 802.1XEAP-TLS, EAP-TTLS, EAP-MSCHAPv2, EAP-SIMTKIP and AES encryptionEnterprise Mobility Management (EMM) & Mobile Device Management (MDM) integration Cisco ISE integration for guest access and BYOD posturingQuality of ServiceAdvanced Power Save (U-APSD)WMM Access Categories with DSCP and 802.1p supportLayer 7 application traffic identification and shapingMobilityPMK, OKC, and 802.11r for fast Layer 2 roamingDistributed or centralized Layer 3 roamingAnalyticsEmbedded location analytics reporting and device trackingGlobal L7 traffic analytics reporting per network, per device, and per applicationLED Indicators1 power/booting/firmware upgrade statusRegulatoryRoHSFor additional country-specific regulatory information, please contact Meraki Sales WarrantyLifetime hardware warranty with advanced replacement includedOrdering InformationMR55-HW: Meraki MR55 Cloud Managed 802.11ax Compatible APMA-PWR-30W-XX: Meraki AC Adapter for MR Series (XX = US/EU/UK/AU)MA-INJ-5-XX: Meraki Multigigabit 802.3at Power over Ethernet Injector(XX = US/EU/UK/AU)Note: Meraki access point license requiredCompliance and StandardsIEEE Standards802.11a802.11ac802.11ax Compatible802.11b802.11e802.11g802.11h802.11i802.11k802.11n802.11r802.11u and Hotspot 2.0Safety ApprovalsCSA and CB 60950 & 62368 Conforms to UL 2043 (Plenum Rating)Radio ApprovalsCanada: FCC Part 15C, 15E, RSS-247Europe: EN 300 328, EN 301 893Australia/NZ: AS/NZS 4268Mexico: IFT, NOM-208Taiwan: NCC LP0002For additional country-specific regulatory information, please contact Meraki Sales EMI Approvals (Class B)Canada: FCC Part 15B, ICES-003Europe: EN 301 489-1-17, EN 55032, EN 55024Australia/NZ: CISPR 22Japan: VCCIExposure ApprovalsCanada: FCC Part 2, RSS-102Europe: EN 50385, EN 62311, EN 62479Australia/NZ: AS/NZS 2772** Software features can be enabled via firmware updates。
Hadoop期末复习题库

一个程序中的MapTask的个数由什么决定?Cc)A、输入的总文件数B、客户端程序设置的mapTask的个数C、FilelnputFormat.getSplits(JobContext job)计算出的逻辑切片的数量D、输入的总文件大小/数据块大小关于SecondaryNameN o de 哪项是正确的?Cc)A. 它是NameNod哟热备B. 它对内存没有要求C. 它的目的是帮助NameNod始`并编辑日志,减少NameNod妇动时间D. Secondary N a meN o de应与NameNod画署到一个节点HBase中的批量加载底层使用(a)实现。
A、MapReduceB、HiveC、CoprocessorD、Bloom FilterDFS检查点(CheckPoint)的作用是可以减少下面哪个组件的启动时间 C b ) A. SecondaryNameNode B. NameNode C. DataNode D. JoumalNode如下哪一个命令可以帮助你知道shell命令的用法Cc)。
A、manB、pwdC、helpD、more解压.tar.gz结尾的HBase压缩包使用的Linux命令是Ca)。
A、tar-zxvfB、tar-zxC、tar--sD、tar11fYARNW翡面默认占用哪个端口? C b )A、50070B、8088C、50090D、9000Flume的Agent包含以下那些组件?(ac )A. SourceB. ZNodeC. ChannelD. Sink面描述HBase的Region的内部结构不正确的是? C d )A. 每个Store由一个MemStore和0至多个StoreFile组成B. Region由一个或者多个Store组成C. MemStore存储在内存中,StoreFile存储在HDFS每个Store保存一个Column关于HDF漠群中的DataNode的描述正确的是?(bed )A. 一个DataNode上存储一个数据块的多个副本B. 存储客户端上传的数据的数据块C. 响应客户端的所有读写数据请求,为客户端的存储和读取数据提供支撑D. 当Datanode读取数据块的时候,会计算它的校验和(checksum), 如果计算后的校验和,与数据块创建时值不一样,说明该数据块巳经损坏下面关千使用Hive的描述中正确的是? C bd )A. Hive支持数据删除和修改B. Hive 中的join查询只支持等值链接,不支持非等值连接C. Hive 中的join查询支持左外连接,不支持右外连接D. Hive默认仓库路径为/user/hive/warehouse/的NameNode负责管理文件系统的命名空间,将所有的文件和文件夹的元数据保存在一个文件系统树中,这些信息也会在硬盘上保存成以下文件:()。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
A Distributed Algorithm for Joins in Sensor NetworksAlexandru Coman Mario A.NascimentoDepartment of Computing ScienceUniversity of Alberta,Canada{acoman|mn}@cs.ualberta.caAbstractGiven their autonomy,flexibility and large range of func-tionality,wireless sensor networks can be used as an effec-tive and discrete means for monitoring data in many do-mains.Typical sensor nodes are very constrained,in par-ticular regarding their energy and memory resources.Thus, any query processing solution over these devices should consider their limitations.We investigate the problem of processing join queries within a sensor network.Due to the limited memory at nodes,joins are typically processed in a distributed manner over a set of nodes.Previous approaches have either assumed that the join processing nodes have sufficient memory to buffer the subset of the join relations assigned to them,or that the amount of available memory at nodes is known in advance.These assumptions are not realistic for most scenarios.In this context we pro-pose and investigate DIJ,a distributed algorithm for join processing that considers the memory limitations at nodes and does not make a priori assumptions on the available memory at the processing nodes.At the same time,our al-gorithm still aims at minimizing the energy cost of query processing.1.IntroductionRecent technological advances,decreasing production costs and increasing capabilities have made sensor net-works suitable for many applications,including environ-mental monitoring,warehouse management and battlefield surveillance.Despite the relative novelty and small num-ber of real-life deployments,sensor networks are consid-ered a highly promising technology that will change the way we interact with our environment[13].Typical sen-sor networks will be typically be formed by a large number of small,radio-enabled,sensing nodes.Each node is ca-pable of observing the environment,storing the observed values,processing them and exchanging them with other nodes over the wireless network.While these capabilitiesare expected to rapidly grow in the near future,the energy source,be it either a battery or some sort of energy har-vesting[8],is likely to remain the main limitation of these devices.Hence,energy efficient data processing and net-working protocols must be developed in order to make the long-term use of such devices practical.Our focus is on en-ergy efficient processing of queries,joins in particular,over sensor networks.We study this problem in an environment where each sensor node is only aware of the existence of the other sensor nodes located within its wireless communica-tion range,and the query can be introduced in the network at any node.Users query the sensor network to retrieve the collected data on the monitored environment.The most popular form for expressing queries in a sensor network is using an SQL-like declarative language[6].The data collected in the sen-sor network can be seen as one relation distributed over the sensor nodes,called the sensor relation in the following.The queries typically accept one or more of the following operators[6,9]:selection,projection,union,grouping and aggregations.We note that the join operation in sensor net-works has been mostly neglected in the literature.A scenario where join queries are important is as fol-lows.National Parks administration is interested in long-term monitoring of the animals in the managed park.A sensor network is deployed over the park,with the task of monitoring the animals(e.g.,using RFID sensing).Park rangers patrol the park and,upon observing certain patterns, query the sensor network through mobile devices tofind in-formation of interest.For instance,uponfinding two ani-mals killed in region A,respectively B,the rangers need to find what animals,possibly ill of rabies,have killed them.The ranger would issue the query“What animals have been in both region A and B between times T1and T2?”.If joins cannot be processed in-network,then two,possibly long, lists of animals IDs appearing in each region will be re-trieved and joined at the user’s device.On the other hand, if the join is processed in-network,only possibly very few animal IDs are retrieved,substantially reducing the commu-nication cost.1In this paper we focus on the processing of the join op-erator in sensor networks.Since the energy required for communication is three to four orders of magnitude higher than the energy required by sensing and computation[9], it is important to minimize the energy cost of communica-tion during query processing.Recently,a few works ad-dressed in-network processing of join queries.Bonfils and Bonnet[3]investigate placing a correlation operator at a node in the network.Pandit and Gupta[11]propose two algorithms for processing a range-join operator in the net-work and Yu at al.[16]propose an algorithm for processing equi-joins.These works study the self-join problem where subsets of the sensor relation are joined.Abadi et al.[1]pro-pose several solutions for the join with an external relation, where the sensor relation is joined with a relation stored at the user’s an et al.[5]study the cost of several join processing solutions with respect to the location of the network region where the join is performed.Most previous solutions either assume that nodes have sufficient memory to buffer the partition of the join relations assigned to them for processing,or that the amount of memory available at each node is known in advance and the assigned data par-titions can be set accordingly.These assumptions are un-realistic for most scenarios.It is well known that sensor networks are very constrained on main memory and the en-ergy cost of using theirflash storage(for those devices that have it)is rather prohibitive to be used for data buffering during query processing.In addition,in large scale sensor networks,it is not feasible for the sensor nodes or the user station to be aware of up-to-date information on memory availability of all network nodes.In this paper our contributions are three-fold.First we analyze the requirements of a distributed in-network join processing algorithm.Second,to our knowledge,this is the first work to develop and discuss in details a distributed al-gorithm for in-network join processing.Third,based on the present algorithm,we develop a cost model that can be used to select the most efficient join plan during the execution of the query.Our join algorithm is general in the sense that it can be used with different types of joins,including semi-joins,with minor modifications to the presented algorithm and cost model.As well,our algorithm can be used within the core of other previously proposed join solutions for re-laxing their assumptions on memory availability.2.BackgroundIn our work we consider a sensor network formed by thousands offixed nodes.Each node has several sensing units(e.g.,temperature,RFID reader),a processor,a few kilobytes of main memory for buffer and data processing,a few megabytes offlash storage for long-term storage of sen-sor observations,fixed-range wireless radio and it is battery operated.These characteristics encompass a wide range ofsensor node hardware,making our work independent of aparticular sensor platform.Further on,we consider thateach node is aware of its location,which is periodicallyrefreshed through GPS or a localization algorithm[14]toaccount for any variation in a node’s position due to envi-ronmental hazards.Each node is aware of the nodes locatedwithin its wireless range,which form its1-hop neighbour-hood.A node communicates with nodes other than its1-hop neighbours using multi-hop routing over the wirelessnetwork.As sensor nodes are not designed for user inter-action,users query the sensor network through personal de-vices,which introduce the query in the network through oneof the nodes in their vicinity.We consider a sensor network deployment where nodesacquire observations periodically and the observations arestored locally for future querying.The data stored at thesensor nodes forms a virtual relation over all nodes,denotedR∗.As nodes store the acquired data locally,each node holds the values of the observations recorded by its sensingunits and the time when each recording was performed.We analyze the self-join processing problem in sensornetworks,i.e.,the joined relations are spatially and tempo-rally constrained subsets of the sensor relation R∗.We im-pose no restrictions on the join condition,that is,any tuplefrom a relation could match any tuple of the other relation.For instance,the query“What animals have been in bothregions R A and R B between times T1and T2?”(from ourexample in Section1)can be expressed in pseudo-SQL as:SELECT S.animalIDFROM R∗as S,R∗as TWHERE S.location IN Region R AAND T.location IN Region R BAND S.time IN TimeRange[T1,T2]AND T.time IN TimeRange[T1,T2]AND S.animalID=T.animalIDLet us denote by A the subset of R∗restricted to Region R A and by B the subset of R∗restricted to Region R B. The query may also contain other operators ops(selection, projection,etc.)on each tuple of R∗or on the result of the join.As our focus is on join processing,we consider the relations A and B as the resulting relations after the query operators that can be applied individually on each node’s relation have been applied.We assume operators that can be processed locally by each sensor node on its stored relation and thus they do not involve any communication.We denote with J the result of the join of relations A and B,including any operators on the join result required by the query:J= ops J(A B).We assume operators on the join result can be processed in a pipelined fashion immediately following the join of two tuples.A general query tree and the notations we use are shown in Figure1.A UU opsJopsAopsopsAR iR jkR RmR oR nR pops BopsBopsBopsBJBAFigure1.Query tree and notations3.DIJ:A Distributed Join Processing Algo-rithm for Sensor NetworksJoin processing in sensor networks is a highly complex operation due to the distributed nature of the processing and the limited memory available at nodes.We discuss some of the requirements of an effective and efficient join pro-cessing algorithm for sensor networks,namely:distributed processing,memory management and synchronized com-munication.•Distributed processing.In large scale sensor net-works the join operation must be processed in a dis-tributed manner using localized knowledge.For most queries no single node can buffer all the data required for the join.In addition,no node(or user station) has global network knowledge tofind the optimal join strategy.As nodes have information only about their neighbourhood,the challenge is to take correct and consistent decisions among nodes with respect to pro-cessing the join.For instance,when the join operation is evaluated over a group of nodes,each node in the group must route and buffer tuples such that each pair of join tuples is evaluated exactly once in the join.•Memory management.Each node participating in the join must have sufficient memory to buffer the tu-ples that it joins and the resulting tuples.For some join queries the join relations are larger than the avail-able memory of a single node.Typically,several nodes must collaborate to process the join operator,pooling their memory and processing resources together.A join processing algorithm should pool these resources together and allocate tasks and data among the partici-pating nodes such that the efficiency of the processing is maximized.•Synchronized dataflow.Inter-node communication must be synchronized such that a node does not re-ceive new tuples to process when its memory is full.Otherwise,the node would have to drop some of the buffered or new tuples,which is unacceptable as it may invalidate the result of the join.Thus,each node mustfully process the join tuples it holds before receivingany new tuples.A similar problem occurs also for thenodes routing the data.A parent node routing data formultiple children may not be able to buffer all receiveddata before it can forward it.Thus,a join processingalgorithm should carefully consider theflow of dataduring its execution.In this work we propose a distributed join processing al-gorithm which considers the above requirements.In ourpresentation we focus on the join between two restrictions(A and B)of the R∗relation,where the join condition isgeneral(theta-join).Thus,every pair of tuples from re-lations A and B must be verified against the join condi-tion.Relations A and B are located within regions R A andR B and they are joined in-network in a join region R J. Technique forfinding the location of the join region havebeen presented elsewhere[4,5,16]and are orthogonal toour problem.In fact,our algorithm is general with respectto the join relations and their locations and could be usedwithin the core of other previously proposed join solutions(e.g.[5]),including solutions using semi-joins(e.g.[16]).For clarity of presentation we describe our join algorithm inthe context of the Mediated Join[5]solution.The Mediated Join solution works as follows:relationsA andB are sent to the join region(R J)where they are joined and the resulting relation J is transmitted to the query originator node.(Recall that a query can be posed at any node of the network.)Figure2shows in overview the query processing steps and the dataflow.The Mediated Join seems straightforward based on this description,but there are several issues that must be carefully addressed in the low-level sensor implementation to ensure the correct-ness of the query result,e.g.:•How to ensure that both relation A and B are transmit-ted to the same region R J?•How large should region R J be to have sufficient re-sources,i.e.,memory at nodes,to process the join?•How should A and B be transmitted such that the join is processed correctly at the nodes in R J?•How to process the join in R J such that the join is processed correctly using minimum resources?We now describe in details DIJ,our join processing al-gorithm addressing these questions.The steps of DIJ are: 1.Multi-cast the query from originator node O to nodesin R A and R B.Designate the nodes closest to the cen-tres C A and C B of the regions R A,respectively R B,as regional coordinators.Designate the coordinator lo-cation C J for join region R J.Disseminate the infor-mation about the coordinators along with the query.Figure2.Mediated Join-dataflow2.Construct routing trees in regions R A and R B rootedat their respective coordinators C A and C B.3.Collect information on the number of query relevanttuples for each region at the corresponding coordina-tors.Each coordinator sends this information to coor-dinator C J of the join region R J.4.Construct the join region.C J constructs R J so that ithas sufficient memory space at its nodes to buffer A.5.Distribute A over R J.(a)C J asks C A to start sending packets with tuples.Once C J receives A’s tuples,it forwards them toa node in R J with available memory.(b)Upon receiving a request for data from C J,C Aasks for relevant tuples from its children in therouting tree.The process is repeated by all inter-nal tree nodes until all relevant tuples have beenforwarded up in the tree.6.Broadcast B over R J(a)Once C J receives a signal from C A that it hasno more packets(i.e.,tuples)to send,C J asksfor one packet with tuples from C B.When thepacket is received,it is broadcast to nodes in R J.(b)Each node in R J joins the tuples in the packet re-ceived from B with its local partition of A,send-ing the resulting tuples to O.Once the join iscomplete,each node asks for another packet ofB’s tuples from C J.(c)Upon receiving a request for tuples from C J,C Basks for a number of join tuples from its childrenin the routing tree.The process is repeated byall internal tree nodes if they cannot satisfy therequest alone.(d)Once C J receives requests for B’s tuples from allnodes in R J,Step6is repeated unless C B signalsthat it has no more packets(i.e.,tuples)to send.In the steps above we chose,only for the sake of presen-tation,that relation A is distributed over the nodes in R J and relation B is broadcast over the nodes in R J.The steps above are symmetrical if the roles of A and B are switched, however the actual order does matter in terms of query cost. In Section4we explore this issue and show how to deter-mine which relation should be distributed and which should be broadcast in order to minimize the cost of the processing the join operator.Steps1-3of DIJ are typical to in-network query pro-cessing and do not present particular challenges.In Step4, the join coordinator C J must request and pool together the memory of other nodes in its vicinity for allocating relation A to these nodes(in Step5a).This is a non-trivial task as C J does not have information about the nodes in its vicin-ity(except its1-hop neighbours).Steps5and6also pose a challenge,that is,how to control theflow of tuples effi-ciently without buffer overflows,ensuring correct execution of the join.We detail these steps in the following.3.1.Constructing the join region(Step4)Once node C J receives the size of the join relations A and B from C A and C B(in Step1),it mustfind the nodes in its vicinity where to buffer relation A.DIJ uses the fol-lowing heuristic for this task,called k-hop-pooling: If C J alone does not have sufficient memory tobuffer relation A,C J asks its1-hop neighbours toreport how much memory they have available forprocessing the query.If relation A is smaller thanthe total memory available at the1-hop neigh-bours,C J stops the memory search.Otherwise,C J asks its2-hop neighbours to report their avail-able memory.This process is repeated for k-hops,where k represents the number of hops such thatthe total memory available at the nodes up to khops away from C J plus the memory available atC J is sufficient to buffer relation A.An interesting question is how much memory should a node allocate for processing a particular query.If the sensor network processes only one join query at a time(e.g.,there is a central point that controls the insertion of join queries in the network),then nodes can allocate all the memory they have available for processing the join.However,if nodes al-locate all their memory for a query,but several join queries are processed simultaneously in the network,it may happen that a coordinator C J will notfind any nodes with available memory in its immediate vicinity,forcing it to use farther away nodes during processing,and,thus,consuming more energy.For networks where multiple queries may coexist in the network,nodes should allocate only a part of their avail-able memory for a certain query,reserving the rest for other queries.How to actually best allocate the memory of an individual node is orthogonal to our problem.In this work we assume that nodes report as available only the memorythey are willing to use for processing the requested query.Figure3shows a possible memory allocation scheme at anode.3.2Distributing A over R J(Step5)In this step two tasks are carried out concurrently:C Arequests and gathers relevant tuples(grouped in data pack-ets)from R A,and C J distributes the packets received fromC A over R J.Once the set of k-hop neighbours that will buffer A hasbeen constructed,C J asks for relation A from C A,packetby packet,and distributes each packet of A’s tuples in around-robin fashion to its neighbours,ordered by their hopdistance to C J.When deciding to which node to send a newpacket with A’s tuples,a straightforward packet allocationstrategy would be for C J to pick a node from its list andsend to it all new packets with A’s tuples until its allocatedmemory is full.This strategy has two disadvantages.As allpackets use the same route(for most routing algorithms)toget to their destination node,their delivery will be delayed ifthere is a delay on one of the links in the route.Also,con-secutive packets may contain tuples with values such thatthey all(or many of them)will join with the same tuple inB.In this case,the node holding all these tuples will gener-ate many result tuples that have to be transmitted,delayingthe processing of the join.The hop-based round-robin al-location also ensures that all k-hop neighbours have a fairchance of having some free memory at the end of the allo-cation process,memory that can be used for other queries.Once node C A receives a request for tuples from C J,ithas to gather relevant tuples from R A.If C A would simplybroadcast the tuple request in the routing tree constructedover R A,nodes in R A will start sending these tuples to-ward C A.As each internal tree node has(likely)severalchildren,it should receive and buffer many packages beforebeing able to send these packages out.Some nodes maynot be able to handle such a dataflow due to lack of bufferspace,possibly dropping some of the packets.To ensurethat no packages are lost due to lack of buffer space,wepropose aflow synchronization scheme where each nodewill only buffer one package.In this scheme,the requestfor A’s tuples is transmitted one link at a time.Each nodein the routing tree is in one of the following states duringthe synchronized tupleflow(Figure4):•Wait for a tuple request from the parent node(or C J in the case of C A)in the routing tree constructed in Step2.•Send local tuples(from the local storage or receive buffer)to the parent node.•If buffer space has been freed and there are relevant tu-ples available at the children nodes in the routing tree,Figure3.Memory allocation schemeFigure4.A node’s states during tuple routingrequest tuples from a child node that still has tuples to send.Figure5shows the routing tree for a region and the information maintained in each node of the tree as tuples are routed from either R A or R B to R J.Note that the number of tuples that each child node will pro-vide has been collected as part of Step3.•Receive tuples from child,buffer the tuples and update the number of tuples that the child still has available.Once a node has forwarded to its parent all of A’s tuples from its routing sub-tree,it can free all buffers used for pro-cessing the query.{local: 2 tuples}{local: 2 tuples{local: 0 tuples}{local: 3 tuples}{local: 3 tuples}{local: 2 tuples{local: 3 tuplesN5: 8 tuplesN6: 5 tuples}N3: 0 tuplesN1: 2 tuplesN2: 3 tuples}N4: 3 tuples}N7N5N1N2N3N6N4Figure5.Join tuples information at nodes3.3.Broadcasting B over R J(Step6)The collection of B’s tuples proceeds much like the collection of A’s tuples,with one important difference. Whereas C A gathers and sends all of the relevant tuples of A as a a result of a single tuple request from C J,C B only sends one packet with tuples for each request it re-ceives from C J.This way,C J can broadcast such a packet of tuples to all nodes in R J,wait until all nodes fully pro-cess the local joins and send the results,and then request a new packet of tuples from R B when each node in the join region R J is ready to receive and join a new set of tuples.4.Selecting the relation to be distributedIn the previous discussions we have assumed for clarity of presentation that relation A is distributed over the nodes in region R J and B is broadcast over the nodes in the re-gion.An interesting question is which of the two join rela-tion should be distributed and whether the choice makes a major difference in cost.Let us focusfirst on which of the two join relation should be distributed and,subsequently,which should be incre-mentally broadcast.To decide on this matter,the query optimizer has to estimate the cost of the two options(i.e., distribute A or B)and compare their costs to decide which alternative is more energy efficient.For generality,we de-rive in the following a cost model for processing the join by distributing relation R d and broadcasting relation R b.The actual relations A and B can then be substituted into R d and R b(or vice-versa)to estimate the processing costs.Considering the steps of DIJ,the cost of query process-ing can be decomposed into a sum of components,with one component associated to each step.Several of these com-ponents are independent of the choice of the relation that is distributed.Thus,they do not affect the decision of which relation to distribute and do not need to be derived.For instance,we have the cost for disseminating the query in regions A and B(Step1)and the cost for constructing the routing tree over regions R A and R B(Step2).These costs are identical when processing the join by distributing A or B and do not affect the decision.The steps that have differ-ent costs when A or B are the distributed relation R d are the construction of the join region R J(Step4),the distribution of the relation R d(Step5a)and the broadcast of the relation R b(Step6a).Note that we are only interested in differences in the communication cost between the two alternatives. 4.1.Constructing the join region(Step4)As discussed in Section3.1,we use the k-hop-pooling strategy to construct the join region R J.In each round of memory allocation,C J broadcasts its request for memory in a hop-wise increasing fashion,until sufficient nodes with the required buffer space are located.During a round h,each node within h-hops from C J broadcast the memory request and its1-hop neighbours re-ceive the request message.Thus,the total energy cost is:E memreq4=k−1h=0(E t N h n M r+E r N h n N1n M r),where N h n represents the average number of nodes within h hops from a node,E t and E r represents the energy required to transmit,respectively receive,one bit of information and M r represents the size of the memory request message(in bits).N h n is a network-dependent value independent of our technique and it is derived in the Appendix.When a node receives a memory request message for the first time,it allocates buffer space in its memory and sends the memory information to C J.The nodes located h-hops away from C J perform two tasks:they send their own mem-ory information to the nodes located h−1hops away;and they forward the information they have received from the nodes located between h+1and k hops away from C J.If we denote by M i the size of the memory information for one node,the total energy cost of collecting the information on available memory is:E meminfo4=kh=1((E t+E r)(N h n−N h−1n)M i+(E t+E r)(N k n−N h n)M i)=(E t+E r)(kN k n−k−1h=1N h n)M iNote that(N h n−N h−1n)represents the number of nodesh-hops away and(N k n−N h−1n)represents the number of nodes located more than h and up to k hops away from C J. The total energy cost of the fourth step of DIJ is:E4=E memreq4+E meminfo4.Note that the costs of Step4do not depend on the join re-lations directly,but through k which determines the size of the join region R J and it is determined by the size of the join relation R d.Let B s be the average size(in bits)of the buffer space that each node in R J can allocate for processing the query. The minimum number of nodes that must be used to store relation R d in region R J is||R d||B s,where||R||denotes the size(in bits)of relation R.Since nodes are added to R J in groups based on their hop distance,k is the lowest number of hops such that the nodes within k hops from C J have sufficient buffer space to buffer R d:k={min h|N h n B s≥||R d||}.。