Parallel Computers

合集下载

并行计算概览(英文版)

并行计算概览(英文版)

Hardware
Distributed Memory
Distributed memory systems require a communication network to connect inter-processor memory. Processors have their own local memory and operate independently. Memory addresses in one processor do not map to another processor, so there is no concept of global address space across all processors. Data exchange between processors is managed by the programmer , not by the hardware.
2
3
Hardware
von Neumann Architecture
In 1945, the Hungarian mathematician John von Neumann proposed the above organization for hardware computers. The Control Unit fetches instructions/data from memory, decodes the instructions and then sequentially coordinates operations to accomplish the programmed task. The Arithmetic Unit performs basic arithmetic operation, while Input/Output is the interface to the human operator.

计算机专业英语课后题答案汇总

计算机专业英语课后题答案汇总

课后题答案.doc第六章Dbcbbaacbd jachgidefb7Ababdadcdb iefjabgdch8Aacacacddc gajidbhcfe9Cbadcdbbbd gbaihecjdf第10章Aadbacdcbd ghfabcjdei计算机专业英语+单词+部分习题计算机专业英语(2008影印版)高等教育出版社共10页KEY TERMS第一单元application software应用软件basic application基本应用软件communication device通信设备compact disc (CD)光盘computer competency计算机能力Connectivity连通性Data数据database file数据库文件desktop computer台式计算机device driver磁盘驱动程序digital versatile disc(DVD)数字多用途光盘digital video disc(DVD)数字多用途光盘document file文档文件end user终端用户floppy disk软盘handheld computer手持计算机hard disk硬盘Hardware硬件High definition高清Information信息information system信息系统information technology信息技术input device输入设备Internet因特网Keyboard键盘mainframe computer大型机Memory内存Microcomputer微型机Microprocessor微处理器midrange computer中型机Minicomputer小型计算机Modem调制解调器Monitor监视器Mouse鼠标Network网络notebook computer笔记本电脑operating system操作系统optical disk光盘output device输出设备palm computer掌上电脑Peoplepersonal digital assistant(PDA)个人数字助理presentation file演示文稿primary storage主存Printer打印机Procedure规程Program程序random access memory随机存储器secondary storage device辅存Software软件specialized application专门应用软件Supercomputer巨型机system software系统软件system unit系统单元tablet PC平板电脑Utility实用程序wireless revolution无线革命worksheet file工作表第三单元analytical graph分析图application software应用软件Autocontent Wizard内容提示向导basic applications基础应用软件bulleted list项目符号列表business suite商业套装软件Button按键Cell单元格character effect字效Chart图表Column列Computer trainer计算机培训员Contextual tab上下文标签Database数据库database management system (DBMS)数据库管理系统database manager数据库管理员Design template设计模板dialog box对话框Document文件Editing编辑Field字段find and replace查找和替换Font字体font size字号Form窗体Format格式Formula公式Function函数Galleries图库grammar checker语法检查器graphical user interface (GUI)图形用户界面home software家庭软件home suite家庭套装软件Icons图标integrated package集成组件Label标签master slide母板Menu菜单menu bar菜单栏numbered list编号列表numeric entry数值型输入personal software个人软件personal suite个人套装软件Pointer指针presentation graphic图形演示文稿productivity suite生产力套装软件Query查询Range范围Recalculation重算Record记录relational database关系型数据Report报表Ribbons功能区、格式栏Row行Sheet工作表Slide幻灯片software suite软件套装Sort排序specialized applications专用应用程序specialized suite专用套装软件speech recognition语音识别spelling checker拼写检查器spreadsheet电子表格system software系统软件Table表格text entry文本输入Thesaurus[θis?:r?s]分类词汇集Toolbar工具栏user interface用户界面utility suite实用套装软件what-if analysis变化分析Window窗口word processor文字处理软件word wrap字回行workbook file工作簿Worksheet工作表第四单元Animation动画artificial intelligence (AI)人工智能artificial reality虚拟现实audio editing software音频编辑软件bitmap image位图Blog博客Buttons按键clip art剪辑图Desktop publisher桌面发布desktop publishing program桌面印刷系统软件drawing program绘图程序expert systems专家系统Flash动画fuzzy logic模糊逻辑graphical map框图graphics suite集成图HTML editors HTML编辑器illustration program绘图程序Image editors图像编辑器image gallery图库immersive experience沉浸式体验industrial robots工业机器人Interactivity交互性knowledge bases知识库knowledge-based system知识库系统Link链接mobile robot移动式遥控装置Morphing渐变Multimedia多媒体multimedia authoring programs多媒体编辑程序page layout program页面布局程序perception systems robot感知系统机器人Photo editors图像编辑器Pixel[piks?l]像素raster image光栅图像Robot机器人Robotics机器人学stock photographs照片库story boards故事版Vector[vekt?]矢量vector illustration矢量图vector image矢量图象video editing software视频编辑软件virtual environments虚拟环境virtual reality虚拟现实virtual reality modeling language (VRML)虚拟现实建模语言virtual reality wall虚拟现实墙VR虚拟现实Web authoring网络编程Web authoring program网络编辑程序Web log网络日志Web page editor网页编辑器Add Printer Wizard添加打印机向导Antivirus program反病毒程序Backup备份backup program备份程序Booting启动、引导cold boot冷启动computer support specialist计算机支持专家Dashboard widgets仪表盘Desktop桌面desktop operating system桌面操作系统device driver磁盘驱动程序diagnostic program诊断程序dialog box对话框Disk Cleanup磁盘清理Disk Defragmenter磁盘碎片整理器Driver驱动器embedded operating systems嵌入式操作系统File文件file compression program文件压缩程序Folder文件夹Fragmented碎片化graphical user interface (GUI)图形用户界面Help帮助Icon图标language translator语言编译器leopard[lep?d]雪豹操作系统LinuxMac OS Mac操作系统Mac OS XMenu菜单Multitasking多任务处理network operating systems(NOS)网络操作系统network server网络服务器One Button Checkup一键修复operating system操作系统Platform平台Pointer指针Sectors[sekt?]扇区software environment软件环境Spotlight聚光灯stand-alone operating system独立操作系统system software系统软件Tiger老虎操作系统troubleshooting program故障检修程序Uninstall program卸载程序UNIXuser interface用户界面Utility实用程序utility suite实用套装软件Virus[vai?r?s]病毒warm boot热启动Window视窗Windows视窗操作系统Windows Update Windows更新Windows VistaWindows XP第六单元AC adapter 交流适配器Accelerated graphics port(AGP):图形加速端口Arithmetic-logic unit(ALU):算术逻辑单元Arithmetic operation:算术运算ASCII美国标准信息交换码Binary coding schemes:二进制编码制Bit:位Bus:总线Bus line:总线Byte:字节Cable:电缆Cache memory:高速缓存carrier package 封装物Central processing unit (CPU):中央处理器Chip:芯片Clock speed时钟速度Complementary metal-oxide semiconductor:互补金属氧化物半导体Computer technician计算机工程师Control unit:控制单元Coprocessor协处理器Desktop system unit:桌面系统单元Digital数字的Dual-core chips双核芯片EBCDIC:扩展二进制编码的十进制交换码Expansion bus扩展总线Expansion card扩展卡Expansion slot扩展槽FireWire port:火线接口Flash memory闪存Graphics card图形适配卡Graphics coprocessor图形协处理器Handheld computer system unit 手持计算机系统单元Industry standard architecture(ISA)工业标准结构Infrared Data Association(IrDA)红外线传输模组Integrated circuit:集成电路Laptop computer膝式计算机Logical operation逻辑运算Microprocessor:微处理器Motherboard:主板Musical instrument digital interface(MIDI)乐器数字接口Network adapter card网络适配卡Network interface card(NIC)网络接口卡Notebook system unit:笔记本Parallel ports:并行端口Parallel processing并行处理Pc card: :个人计算机插卡PCI Express(PCIe)Peripheral component interconnect (PCI):外围部件互联Personal digital assistant (PDA) 个人数字助理Plug and play:即插即用Port:端口Power supply unit 供电设备Processor:处理器RAM cache: RAM高速缓存Random-access memory (RAM):随机存储器Read-only memory (ROM):只读存储器RFID tag射频识别标签Semiconductor:半导体serial ATA(SATA)串行A TA接口规范Serial ports:串行端口Silicon chip:硅芯片Slot:插槽Smart card:智能卡sound card声卡System board:系统板System cabinet:主机System clock:系统时钟System unit:系统单元tablet PC平板式电脑tablet PC system unit平板式电脑系统单元TV tuner card:电视调频卡Unicode:统一字符编码标准Universal serial bus (USB):通用串行总线Universal serial bus (USB) port:通用串行总线端口Virtual memory:虚拟存储器Word:字第七单元active-matrix monitor有源矩阵显示器bar code条形码bar code reader条形码阅读器cathode ray tube monitor (CRT)阴极射线管显示器Clarity清晰度combination key组合键cordless mouse无线鼠标data projector数据投影仪digital camera数码照相机Digital media player数字媒体播放器Digital music player数码音乐播放器digital video camera数码影像摄录机dot pitch点距dot-matrix printer针式打印机dots-per-inch (dpi)点每英寸dual-scan monitor双向扫描显示器dumb terminal哑终端e-book电子图书阅读器ergonomic keyboard人体工程学键盘Fax machine传真机flat-panel monitor平面显示器Flatbed scanner平板扫描仪flexible keyboard可变形键盘handwriting recognition software手写识别软件Headphones耳机high-definition television (HDTV)高清电视ink-jet printer喷墨打印机intelligent terminal智能终端Internet telephone网络电话Internet telephony网络电话IP Telephony IP电话Joystick游戏杆Keyboard键盘laser printer激光打印机light pen光笔Liquid crystal display(LCD)液晶显示器Magnetic card reader磁卡阅读器magnetic-ink character recognition (MICR)磁性墨水字符识别mechanical mouse机械鼠标Monitor显示器Mouse鼠标mouse pointer鼠标指针multifunction device (MFD)多功能设备network terminal网络终端numeric keypad数字小键盘optical-character recognition (OCR)光学字符识别optical-mark recognition (OMR)光学标记识别optical mouse光电鼠标Optical scanner光电扫描仪passive-matrix monitor无源矩阵显示器PDA keyboard PDA键盘personal laser printer个人激光打印机photo printer照片打印机picture elements 有效像素Pixel像素Pixel pitch像素间距platform scanner平版式扫描仪Plotter绘图仪pointing stick触控点portable printer便携式打印机portable scanner便携式扫描仪Printer打印机Radio frequency card reader射频卡阅读器Radio frequency identification(RFID)射频识别refresh rate刷新率Resolution分辨率roller ball滚动球shared laser printer共享激光打印机Speakers扬声器Stylus[stail?s]输入笔Technical writer技术文档编写员telephony[tilef?ni]电话Terminal终端thermal printer[θ?:m?l]热敏打印机thin client瘦客户端thin film transistor monitor (TFT)薄膜晶体管显示器toggle key[t?ɡl]切换键touch pad触控板touch screen触摸屏Trackball轨迹球traditional keyboard传统键盘Universal Product Code (UPC)同一产品编码voice-over IP (VoIP)网络电话voice recognition system语音识别系统wand reader棒式阅读器WebCam摄像头wheel button滚动键wireless keyboard无线键盘wireless mouse无线鼠标第八单元access speed存取速度Blu-Ray(BD)蓝光Capacity容量CD (compact disc)光盘CD-R (CD-recordable)可录式CDCD-ROM (compact disc-read only memory)光盘库CD-RW (compact disc rewritable)可重写CDCylinder[silind?]柱面Density密度direct access直接存取disk caching磁盘缓存DVD(digital versatile disc or digital video disc)DVD player DVD播放器DVD- R (DVD recordable)可录式DVDDVD +R (DVD recordable)可录式DVDDVD-RAM(DVD random-access memory)DVD随机存取器DVD-ROM(DVD random-read-only memory)DVD只读存储器DVD-ROM jukeboxDVD-RW (DVD rewritable)可重写DVDEnterprise storage system企业存储系统erasable optical disk可擦光盘file compression文件压缩file decompression文件解压缩File server文件服务器flash memory card闪存卡floppy disk软盘Floppy disk cartridge软盘盒floppy disk drive (FDD)软磁盘驱动器hard disk硬盘hard-disk cartridge硬盘盒hard-disk pack硬盘组HD DVD(high-definition DVD)高清DVDhead crash磁头碰撞Hi def(high definition)高清high capacity disk高容量磁盘internal hard disk内置硬盘Internet hard drive网络硬盘驱动器Label标签Land平地magnetic tape磁带magnetic tape reel磁带盒magnetic tape streamer磁带条Media多媒体optical disk光盘optical disk drive光盘驱动器Organizational Internet storage组织性网络存储PC Card hard disk PC卡硬盘Pit坑primary storage主存RAID system磁碟阵列系统Redundant array of inexpensive disks(RAID)廉价磁盘冗余阵列secondary storage辅存Sector扇区sequential access顺序存取Shutter滑盖Software engineer软件工程师solid-state storage固态存储器storage devices存储装置tape cartridge盒式带Track轨道USB drive USB驱动器write-protection notch写入保护缺口第九单元3G cellular networkanalog signal 模拟信号asymmetric digital subscriber line(ADSL)非对称数字用户线路Backbone中枢Bandwidth带宽base station基址bits per second位/秒Bluetooth 蓝牙Broadband宽带broadcast radio无线广播Bus总线bus network总线网络cable modem电缆调制解调器cellular service无线服务Client 客户client/server network system客户/服务网络系统coaxial cable同轴电缆communication channel 信道communication system 通信系统computer network计算机网络Connectivity连通性Demodulation 解调dial-up service拨号服务digital signal数字信号digital subscriber line (DSL)数字用户线路distributed data processing分布式数据处理系统distributed processing分布处理domain name server (DNS)域名服务Ethernet以太网external modem外置调制解调器Extranet外联网fiber-optic cable 光纤电缆Firewall防火墙global positioning system (GPS)全球卫星定位系统hierarchical network树型网络home network家庭网络host computer主机Hub集线器Infrared红外线internal modem 内置式调制解调器Intranet内联网IP address (Internet Protocol address)IP地址local area network (LAN)局域网low bandwidth低频带宽medium band 中频波段metropolitan area network (MAN) 城域网Microwave微波Modem调制解调器Modulation调制network administrator网络管理员network architecture网络体系结构network gateway 网关network hub 网络集线器network interface card (NIC)网络接口卡network operating system (NOS)网络操作系统Node 节点Packet 数据包PC card modem PC卡调制解调器peer-to-peer network system 对等网络系统Polling 轮流检测Protocol协议proxy server代理服务器ring network环型网络Satellite卫星satellite/air connection service卫星互连服务Server服务器star network 星型网络Strategy策略T1, T2, T3, T4 linestelephone line电话线terminal network 终端网络time-sharing system并发式系统Topology拓扑结构transfer rate传输率TCP/IP (transmission control protocol/Internet protocol)传输控制协议/因特网协议voiceband声音带宽wide area network (W AN)广域网Wi-FI (wireless fidelity)无限保真wireless LAN (WLAN)无线局域网wireless modem无线调制解调器wireless receiver无线接收器课后习题答案:Ch1: Ch6:bbabd,dacdd; eichafgbdj. dbcbb,aacbd; jachgidefb.Ch3: Ch7:dcbdd,abccb; jachbdiegf. Ababd,adcdb; iefjabgdch.Ch4: Ch8:aaaba,bcbab; igdecfhbja. dacac,acddc; gajidbhcfe.Ch5: Ch9:cdcaa,cbbac; gdfbghaeic. abadc,dbbbd; gbaidecjhf.中英文对照的ERP专业词汇介绍:B2C、B2B、ASP、APS、BOM、C/S、CAD、CAM、CPC、EDI、GUI、ISO、MIS、PM、SCM、SQL、TQM、line item、planned capacity、rated capacity、virtual warehouse……1 ABM Activity-based Management 基于作业活动管理2 AO Application Outsourcing 应用程序外包3 APICS American Production and Inventory Control Society,Inc 美国生产与库存管理协会4 APICS Applied Manufacturing Education Series 实用制造管理系列培训教材5 APO Advanced Planning and Optimization 先进计划及优化技术6 APS Advanced Planning and Scheduling 高级计划与排程技术7 ASP Application Service/Software Provider 应用服务/软件供应商8 ATO Assemble To Order 定货组装9 ATP Available To Promise 可供销售量(可签约量)10 B2B Business to Business 企业对企业(电子商务)11 B2C Business to Consumer 企业对消费者(电子商务)12 B2G Business to Government 企业对政府(电子商务)13 B2R Business to Retailer 企业对经销商(电子商务)14 BIS Business Intelligence System 商业智能系统15 BOM Bill Of Materials 物料清单16 BOR Bill Of Resource 资源清单17 BPR Business Process Reengineering 业务/企业流程重组18 BPM Business Process Management 业务/企业流程管理19 BPS Business Process Standard 业务/企业流程标准20 C/S Client/Server(C/S)\Browser/Server(B/S) 客户机/服务器\浏览器/服务器21 CAD Computer-Aided Design 计算机辅助设计22 CAID Computer-Aided Industrial Design 计算机辅助工艺设计23 CAM Computer-Aided Manufacturing 计算机辅助制造24 CAPP Computer-Aided Process Planning 计算机辅助工艺设计25 CASE Computer-Aided Software Engineering 计算机辅助软件工程26 CC Collaborative Commerce 协同商务27 CIMS Computer Integrated Manufacturing System 计算机集成制造系统28 CMM Capability Maturity Model 能力成熟度模型29 COMMS Customer Oriented Manufacturing Management System 面向客户制造管理系统30 CORBA Common Object Request Broker Architecture 通用对象请求代理结构31 CPC Collaborative Product Commerce 协同产品商务32 CPIM Certified Production and Inventory Management 生产与库存管理认证资格33 CPM Critical Path Method 关键线路法34 CRM Customer Relationship Management 客户关系管理35 CRP capacity requirements planning 能力需求计划36 CTI Computer Telephony Integration 电脑电话集成(呼叫中心)37 CTP Capable to Promise 可承诺的能力38 DCOM Distributed Component Object Model 分布式组件对象模型39 DCS Distributed Control System 分布式控制系统40 DMRP Distributed MRP 分布式MRP41 DRP Distribution Resource Planning 分销资源计划42 DSS Decision Support System 决策支持系统43 DTF Demand Time Fence 需求时界44 DTP Delivery to Promise 可承诺的交货时间45 EAI Enterprise Application Integration 企业应用集成46 EAM Enterprise Assets Management 企业资源管理47 ECM Enterprise Commerce Management 企业商务管理48 ECO Engineering Change Order 工程变更订单49 EDI Electronic Data Interchange 电子数据交换50 EDP Electronic Data Processing 电子数据处理51 EEA Extended Enterprise Applications 扩展企业应用系统52 EIP Enterprise Information Portal 企业信息门户53 EIS Executive Information System 高层领导信息系统54 EOI Economic Order Interval 经济定货周期55 EOQ Economic Order Quantity 经济订货批量(经济批量法)56 EPA Enterprise Proficiency Analysis 企业绩效分析57 ERP Enterprise Resource Planning 企业资源计划58 ERM Enterprise Resource Management 企业资源管理59 ETO Engineer To Order 专项设计,按订单设计60 FAS Final Assembly Schedule 最终装配计划61 FCS Finite Capacity Scheduling 有限能力计划62 FMS Flexible Manufacturing System 柔性制造系统63 FOQ Fixed Order Quantity 固定定货批量法64 GL General Ledger 总账65 GUI Graphical User Interface 图形用户界面66 HRM Human Resource Management 人力资源管理67 HRP Human Resource Planning 人力资源计划68 IE Industry Engineering/Internet Exploration 工业工程/浏览器69 ISO International Standard Organization 国际标准化组织70 ISP Internet Service Provider 互联网服务提供商71 ISPE International Society for Productivity Enhancement 国际生产力促进会72 IT/GT Information/Group Technology 信息/成组技术73 JIT Just In Time 准时制造/准时制生产74 KPA Key Process Areas 关键过程域75 KPI Key Performance Indicators 关键业绩指标76 LP Lean Production 精益生产77 MES Manufacturing Executive System 制造执行系统78 MIS Management Information System 管理信息系统79 MPS Master Production Schedule 主生产计划80 MRP Material Requirements Planning 物料需求计划81 MRPII Manufacturing Resource Planning 制造资源计划82 MTO Make To Order 定货(订货)生产83 MTS Make To Stock 现货(备货)生产84 OA Office Automation 办公自动化85 OEM Original Equipment Manufacturing 原始设备制造商86 OPT Optimized Production Technology 最优生产技术87 OPT Optimized Production Timetable 最优生产时刻表88 PADIS Production And Decision Information System 生产和决策管理信息系统89 PDM Product Data Management 产品数据管理90 PERT Program Evaluation Research Technology 计划评审技术91 PLM Production Lifecycle Management 产品生命周期管理92 PM Project Management 项目管理93 POQ Period Order Quantity 周期定量法94 PRM Partner Relationship Management 合作伙伴关系管理95 PTF Planned Time Fence 计划时界96 PTX Private Trade Exchange 自用交易网站97 RCCP Rough-Cut Capacity Planning 粗能力计划98 RDBM Relational Data Base Management 关系数据库管理99 RPM Rapid Prototype Manufacturing 快速原形制造100 RRP Resource Requirements Planning 资源需求计划101 SCM Supply Chain Management 供应链管理102 SCP Supply Chain Partnership 供应链合作伙伴关系103 SFA Sales Force Automation 销售自动化104 SMED Single-Minute Exchange Of Dies 快速换模法105 SOP Sales And Operation Planning 销售与运作规划106 SQL Structure Query Language 结构化查询语言107 TCO Total Cost Ownership 总体运营成本108 TEI Total Enterprise Integration 全面企业集成109 TOC Theory Of Constraints/Constraints managemant 约束理论/约束管理110 TPM Total Productive Maintenance 全员生产力维护111 TQC Total Quality Control 全面质量控制112 TQM Total Quality Management 全面质量管理113 WBS Work Breakdown System 工作分解系统114 XML eXtensible Markup Language 可扩展标记语言115 ABC Classification(Activity Based Classification) ABC分类法116 ABC costing 作业成本法117 ABC inventory control ABC 库存控制118 abnormal demand 反常需求119 acquisition cost ,ordering cost 定货费120 action message 行为/活动(措施)信息121 action report flag 活动报告标志122 activity cost pool 作业成本集123 activity-based costing(ABC) 作业基准成本法/业务成本法124 actual capacity 实际能力125 adjust on hand 调整现有库存量126 advanced manufacturing technology 先进制造技术127 advanced pricing 高级定价系统128 AM Agile Manufacturing 敏捷制造129 alternative routing 替代工序(工艺路线)130 Anticipated Delay Report 拖期预报131 anticipation inventory 预期储备132 apportionment code 分摊码133 assembly parts list 装配零件表134 automated storage/retrieval system 自动仓储/检索系统135 Automatic Rescheduling 计划自动重排136 available inventory 可达到库存137 available material 可用物料138 available stock 达到库存139 available work 可利用工时140 average inventory 平均库存141 back order 欠交(脱期)订单142 back scheduling 倒排(序)计划/倒序排产?143 base currency 本位币144 batch number 批号145 batch process 批流程146 batch production 批量生产147 benchmarking 标杆瞄准(管理)148 bill of labor 工时清单149 bill of lading 提货单150 branch warehouse 分库151 bucketless system 无时段系统152 business framework 业务框架153 business plan 经营规划154 capacity level 能力利用水平155 capacity load 能力负荷156 capacity management 能力管理157 carrying cost 保管费158 carrying cost rate 保管费率159 cellular manufacturing 单元式制造160 change route 修改工序161 change structure 修改产品结构162 check point 检查点163 closed loop MRP 闭环MRP164 Common Route Code(ID) 通用工序标识165 component-based development 组件(构件)开发技术166 concurrent engineering 并行(同步)工程167 conference room pilot 会议室模拟168 configuration code 配置代码169 continuous improvement 进取不懈170 continuous process 连续流程171 cost driver 作业成本发生因素172 cost driver rate 作业成本发生因素单位费用173 cost of stockout 短缺损失174 cost roll-up 成本滚动计算法175 crew size 班组规模176 critical part 急需零件177 critical ratio 紧迫系数178 critical work center 关键工作中心179 CLT Cumulative Lead Time 累计提前期180 current run hour 现有运转工时181 current run quantity 现有运转数量182 customer care 客户关怀183 customer deliver lead time 客户交货提前期184 customer loyalty 客户忠诚度185 customer order number 客户订单号186 customer satisfaction 客户满意度187 customer status 客户状况188 cycle counting 周期盘点189 DM Data Mining 数据挖掘190 Data Warehouse 数据仓库191 days offset 偏置天数192 dead load 空负荷193 demand cycle 需求周期194 demand forecasting 需求预测195 demand management 需求管理196 Deming circle 戴明环197 demonstrated capacity 实际能力198 discrete manufacturing 离散型生产199 dispatch to 调度200 DRP Distribution Requirements Planning 分销需求计划201 drop shipment 直运202 dunning letter 催款信203 ECO workbench ECO工作台204 employee enrolled 在册员工205 employee tax id 员工税号206 end item 最终产品207 engineering change mode flag 工程变更方式标志208 engineering change notice 工程变更通知209 equipment distribution 设备分配210 equipment management 设备管理211 exception control 例外控制212 excess material analysis 呆滞物料分析213 expedite code 急送代码214 external integration 外部集成215 fabrication order 加工订单216 factory order 工厂订单217 fast path method 快速路径法218 fill backorder 补足欠交219 final assembly lead time 总装提前期220 final goods 成品221 finite forward scheduling 有限顺排计划222 finite loading 有限排负荷223 firm planned order 确认的计划订单224 firm planned time fence 确认计划需求时界225 FPR Fixed Period Requirements 定期用量法226 fixed quantity 固定数量法227 fixed time 固定时间法228 floor stock 作业现场库存229 flow shop 流水车间230 focus forecasting 调焦预测231 forward scheduling 顺排计划232 freeze code 冻结码233 freeze space 冷冻区234 frozen order 冻结订单235 gross requirements 毛需求236 hedge inventory 囤积库存237 in process inventory 在制品库存238 in stock 在库239 incrementing 增值240 indirect cost 间接成本241 indirect labor 间接人工242 infinite loading 无限排负荷243 input/output control 投入/产出控制244 inspection ID 检验标识245 integrity 完整性246 inter companies 公司内部间247 interplant demands 厂际需求量248 inventory carry rate 库存周转率249 inventory cycle time 库存周期250 inventory issue 库存发放251 inventory location type 仓库库位类型252 inventory scrap 库存报废量253 inventory transfers 库存转移254 inventory turns/turnover 库存(资金)周转次数255 invoice address 发票地址256 invoice amount gross 发票金额257 invoice schedule 发票清单258 issue cycle 发放周期259 issue order 发送订单260 issue parts 发放零件261 issue policy 发放策略262 item availability 项目可供量263 item description 项目说明264 item number 项目编号265 item record 项目记录266 item remark 项目备注267 item status 项目状态268 job shop 加工车间269 job step 作业步骤270 kit item 配套件项目271 labor hour 人工工时272 late days 延迟天数273 lead time 提前期274 lead time level 提前期水平275 lead time offset days 提前期偏置(补偿)天数276 least slack per operation 最小单个工序平均时差277 line item 单项产品278 live pilot 应用模拟279 load leveling 负荷量280 load report 负荷报告281 location code 仓位代码282 location remarks 仓位备注283 location status 仓位状况284 lot for lot 按需定货(因需定量法/缺补法)285 lot ID 批量标识286 lot number 批量编号287 lot number traceability 批号跟踪288 lot size 批量289 lot size inventory 批量库存290 lot sizing 批量规划291 low level code 低层(位)码292 machine capacity 机器能力293 machine hours 机时294 machine loading 机器加载295 maintenance ,repair,and operating supplies 维护修理操作物料296 make or buy decision 外购或自制决策297 management by exception 例外管理法298 manufacturing cycle time 制造周期时间299 manufacturing lead time 制造提前期300 manufacturing standards 制造标准301 master scheduler 主生产计划员302 material 物料303 material available 物料可用量304 material cost 物料成本305 material issues and receipts 物料发放和接收306 material management 物料管理307 material manager 物料经理308 material master,item master 物料主文件309 material review board 物料核定机构310 measure of velocity 生产速率水平311 memory-based processing speed 基于存储的处理速度312 minimum balance 最小库存余量313 Modern Materials Handling 现代物料搬运314 month to date 月累计315 move time , transit time 传递时间316 MSP book flag MPS登录标志317 multi-currency 多币制318 multi-facility 多场所319 multi-level 多级320 multi-plant management 多工厂管理321 multiple location 多重仓位322 net change 净改变法323 net change MRP 净改变式MRP324 net requirements 净需求325 new location 新仓位326 new parent 新组件327 new warehouse 新仓库328 next code 后续编码329 next number 后续编号330 No action report 不活动报告331 non-nettable 不可动用量332 on demand 急需的333 on-hand balance 现有库存量334 on hold 挂起335 on time 准时336 open amount 未清金额337 open order 未结订单/开放订单338 order activity rules 订单活动规则339 order address 订单地址340 order entry 订单输入341 order point 定货点342 order point system 定货点法343 order policy 定货策略344 order promising 定货承诺345 order remarks 定货备注346 ordered by 定货者347 overflow location 超量库位348 overhead apportionment/allocation 间接费分配349 overhead rate,burden factor,absorption rate 间接费率350 owner's equity 所有者权益351 parent item 母件352 part bills 零件清单353 part lot 零件批次354 part number 零件编号355 people involvement 全员参治356 performance measurement 业绩评价357 physical inventory 实际库存358 picking 领料/提货359 planned capacity 计划能力360 planned order 计划订单361 planned order receipts 计划产出量362 planned order releases 计划投入量363 planning horizon 计划期/计划展望期364 point of use 使用点365 Policy and procedure 工作准则与工作规程366 price adjustments 价格调整367 price invoice 发票价格368 price level 物价水平369 price purchase order 采购订单价格370 priority planning 优先计划371 processing manufacturing 流程制造372 product control 产品控制373 product family 产品系列374 product mix 产品搭配组合375 production activity control 生产作业控制376 production cycle 生产周期377 production line 产品线378 production rate 产品率379 production tree 产品结构树380 PAB Projected Available Balance 预计可用库存(量) 381 purchase order tracking 采购订单跟踪382 quantity allocation 已分配量383 quantity at location 仓位数量384 quantity backorder 欠交数量385 quantity completion 完成数量386 quantity demand 需求量387 quantity gross 毛需求量388 quantity in 进货数量389 quantity on hand 现有数量390 quantity scrapped 废品数量391 quantity shipped 发货数量392 queue time 排队时间393 rated capacity 额定能力394 receipt document 收款单据395 reference number 参考号396 regenerated MRP 重生成式MRP397 released order 下达订单398 reorder point 再订购点399 repetitive manufacturing 重复式生产(制造)400 replacement parts 替换零件401 required capacity 需求能力402 requisition orders 请购单403 rescheduling assumption 重排假设404 resupply order 补库单405 rework bill 返工单406 roll up 上滚407 rough cut resource planning 粗资源计划408 rounding amount 舍入金额409 run time 加工(运行)时间410 safety lead time 安全提前期411 safety stock 安全库存412 safety time 保险期413 sales order 销售订单414 scheduled receipts 计划接收量(预计入库量/预期到货量) 415 seasonal stock 季节储备416 send part 发送零件417 service and support 服务和支持418 service parts 维修件419 set up time 准备时间420 ship address 发运地址421 ship contact 发运单联系人422 ship order 发货单423 shop calendar 工厂日历(车间日历)424 shop floor control 车间作业管理(控制)425 shop order , work order 车间订单426 shrink factor 损耗因子(系数)427 single level where used 单层物料反查表428 standard cost system 标准成本体系429 standard hours 标准工时430 standard product cost 标准产品成本431 standard set up hour 标准机器设置工时432 standard unit run hour 标准单位运转工时433 standard wage rate 标准工资率434 status code 状态代码435 stores control 库存控制436 suggested work order 建议工作单437 supply chain 供应链438 synchronous manufacturing 同步制造/同期生产439 time bucket 时段(时间段)。

Slides1

Slides1

Web site: /
Parallel Programming Platforms
Implicit Parallelism:
Trends in Microprocessor Architectures Executing multiple instructions in a single clock cycle.
Scope of Parallel Computing
Applications in Engineering Scientific Applications Commercial Applications Applications in Computer Systems Everywhere
Course Content
Introduction to parallel architecture and the basic theoretical principles of parallel algorithms and programming, includes some parallel programming tools. Practices: Includes some hands-on parallel programming on shared-memory and message-passing parallel architectures.
How we use very large number of transistors to achieve increasing rates of computation is the key
The Memory/Disk Speed Argument
Parallel platform yield better memory system performance:

02_2并行计算机(系统结构)

02_2并行计算机(系统结构)

P
M
P M
P M
...
P M
2019/2/23
23
构建并行机系统的不同存储结构
PVP (Cray
中央存储器 T90)
UMA SMP SGI
多处理机 ( 单地址 空间 共享 存储器 ) (Intel SHV,SunFire,DEC 8400, PowerChallenge,IBMR60,etc.) (KSR-1,DDM) (Stanford Dash, SGI Origin 2000,Sequent NUMA-Q, HP/Convex Exemplar) (Cray T3E)
2019/2/23 10
MPP(Massively Parallel Processor)



处理节点采用微处理器 系统中有物理上的分布式存储器 采用高通信带宽和低延迟的互连网络(专门设 计和定制的) 能扩展至成百上千乃至上万个处理器 异步MIMD,构成程序的多个进程有自己的地 址空间,进程间通信消息传递相互作用
16
Origin3000 与 Altix3000
Origin3000
2019/2/23
Altix3000
17
并行计算机内存访问模型

UMA / NUMA / COMA / CC-NUMA / NORMA
2019/2/23
18
并行计算机访存模型(1)

UMA(Uniform Memory Access)模型是均匀存储访问模型的 简称。其特点是:
节 点1 P / C 节 点N M e m P / C
… P/C
交 叉 开 关 总 线 或

I / O
…P/C
开 关 总 线 或 交 叉

Parallel-Computing-并行程序设计-同济大学张大强

Parallel-Computing-并行程序设计-同济大学张大强
partial sums
▪ Minimizing the cost of communication
improved speedup - Moved students (“processors”) closer together
(or let them shout)
9
Tongji 217004301
14
Tongji 217004301
Course Parallel computer hardware them implementation: how parallel computers e 2: work
▪ Mechanisms used to implement
abstractions efficiently
- Some students (“processors”) ran out
work to do (went idle), while others were still working on their assigned task
▪ Improving the distribution of work
Everyone needs to understand
Tongji 217004301, F
Tunes
“I’d spent all winter break waiting to write some parallel code, and when I got back in front of a machine I was so jacked I ended up just spawning pthreads all “Long Time night long.” C oming” - Leela James, on the inspiration for “Long Time Coming ” (A Change is

ParallelComputingToolbox:并行计算工具箱

ParallelComputingToolbox:并行计算工具箱

Parallel computing with MATLAB. You can use Parallel Computing Toolbox to run applications on a multicore desktop with local workers available in the toolbox, take advantage of GPUs, and scale up to a cluster (with MATLAB Distributed Computing Server).Programming Parallel ApplicationsParallel Computing Toolbox provides several high-level programming constructs that let you convert your applications to take advantage of computers equipped with multicore processors and GPUs. Constructs such as parallel for-loops(parfor)and special array types for distributed processing and for GPU computing simplify parallel code development by abstracting away the complexity of managing computations and data between your MATLAB session and the computing resource you are using.You can run the same application on a variety of computing resources without reprogramming it. The parallel constructs function in the same way, regardless of the resource on which your application runs—a multicore desktop (using the toolbox) or on a larger resource such as a computer cluster (using toolbox with MATLAB Distributed Computing Server).Using Built-In Parallel Algorithms in Other MathWorks ProductsKey functions in several MathWorks products have built-in parallel algorithms. In the presence of Parallel Computing Toolbox, these functions can distribute computations across available parallel computing resources, allowing you to speed up not just your MATLAB and Simulink based analysis or simulation tasks but also code generation for large Simulink models. You do not have to write any parallel code to take advantage of thesefunctions.Using built-in parallel algorithms in MathWorks products. Built-in parallel algorithms can speed up MATLAB and Simulink computations as well as code generation from Simulink models.Speeding Up Task-Parallel ApplicationsYou can speed up some applications by organizing them into independent tasks(units of work) and executing multiple tasks concurrently. This class of task-parallel applications includes simulations for design optimization, BER testing, Monte Carlo simulations, and repetitive analysis on a large number of data files.The toolbox offers parfor, a parallel for-loop construct that can automatically distribute independent tasks to multiple MATLAB workers(MATLAB computational engines running independently of your desktop MATLAB session). This construct automatically detects the presence of workers and reverts to serial behavior if none are present. You can also set up task execution using other methods, such as manipulating task objects in thetoolbox.Using parallel for-loops for a task-parallel application. You can use parallel for-loops in MATLAB scripts and functions and execute them both interactively and offline.Speeding Up MATLAB Computations with GPUsParallel Computing Toolbox provides GPUArray, a special array type with several associated functions that lets you perform computations on CUDA-enabled NVIDIA GPUs directly from MATLAB. Functions include fft, element-wise operations, and several linear algebra operations such as lu and mldivide, also known as the backslash operator (\). The toolbox also provides a mechanism that lets you use your existing CUDA-based GPU kernels directly from MATLAB.Learn more about GPU computing with MATLAB.GPU computing with MATLAB. Using GPUArrays and GPU-enabled MATLAB functions help speed up MATLAB operations without low-level CUDA programming.Scaling Up to Clusters, Grids, and Clouds Using MATLAB Distributed Computing ServerParallel Computing Toolbox provides the ability to run MATLAB workers locally on your multicore desktop to execute your parallel applications allowing you to fully use the computational power of your desktop. Using the toolbox in conjunction with MATLAB Distributed Computing Server, you can run your applications on large scale computing resources such as computer clusters or grid and cloud computing resourcesRunning a gene regulation model on a cluster using MATLAB Distributed Computing Server. The server enables applications developed using Parallel Computing Toolbox to harness computer clusters for large problems.Listening to the World’s Oceans: Searching for Marine Mammals by Detecting andClassifying Terabytes of Bioacoustic Data in Clouds of Noise51:32This session describes how Cornell University Bioacoustics Research Program datascientists use MATLAB®to develop high-performance computing software to processand analyze terabytes of acoustic data.Implementing Data-Parallel Applications using the Toolbox and MATLAB Distributed Computing ServerDistributed arrays in Parallel Computing Toolbox are special arrays that hold several times the amount of data that your desktop computer’s memory (RAM) can hold. Distributed arrays apportion the data across several MATLAB worker processes running on a computer cluster (using MATLAB Distributed Computing Server). As a result, with distributed arrays you can overcome the memory limits of your desktop computer and solve problems that require manipulating very large matrices.With over 150 functions available for working with distributed arrays, you can interact with these arrays as you would with MATLAB arrays and manipulate data available remotely on workers without low-level MPI programming. Available functions include linear algebra routines based on ScaLAPACK, such as mldivide, also known as the backslash operator (\),lu and chol, and functions for moving distributed data to and fromMAT-files.For fine-grained control over your parallelization scheme, the toolbox provides the single program multiple data (spmd)construct and several message-passing routines based on an MPI standard library (MPICH2). The spmd construct lets you designate sections of your code to run concurrently across workers participating in a parallel computation. During program execution,spmd automatically transfers data and code used within its body to the workers and, once the execution is complete, brings results back to the MATLAB client session. Message-passing functions for send, receive, broadcast, barrier, and probe operations are available.Programming with distributed arrays. Distributed arrays and parallel algorithms let you create data-parallel MATLAB programs with minimal changes to your code and without programming in MPI.Product Details, Examples, and System Requirements/products/parallel-computingTrial Software/trialrequestSales/contactsalesTechnical Support/support Running Parallel Applications Interactively and as Batch JobsYou can execute parallel applications interactively and in batch using Parallel Computing Toolbox. Using the parpool command, you can connect your MATLAB session to a pool of MATLAB workers that can run either locally on your desktop (using the toolbox) or on a computer cluster (using MATLAB Distributed Computing Server ) to setup a dedicated interactive parallel execution environment. You can execute parallel applications from the MATLAB prompt on these workers and retrieve results immediately as computations finish, just as you would in any MATLAB session.Running applications interactively is suitable when execution time is relatively short. When your applications need to run for a long time, you can use the toolbox to set them up to run as batch jobs. This enables you to free your MATLAB session for other activities while you execute large MATLAB and Simulink applications.While your application executes in batch, you can shut down your MATLAB session and retrieve results later. The toolbox provides several mechanisms to manage offline execution of parallel programs, such as the batch function and job and task objects. Both the batch function and the job and task objects can be used tooffload the execution of serial MATLAB and Simulink applications from a desktop MATLAB session.Running parallel applications interactively and as batch jobs. You can run applications on your workstation using twelve workers available with the toolbox, or on a computer cluster using more workers available with MATLAB Distributed Computing Server.ResourcesOnline User Community /matlabcentral Training Services /training Third-Party Products and Services /connections Worldwide Contacts /contact。

Simulating 3D HF in a Parallel Computing Environment

Simulating 3D HF in a Parallel Computing Environment

Simulating 3D HF in a ParallelComputing EnvironmentBruce J. Carter, Anthony R. Ingraffea, Gerd HeberCornell Fracture Group/Computational Materials InstituteCornell Theory Center, Cornell University, Ithaca, NYThree Dimensional and Advanced Hydraulic Fracture WorkshopPacific Rocks 2000: Rock Around the RimFourth North American Rock Mechanics SymposiumSeattle, Washington, USAJuly 29, 2000SummaryThree-dimensional hydraulic fracture (HF) models have been under development since the late 1970's. The Cornell Fracture Group (CFG) in collaberation with Schlumberger-Dowell and Schlumberger Cambridge Research worked on the development of a fully-3D HF simulator from 1988 through 1996. This software, called HYFRANC3D, has seen limited use in the petroleum industry primarily because of the significant time and effort required to simulate the growth of a fully-3D HF. However, increasing computer speeds and decreasing computer costs have made the use of 3D models more practical. Parallel computing platforms are aiding this process by drastically reducing the time-to-solution. The Cornell Theory Center (CTC) now has two industry standard Dell/Intel PC clusters running the Microsoft Windows 2000 operating system. The authors are part of a group of researchers from computer science, physics, and engineering who are taking advantage of these clusters to perform large scale 3D finite element based fracture simulations. The goal of the project is to perform a thousand steps of crack growth, involving a million degrees of freedon (DOFS) per step, in an hour. The long term goal is to extend the general framework to specific applications such as HF.IntroductionParallel computers are not required for 3D hydraulic fracture (HF) simulation, but they sure are nice to have. Multiple processors, with their own memory, allow much larger problems to be solved much faster by avoiding disk access and by simultaneously solving many small parts of the whole model. Parallel computers, in the past, were specially designed, large, expensive machines. Today, however, a relatively inexpensive parallel computer can be built from a cluster of industry standard PC’s. The Cornell TheoryCenter (CTC) has shifted from an IBM SP2 to a cluster of Dell/Intel PC’s and the Cornell Fracture Group (CFG) software is being ported to this new environment.The CFG19 in collaberation with Schlumberger30 worked on a serial version of a fully-3D HF simulator from 1988 through 1996. This software, called HYFRANC3D and available from the CFG, is based on a general purpose 3D fracture analysis software package that includes OSM, FRANC3D, and BES. OSM is a program for building geometric models. FRANC3D is a pre- and post-processor for simulating crack growth. BES is a linear elastic boundary element code. HYFRANC3D builds upon FRANC3D by adding a module for simulating fluid flow in the 3D fractures.The software has seen limited use in the petroleum industry primarily because of the significant time and effort required to model a fully-3D HF. Most HF simulators used in industry are either 2D or psuedo-3D (planar 3D cracks). This is because the time-to-solution for these simulators is seconds or minutes, which allows for active control of a HF stimulation in the field. A fully-3D, linear elastic, boundary element solution of a complex geometry with one or more abitrarily shaped cracks can take hours (or days) on a single computer processor using BES. A BES analysis is required for each step of crack growth, meaning that a reasonable number of crack growth steps (20 to 30) could take weeks. That is just too much time, even in a research environment.To decrease the time-to-solution, BES was ported to the CTC10 IBM SP2 in 19961. The time-to-solution decreased significantly, as a single analysis could be obtained in an hour or less using 32 to 64 processors. However, very few companies have a large IBM SP2 in their engineering field offices meaning that the HYFRANC3D software was still only useful as a research tool. BES now has been ported to the CTC’s new Dell/Intel PC cluster, which makes parallel computations more accessible. To further reduce computing time, parallel FE simulations are being explored.In 1998, the NSF funded the CCISE: Crack Propagation on Teraflop Computers9 project at Cornell. This project combines researchers from computer science, physics, and engineering with a common goal of simulating 3D fracture growth very quickly. The project is based on the development of several 3D finite element mesh generators and several iterative solvers in a parallel computing environment. The initial software was developed on the CTC IBM SP2. However, it now has been ported to the new cluster of PC's. Although this software is not yet ready to solve 3D HF simulations, it has the potential to make fully 3D HF simulation widely accessible.The HYFRANC3D software is briefly described along with the past and present computing environments at the CTC. The recent progress in 3D finite element based fracture simulation is then described along with comparisons of the past and present parallel computing platforms and software. Extension to HF simulation lies ahead.HYFRANC3D and Accompanying SoftwareThe FRANC3D software is described by Carter et al 2. The HYFRANC3D software is described by Carter et al 3. A brief overview of the numerical algorithm is provided here. Although, the fluid flow equations are not the central theme, the flow is tightly coupled to the elastic solution and the manner in which the elastic solution is obtained. The parallel implementation of the fluid flow equations should follow the parallelization of the more time consuming elastic solution.The HF governing equations consist of linear elasticity, lubrication theory for flow, and mass conservation. The boundary conditions include far-field stresses, fluid flux in and out of the crack, zero crack aperature at the crack front, and the requirement that the fluid velocity at the crack front equals the crack speed. The elasticity equation relates the fracture width to the fluid pressure:[]dy y p y L x L y L x L E x w Le 302/1222/1222/1222/122)()()()()(log 4)(σπ−−−−−+−′=The lubrication equation relates the pressure gradient and fracture width with the fluid flux:n n nnn w x p K n n q /)12(/1/112112++∂∂′+−=The continuity equation imposes mass conservation and implies that the fluid volume in the fracture should equal the fracture volume:0=∂∂+∂∂xq t wq is the fluid fluxw is the width of the fracture L is the fracture half lengthp is the fluid pressure in the fracturen and K are the fluid viscosity parameters E ′=E/(1-ν) is the effective Young's modulus E is the Young's modulus ν is Poisson's ratiox is the distance along the fracture from the fluid inlet σ3 is the far-field stress normal to the fracture V is the crack front speedCombining these equations gives:()− ′−′+=−′+′′+′)2/()2/(3)2(222n n h n n h h L L L n n p p ξπσ′−′=′+′+′+′′+n n n n hn L n c n c L w 243221)2/()2/(2)()(ξξ where L h =V ′ K ′E1/′np h =′ E cos((1−α)π)sin(απ) 1+′ n 2′ n +1′ n 2(2+′ n ) ′ n1(2+′ n )α(′ n )=2(2+′ n )x L −=ξFor a Newtonian fluid, n=1 and the pressure has a 1/3 singularity. The stress ahead of the crack has the same order of singularity. The solution has been dubbed LEHF todifferentiate it from standard linear elastic fracture mechanics (LEFM). Physically, the pressure cannot be singular, which implies that a fluid lag exists at the crack front.Figure 1. Crack front region with fluid lag.For a finite element formulation, the near-crack front behavior is captured by special crack front elements where the analytical equations hold. The crack front element captures the zone, which includes the fluid lag zone, between the zone where thelubrication equation holds and the immediate crack tip zone where LEFM holds. This avoids having a large number of elements near the crack front and assumes that thebehavior is dominated by the “intermediate” region. The assumptions are that there is no sizeable effect of fracture toughness, there is a small fluid lag, there is a sharp pressure drop at the crack front, and there is no effect of pore pressure. The bulk of the fracture ismodeled using collapsed solid elements with the lubrication approximation. The fluid flow elements correspond exactly to the BE elements. Thus, limiting the number of elements near the crack front also reduces the elastic solution time.The finite element formulation leads to a set of n fluid flow equations and n structural equations with n unknown nodal widths and n unknown nodal fluid pressures. The resulting system of equations is given by:ˆ w j ∆t j =1j =n N i N j Ωid Ω+ˆ p j 112µΩij =1j =nN k w k k =1k =n 3grad N i ⋅grad N jd Ω=ˆ wj (t n )∆tj =1j =nN i N j Ωid Ω−N i M k ˆ β k ˆ V k 4/3k =1lΓid Γ+Q (t )N i (O )+112µΩiN k ˆ w k k 3grad N i ⋅grad N j ˆ B j ˆ V j 1/3j =m +1n d Ωwhereˆ Β j=′ E 23µ3ˆ ρ j 13and β=(2)376()µ′E 13ρ23These latter terms account for the fluid flow in the special crack front elements. Therelationship between fluid pressure and crack aperature is provided by an elastic influence matrix that is generated by BES. The final solution is a set of crack surfacedisplacements that can be corellated to the stress intensity factors to determine the crack advance and resulting new crack shape.ΓFigure 2. Finite element mesh of a radial crack from a wellbore.The fluid flow analysis relies upon an elastic influence matrix that relates the displacements and tractions everywhere in the model to an applied unit pressure at each of the nodes on the crack surface. Historically, the CFG has used BES to generate this matrix by solving for multiple right hand sides. The elastic analysis and the generation of the influence matrix is the most time consuming and computationally expensive portion of a HF simulation. Therefore, this is the part of the code that has been parallelized first.Parallel Computing Environment at the Cornell Theory CenterThe CTC was a NSF National Supercomputer Center from 1985-1998. The parallel computing platform during that era was a massively parallel IBM Scalable POWERparallel Systems SP2. The current SP2 configuration has only 32, 120MHz POWER2 Super Chips (P2SC) thin nodes, with 256 MB RAM per node and TB3 switching fabric (150 MB/s peak hardware bandwidth), each running AIX 4.2.1. This system will be phased out later this year.In August, 1999, the CTC installed a cluster of 64 Dell PowerEdge servers5, each with four Intel28 Pentium® III Xeon 500 Mhz processors and running the Microsoft26 Windows® NT operating system. Dubbed AC3 Velocity (V1), each server has 2 MB of L2 cache per processor, 4 GB of RAM and 54 GB of hard disk space with TCP and Giganet interconnects. Due to its success, and due to increasing demands, a second cluster was installed in April, 2000. Dubbed Velocity+ (V+), the new cluster consists of 64 dual Pentium III 733 Mhz processors, with 256 KB L2 cache, 2 GB of RAM per node, 27 GB of hard disk space, and full 64-way Giganet6 interconnect, running Microsoft Windows® 2000 Advanced Server.The move to cluster computing is a dramatic shift from the IBM SP2. The AC3 cluster is designed to illustrate the feasibilty and performance of a cluster of industry standard computer parts. Processors and network switches have evolved to a point where it is possible to build inexpensive, high speed, low latency computer clusters that rival the much more expensive, specially designed, parallel architectures. Finally, new software, including the release of the new Microsoft Windows® 2000 operating system and robust middleware such MPI Pro7 for message passing has led to a stable software environment that is uniform from desktop to cluster. This enables software development on an inexpensive desktop PC and immediate scalability on the cluster. Of course, one of the drawbacks is that large amounts of software developed on the AIX operating system of the SP2 had to be ported to the Microsoft Windows operating system. This has proven to be less difficult than first imagined, but is not completed yet.Parallel BEM SoftwareThe boundary element code, BES, was initially ported to the IBM SP2 in 1996 and has recently been ported to the Velocity cluster. The software is a mixture of C and Fortranand uses MPI for message passing. There are two phases to a BE solution, integration and solution. The model is discretized and integration is performed at collocation points (usually at nodes). The collocations can be evenly distributed amongst a set of processors as the integrations can be performed independently making this phase embarassingly parallel. The result of the integrations is a dense unsymmetric matrix. A QR solver is used to solve the system of equations. Although, the solver does operate in parallel it is not as efficient as the integration phase. Figure 3 shows the parallel speed up of both components on the IBM SP2. A 97% efficiency is achieved with an almost linear scaling for the integrations, but the solver does not scale nearly as well, especially for large problems.It is possible to use other solvers; in fact, the serial version of BES employs a Gauss elimination direct solver and an iterative solver with a variety of preconditioners. The iterative solver is more efficient for most problems than either of the direct solvers. However, the iterative solver has not been extended to the parallel version. The reason is that iterative solvers are not well suited for the multiple right hand sides that HF simulations require. This issue needs to be addressed, especially if the FE software described next is extended to HF.081624328162432Number of ProcessorsS p e e d -u p i n C P U T i m eFigure 3. Parallel speed up of BES on the IBM SP2.Parallel FEM SoftwareThe CFG and a team of computer scientists and physiscists are working on the design and implementation of a highly parallel finite element code for 3D fracture analysis. The main challenges in implementing such a code are the following.1) The simulation requires the solution of many finite-element problems (each crackstep) involving on the order of a million degrees of freedom (DOF). In contrast tomatrices arising from the use of boundary elements (BE), finite-element (FE)matrices are sparse.2) As cracks grow, the problem geometry and sometimes the topology changes.3) Because of evolving geometry, repeated volume meshing is necessary. Since it isnot practical to manipulate large numbers of meshes by hand, the meshes mustcome with certain quality guarantees.Figure 4 shows a flow diagram of the code. During pre-processing, a solid model is created, problem-specific boundary conditions (displacements, tractions, etc.) are imposed, and flaws (cracks) are introduced. In the next step, a volume mesh is created, and linear elasticity equations for the displacements are formulated and solved. An error estimator is used to determine whether the desired accuracy has been reached, or whether refinement is necessary. Once the solution has converged, stress intensities are extracted, and the crack is advanced for one time step. The entire process must be repeated for a number of time (crack growth) steps. Finally, the results are fed back into a fracture analysis tool for post-processing and life prediction.The most time consuming module in this code is the sparse linear system solver. One can choose between iterative and direct methods. For the iterative solvers, one can select from a variety of preconditioners (ICC, EBE, SPAI32 etc.). The modules for mesh generation (QMG25, DMESH31, and JMESH19), finite-element formulation, preconditioning, and h/p-adaptation are implemented almost exclusively in C and C++ with MPI for message passing. A variety of third party packages for solving and mesh partitioning are used, including BLAS/LAPACK12, PETSc13, BlockSolve9513, ParMETIS15, PSPASES14, and JANUS33.Figure 4: Block Diagram of a finite-element crack growth simulation.Example Simulations and Parallel PerformanceTwo example simulations illustrate the parallel performance of both the BE and FEsolution techniques. The first example shows the decreased time-to-solution for parallel compared to serial BE simulations. It also shows a further decrease in solution time for FE compared to BE simulations. The second example better illustrates the performance of the Velocity cluster compared to the IBM SP2 using only the FE solver.Example 1The first example is a half-symmetry model of a vertical wellbore with a crack aligned with the borehole axis. This is a typical tutorial HF example; the model is shown in Figure 5. A BE surface mesh is shown in Figure 6; this mesh is rather coarse, but is suitable for a tutorial.Figure 5. Wellbore model and boundary conditions.Figure 6. BE surface mesh of wellbore model.A HF simulation normally involves both an elastic BE analysis and a fluid flow analysis in HYFRANC3D. The results from HYFRANC3D include crack opening displacements (COD), fluid pressures, and flow rates. Figure 7 shows the COD and flow rate arrows for the first and second flow analyses from the tutorial. Since, the elastic solution time is generally much greater than the flow analysis time, the examples concentrate on the elastic solution times rather than the flow simulations.Figure 7. COD contours with flow rate arrows for the first two flow analyses. Elastic solution times are provided in Table 1 for two mesh densities for the initial crack model only. The solution times for serial and parallel BE analyses, including the solution of multiple right hand sides are compared. The parallel solution times on both the SP2 and the V+ cluster are reported. Only one processor per node of the V+ cluster is used and TCP (100 MB/s Fast Ethernet) is used for communication. The FE analyses do not include multiple rhs and their contribution to the FE time-to-solution is unknown.Table 1: Time-to-solution for the initial crack model of a wellbore. WellboreModel1658 BE2541 dofs1658 BE+ 26 rhs2567 dofs4038 BE6171 dofs4038 BE+ 120 rhs6291 dofs14762 FE62622 dofs35656 FE151312 dofsV+ (1x1) 14 min 14 min 126 min 141 min 4 min 12 minV+ (2x1) 7 min 7 min 68 min 77 min ---- 6 minV+ (4x1) ---- 5 min 41 min 48 min ---- 3 minV+ (8x1) ---- 4 min 26 min 32 min ---- 1.65 min SP2 (2) ---- 13 min ---- NEM* ---- ----SP2 (4) ---- 8 min ---- 66 min ---- ----SP2 (8) ---- 6 min ---- 44 min ---- ----*NEM: not enough memoryThe results show a reasonable speedup from 1 to 8 processors for the BE analyses on both the SP2 and the V+ cluster. The cost of solving for multiple right hand sides is significant only for larger models where the number of rhs is large. The V+ cluster is clearly faster than the relatively old SP2. A further decrease in time-to-solution is expected for the cluster when using the Giganet switch for communication, especially for larger models. The FE meshes are created using JMESH, an advancing front tetrahedral mesher, starting from the BE surface meshes. The FE solution time is significantly less than the BE solution time for similar accuracy in displacements. To compare the solution accuracy, the stress intensity factors (SIF) along the crack front are compared for the refined BE and FE meshes under static pressure loading. The average mode I SIF for the middle portion of the crack is 1390 and 1470 psi√in for the BE and FE models respectively, a difference of about 5%. The value at the exact center differs by only 0.8%.Example 2The second example is based on grout pressure control in a cracked concrete dam4. A model of the doubly-curved concrete arch dam with cracks on the up-stream face is shown in Figure 8. Cement grout is to be pumped into the crack to repair it without causing further crack growth by over-pressurization. This model is used here as a test case for the parallel FE solver; thus, the loading is static and no rhs are required.On the V1 cluster, there are 4 processors in each node. It is possible to use only a subset of the processors in a given node. In the illustrations below, the notations V1 (#x1), V1 (#x2), and V1 (#x4) refer to experiments where 1, 2, and 4 processors per node are used, respectively.Figure 8. Doubly-curved concrete arch dam.Table 2 shows the timing results for the FE solution of the initial crack model on the SP2 and the V1 cluster. All execution times are in seconds; note that the total time includes the time for formulation and assembly of the matrix, which is not shown here, but is generally less than 10% of the total time. Figure 9 shows the relative speed up on both computers as the number of procesors increases. The V1 cluster delivers better scalability than the SP2 for this FE solver. Furthermore, for large processor numbers, the V1 cluster delivers better absolute performance. There are two reasons for this: (i) the processors on the V1 are a generation ahead of the processors on the relatively old SP2, and (ii) the Giganet switch provides lower latency than the SP2’s TB3 fabric. When the number of processors is large, these factors become important because the cost of performing global reductions in the conjugate gradient code dominates performance.Table 2: FE execution times for the initial crack model of a dam,with 86,325 tetrahedral elements and 401,124 dofs.# Processors # Iterations Time/Iteration (s) Total Time (s) Concrete DamModelSP2 16 3549 0.48 1772.67V1 (4x4) 16 3372 0.53 1821.11V1 (8x2) 16 3485 0.43 1534.14V1 (16x1) 16 3719 0.39 1483.42SP2 32 3374 0.27 948.54V1 (8x4) 32 3548 0.27 975.42V1 (16x2) 32 3549 0.22 797.52V1 (32x1) 32 3556 0.20 728.03SP2 64 3548 0.16 594.11V1 (16x4) 64 3539 0.15 540.89V1 (32x2) 64 3548 0.12 437.71V1 (64x1) 64 3540 0.12 433.80ConclusionsParallel computing is becoming more widely accessable due to the development of industry standard PC-based clusters that now rival specially designed parallel architectures in terms of speed and performance while remaining relatively inexpensive. HF simulation in 3D requires parallel computations to decrease the time-to-solution. The CFG has a long history in the use of boundary elements for simulating crack growth. The parallel implementation of the 3D BE software dramatically decreases the time-to-solution. However, the 3D FE technology has more potential in terms of modeling a wider class of problems with faster solution techniques. FE based fracture simulationusing iterative solvers shows almost linear scaling in parallel and holds great promise for further decreasing the time-to-solution for HF simulation.A simple tutorial HF example takes about 3 hours per step of crack growth using the BE code for stress analysis on a single 400 MHz processor. This time-to-solution is reduced to 32 minutes on 8 processors of the V+ cluster. This time is expected to decrease once the solver is tuned for the new architecture. The parallel FE solution time is an order of magnitude less, requiring less than 2 minutes. The FE solution does not include multiple right hand sides, however, and is an issue that still needs to be adressed.ConcDam_1102030405060163264ProcessorsR e l a t i v e S p e e d u pFigure 9: Relative speedup of the FE solver on the SP2 and V1 cluster.AcknowledgmentsThe authors would like to acknowledge the past and present financial support of NSF (EIA-9972853; EIA-9726388; CMS-9625406) and Schlumberger. This research wasconducted using the resources of the Cornell Theory Center, which receives funding from Cornell University, New York State, federal agencies, and corporate partners. The intellectual support of the other members of the CCISE/CPTC and CFG teams is also gratefully acknowledged.References1.Shah, K.R., B.J. Carter and A.R. Ingraffea,(1997) Hydraulic fracturing simulationin parallel computing environment, Int. J. Rock Mech. & Min. Sci., Vol 34, No. 3-4, Paper No. 282, presented at the 36th US Rock Mechanics Symposium,Columbia University, New York, July, 1997.2.Carter, B.J., Wawrzynek, P.A., Ingraffea, A.R., (2000), Automated 3D CrackGrowth Simulation, Gallagher Special Issue of Int. J. Num. Meth. Engng., Vol 47, pp. 229-253.3.Carter, B.J, Desroches, J., Ingraffea, A.R. and Wawrzynek, P.A. (2000)Simulating Fully 3D Hydraulic Fracturing, in "Modeling in Geomechanics" Ed.Zaman, Booker, and Gioda, Wiley Publishers, 730p.4.Ingraffea, A.R., Carter, B.J. and Wawrzynek, P.A., 1995, Application ofcomputational fracture mechanics to repair of large concrete structures,FRAMCOS Vol 3, Proc. 2nd Int. Conf. Fracture Mechanics of Concrete Strucutres, Zurich, Switzerland, p.1721-1734.5./us/en/hied/topics/vectors_1998-wpvia.htm6./7./8./performance/perf-win-lin.html9./Research/CrackProp/10./11./UserDoc/Cluster12./13./division/software/14./~mjoshi/pspases/15./~karypis/metis/16./17./~glc5/mpidc/18./19./20.http://wwwwissrech.iam.uni-bonn.de/research/projects/parnass2/21./cplant/22./beowulf/index.html23./24./cygwin/25./home/vavasis/qmg-home.html26.27./fortran/28.29./vtune/compilers/fortran/30.31./Info/People/chew/chew.html32.http://www.sam.math.ethz.ch/~grote/spai/33.http://www.first.gmd.de/~jens/janus/。

并行计算综述

并行计算综述

什么是并行计算并行计算(parallel computing)是指,在并行机上,将一个应用分解成多个子任务,分配给不同的处理器,各个处理器之间相互协同,并行地执行子任务,从而达到加速求解速度,或者增大求解应用问题规模的目的。

由此,为了成功开展并行计算,必须具备三个基本条件:(1) 并行机。

并行机至少包含两台或两台以上处理机,这些处理机通过互连网络相互连接,相互通信。

(2) 应用问题必须具有并行度。

也就是说,应用可以分解为多个子任务,这些子任务可以并行地执行。

将一个应用分解为多个子任务的过程,称为并行算法的设计。

(3) 并行编程。

在并行机提供的并行编程环境上,具体实现并行算法,编制并行程序,并运行该程序,从而达到并行求解应用问题的目的。

并行计算的主要研究目标和内容对于具体的应用问题,采用并行计算技术的主要目的在于两个方面:(1) 加速求解问题的速度。

(2) 提高求解问题的规模。

组成并行机的三个要素为:•结点(node)。

每个结点由多个处理器构成,可以直接输入输出(I/O)。

•互联网络(interconnect network)。

所有结点通过互联网络相互连接相互通信。

•内存(memory)。

内存由多个存储模块组成,这些模块可以与结点对称地分布在互联网络的两侧,或者位于各个结点的内部。

并行编程模型1.共享内存模型a)在共享编程模型中,任务间共享统一的可以异步读写的地址空间。

b)共享内存的访问控制机制可能使用锁或信号量。

c)这个模型的优点是对于程序员来说数据没有身份的区分,不需要特别清楚任务间的单数据通信。

程序开发也相应的得以简化。

d)在性能上有个很突出的缺点是很难理解和管理数据的本地性问题。

2.线程模型在并行编程的线程模型中,单个处理器可以有多个并行的执行路径。

3.消息传递模型消息传递模型有以下三个特征:1)计算时任务集可以用他们自己的内存。

多任务可以在相同的物理处理器上,同时可以访问任意数量的处理器。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

TR-CS-97-06
RMSIM:a Serial Simulator for
Reconfigurable Mesh Parallel
Computers
M.Manzur Murshed and Richard P.Brent
April1997
Joint Computer Science Technical Report Series
Department of Computer Science
Faculty of Engineering and Information Technology
Computer Sciences Laboratory
Research School of Information Sciences and Engineering
This technical report series is published jointly by the Department of Computer Science,Faculty of Engineering and Information Technology, and the Computer Sciences Laboratory,Research School of Information Sciences and Engineering,The Australian National University.
Please direct correspondence regarding this series to:
Technical Reports
Department of Computer Science
Faculty of Engineering and Information Technology
The Australian National University
Canberra ACT0200
Australia
or send email to:
Technical.Reports@.au
A list of technical reports,including some abstracts and copies of some full reports may be found at:
.au/techreports/
Recent reports in this series:
TR-CS-97-05Beat Fischer.Collocation andfiltering—a data smoothing method in surveying engineering and geodesy.March1997.
TR-CS-97-04Stephen Fenwick and Chris Johnson.HeRODflavoured oct-trees:Scientific computation with a multicomputer persistent
object store.February1997.
TR-CS-97-03Brendan D.McKay.Knight’s tours of an88chessboard.
February1997.
TR-CS-97-02Xun Qu and Jeffrey X.Yu.Mobilefilefiltering.February 1997.
TR-CS-97-01Peter Arbenz and Markus Hegland.The stable parallel solution of general narrow banded linear systems.January1997. TR-CS-96-09Ralph Back,Jim Grundy,and Joakim von Wright.Structured calculational proof.November1996.
Support planes。

相关文档
最新文档