Engineering radix sort
3讲排序问题3

Decision-tree model
A decision tree can model the execution of any comparison sort: • One tree for each input size n. • View the algorithm as splitting whenever it compares two elements. • The tree contains the comparisons along all possible instruction traces. • The running time of the algorithm = the length of the path taken. • worst-case running time = height of tree.
A: B:
4
C: C’:
1
1
1
3
5
for i ←2 to k do C[i] ← C[i] + C[i–1]
//C[i] = |{key ≤ i}|
24
Loop 4
1 2 1 3 3 4 4 5 3 1 2 0 3 2 4 2
A: B:
4
C: C’:
1
3
1
1
2
5
for j ←n downto 1 do B[C[A[ j]]] ← A[ j] C[A[ j]] ← C[A[ j]] – 1
28
Loop 4
1 2 1 3 3 4 4 5 3 1 2 0 3 2 4 2
13
Counting sort
for i ←1 to k do C[i] ← 0 for j ←1 to n do C[A[ j]] ← C[A[ j]] + 1 for i ← to k ←2 do C[i] ← C[i] + C[i–1] for j ←n downto 1 do B[C[A[ j]]] ← A[ j] C[A[ j]] ← C[A[ j]] – 1
408常用算法与英文缩写对照

408常用算法与英文缩写对照可以参考如下内容:1. 排序算法:* 插入排序:Insertion Sort* 快速排序:Quick Sort* 归并排序:Merge Sort* 堆排序:Heap Sort2. 搜索算法:* 线性搜索:Linear Search* 二分搜索:Binary Search* 网格搜索:Grid Search3. 图算法:* 最短路径算法:Dijkstra算法* A*算法:A-Star Algorithm* 最优路径算法:Floyd-Warshall算法4. 最小生成树算法:Prim算法和Kruskal算法。
5. 动态规划:Dynamic Programming(DP)。
6. 分治策略:Divide and Conquer(DC)。
7. 贪心算法:Greedy Algorithm。
8. 回溯算法:Backtracking。
9. 分支决策算法:Decision Tree。
下面是英文缩写对照:1. 排序算法(英文缩写):* IS = Insertion Sort(插入排序)* QP = Quick Sort(快速排序)* MS = Merge Sort(合并排序)* HD = Heapsort(堆排序)2. 搜索算法(英文全称):* LS = Linear Search(线性搜索)* BS = Binary Search(二分搜索)* GS = Grid Search(网格搜索)(这个通常不作为算法出现在算法学习中,更接近于数据集或参数搜索方法)3. 图算法(英文缩写):* Dijkstra = Dijkstra's Algorithm(迪杰斯特拉算法)* A* = A*-Star Algorithm(A星算法)* FPS = Floyd-Warshall Algorithm(弗洛伊德-华生算法)这些英文缩写在研究图论或计算机科学时通常被使用。
4. 最短路径算法(英文全称):Shortest Path Algorithm。
绿盟笔试

素质测试a)作为一名技术,在接到客户电话时首先要做什么?该用什么样的典范语言?b)作为一名技术,出差时你认为必须要带的东西有哪些?(至少三样,笔记本除外)c)两道算术题,一题是6个带小数的数字相加之和,有选择项。
另一题要详细讲一下,因为我到现在都还没搞清楚。
d)题目的内容是:迈克和托德的薪水相差$21 。
迈克的薪水比托德多$20 。
迈克的薪水是多少?托德的薪水是多少?(起初我以为题目出错了,回来后查了一下,网上居然有,是微软公司IT技术专家碰到的一次面试题。
职业目标a)英文描述为什么选择中联绿盟?你的短期和长期的职业目标是什么?你想要有什么的成就?b)情景题:假如你在电梯里遇到绿盟的HR,你如何在30秒内给HR留下深刻印象?(虽然考试试题绝大部分是一些基础试题,但是明显可以看得出来,如果没有深厚的功底,是答不好的)专业题第一题:描述tcp的三次握手,写出syn ack的关系。
三次握手:握手过程的第一个段的代码位设置为SYN,序列号为x,表示开始一次握手。
接收方收到这个段后,向发送者回发一个段。
代码位设置为SYN和ACK,序列号设置为y,确认序列号设置为x+1。
发送者在受到这个段后,知道就可以进行TCP数据发送了,于是,它又向接收者发送一个ACK段,表示,双方的连接已经建立。
Client --> 置SYN标志序列号= J,确认号= 0 ----> ServerClient <-- 置SYN标志置ACK标志序列号= K, 确认号= J + 1 <-- ServerClinet --> 置ACK标志序列号= J + 1,确认号= K + 1 --> Servera 发起方b接受方a发送一个SYN包给b,b回一个[SYN,ACK]给b,a再回一个ACK包给b;数据交换:a ----》ba 发送数据完毕,(PSH,ACK)aseq = x,ack=y,datalen = zb 接受到以后发送(ACK)bseq = aack,back = aseq + alen,datalen=blenb 发送数据完毕(PSH,ACK)bseq = bseq + blen,back = back,datalen = blena 确认接受(ACK)aseq = back ,aack = bseq + blen注:PSH标志指示接收端应尽快将数据提交给应用层。
算法常用术语中英对照

算法常用术语中英对照以下是一些算法常用术语的中英对照,供参考:1. Algorithm 算法2. Data structure 数据结构3. Array 数组4. Stack 栈5. Queue 队列6. Linked list 链表7. Tree 树8. Binary tree 二叉树9. Graph 图10. Hash table 哈希表11. Sorting algorithm 排序算法12. Bubble sort 冒泡排序13. Insertion sort 插入排序14. Selection sort 选择排序15. Merge sort 归并排序16. Quick sort 快速排序17. Binary search 二分查找18. Depth-first search (DFS) 深度优先19. Breadth-first search (BFS) 广度优先20. Dijkstra's algorithm 迪杰斯特拉算法21. Prim's algorithm 普里姆算法22. Greedy algorithm 贪心算法23. Dynamic programming 动态规划24. Recursion 递归25. Backtracking 回溯29. Big O notation 大O符号30. Worst case scenario 最坏情况31. Best case scenario 最好情况32. Average case scenario 平均情况33. Asymptotic analysis 渐近分析34. Brute force 暴力解法35. Heuristic algorithm 启发式算法36. Randomized algorithm 随机算法37. Divide and conquer 分治法38. Memorization 记忆化39. Online algorithm 在线算法40. Offline algorithm 离线算法41. Random access 随机访问42. Sequential access 顺序访问45. In-place algorithm 原地算法46. Stable algorithm 稳定算法47. Unstable algorithm 不稳定算法48. Exact algorithm 精确算法49. Approximation algorithm 近似算法这些术语覆盖了算法和数据结构的各个方面,从基础的数据结构到排序算法、算法、图算法等等。
算法常用术语中英对照

算法常用术语中英对照Data Structures 基本数据结构Dictionaries 字典Priority Queues 堆Graph Data Structures 图Set Data Structures 集合Kd-Trees 线段树Numerical Problems 数值问题Solving Linear Equations 线性方程组Bandwidth Reduction 带宽压缩Matrix Multiplication 矩阵乘法Determinants and Permanents 行列式Constrained and Unconstrained Optimization 最值问题Linear Programming 线性规划Random Number Generation 随机数生成Factoring and Primality Testing 因子分解/质数判定Arbitrary Precision Arithmetic 高精度计算Knapsack Problem 背包问题Discrete Fourier Transform 离散Fourier变换Combinatorial Problems 组合问题Sorting 排序Searching 查找Median and Selection 中位数Generating Permutations 排列生成Generating Subsets 子集生成Generating Partitions 划分生成Generating Graphs 图的生成Calendrical Calculations 日期Job Scheduling 工程安排Satisfiability 可满足性Graph Problems -- polynomial 图论-多项式算法Connected Components 连通分支Topological Sorting 拓扑排序Minimum Spanning Tree 最小生成树Shortest Path 最短路径Transitive Closure and Reduction 传递闭包Matching 匹配Eulerian Cycle / Chinese Postman Euler回路/中国邮路Edge and Vertex Connectivity 割边/割点Network Flow 网络流Drawing Graphs Nicely 图的描绘Drawing Trees 树的描绘Planarity Detection and Embedding 平面性检测和嵌入Graph Problems -- hard 图论-NP问题Clique 最大团Independent Set 独立集Vertex Cover 点覆盖Traveling Salesman Problem 旅行商问题Hamiltonian Cycle Hamilton回路Graph Partition 图的划分Vertex Coloring 点染色Edge Coloring 边染色Graph Isomorphism 同构Steiner Tree Steiner树Feedback Edge/Vertex Set 最大无环子图Computational Geometry 计算几何Convex Hull 凸包Triangulation 三角剖分Voronoi Diagrams Voronoi图Nearest Neighbor Search 最近点对查询Range Search 围查询Point Location 位置查询Intersection Detection 碰撞测试Bin Packing 装箱问题Medial-Axis Transformation 中轴变换Polygon Partitioning 多边形分割Simplifying Polygons 多边形化简Shape Similarity 相似多边形Motion Planning 运动规划Maintaining Line Arrangements 平面分割Minkowski Sum Minkowski和Set and String Problems 集合与串的问题Set Cover 集合覆盖Set Packing 集合配置String Matching 模式匹配Approximate String Matching 模糊匹配Text Compression 压缩Cryptography 密码Finite State Machine Minimization 有穷自动机简化Longest Common Substring 最长公共子串Shortest Common Superstring 最短公共父串robustness 鲁棒性rate of convergence 收敛速度********************************************************************* 数据结构基本英语词汇数据抽象data abstraction数据元素data element数据对象data object数据项data item数据类型data type抽象数据类型abstract data type逻辑结构logical structure物理结构phyical structure线性结构linear structure非线性结构nonlinear structure基本数据类型atomic data type固定聚合数据类型fixed-aggregate data type 可变聚合数据类型variable-aggregate data type 线性表linear list栈stack队列queue串string数组array树tree图grabh查找,线索searching更新updating排序(分类) sorting插入insertion删除deletion前趋predecessor后继successor直接前趋immediate predecessor直接后继immediate successor双端列表deque(double-ended queue) 循环队列cirular queue指针pointer先进先出表(队列)first-in first-out list 后进先出表(队列)last-in first-out list 栈底bottom栈定top压入push弹出pop队头front队尾rear上溢overflow下溢underflow数组array矩阵matrix多维数组multi-dimentional array以行为主的顺序分配row major order 以列为主的顺序分配column major order 三角矩阵truangular matrix对称矩阵symmetric matrix稀疏矩阵sparse matrix转置矩阵transposed matrix链表linked list线性链表linear linked list单链表single linked list多重链表multilinked list循环链表circular linked list双向链表doubly linked list十字链表orthogonal list广义表generalized list链link指针域pointer field链域link field头结点head node头指针head pointer尾指针tail pointer串string空白(空格)串blank string空串(零串)null string子串substring树tree子树subtree森林forest根root叶子leaf结点node深度depth层次level双亲parents孩子children兄弟brother祖先ancestor子descentdant二叉树binary tree平衡二叉树banlanced binary tree 满二叉树full binary tree完全二叉树complete binary tree遍历二叉树traversing binary tree 二叉排序树binary sort tree二叉查找树binary search tree线索二叉树threaded binary tree 哈夫曼树Huffman tree有序数ordered tree无序数unordered tree判定树decision tree双链树doubly linked tree数字查找树digital search tree树的遍历traversal of tree先序遍历preorder traversal中序遍历inorder traversal后序遍历postorder traversal图graph子图subgraph有向图digraph(directed graph)无向图undigraph(undirected graph) 完全图complete graph连通图connected graph非连通图unconnected graph强连通图strongly connected graph 弱连通图weakly connected graph 加权图weighted graph有向无环图directed acyclic graph 稀疏图spares graph稠密图dense graph重连通图biconnected graph二部图bipartite graph边edge顶点vertex弧arc路径path回路(环)cycle弧头head弧尾tail源点source终点destination汇点sink权weight连接点articulation point初始结点initial node终端结点terminal node相邻边adjacent edge相邻顶点adjacent vertex关联边incident edge入度indegree出度outdegree最短路径shortest path有序对ordered pair无序对unordered pair简单路径simple path简单回路simple cycle连通分量connected component邻接矩阵adjacency matrix邻接表adjacency list邻接多重表adjacency multilist遍历图traversing graph生成树spanning tree最小(代价)生成树minimum(cost)spanning tree 生成森林spanning forest拓扑排序topological sort偏序partical order拓扑有序topological orderAOV网activity on vertex networkAOE网activity on edge network关键路径critical path匹配matching最大匹配maximum matching增广路径augmenting path增广路径图augmenting path graph查找searching线性查找(顺序查找)linear search (sequential search) 二分查找binary search分块查找block search散列查找hash search平均查找长度average search length散列表hash table散列函数hash funticion直接定址法immediately allocating method数字分析法digital analysis method平方取中法mid-square method折叠法folding method除法division method随机数法random number method排序sort部排序internal sort外部排序external sort插入排序insertion sort随小增量排序diminishing increment sort选择排序selection sort堆排序heap sort快速排序quick sort归并排序merge sort基数排序radix sort外部排序external sort平衡归并排序balance merging sort二路平衡归并排序balance two-way merging sort 多步归并排序ployphase merging sort置换选择排序replacement selection sort文件file主文件master file顺序文件sequential file索引文件indexed file索引顺序文件indexed sequential file索引非顺序文件indexed non-sequential file直接存取文件direct access file多重链表文件multilist file 倒排文件inverted file目录结构directory structure 树型索引tree index。
11 算法学习总结

一、算法简介(一)、概述算法(Algorithm)是指解题方案的准确而完整的描述,是一系列解决问题的清晰指令,算法代表着用系统的方法描述解决问题的策略机制。
也就是说,能够对一定规范的输入,在有限时间内获得所要求的输出。
如果一个算法有缺陷,或不适合于某个问题,执行这个算法将不会解决这个问题。
不同的算法可能用不同的时间、空间或效率来完成同样的任务。
一个算法的优劣可以用空间复杂度与时间复杂度来衡量。
算法中的指令描述的是一个计算,当其运行时能从一个初始状态和(可能为空的)初始输入开始,经过一系列有限而清晰定义的状态,最终产生输出并停止于一个终态。
一个状态到另一个状态的转移不一定是确定的。
随机化算法在内的一些算法,包含了一些随机输入。
形式化算法的概念部分源自尝试解决希尔伯特提出的判定问题,并在其后尝试定义有效计算性或者有效方法中成形。
这些尝试包括库尔特·哥德尔、Jacques Herbrand和斯蒂芬·科尔·克莱尼分别于1930年、1934年和1935年提出的递归函数,阿隆佐·邱奇于1936年提出的λ演算,1936年Emil Leon Post的Formulation 1和艾伦·图灵1937年提出的图灵机。
即使在当前,依然常有直觉想法难以定义为形式化算法的情况。
(二)、算法的五大特征一个算法应该具有以下五个重要的特征:1.有穷性(Finiteness)算法的有穷性是指算法必须能在执行有限个步骤之后终止;2.确切性(Definiteness)算法的每一步骤必须有确切的定义;3.输入项(Input)一个算法有0个或多个输入,以刻画运算对象的初始情况,所谓0个输入是指算法本身定出了初始条件;4.输出项(Output)一个算法有一个或多个输出,以反映对输入数据加工后的结果。
没有输出的算法是毫无意义的;5.可行性(Effectiveness)算法中执行的任何计算步骤都是可以被分解为基本的可执行的操作步,即每个计算步都可以在有限时间内完成(也称之为有效性)。
数据结构基本英语词汇大全

数据结构基本英语词汇大全以下是一些常见的数据结构基本英语词汇:1. Data structure - 数据结构2. Array - 数组3. Linked list - 链表4. Stack - 栈5. Queue - 队列6. Tree - 树7. Binary tree - 二叉树8. Binary search tree - 二叉树9. AVL tree - 平衡二叉树10. Heap - 堆11. Graph - 图12. Hash table - 哈希表13. Set - 集合14. Bag/Stack - 背包/堆栈15. Priority queue - 优先队列16. Graph traversal - 图遍历17. Depth-first search (DFS) - 深度优先18. Breadth-first search (BFS) - 广度优先19. Sorting algorithm - 排序算法20. Bubble sort - 冒泡排序21. Insertion sort - 插入排序22. Selection sort - 选择排序23. Merge sort - 归并排序24. Quick sort - 快速排序25. Hashing - 哈希算法26. Search algorithm - 算法27. Linear search - 线性28. Binary search - 二分29. Graph algorithms - 图算法30. Dijkstra's algorithm - 迪杰斯特拉算法31. Prim's algorithm - 普里姆算法32. Kruskal's algorithm - 克鲁斯克尔算法33. Depth-first search (DFS) - 深度优先34. Breadth-first search (BFS) - 广度优先35. Dynamic programming - 动态规划。
自然语言处理及计算语言学相关术语中英对译表三_计算机英语词汇

multilingual processing system 多语讯息处理系统multilingual translation 多语翻译multimedia 多媒体multi-media communication 多媒体通讯multiple inheritance 多重继承multistate logic 多态逻辑mutation 语音转换mutual exclusion 互斥mutual information 相互讯息nativist position 语法天生假说natural language 自然语言natural language processing (nlp) 自然语言处理natural language understanding 自然语言理解negation 否定negative sentence 否定句neologism 新词语nested structure 崁套结构network 网络neural network 类神经网络neurolinguistics 神经语言学neutralization 中立化n-gram n-连词n-gram modeling n-连词模型nlp (natural language processing) 自然语言处理node 节点nominalization 名物化nonce 暂用的non-finite 非限定non-finite clause 非限定式子句non-monotonic reasoning 非单调推理normal distribution 常态分布noun 名词noun phrase 名词组np (noun phrase) completeness 名词组完全性object 宾语{语言学}/对象{信息科学}object oriented programming 对象导向程序设计[面向对向的程序设计]official language 官方语言one-place predicate 一元述语on-line dictionary 线上查询词典 [联机词点]onomatopoeia 拟声词onset 节首音ontogeny 个体发生ontology 本体论open set 开放集operand 操作数 [操作对象]optimization 最佳化 [最优化]overgeneralization 过度概化overgeneration 过度衍生paradigmatic relation 聚合关系paralanguage 附语言parallel construction 并列结构parallel corpus 平行语料库parallel distributed processing (pdp) 平行分布处理paraphrase 转述 [释意;意译;同意互训]parole 言语parser 剖析器 [句法剖析程序]parsing 剖析part of speech (pos) 词类particle 语助词part-of relation part-of 关系part-of-speech tagging 词类标注pattern recognition 型样识别p-c (predicate-complement) insertion 述补中插pdp (parallel distributed processing) 平行分布处理perception 知觉perceptron 感觉器 [感知器]perceptual strategy 感知策略performative 行为句periphrasis 用独立词表达perlocutionary 语效性的permutation 移位petri net grammar petri 网语法philology 语文学phone 语音phoneme 音素phonemic analysis 因素分析phonemic stratum 音素层phonetics 语音学phonogram 音标phonology 声韵学 [音位学;广义语音学] phonotactics 音位排列理论phrasal verb 词组动词 [短语动词]phrase 词组 [短语]phrase marker 词组标记 [短语标记]pitch 音调pitch contour 调形变化pivot grammar 枢轴语法pivotal construction 承轴结构plausibility function 可能性函数pm (phrase marker) 词组标记 [短语标记] polysemy 多义性pos-tagging 词类标记postposition 方位词pp (preposition phrase) attachment 介词依附pragmatics 语用学precedence grammar 优先级语法precision 精确度predicate 述词predicate calculus 述词计算predicate logic 述词逻辑 [谓词逻辑]predicate-argument structure 述词论元结构prefix 前缀premodification 前置修饰preposition 介词prescriptive linguistics 规定语言学 [规范语言学] presentative sentence 引介句presupposition 前提principle of compositionality 语意合成性原理privative 二元对立的probabilistic parser 概率句法剖析程序problem solving 解决问题program 程序programming language 程序设计语言 [程序设计语言] proofreading system 校对系统proper name 专有名词prosody 节律prototype 原型pseudo-cleft sentence 准分裂句psycholinguistics 心理语言学punctuation 标点符号pushdown automata 下推自动机pushdown transducer 下推转换器qualification 后置修饰quantification 量化quantifier 范域词quantitative linguistics 计量语言学question answering system 问答系统queue 队列radical 字根 [词干;词根;部首;偏旁]radix of tuple 元组数基random access 随机存取rationalism 理性论rationalist (position) 理性论立场 [唯理论观点]reading laboratory 阅读实验室real time 实时real time control 实时控制 [实时控制]recursive transition network 递归转移网络reduplication 重叠词 [重复]reference 指涉referent 指称对象referential indices 指针referring expression 指涉词 [指示短语]register 缓存器[寄存器]{信息科学}/调高{语音学}/语言的场合层级{社会语言学}regular language 正规语言 [正则语言]relational database 关系型数据库 [关系数据库]relative clause 关系子句relaxation method 松弛法relevance 相关性restricted logic grammar 受限逻辑语法resumptive pronouns 复指代词retroactive inhibition 逆抑制rewriting rule 重写规则rheme 述位rhetorical structure 修辞结构rhetorics 修辞学robust 强健性robust processing 强健性处理robustness 强健性schema 基朴school grammar 教学语法scope 范域 [作用域;范围]script 脚本search mechanism 检索机制search space 检索空间searching route 检索路径 [搜索路径]second order predicate 二阶述词segmentation 分词segmentation marker 分段标志selectional restriction 选择限制semantic field 语意场semantic frame 语意架构semantic network 语意网络semantic representation 语意表征 [语义表示] semantic representation language 语意表征语言semantic restriction 语意限制semantic structure 语意结构semantics 语意学sememe 意素semiotics 符号学sender 发送者sensorimotor stage 感觉运动期sensory information 感官讯息 [感觉信息]sentence 句子sentence generator 句子产生器 [句子生成程序]sentence pattern 句型separation of homonyms 同音词区分sequence 序列serial order learning 顺序学习serial verb construction 连动结构set oriented semantic network 集合导向型语意网络 [面向集合型语意网络]sgml (standard generalized markup language) 结构化通用标记语言shift-reduce parsing 替换简化式剖析short term memory 短程记忆sign 信号signal processing technology 信号处理技术simple word 单纯词situation 情境situation semantics 情境语意学situational type 情境类型social context 社会环境sociolinguistics 社会语言学software engineering 软件工程 [软件工程]sort 排序speaker-independent speech recognition 非特定语者语音识别spectrum 频谱speech 口语speech act assignment 言语行为指定speech continuum 言语连续体speech disorder 语言失序 [言语缺失]speech recognition 语音辨识speech retrieval 语音检索speech situation 言谈情境 [言语情境]speech synthesis 语音合成speech translation system 语音翻译系统speech understanding system 语音理解系统spreading activation model 扩散激发模型standard deviation 标准差standard generalized markup language 标准通用标示语言start-bound complement 接头词state of affairs algebra 事态代数state transition diagram 状态转移图statement kernel 句核static attribute list 静态属性表statistical analysis 统计分析statistical linguistics 统计语言学statistical significance 统计意义stem 词干stimulus-response theory 刺激反应理论stochastic approach to parsing 概率式句法剖析 [句法剖析的随机方法]stop 爆破音stratificational grammar 阶层语法 [层级语法]string 字符串[串;字符串]string manipulation language 字符串操作语言string matching 字符串匹配 [字符串]structural ambiguity 结构歧义structural linguistics 结构语言学structural relation 结构关系structural transfer 结构转换structuralism 结构主义structure 结构structure sharing representation 结构共享表征subcategorization 次类划分 [下位范畴化] subjunctive 假设的sublanguage 子语言subordinate 从属关系subordinate clause 从属子句 [从句;子句] subordination 从属substitution rule 代换规则 [置换规则] substrate 底层语言suffix 后缀superordinate 上位的superstratum 上层语言suppletion 异型[不规则词型变化] suprasegmental 超音段的syllabification 音节划分syllable 音节syllable structure constraint 音节结构限制symbolization and verbalization 符号化与字句化synchronic 同步的synonym 同义词syntactic category 句法类别syntactic constituent 句法成分syntactic rule 语法规律 [句法规则]syntactic semantics 句法语意学syntagm 句段syntagmatic 组合关系 [结构段的;组合的] syntax 句法systemic grammar 系统语法tag 标记target language 目标语言 [目标语言]task sharing 课题分享 [任务共享] tautology 套套逻辑 [恒真式;重言式;同义反复] taxonomical hierarchy 分类阶层 [分类层次] telescopic compound 套装合并template 模板temporal inference 循序推理 [时序推理] temporal logic 时间逻辑 [时序逻辑] temporal marker 时貌标记tense 时态terminology 术语text 文本text analyzing 文本分析text coherence 文本一致性text generation 文本生成 [篇章生成]text linguistics 文本语言学text planning 文本规划text proofreading 文本校对text retrieval 文本检索text structure 文本结构 [篇章结构]text summarization 文本自动摘要 [篇章摘要] text understanding 文本理解text-to-speech 文本转语音thematic role 题旨角色thematic structure 题旨结构theorem 定理thesaurus 同义词辞典theta role 题旨角色theta-grid 题旨网格token 实类 [标记项]tone 音调tone language 音调语言tone sandhi 连调变换top-down 由上而下 [自顶向下]topic 主题topicalization 主题化 [话题化]trace 痕迹trace theory 痕迹理论training 训练transaction 异动 [处理单位]transcription 转写 [抄写;速记翻译]transducer 转换器transfer 转移transfer approach 转换方法transfer framework 转换框架transformation 变形 [转换]transformational grammar 变形语法 [转换语法] transitional state term set 转移状态项集合transitivity 及物性translation 翻译translation equivalence 翻译等值性translation memory 翻译记忆transparency 透明性tree 树状结构 [树]tree adjoining grammar 树形加接语法 [树连接语法] treebank 树图数据库[语法关系树库]trigram 三连词t-score t-数turing machine 杜林机 [图灵机]turing test 杜林测试 [图灵试验]type 类型type/token node 标记类型/实类节点type-feature structure 类型特征结构typology 类型学ultimate constituent 终端成分unbounded dependency 无界限依存underlying form 基底型式underlying structure 基底结构unification 连并 [合一]unification-based grammar 连并为本的语法 [基于合一的语法] universal grammar 普遍性语法universal instantiation 普遍例式universal quantifier 全称范域词unknown word 未知词 [未定义词]unrestricted grammar 非限制型语法usage flag 使用旗标user interface 使用者界面 [用户界面]valence grammar 结合价语法valence theory 结合价理论valency 结合价variance 变异数 [方差]verb 动词verb phrase 动词组 [动词短语]verb resultative compound 动补复合词verbal association 词语联想verbal phrase 动词组verbal production 言语生成vernacular 本地话v-o construction (verb-object) 动宾结构vocabulary 字汇vocabulary entry 词条vocal track 声道vocative 呼格voice recognition 声音辨识 [语音识别]vowel 元音vowel harmony 元音和谐 [元音和谐]waveform 波形weak verb 弱化动词whorfian hypothesis whorfian 假说word 词word frequency 词频word frequency distribution 词频分布word order 词序word segmentation 分词word segmentation standard for chinese 中文分词规范word segmentation unit 分词单位 [切词单位]word set 词集working memory 工作记忆 [工作存储区]world knowledge 世界知识writing system 书写系统x-bar theory x标杠理论 ["x"阶理论]zipf's law 利夫规律 [齐普夫定律]。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Engineering Radix SortPeter M. McIlroyKeith BosticComputer Science Research GroupUniversity of California at Berkeley
M. Douglas McIlroyAT&T Bell Laboratories
ABSTRACTRadix sorting methods have excellent asymptotic performance on string data, forwhich comparison is not a unit-time operation. Attractive for use in large byte-addressable memories, these methods have nevertheless long been eclipsed by more eas-ily programmed algorithms. Three ways to sort strings by bytes left to right—a stable listsort, a stable two-array sort, and an in-place ‘‘American flag’’ sort—are illustrated withpractical C programs. For heavy-duty sorting, all three perform comparably, usually run-ning at least twice as fast as a good quicksort. We recommend American flag sort forgeneral use.
1. IntroductionFor sorting strings you can’t beat radix sort—or so the theory says. The idea is simple. Deal thestrings into piles by their first letters. One pile gets all the empty strings. The next gets all the strings thatbegin withA-; another getsB- strings, and so on. Split these piles recursively on second and further lettersuntil the strings end. When there are no more piles to split, pick up all the piles in order. The strings aresorted.
In theory radix sort is perfectly efficient. It looks at just enough letters in each string to distinguish itfrom all the rest. There is no way to inspect fewer letters and still be sure that the strings are properlysorted. But this theory doesn’t tell the whole story: it’s hard to keep track of the piles.
Our main concern is bookkeeping, which can make or break radix sorting as a practical method. Thepaper may be read as a thorough answer to excercises posed in Knuth chapters 5.2 and 5.2.5, where the gen-eral plan is laid out.1Knuth also describes the other classical sorting methods that we refer to: radixexchange, quicksort, insertion sort, Shell sort, and little-endian radix sort.
1.1. Radix exchangeFor a binary alphabet, radix sorting specializes to the simple method ofradix exchange.2Split thestrings into three piles: the empty strings, those that begin with 0, and those that begin with 1. For classicalradix exchange assume further that the strings are all the same length. Then there is no pile for emptystrings and splitting can be done as in quicksort, with a bit test instead of quicksort’s comparison to decidewhich pile a string belongs in.
Program 1.1 sorts the part of arrayAthat runs fromA[lo] toA[hi−1 ]. All the strings in this rangehave the sameb-bit prefix, sayx-. The functionsplitmoves strings with prefixx0- to the beginning ofthe array, fromA[ 0 ] throughA[mid−1 ], and strings with prefixx1- to the end, fromA[mid] throughA[hi−1 ].- 2 -Program 1.1RadixExchange(A, lo, hi, b) =if hi – lo≤1then returnif b≥length(A[lo])then returnmid := Split(A, lo, hi, b)RadixExchange(A, lo, mid, b+1)RadixExchange(A, mid, hi, b+1)To sort ann-element array, call
RadixExchange(A, 0, n, 0)When strings can have different lengths, a full three-way split is needed, as in Program 1.2.3The pileof finished strings, with valuex, say, begins atA[lo]; thex0- pile begins atA[i0 ]; thex1- pile begins atA[i1 ].
Program 1.2.RadixExchange(A, lo, hi, b) =if hi - lo≤1then return(i0, i1) := Split3(A, lo, hi, b)RadixExchange(A, i0, i1, b+1)RadixExchange(A, i1, hi, b+1)
Three-way splitting is the famous problem of the Dutch national flag: separate three mixed colorsinto bands like the red, white and blue of the flag.4For us, the three colors are∅(no bit), 0 and 1. A recipefor splitting is given in Figure 1.1 and Program 1.3. The indexi0 points to the beginning of the 0- pile,i1points just beyond the end of the 0- pile, andi2 points to the beginning of the 1- pile. The notationA[i].bdenotes thebth bit, counted from 0, in stringA[i]. WhenSplit3finishes,i1 points to the beginning ofthe 1- pile as desired. The test for∅is figurative; it stands for a test for end of string.
∅lo0i00∅i1?1i2hi
∅lo0i00i1?1i2hi
∅lo0i01i1??1i2hiFigure 1.1.Howsplit3works. The four parts of the array hold strings known to haveended (∅), strings known to have 0 in the selected position, unknown strings, and stringsknown to have 1 there. Repeatedly look at the selected position of the first unknown string—the shaded box. Update according to the matching diagram.