外文文献翻译 An Introduction to Database Management System

合集下载

大数据外文翻译参考文献综述

大数据外文翻译参考文献综述

大数据外文翻译参考文献综述(文档含中英文对照即英文原文和中文翻译)原文:Data Mining and Data PublishingData mining is the extraction of vast interesting patterns or knowledge from huge amount of data. The initial idea of privacy-preserving data mining PPDM was to extend traditional data mining techniques to work with the data modified to mask sensitive information. The key issues were how to modify the data and how to recover the data mining result from the modified data. Privacy-preserving data mining considers the problem of running data mining algorithms on confidential data that is not supposed to be revealed even to the partyrunning the algorithm. In contrast, privacy-preserving data publishing (PPDP) may not necessarily be tied to a specific data mining task, and the data mining task may be unknown at the time of data publishing. PPDP studies how to transform raw data into a version that is immunized against privacy attacks but that still supports effective data mining tasks. Privacy-preserving for both data mining (PPDM) and data publishing (PPDP) has become increasingly popular because it allows sharing of privacy sensitive data for analysis purposes. One well studied approach is the k-anonymity model [1] which in turn led to other models such as confidence bounding, l-diversity, t-closeness, (α,k)-anonymity, etc. In particular, all known mechanisms try to minimize information loss and such an attempt provides a loophole for attacks. The aim of this paper is to present a survey for most of the common attacks techniques for anonymization-based PPDM & PPDP and explain their effects on Data Privacy.Although data mining is potentially useful, many data holders are reluctant to provide their data for data mining for the fear of violating individual privacy. In recent years, study has been made to ensure that the sensitive information of individuals cannot be identified easily.Anonymity Models, k-anonymization techniques have been the focus of intense research in the last few years. In order to ensure anonymization of data while at the same time minimizing the informationloss resulting from data modifications, everal extending models are proposed, which are discussed as follows.1.k-Anonymityk-anonymity is one of the most classic models, which technique that prevents joining attacks by generalizing and/or suppressing portions of the released microdata so that no individual can be uniquely distinguished from a group of size k. In the k-anonymous tables, a data set is k-anonymous (k ≥ 1) if each record in the data set is in- distinguishable from at least (k . 1) other records within the same data set. The larger the value of k, the better the privacy is protected. k-anonymity can ensure that individuals cannot be uniquely identified by linking attacks.2. Extending ModelsSince k-anonymity does not provide sufficient protection against attribute disclosure. The notion of l-diversity attempts to solve this problem by requiring that each equivalence class has at least l well-represented value for each sensitive attribute. The technology of l-diversity has some advantages than k-anonymity. Because k-anonymity dataset permits strong attacks due to lack of diversity in the sensitive attributes. In this model, an equivalence class is said to have l-diversity if there are at least l well-represented value for the sensitive attribute. Because there are semantic relationships among the attribute values, and different values have very different levels of sensitivity. Afteranonymization, in any equivalence class, the frequency (in fraction) of a sensitive value is no more than α.3. Related Research AreasSeveral polls show that the public has an in- creased sense of privacy loss. Since data mining is often a key component of information systems, homeland security systems, and monitoring and surveillance systems, it gives a wrong impression that data mining is a technique for privacy intrusion. This lack of trust has become an obstacle to the benefit of the technology. For example, the potentially beneficial data mining re- search project, Terrorism Information Awareness (TIA), was terminated by the US Congress due to its controversial procedures of collecting, sharing, and analyzing the trails left by individuals. Motivated by the privacy concerns on data mining tools, a research area called privacy-reserving data mining (PPDM) emerged in 2000. The initial idea of PPDM was to extend traditional data mining techniques to work with the data modified to mask sensitive information. The key issues were how to modify the data and how to recover the data mining result from the modified data. The solutions were often tightly coupled with the data mining algorithms under consideration. In contrast, privacy-preserving data publishing (PPDP) may not necessarily tie to a specific data mining task, and the data mining task is sometimes unknown at the time of data publishing. Furthermore, some PPDP solutions emphasize preserving the datatruthfulness at the record level, but PPDM solutions often do not preserve such property. PPDP Differs from PPDM in Several Major Ways as Follows :1) PPDP focuses on techniques for publishing data, not techniques for data mining. In fact, it is expected that standard data mining techniques are applied on the published data. In contrast, the data holder in PPDM needs to randomize the data in such a way that data mining results can be recovered from the randomized data. To do so, the data holder must understand the data mining tasks and algorithms involved. This level of involvement is not expected of the data holder in PPDP who usually is not an expert in data mining.2) Both randomization and encryption do not preserve the truthfulness of values at the record level; therefore, the released data are basically meaningless to the recipients. In such a case, the data holder in PPDM may consider releasing the data mining results rather than the scrambled data.3) PPDP primarily “anonymizes” the data by hiding the identity of record owners, whereas PPDM seeks to directly hide the sensitive data. Excellent surveys and books in randomization and cryptographic techniques for PPDM can be found in the existing literature. A family of research work called privacy-preserving distributed data mining (PPDDM) aims at performing some data mining task on a set of private databasesowned by different parties. It follows the principle of Secure Multiparty Computation (SMC), and prohibits any data sharing other than the final data mining result. Clifton et al. present a suite of SMC operations, like secure sum, secure set union, secure size of set intersection, and scalar product, that are useful for many data mining tasks. In contrast, PPDP does not perform the actual data mining task, but concerns with how to publish the data so that the anonymous data are useful for data mining. We can say that PPDP protects privacy at the data level while PPDDM protects privacy at the process level. They address different privacy models and data mining scenarios. In the field of statistical disclosure control (SDC), the research works focus on privacy-preserving publishing methods for statistical tables. SDC focuses on three types of disclosures, namely identity disclosure, attribute disclosure, and inferential disclosure. Identity disclosure occurs if an adversary can identify a respondent from the published data. Revealing that an individual is a respondent of a data collection may or may not violate confidentiality requirements. Attribute disclosure occurs when confidential information about a respondent is revealed and can be attributed to the respondent. Attribute disclosure is the primary concern of most statistical agencies in deciding whether to publish tabular data. Inferential disclosure occurs when individual information can be inferred with high confidence from statistical information of the published data.Some other works of SDC focus on the study of the non-interactive query model, in which the data recipients can submit one query to the system. This type of non-interactive query model may not fully address the information needs of data recipients because, in some cases, it is very difficult for a data recipient to accurately construct a query for a data mining task in one shot. Consequently, there are a series of studies on the interactive query model, in which the data recipients, including adversaries, can submit a sequence of queries based on previously received query results. The database server is responsible to keep track of all queries of each user and determine whether or not the currently received query has violated the privacy requirement with respect to all previous queries. One limitation of any interactive privacy-preserving query system is that it can only answer a sublinear number of queries in total; otherwise, an adversary (or a group of corrupted data recipients) will be able to reconstruct all but 1 . o(1) fraction of the original data, which is a very strong violation of privacy. When the maximum number of queries is reached, the query service must be closed to avoid privacy leak. In the case of the non-interactive query model, the adversary can issue only one query and, therefore, the non-interactive query model cannot achieve the same degree of privacy defined by Introduction the interactive model. One may consider that privacy-reserving data publishing is a special case of the non-interactivequery model.This paper presents a survey for most of the common attacks techniques for anonymization-based PPDM & PPDP and explains their effects on Data Privacy. k-anonymity is used for security of respondents identity and decreases linking attack in the case of homogeneity attack a simple k-anonymity model fails and we need a concept which prevent from this attack solution is l-diversity. All tuples are arranged in well represented form and adversary will divert to l places or on l sensitive attributes. l-diversity limits in case of background knowledge attack because no one predicts knowledge level of an adversary. It is observe that using generalization and suppression we also apply these techniques on those attributes which doesn’t need th is extent of privacy and this leads to reduce the precision of publishing table. e-NSTAM (extended Sensitive Tuples Anonymity Method) is applied on sensitive tuples only and reduces information loss, this method also fails in the case of multiple sensitive tuples.Generalization with suppression is also the causes of data lose because suppression emphasize on not releasing values which are not suited for k factor. Future works in this front can include defining a new privacy measure along with l-diversity for multiple sensitive attribute and we will focus to generalize attributes without suppression using other techniques which are used to achieve k-anonymity because suppression leads to reduce the precision ofpublishing table.译文:数据挖掘和数据发布数据挖掘中提取出大量有趣的模式从大量的数据或知识。

大数据外文翻译参考文献综述

大数据外文翻译参考文献综述

大数据外文翻译参考文献综述(文档含中英文对照即英文原文和中文翻译)原文:Data Mining and Data PublishingData mining is the extraction of vast interesting patterns or knowledge from huge amount of data. The initial idea of privacy-preserving data mining PPDM was to extend traditional data mining techniques to work with the data modified to mask sensitive information. The key issues were how to modify the data and how to recover the data mining result from the modified data. Privacy-preserving data mining considers the problem of running data mining algorithms on confidential data that is not supposed to be revealed even to the partyrunning the algorithm. In contrast, privacy-preserving data publishing (PPDP) may not necessarily be tied to a specific data mining task, and the data mining task may be unknown at the time of data publishing. PPDP studies how to transform raw data into a version that is immunized against privacy attacks but that still supports effective data mining tasks. Privacy-preserving for both data mining (PPDM) and data publishing (PPDP) has become increasingly popular because it allows sharing of privacy sensitive data for analysis purposes. One well studied approach is the k-anonymity model [1] which in turn led to other models such as confidence bounding, l-diversity, t-closeness, (α,k)-anonymity, etc. In particular, all known mechanisms try to minimize information loss and such an attempt provides a loophole for attacks. The aim of this paper is to present a survey for most of the common attacks techniques for anonymization-based PPDM & PPDP and explain their effects on Data Privacy.Although data mining is potentially useful, many data holders are reluctant to provide their data for data mining for the fear of violating individual privacy. In recent years, study has been made to ensure that the sensitive information of individuals cannot be identified easily.Anonymity Models, k-anonymization techniques have been the focus of intense research in the last few years. In order to ensure anonymization of data while at the same time minimizing the informationloss resulting from data modifications, everal extending models are proposed, which are discussed as follows.1.k-Anonymityk-anonymity is one of the most classic models, which technique that prevents joining attacks by generalizing and/or suppressing portions of the released microdata so that no individual can be uniquely distinguished from a group of size k. In the k-anonymous tables, a data set is k-anonymous (k ≥ 1) if each record in the data set is in- distinguishable from at least (k . 1) other records within the same data set. The larger the value of k, the better the privacy is protected. k-anonymity can ensure that individuals cannot be uniquely identified by linking attacks.2. Extending ModelsSince k-anonymity does not provide sufficient protection against attribute disclosure. The notion of l-diversity attempts to solve this problem by requiring that each equivalence class has at least l well-represented value for each sensitive attribute. The technology of l-diversity has some advantages than k-anonymity. Because k-anonymity dataset permits strong attacks due to lack of diversity in the sensitive attributes. In this model, an equivalence class is said to have l-diversity if there are at least l well-represented value for the sensitive attribute. Because there are semantic relationships among the attribute values, and different values have very different levels of sensitivity. Afteranonymization, in any equivalence class, the frequency (in fraction) of a sensitive value is no more than α.3. Related Research AreasSeveral polls show that the public has an in- creased sense of privacy loss. Since data mining is often a key component of information systems, homeland security systems, and monitoring and surveillance systems, it gives a wrong impression that data mining is a technique for privacy intrusion. This lack of trust has become an obstacle to the benefit of the technology. For example, the potentially beneficial data mining re- search project, Terrorism Information Awareness (TIA), was terminated by the US Congress due to its controversial procedures of collecting, sharing, and analyzing the trails left by individuals. Motivated by the privacy concerns on data mining tools, a research area called privacy-reserving data mining (PPDM) emerged in 2000. The initial idea of PPDM was to extend traditional data mining techniques to work with the data modified to mask sensitive information. The key issues were how to modify the data and how to recover the data mining result from the modified data. The solutions were often tightly coupled with the data mining algorithms under consideration. In contrast, privacy-preserving data publishing (PPDP) may not necessarily tie to a specific data mining task, and the data mining task is sometimes unknown at the time of data publishing. Furthermore, some PPDP solutions emphasize preserving the datatruthfulness at the record level, but PPDM solutions often do not preserve such property. PPDP Differs from PPDM in Several Major Ways as Follows :1) PPDP focuses on techniques for publishing data, not techniques for data mining. In fact, it is expected that standard data mining techniques are applied on the published data. In contrast, the data holder in PPDM needs to randomize the data in such a way that data mining results can be recovered from the randomized data. To do so, the data holder must understand the data mining tasks and algorithms involved. This level of involvement is not expected of the data holder in PPDP who usually is not an expert in data mining.2) Both randomization and encryption do not preserve the truthfulness of values at the record level; therefore, the released data are basically meaningless to the recipients. In such a case, the data holder in PPDM may consider releasing the data mining results rather than the scrambled data.3) PPDP primarily “anonymizes” the data by hiding the identity of record owners, whereas PPDM seeks to directly hide the sensitive data. Excellent surveys and books in randomization and cryptographic techniques for PPDM can be found in the existing literature. A family of research work called privacy-preserving distributed data mining (PPDDM) aims at performing some data mining task on a set of private databasesowned by different parties. It follows the principle of Secure Multiparty Computation (SMC), and prohibits any data sharing other than the final data mining result. Clifton et al. present a suite of SMC operations, like secure sum, secure set union, secure size of set intersection, and scalar product, that are useful for many data mining tasks. In contrast, PPDP does not perform the actual data mining task, but concerns with how to publish the data so that the anonymous data are useful for data mining. We can say that PPDP protects privacy at the data level while PPDDM protects privacy at the process level. They address different privacy models and data mining scenarios. In the field of statistical disclosure control (SDC), the research works focus on privacy-preserving publishing methods for statistical tables. SDC focuses on three types of disclosures, namely identity disclosure, attribute disclosure, and inferential disclosure. Identity disclosure occurs if an adversary can identify a respondent from the published data. Revealing that an individual is a respondent of a data collection may or may not violate confidentiality requirements. Attribute disclosure occurs when confidential information about a respondent is revealed and can be attributed to the respondent. Attribute disclosure is the primary concern of most statistical agencies in deciding whether to publish tabular data. Inferential disclosure occurs when individual information can be inferred with high confidence from statistical information of the published data.Some other works of SDC focus on the study of the non-interactive query model, in which the data recipients can submit one query to the system. This type of non-interactive query model may not fully address the information needs of data recipients because, in some cases, it is very difficult for a data recipient to accurately construct a query for a data mining task in one shot. Consequently, there are a series of studies on the interactive query model, in which the data recipients, including adversaries, can submit a sequence of queries based on previously received query results. The database server is responsible to keep track of all queries of each user and determine whether or not the currently received query has violated the privacy requirement with respect to all previous queries. One limitation of any interactive privacy-preserving query system is that it can only answer a sublinear number of queries in total; otherwise, an adversary (or a group of corrupted data recipients) will be able to reconstruct all but 1 . o(1) fraction of the original data, which is a very strong violation of privacy. When the maximum number of queries is reached, the query service must be closed to avoid privacy leak. In the case of the non-interactive query model, the adversary can issue only one query and, therefore, the non-interactive query model cannot achieve the same degree of privacy defined by Introduction the interactive model. One may consider that privacy-reserving data publishing is a special case of the non-interactivequery model.This paper presents a survey for most of the common attacks techniques for anonymization-based PPDM & PPDP and explains their effects on Data Privacy. k-anonymity is used for security of respondents identity and decreases linking attack in the case of homogeneity attack a simple k-anonymity model fails and we need a concept which prevent from this attack solution is l-diversity. All tuples are arranged in well represented form and adversary will divert to l places or on l sensitive attributes. l-diversity limits in case of background knowledge attack because no one predicts knowledge level of an adversary. It is observe that using generalization and suppression we also apply these techniques on those attributes which doesn’t need th is extent of privacy and this leads to reduce the precision of publishing table. e-NSTAM (extended Sensitive Tuples Anonymity Method) is applied on sensitive tuples only and reduces information loss, this method also fails in the case of multiple sensitive tuples.Generalization with suppression is also the causes of data lose because suppression emphasize on not releasing values which are not suited for k factor. Future works in this front can include defining a new privacy measure along with l-diversity for multiple sensitive attribute and we will focus to generalize attributes without suppression using other techniques which are used to achieve k-anonymity because suppression leads to reduce the precision ofpublishing table.译文:数据挖掘和数据发布数据挖掘中提取出大量有趣的模式从大量的数据或知识。

计算机java外文翻译外文文献英文文献

计算机java外文翻译外文文献英文文献

英文原文:Title: Business Applications of Java. Author: Erbschloe, Michael, Business Applications of Java -- Research Starters Business, 2008DataBase: Research Starters - BusinessBusiness Applications of JavaThis article examines the growing use of Java technology in business applications. The history of Java is briefly reviewed along with the impact of open standards on the growth of the World Wide Web. Key components and concepts of the Java programming language are explained including the Java Virtual Machine. Examples of how Java is being used bye-commerce leaders is provided along with an explanation of how Java is used to develop data warehousing, data mining, and industrial automation applications. The concept of metadata modeling and the use of Extendable Markup Language (XML) are also explained.Keywords Application Programming Interfaces (API's); Enterprise JavaBeans (EJB); Extendable Markup Language (XML); HyperText Markup Language (HTML); HyperText Transfer Protocol (HTTP); Java Authentication and Authorization Service (JAAS); Java Cryptography Architecture (JCA); Java Cryptography Extension (JCE); Java Programming Language; Java Virtual Machine (JVM); Java2 Platform, Enterprise Edition (J2EE); Metadata Business Information Systems > Business Applications of JavaOverviewOpen standards have driven the e-business revolution. Networking protocol standards, such as Transmission Control Protocol/Internet Protocol (TCP/IP), HyperText Transfer Protocol (HTTP), and the HyperText Markup Language (HTML) Web standards have enabled universal communication via the Internet and the World Wide Web. As e-business continues to develop, various computing technologies help to drive its evolution.The Java programming language and platform have emerged as major technologies for performing e-business functions. Java programming standards have enabled portability of applications and the reuse of application components across computing platforms. Sun Microsystems' Java Community Process continues to be a strong base for the growth of the Java infrastructure and language standards. This growth of open standards creates new opportunities for designers and developers of applications and services (Smith, 2001).Creation of Java TechnologyJava technology was created as a computer programming tool in a small, secret effort called "the Green Project" at Sun Microsystems in 1991. The Green Team, fully staffed at 13 people and led by James Gosling, locked themselves away in an anonymous office on Sand Hill Road in Menlo Park, cut off from all regular communications with Sun, and worked around the clock for18 months. Their initial conclusion was that at least one significant trend would be the convergence of digitally controlled consumer devices and computers. A device-independent programming language code-named "Oak" was the result.To demonstrate how this new language could power the future of digital devices, the Green Team developed an interactive, handheld home-entertainment device controller targeted at the digital cable television industry. But the idea was too far ahead of its time, and the digital cable television industry wasn't ready for the leap forward that Java technology offered them. As it turns out, the Internet was ready for Java technology, and just in time for its initial public introduction in 1995, the team was able to announce that the Netscape Navigator Internet browser would incorporate Java technology ("Learn about Java," 2007).Applications of JavaJava uses many familiar programming concepts and constructs and allows portability by providing a common interface through an external Java Virtual Machine (JVM). A virtual machine is a self-contained operating environment, created by a software layer that behaves as if it were a separate computer. Benefits of creating virtual machines include better exploitation of powerful computing resources and isolation of applications to prevent cross-corruption and improve security (Matlis, 2006).The JVM allows computing devices with limited processors or memory to handle more advanced applications by calling up software instructions inside the JVM to perform most of the work. This also reduces the size and complexity of Java applications because many of the core functions and processing instructions were built into the JVM. As a result, software developersno longer need to re-create the same application for every operating system. Java also provides security by instructing the application to interact with the virtual machine, which served as a barrier between applications and the core system, effectively protecting systems from malicious code.Among other things, Java is tailor-made for the growing Internet because it makes it easy to develop new, dynamic applications that could make the most of the Internet's power and capabilities. Java is now an open standard, meaning that no single entity controls its development and the tools for writing programs in the language are available to everyone. The power of open standards like Java is the ability to break down barriers and speed up progress.Today, you can find Java technology in networks and devices that range from the Internet and scientific supercomputers to laptops and cell phones, from Wall Street market simulators to home game players and credit cards. There are over 3 million Java developers and now there are several versions of the code. Most large corporations have in-house Java developers. In addition, the majority of key software vendors use Java in their commercial applications (Lazaridis, 2003).ApplicationsJava on the World Wide WebJava has found a place on some of the most popular websites in the world and the uses of Java continues to grow. Java applications not only provide unique user interfaces, they also help to power the backend of websites. Two e-commerce giants that everybody is probably familiar with (eBay and Amazon) have been Java pioneers on the World Wide Web.eBayFounded in 1995, eBay enables e-commerce on a local, national and international basis with an array of Web sites-including the eBay marketplaces, PayPal, Skype, and -that bring together millions of buyers and sellers every day. You can find it on eBay, even if you didn't know it existed. On a typical day, more than 100 million items are listed on eBay in tens of thousands of categories. Recent listings have included a tunnel boring machine from the Chunnel project, a cup of water that once belonged to Elvis, and the Volkswagen that Pope Benedict XVI owned before he moved up to the Popemobile. More than one hundred million items are available at any given time, from the massive to the miniature, the magical to the mundane, on eBay; the world's largest online marketplace.eBay uses Java almost everywhere. To address some security issues, eBay chose Sun Microsystems' Java System Identity Manager as the platform for revamping its identity management system. The task at hand was to provide identity management for more than 12,000 eBay employees and contractors.Now more than a thousand eBay software developers work daily with Java applications. Java's inherent portability allows eBay to move to new hardware to take advantage of new technology, packaging, or pricing, without having to rewrite Java code ("eBay drives explosive growth," 2007).Amazon (a large seller of books, CDs, and other products) has created a Web Service application that enables users to browse their product catalog and place orders. uses a Java application that searches the Amazon catalog for books whose subject matches a user-selected topic. The application displays ten books that match the chosen topic, and shows the author name, book title, list price, Amazon discount price, and the cover icon. The user may optionally view one review per displayed title and make a buying decision (Stearns & Garishakurthi, 2003).Java in Data Warehousing & MiningAlthough many companies currently benefit from data warehousing to support corporate decision making, new business intelligence approaches continue to emerge that can be powered by Java technology. Applications such as data warehousing, data mining, Enterprise Information Portals (EIP's), and Knowledge Management Systems (which can all comprise a businessintelligence application) are able to provide insight into customer retention, purchasing patterns, and even future buying behavior.These applications can not only tell what has happened but why and what may happen given certain business conditions; allowing for "what if" scenarios to be explored. As a result of this information growth, people at all levels inside the enterprise, as well as suppliers, customers, and others in the value chain, are clamoring for subsets of the vast stores of information such as billing, shipping, and inventory information, to help them make business decisions. While collecting and storing vast amounts of data is one thing, utilizing and deploying that data throughout the organization is another.The technical challenges inherent in integrating disparate data formats, platforms, and applications are significant. However, emerging standards such as the Application Programming Interfaces (API's) that comprise the Java platform, as well as Extendable Markup Language (XML) technologies can facilitate the interchange of data and the development of next generation data warehousing and business intelligence applications. While Java technology has been used extensively for client side access and to presentation layer challenges, it is rapidly emerging as a significant tool for developing scaleable server side programs. The Java2 Platform, Enterprise Edition (J2EE) provides the object, transaction, and security support for building such systems.Metadata IssuesOne of the key issues that business intelligence developers must solve is that of incompatible metadata formats. Metadata can be defined as information about data or simply "data about data." In practice, metadata is what most tools, databases, applications, and other information processes use to define, relate, and manipulate data objects within their own environments. It defines the structure and meaning of data objects managed by an application so that the application knows how to process requests or jobs involving those data objects. Developers can use this schema to create views for users. Also, users can browse the schema to better understand the structure and function of the database tables before launching a query.To address the metadata issue, a group of companies (including Unisys, Oracle, IBM, SAS Institute, Hyperion, Inline Software and Sun) have joined to develop the Java Metadata Interface (JMI) API. The JMI API permits the access and manipulation of metadata in Java with standard metadata services. JMI is based on the Meta Object Facility (MOF) specification from the Object Management Group (OMG). The MOF provides a model and a set of interfaces for the creation, storage, access, and interchange of metadata and metamodels (higher-level abstractions of metadata). Metamodel and metadata interchange is done via XML and uses the XML Metadata Interchange (XMI) specification, also from the OMG. JMI leverages Java technology to create an end-to-end data warehousing and business intelligence solutions framework.Enterprise JavaBeansA key tool provided by J2EE is Enterprise JavaBeans (EJB), an architecture for the development of component-based distributed business applications. Applications written using the EJB architecture are scalable, transactional, secure, and multi-user aware. These applications may be written once and then deployed on any server platform that supports J2EE. The EJB architecture makes it easy for developers to write components, since they do not need to understand or deal with complex, system-level details such as thread management, resource pooling, and transaction and security management. This allows for role-based development where component assemblers, platform providers and application assemblers can focus on their area of responsibility further simplifying application development.EJB's in the Travel IndustryA case study from the travel industry helps to illustrate how such applications could function. A travel company amasses a great deal of information about its operations in various applications distributed throughout multiple departments. Flight, hotel, and automobile reservation information is located in a database being accessed by travel agents worldwide. Another application contains information that must be updated with credit and billing historyfrom a financial services company. Data is periodically extracted from the travel reservation system databases to spreadsheets for use in future sales and marketing analysis.Utilizing J2EE, the company could consolidate application development within an EJB container, which can run on a variety of hardware and software platforms allowing existing databases and applications to coexist with newly developed ones. EJBs can be developed to model various data sets important to the travel reservation business including information about customer, hotel, car rental agency, and other attributes.Data Storage & AccessData stored in existing applications can be accessed with specialized connectors. Integration and interoperability of these data sources is further enabled by the metadata repository that contains metamodels of the data contained in the sources, which then can be accessed and interchanged uniformly via the JMI API. These metamodels capture the essential structure and semantics of business components, allowing them to be accessed and queried via the JMI API or to be interchanged via XML. Through all of these processes, the J2EE infrastructure ensures the security and integrity of the data through transaction management and propagation and the underlying security architecture.To consolidate historical information for analysis of sales and marketing trends, a data warehouse is often the best solution. In this example, data can be extracted from the operational systems with a variety of Extract, Transform and Load tools (ETL). The metamodels allow EJBsdesigned for filtering, transformation, and consolidation of data to operate uniformly on datafrom diverse data sources as the bean is able to query the metamodel to identify and extract the pertinent fields. Queries and reports can be run against the data warehouse that contains information from numerous sources in a consistent, enterprise-wide fashion through the use of the JMI API (Mosher & Oh, 2007).Java in Industrial SettingsMany people know Java only as a tool on the World Wide Web that enables sites to perform some of their fancier functions such as interactivity and animation. However, the actual uses for Java are much more widespread. Since Java is an object-oriented language like C++, the time needed for application development is minimal. Java also encourages good software engineering practices with clear separation of interfaces and implementations as well as easy exception handling.In addition, Java's automatic memory management and lack of pointers remove some leading causes of programming errors. Most importantly, application developers do not need to create different versions of the software for different platforms. The advantages available through Java have even found their way into hardware. The emerging new Java devices are streamlined systems that exploit network servers for much of their processing power, storage, content, and administration.Benefits of JavaThe benefits of Java translate across many industries, and some are specific to the control and automation environment. For example, many plant-floor applications use relatively simple equipment; upgrading to PCs would be expensive and undesirable. Java's ability to run on any platform enables the organization to make use of the existing equipment while enhancing the application.IntegrationWith few exceptions, applications running on the factory floor were never intended to exchange information with systems in the executive office, but managers have recently discovered the need for that type of information. Before Java, that often meant bringing together data from systems written on different platforms in different languages at different times. Integration was usually done on a piecemeal basis, resulting in a system that, once it worked, was unique to the two applications it was tying together. Additional integration required developing a brand new system from scratch, raising the cost of integration.Java makes system integration relatively easy. Foxboro Controls Inc., for example, used Java to make its dynamic-performance-monitor software package Internet-ready. This software provides senior executives with strategic information about a plant's operation. The dynamic performance monitor takes data from instruments throughout the plant and performs variousmathematical and statistical calculations on them, resulting in information (usually financial) that a manager can more readily absorb and use.ScalabilityAnother benefit of Java in the industrial environment is its scalability. In a plant, embedded applications such as automated data collection and machine diagnostics provide critical data regarding production-line readiness or operation efficiency. These data form a critical ingredient for applications that examine the health of a production line or run. Users of these devices can take advantage of the benefits of Java without changing or upgrading hardware. For example, operations and maintenance personnel could carry a handheld, wireless, embedded-Java device anywhere in the plant to monitor production status or problems.Even when internal compatibility is not an issue, companies often face difficulties when suppliers with whom they share information have incompatible systems. This becomes more of a problem as supply-chain management takes on a more critical role which requires manufacturers to interact more with offshore suppliers and clients. The greatest efficiency comes when all systems can communicate with each other and share information seamlessly. Since Java is so ubiquitous, it often solves these problems (Paula, 1997).Dynamic Web Page DevelopmentJava has been used by both large and small organizations for a wide variety of applications beyond consumer oriented websites. Sandia, a multiprogram laboratory of the U.S. Department of Energy's National Nuclear Security Administration, has developed a unique Java application. The lab was tasked with developing an enterprise-wide inventory tracking and equipment maintenance system that provides dynamic Web pages. The developers selected Java Studio Enterprise 7 for the project because of its Application Framework technology and Web Graphical User Interface (GUI) components, which allow the system to be indexed by an expandable catalog. The flexibility, scalability, and portability of Java helped to reduce development timeand costs (Garcia, 2004)IssueJava Security for E-Business ApplicationsTo support the expansion of their computing boundaries, businesses have deployed Web application servers (WAS). A WAS differs from a traditional Web server because it provides a more flexible foundation for dynamic transactions and objects, partly through the exploitation of Java technology. Traditional Web servers remain constrained to servicing standard HTTP requests, returning the contents of static HTML pages and images or the output from executed Common Gateway Interface (CGI ) scripts.An administrator can configure a WAS with policies based on security specifications for Java servlets and manage authentication and authorization with Java Authentication andAuthorization Service (JAAS) modules. An authentication and authorization service can bewritten in Java code or interface to an existing authentication or authorization infrastructure. Fora cryptography-based security infrastructure, the security server may exploit the Java Cryptography Architecture (JCA) and Java Cryptography Extension (JCE). To present the user with a usable interaction with the WAS environment, the Web server can readily employ a formof "single sign-on" to avoid redundant authentication requests. A single sign-on preserves user authentication across multiple HTTP requests so that the user is not prompted many times for authentication data (i.e., user ID and password).Based on the security policies, JAAS can be employed to handle the authentication process with the identity of the Java client. After successful authentication, the WAS securitycollaborator consults with the security server. The WAS environment authentication requirements can be fairly complex. In a given deployment environment, all applications or solutions may not originate from the same vendor. In addition, these applications may be running on different operating systems. Although Java is often the language of choice for portability between platforms, it needs to marry its security features with those of the containing environment.Authentication & AuthorizationAuthentication and authorization are key elements in any secure information handling system. Since the inception of Java technology, much of the authentication and authorization issues have been with respect to downloadable code running in Web browsers. In many ways, this had been the correct set of issues to address, since the client's system needs to be protected from mobile code obtained from arbitrary sites on the Internet. As Java technology moved from a client-centric Web technology to a server-side scripting and integration technology, it required additional authentication and authorization technologies.The kind of proof required for authentication may depend on the security requirements of a particular computing resource or specific enterprise security policies. To provide such flexibility, the JAAS authentication framework is based on the concept of configurable authenticators. This architecture allows system administrators to configure, or plug in, the appropriate authenticatorsto meet the security requirements of the deployed application. The JAAS architecture also allows applications to remain independent from underlying authentication mechanisms. So, as new authenticators become available or as current authentication services are updated, system administrators can easily replace authenticators without having to modify or recompile existing applications.At the end of a successful authentication, a request is associated with a user in the WAS user registry. After a successful authentication, the WAS consults security policies to determine if the user has the required permissions to complete the requested action on the servlet. This policy canbe enforced using the WAS configuration (declarative security) or by the servlet itself (programmatic security), or a combination of both.The WAS environment pulls together many different technologies to service the enterprise. Because of the heterogeneous nature of the client and server entities, Java technology is a good choice for both administrators and developers. However, to service the diverse security needs of these entities and their tasks, many Java security technologies must be used, not only at a primary level between client and server entities, but also at a secondary level, from served objects. By using a synergistic mix of the various Java security technologies, administrators and developers can make not only their Web application servers secure, but their WAS environments secure as well (Koved, 2001).ConclusionOpen standards have driven the e-business revolution. As e-business continues to develop, various computing technologies help to drive its evolution. The Java programming language and platform have emerged as major technologies for performing e-business functions. Java programming standards have enabled portability of applications and the reuse of application components. Java uses many familiar concepts and constructs and allows portability by providing a common interface through an external Java Virtual Machine (JVM). Today, you can find Java technology in networks and devices that range from the Internet and scientific supercomputers to laptops and cell phones, from Wall Street market simulators to home game players and credit cards.Java has found a place on some of the most popular websites in the world. Java applications not only provide unique user interfaces, they also help to power the backend of websites. While Java technology has been used extensively for client side access and in the presentation layer, it is also emerging as a significant tool for developing scaleable server side programs.Since Java is an object-oriented language like C++, the time needed for application development is minimal. Java also encourages good software engineering practices with clear separation of interfaces and implementations as well as easy exception handling. Java's automatic memory management and lack of pointers remove some leading causes of programming errors. The advantages available through Java have also found their way into hardware. The emerging new Java devices are streamlined systems that exploit network servers for much of their processing power, storage, content, and administration.中文翻译:标题:Java的商业应用。

数据挖掘技术中英文对照外文翻译文献

数据挖掘技术中英文对照外文翻译文献

中英文对照外文翻译文献(文档含英文原文和中文翻译)中英文资料对照外文翻译英文原文Introduction to Data MiningAbstract:Microsoft® SQL Server™ 2005 provides an integrated environment for creating and working with data mining models. This tutorial uses four scenarios, targeted mailing, forecasting, market basket, and sequence clustering, to demonstrate how to use the mining model algorithms, mining model viewers, and data mining tools that are included in this release of SQL Server.IntroductionThe data mining tutorial is designed to walk you through the process of creating data mining models in Microsoft SQL Server 2005. The data mining algorithms and tools in SQL Server 2005 make it easy to build a comprehensive solution for a variety of projects, including market basket analysis, forecasting analysis, and targeted mailing analysis. The scenarios for these solutions areexplained in greater detail later in the tutorial.The most visible components in SQL Server 2005 are the workspaces that you use to create and work with data mining models. The online analytical processing (OLAP) and data mining tools are consolidated into two working environments: Business Intelligence Development Studio and SQL Server Management Studio. Using Business Intelligence Development Studio, you can develop an Analysis Services project disconnected from the server. When the project is ready, you can deploy it to the server. You can also work directly against the server. The main function of SQL Server Management Studio is to manage the server. Each environment is described in more detail later in this introduction. For more information on choosing between the two environments, see "Choosing Between SQL Server Management Studio and Business Intelligence Development Studio" in SQL Server Books Online.All of the data mining tools exist in the data mining editor. Using the editor you can manage mining models, create new models, view models, compare models, and create predictions based on existing models.After you build a mining model, you will want to explore it, looking for interesting patterns and rules. Each mining model viewer in the editor is customized to explore models built with a specific algorithm. For more information about the viewers, see "Viewing a Data Mining Model" in SQL Server Books Online.Often your project will contain several mining models, so before you can use a model to create predictions, you need to be able to determine which model is the most accurate. For this reason, the editor contains a model comparison tool called the Mining Accuracy Chart tab. Using this tool you can compare the predictive accuracy of your models and determine the best model.To create predictions, you will use the Data Mining Extensions (DMX) language. DMX extends SQL, containing commands to create, modify, and predict against mining models. For more information about DMX, see "Data Mining Extensions (DMX) Reference" in SQL Server Books Online. Because creating a prediction can be complicated, the data mining editor contains a tool called Prediction Query Builder, which allows you to build queries using a graphical interface. You can also view the DMX code that is generated by the query builder.Just as important as the tools that you use to work with and create data mining models are the mechanics by which they are created. The key to creating a mining model is the data mining algorithm. The algorithm finds patterns in the data that you pass it, and it translates them into a mining model — it is the engine behind the process.Some of the most important steps in creating a data mining solution are consolidating, cleaning, and preparing the data to be used to create the mining models. SQL Server 2005 includes the Data Transformation Services (DTS) working environment, which contains tools that you can use to clean, validate, and prepare your data. For more information on using DTS in conjunction with a data mining solution, see "DTS Data Mining Tasks and Transformations" in SQL Server Books Online.In order to demonstrate the SQL Server data mining features, this tutorial uses a new sample database called AdventureWorksDW. The database is included with SQL Server 2005, and it supports OLAP and data mining functionality. In order to make the sample database available, you need to select the sample database at the installati on time in the “Advanced” dialog for component selection.Adventure WorksAdventureWorksDW is based on a fictional bicycle manufacturing company named Adventure Works Cycles. Adventure Works produces and distributes metal and composite bicycles to North American, European, and Asian commercial markets. The base of operations is located in Bothell, Washington with 500 employees, and several regional sales teams are located throughout their market base.Adventure Works sells products wholesale to specialty shops and to individuals through the Internet. For the data mining exercises, you will work with the AdventureWorksDW Internet sales tables, which contain realistic patterns that work well for data mining exercises.For more information on Adventure Works Cycles see "Sample Databases and Business Scenarios" in SQL Server Books Online.Database DetailsThe Internet sales schema contains information about 9,242 customers. These customers live in six countries, which are combined into three regions:North America (83%)Europe (12%)Australia (7%)The database contains data for three fiscal years: 2002, 2003, and 2004.The products in the database are broken down by subcategory, model, and product.Business Intelligence Development StudioBusiness Intelligence Development Studio is a set of tools designed for creating business intelligence projects. Because Business Intelligence Development Studio was created as an IDE environment in which you can create a complete solution, you work disconnected from the server. You can change your data mining objects as much as you want, but the changes are not reflected on the server until after you deploy the project.Working in an IDE is beneficial for the following reasons:The Analysis Services project is the entry point for a business intelligence solution. An Analysis Services project encapsulates mining models and OLAP cubes, along with supplemental objects that make up the Analysis Services database. From Business Intelligence Development Studio, you can create and edit Analysis Services objects within a project and deploy the project tothe appropriate Analysis Services server or servers.If you are working with an existing Analysis Services project, you can also use Business Intelligence Development Studio to work connected the server. In this way, changes are reflected directly on the server without having to deploy the solution.SQL Server Management StudioSQL Server Management Studio is a collection of administrative and scripting tools for working with Microsoft SQL Server components. This workspace differs from Business Intelligence Development Studio in that you are working in a connected environment where actions are propagated to the server as soon as you save your work.After the data has been cleaned and prepared for data mining, most of the tasks associated with creating a data mining solution are performed within Business Intelligence Development Studio. Using the Business Intelligence Development Studio tools, you develop and test the data mining solution, using an iterative process to determine which models work best for a given situation. When the developer is satisfied with the solution, it is deployed to an Analysis Services server. From this point, the focus shifts from development to maintenance and use, and thus SQL Server Management Studio. Using SQL Server Management Studio, you can administer your database and perform some of the same functions as in Business Intelligence Development Studio, such as viewing, and creating predictions from mining models.Data Transformation ServicesData Transformation Services (DTS) comprises the Extract, Transform, and Load (ETL) tools in SQL Server 2005. These tools can be used to perform some of the most important tasks in data mining: cleaning and preparing the data for model creation. In data mining, you typically perform repetitive data transformations to clean the data before using the data to train a mining model. Using the tasks and transformations in DTS, you can combine data preparation and model creation into a single DTS package.DTS also provides DTS Designer to help you easily build and run packages containing all of the tasks and transformations. Using DTS Designer, you can deploy the packages to a server and run them on a regularly scheduled basis. This is useful if, for example, you collect data weekly data and want to perform the same cleaning transformations each time in an automated fashion.You can work with a Data Transformation project and an Analysis Services project together as part of a business intelligence solution, by adding each project to a solution in Business Intelligence Development Studio.Mining Model AlgorithmsData mining algorithms are the foundation from which mining models are created. The variety of algorithms included in SQL Server 2005 allows you to perform many types of analysis.For more specific information about the algorithms and how they can be adjusted using parameters, see "Data Mining Algorithms" in SQL Server Books Online.Microsoft Decision TreesThe Microsoft Decision Trees algorithm supports both classification and regression and it works well for predictive modeling. Using the algorithm, you can predict both discrete and continuous attributes.In building a model, the algorithm examines how each input attribute in the dataset affects the result of the predicted attribute, and then it uses the input attributes with the strongest relationship to create a series of splits, called nodes. As new nodes are added to the model, a tree structure begins to form. The top node of the tree describes the breakdown of the predicted attribute over the overall population. Each additional node is created based on the distribution of states of the predicted attribute as compared to the input attributes. If an input attribute is seen to cause the predicted attribute to favor one state over another, a new node is added to the model. The model continues to grow until none of the remaining attributes create a split that provides an improved prediction over the existing node. The model seeks to find a combination of attributes and their states that creates a disproportionate distribution of states in the predicted attribute, therefore allowing you to predict the outcome of the predicted attribute.Microsoft ClusteringThe Microsoft Clustering algorithm uses iterative techniques to group records from a dataset into clusters containing similar characteristics. Using these clusters, you can explore the data, learning more about the relationships that exist, which may not be easy to derive logically through casual observation. Additionally, you can create predictions from the clustering model created by the algorithm. For example, consider a group of people who live in the same neighborhood, drive the same kind of car, eat the same kind of food, and buy a similar version of a product. This is a cluster of data. Another cluster may include people who go to the same restaurants, have similar salaries, and vacation twice a year outside the country. Observing how these clusters are distributed, you can better understand how the records in a dataset interact, as well as how that interaction affects the outcome of a predicted attribute.Microsoft Naïve BayesThe Microsoft Naïve Bayes algorithm quickly builds mining models that can be used for classification and prediction. It calculates probabilities for each possible state of the input attribute, given each state of the predictable attribute, which can later be used to predict an outcome of the predicted attribute based on the known input attributes. The probabilities used to generate the model are calculated and stored during the processing of the cube. The algorithm supports only discrete or discretized attributes, and it considers all input attributes to be independent. TheMicrosoft Naïve Bayes algorithm produces a simple mining model that can be considered a starting point in the data mining process. Because most of the calculations used in creating the model are generated during cube processing, results are returned quickly. This makes the model a good option for exploring the data and for discovering how various input attributes are distributed in the different states of the predicted attribute.Microsoft Time SeriesThe Microsoft Time Series algorithm creates models that can be used to predict continuous variables over time from both OLAP and relational data sources. For example, you can use the Microsoft Time Series algorithm to predict sales and profits based on the historical data in a cube.Using the algorithm, you can choose one or more variables to predict, but they must be continuous. You can have only one case series for each model. The case series identifies the location in a series, such as the date when looking at sales over a length of several months or years.A case may contain a set of variables (for example, sales at different stores). The Microsoft Time Series algorithm can use cross-variable correlations in its predictions. For example, prior sales at one store may be useful in predicting current sales at another store.Microsoft Neural NetworkIn Microsoft SQL Server 2005 Analysis Services, the Microsoft Neural Network algorithm creates classification and regression mining models by constructing a multilayer perceptron network of neurons. Similar to the Microsoft Decision Trees algorithm provider, given each state of the predictable attribute, the algorithm calculates probabilities for each possible state of the input attribute. The algorithm provider processes the entire set of cases , iteratively comparing the predicted classification of the cases with the known actual classification of the cases. The errors from the initial classification of the first iteration of the entire set of cases is fed back into the network, and used to modify the network's performance for the next iteration, and so on. You can later use these probabilities to predict an outcome of the predicted attribute, based on the input attributes. One of the primary differences between this algorithm and the Microsoft Decision Trees algorithm, however, is that its learning process is to optimize network parameters toward minimizing the error while the Microsoft Decision Trees algorithm splits rules in order to maximize information gain. The algorithm supports the prediction of both discrete and continuous attributes.Microsoft Linear RegressionThe Microsoft Linear Regression algorithm is a particular configuration of the Microsoft Decision Trees algorithm, obtained by disabling splits (the whole regression formula is built in a single root node). The algorithm supports the prediction of continuous attributes.Microsoft Logistic RegressionThe Microsoft Logistic Regression algorithm is a particular configuration of the Microsoft Neural Network algorithm, obtained by eliminating the hidden layer. The algorithm supports the prediction of both discrete andcontinuous attributes.)中文译文数据挖掘技术简介摘要:微软® SQL Server™2005中提供用于创建和使用数据挖掘模型的集成环境的工作。

计算机经典教材

计算机经典教材

1前言。

2Mathematics(数学)。

3DataStructures&Algorithms(数据结构、算法)。

4Compiler(编译原理)。

5OperatingSystem(操作系统)。

6Database(数据库)。

7C(C语言)。

8C++(C++语言)。

9Object-Oriented(面向对象)。

10SoftwareEngineering(软件工程)。

11UNIXProgramming(UNIX编程)。

12UNIXAdministration(UNIX系统管理)。

13Networks(网络)。

14WindowsProgramming(Windows编程)。

15Other(*)。

Mathematics(数学)。

书名(英文):DiscreteMathematicsandItsApplications(FifthEdition)。

书名(中文):离散数学及其应用(第五版)。

原作者:KennethH.Rosen。

书名(英文):ConcreteMathematics:AFoundationforComputerScience(SecondEdition)。

书名(中文):具体数学:计算机科学基础(第2版)。

原作者:RonaldL.Graham/DonaldE.Knuth/OrenPatashnik。

DataStructures&Algorithms(数据结构、算法)。

书名(英文):DataStructuresandAlgorithmAnalysisinC,SecondEdition。

书名(中文):数据结构与算法分析--C语言描述(第二版)。

原作者:MarkAllenWeiss。

书名(英文):DataStructures&ProgramDesignInC(SecondEdition)。

书名(中文):数据结构与程序设计C语言描述(第二版)。

原作者:RobertKruse/C.L.Tondo/BruceLeung。

计算机外文翻译外文文献英文文献数据库系统

计算机外文翻译外文文献英文文献数据库系统

外文资料原文Database Systems1.Introduction to Database SystemToday, more than at any previous time, the success of an organization depends on its ability to acquire accurate and timely data about its operation,to manage this data effectively,and to use it to analyze and guide its activities. Phrases such as the information superhighway have become ubiquitous,and information processing is a rapidly growing multibillion dollar industry 。

The amount of information available to us is literally exploding, and the value of data as an organizational asset is being widely recognized。

This paradox drives the need for increasingly powerful and flexible data management systems 。

A database is a collection of data , typically describing the activities of one or more related organizations . For example , a university database might contain information about the following .●Entities such as students ,faculty , courses ,and classrooms 。

外文翻译原文—数据库

DatabaseA database consists of an organized collection of data for one or more uses, typically in digital form. One way of classifying databases involves the type of their contents, for example: bibliographic, document-text, statistical. Digital databases are managed using database management systems, which store database contents, allowing data creation and maintenance, and search and other access. ArchitectureDatabase architecture consists of three levels, external, conceptual and internal. Clearly separating the three levels was a major feature of the relational database model that dominates 21st century databases.The external level defines how users understand the organization of the data. A single database can have any number of views at the external level. The internal level defines how the data is physically stored and processed by the computing system. Internal architecture is concerned with cost, performance, scalability and other operational matters. The conceptual is a level of indirection between internal and external. It provides a common view of the database that is uncomplicated by details of how the data is stored or managed, and that can unify the various external views into a coherent whole.Database management systemsA database management system (DBMS) consists of software that operates databases, providing storage, access, security, backup and other facilities. Database management systems can be categorized according to the database model that they support, such as relational or XML, the type(s) of computer they support, such as a server cluster or a mobile phone, the query language(s) that access the database, such as SQL or XQuery, performance trade-offs, such as maximum scale or maximum speed or others. Some DBMS cover more than one entry in these categories, e.g., supporting multiple query languages.Components of DBMSMost DBMS as of 2009[update] implement a relational model. Other DBMS systems, such as Object DBMS, offer specific features for more specialized requirements. Their components are similar, but not identical.RDBMS components•Sublanguages—Relational DBMS (RDBMS) include Data Definition Language (DDL) for defining the structure of the database, Data Control Language (DCL) for defining security/access controls, and Data Manipulation Language (DML) for querying and updating data.•Interface drivers—These drivers are code libraries that provide methods to prepare statements, execute statements, fetch results, etc. Examples include ODBC, JDBC, MySQL/PHP, FireBird/Python.•SQL engine—This component interprets and executes the DDL, DCL, and DML statements.It includes three major components (compiler, optimizer, and executor).•Transaction engine—Ensures that multiple SQL statements either succeed or fail as a group, according to application dictates.•Relational engine—Relational objects such as Table, Index, and Referential integrity constraints are implemented in this component.•Storage engine—This component stores and retrieves data from secondary storage, as well asmanaging transaction commit and rollback, backup and recovery, etc.ODBMS componentsObject DBMS (ODBMS) has transaction and storage components that are analogous to those in an RDBMS. Some ODBMS handle DDL, DCL and update tasks differently. Instead of using sublanguages, they provide APIs for these purposes. They typically include a sublanguage and accompanying engine for processing queries with interpretive statements analogous to but not the same as SQL. Example object query languages are OQL, LINQ, JDOQL, JPAQL and others. The query engine returns collections of objects instead of relational rows.TypesOperational databaseThese databases store detailed data about the operations of an organization. They are typically organized by subject matter, process relatively high volumes of updates using transactions. Essentially every major organization on earth uses such databases. Examples include customer databases that record contact, credit, and demographic information about a business' customers, personnel databases that hold information such as salary, benefits, skills data about employees, manufacturing databases that record details about product components, parts inventory, and financial databases that keep track of the organization's money, accounting and financial dealings.Data warehouseData warehouses archive historical data from operational databases and often from external sources such as market research firms. Often operational data undergoes transformation on its way into the warehouse, getting summarized, anonymized, reclassified, etc. The warehouse becomes the central source of data for use by managers and other end-users who may not have access to operational data. For example, sales data might be aggregated to weekly totals and converted from internal product codes to use UPC codes so that it can be compared with ACNielsen data. Analytical databaseAnalysts may do their work directly against a data warehouse, or create a separate analytic database for Online Analytical Processing. For example, a company might extract sales records for analyzing the effectiveness of advertising and other sales promotions at an aggregate level. Distributed databaseThese are databases of local work-groups and departments at regional offices, branch offices, manufacturing plants and other work sites. These databases can include segments of both common operational and com mon user databases, as well as data generated and used only at a user’s own site. End-user databaseThese databases consist of data developed by individual end-users. Examples of these are collections of documents in spreadsheets, word processing and downloaded files, or even managing their personal baseball card collection.External databaseThese databases contain data collect for use across multiple organizations, either freely or via subscription. The Internet Movie Database is one example.Hypermedia databasesThe Worldwide web can be thought of as a database, albeit one spread across millions of independent computing systems. Web browsers "process" this data one page at a time, while web crawlers and other software provide the equivalent of database indexes to support search and otheractivities.ModelsPost-relational database modelsProducts offering a more general data model than the relational model are sometimes classified as post-relational. Alternate terms include "hybrid database", "Object-enhanced RDBMS" and others. The data model in such products incorporates relations but is not constrained by E.F. Codd's Information Principle, which requires that all information in the database must be cast explicitly in terms of values in relations and in no other way.Some of these extensions to the relational model integrate concepts from technologies that pre-date the relational model. For example, they allow representation of a directed graph with trees on the nodes.Some post-relational products extend relational systems with non-relational features. Others arrived in much the same place by adding relational features to pre-relational systems. Paradoxically, this allows products that are historically pre-relational, such as PICK and MUMPS, to make a plausible claim to be post-relational.Object database modelsIn recent years[update], the object-oriented paradigm has been applied in areas such as engineering and spatial databases, telecommunications and in various scientific domains. The conglomeration of object oriented programming and database technology led to this new kind of database. These databases attempt to bring the database world and the application-programming world closer together, in particular by ensuring that the database uses the same type system as the application program. This aims to avoid the overhead (sometimes referred to as the impedance mismatch) of converting information between its representation in the database (for example as rows in tables) and its representation in the application program (typically as objects). At the same time, object databases attempt to introduce key ideas of object programming, such as encapsulation and polymorphism, into the world of databases.A variety of these ways have been tried for storing objects in a database. Some products have approached the problem from the application-programming side, by making the objects manipulated by the program persistent. This also typically requires the addition of some kind of query language, since conventional programming languages do not provide language-level functionality for finding objects based on their information content. Others have attacked the problem from the database end, by defining an object-oriented data model for the database, and defining a database programming language that allows full programming capabilities as well as traditional query facilities. Storage structuresDatabases may store relational tables/indexes in memory or on hard disk in one of many forms: •ordered/unordered flat files•ISAM•heaps•hash buckets•logically-blocked files•B+ treesThe most commonly used are B+ trees and ISAM.Object databases use a range of storage mechanisms. Some use virtual memory-mapped files tomake the native language (C++, Java etc.) objects persistent. This can be highly efficient but it can make multi-language access more difficult. Others disassemble objects into fixed- and varying-length components that are then clustered in fixed sized blocks on disk and reassembled into the appropriate format on either the client or server address space. Another popular technique involves storing the objects in tuples (much like a relational database) which the database server then reassembles into objects for the client.Other techniques include clustering by category (such as grouping data by month, or location), storing pre-computed query results, known as materialized views, partitioning data by range (e.g., a data range) or by hash.Memory management and storage topology can be important design choices for database designers as well. Just as normalization is used to reduce storage requirements and improve database designs, conversely denormalization is often used to reduce join complexity and reduc e query execution time.IndexingIndexing is a technique for improving database performance. The many types of index share the common property that they eliminate the need to examine every entry when running a query. In large databases, this can reduce query time/cost by orders of magnitude. The simplest form of index is a sorted list of values that can be searched using a binary search with an adjacent reference to the location of the entry, analogous to the index in the back of a book. The same data can have multiple indexes (an employee database could be indexed by last name and hire date.)Indexes affect performance, but not results. Database designers can add or remove indexes without changing application logic, reducing maintenance costs as the database grows and database usage evolves.Given a particular query, the DBMS' query optimizer is responsible for devising the most efficient strategy for finding matching data. The optimizer decides which index or indexes to use, how to combine data from different parts of the database, how to provide data in the order requested, etc.Indexes can speed up data access, but they consume space in the database, and must be updated each time the data are altered. Indexes therefore can speed data access but slow data maintenance. These two properties determine whether a given index is worth the cost.TransactionsMost DBMS provide some form of support for transactions, which allow multiple data items to be updated in a consistent fashion, such that updates that are part of a transaction succeed or fail in unison. The so-called ACID rules, summarized here, characterize this behavior:•Atomicity: Either all the data changes in a transaction must happen, or none of them. The transaction must be completed, or else it must be undone (rolled back).•Consistency: Every transaction must preserve the declared consistency rules for the database. •Isolation: Two concurrent transactions cannot interfere with one another. Intermediate results within one transaction must remain invisible to other transactions. The most extreme form of isolation is serializability, meaning that transactions that take place concurrently could instead be performed in some series, without affecting the ultimate result.•Durability: Completed transactions cannot be aborted later or their results discarded. They must persist through (for instance) DBMS restarts.In practice, many DBMSs allow the selective relaxation of these rules to balance perfect behavior with optimum performance.ReplicationDatabase replication involves maintaining multiple copies of a database on different computers, to allow more users to access it, or to allow a secondary site to immediately take over if the primary site stops working. Some DBMS piggyback replication on top of their transaction logging facility, applying the primary's log to the secondary in near real-time. Database clustering is a related concept for handling larger databases and user communities by employing a cluster of multiple computers to host a single database that can use replication as part of its approach.SecurityDatabase security denotes the system, processes, and procedures that protect a database from unauthorized activity.DBMSs usually enforce security through access control, auditing, and encryption:•Access control manages who can connect to the database via authentication and what they can do via authorization.•Auditing records information about database activity: who, what, when, and possibly where. •Encryption protects data at the lowest possible level by storing and possibly transmitting data in an unreadable form. The DBMS encrypts data when it is added to the database and decrypts it when returning query results. This process can occur on the client side of a network connection to prevent unauthorized access at the point of use.ConfidentialityLaw and regulation governs the release of information from some databases, protecting medical history, driving records, telephone logs, etc.In the United Kingdom, database privacy regulation falls under the Office of the Information Commissioner. Organizations based in the United Kingdom and holding personal data in digital format such as databases must register with the Office.LockingWhen a transaction modifies a resource, the DBMS stops other transactions from also modifying it, typically by locking it. Locks also provide one method of ensuring that data does not c hange while a transaction is reading it or even that it doesn't change until a transaction that once read it has completed.GranularityLocks can be coarse, covering an entire database, fine-grained, covering a single data item, or intermediate covering a collection of data such as all the rows in a RDBMS table.Lock typesLocks can be shared or exclusive, and can lock out readers and/or writers. Locks can be created implicitly by the DBMS when a transaction performs an operation, or explic itly at the transaction's request.Shared locks allow multiple transactions to lock the same resource. The lock persists until all such transactions complete. Exclusive locks are held by a single transaction and prevent other transactions from locking the same resource.Read locks are usually shared, and prevent other transactions from modifying the resource. Write locks are exclusive, and prevent other transactions from modifying the resource. On some systems, write locks also prevent other transactions from reading the resource.The DBMS implicitly locks data when it is updated, and may also do so when it is read.Transactions explicitly lock data to ensure that they can complete without a deadlock or other complication. Explic it locks may be useful for some administrative tasks.Locking can significantly affect database performance, especially with large and complex transactions in highly concurrent environments.IsolationIsolation refers to the ability of one transaction to see the results of other transactions. Greater isolation typically reduces performance and/or concurrency, leading DBMSs to provide administrative options to reduce isolation. For example, in a database that analyzes trends rather than looking at low-level detail, increased performance might justify allowing readers to see uncommitted changes ("dirty reads".)DeadlocksDeadlocks occur when two transactions each require data that the other has already locked exclusively. Deadlock detection is performed by the DBMS, which then aborts one of the transactions and allows the other to complete.From: Wikipedia, the free encyclopedia。

数据库外文参考文献及翻译

数据库外文参考文献及翻译数据库外文参考文献及翻译SQL ALL-IN-ONE DESK REFERENCE FOR DUMMIESData Files and DatabasesI. Irreducible complexityAny software system that performs a useful function is going to be complex. The more valuable the function, the more complex its implementation will be. Regardless of how the data is stored, the complexity remains. The only question is where that complexity resides. Any non-trivial computer application has two major components: the program the data. Although an application’s level of complexity depends on the task to be performed, developers have some control over the location of that complexity. The complexity may reside primarily in the program part of the overall system, or it may reside in the data part.Operations on the data can be fast. Because the programinteracts directly with the data, with no DBMS in the middle, well-designed applications can run as fast as the hardware permits. What could be better? A data organization that minimizes storage requirements and at the same time maximizes speed of operation seems like the best of all possible worlds. But wait a minute . Flat file systems came into use in the 1940s. We have known about them for a long time, and yet today they have been almost entirely replaced by database s ystems. What’s up with that? Perhaps it is the not-so-beneficial consequences。

信息系统和数据库开发中英文对照外文翻译文献

中英文对照外文翻译文献(文档含英文原文和中文翻译)Information System Development and DatabaseDevelopmentIn many organizations, database development from the beginning of enterprise data modeling, data modeling enterprises determine the scope of the database and the general content. This step usually occurs in an organization's information system planning process, it aims to help organizations create an overall data description or explanation, and not the design of a specific database. A specific database for one or more information systems provide data and the corporate data model (which may involve a number of databases) described by the organization maintaining the scope of the data. Data modeling in the enterprise, you review of the current system, the need to support analysis of the nature of the business areas, the need for further description of the abstract data, and planning one or more database developmentproject. Figure 1 shows Pine Valley furniture company's enterprise data model of a part.1.1 Information System ArchitectureSenior data model is only general information system architecture (ISA) or a part of an organization's information system blueprint. In the information system planning, you can build an enterprise data model as a whole information system architecture part. According to Zachman (1987), Sowa and Zachman (1992) views of an information system architecture consists of the following six key components:DataManipulation of data processing (of a data flow diagram can be used, with the object model methods, or other symbols that).Networks, which organizations and in organizations with its main transmission of data between business partners (it can connect through the network topology map and to demonstrate).People who deal with the implementation of data and information and is the source and receiver (in the process model for the data shows that the sender and the receiver).Implementation of the events and time points (they can use state transition diagram and other means.)The reasons for the incident and data processing rules (often in the form of text display, but there are also a number of charts for the planning tools such as decision tables).1.2 Information EngineeringInformation systems planners in accordance with the specific information system planning methods developed information system architecture. Information engineering is a popular and formal methods. Information engineering is a data-oriented creation and maintenance of the information system. Information engineering is because the data-oriented, so when you begin to understand how the database is defined by the logo and when information engineering a concise explanation is very helpful. Information Engineering follow top-down planning approach, in which specific information systems from a wide range of informationneeds in the understanding derived from (for example, we need about customers, products, suppliers, sales and processing of the data center), rather than merging many detailed information requested ( orders such as a screen or in accordance with the importation of geographical sales summary report). Top-down planning will enable developers to plan more comprehensive information system, consider system components provide an integrated approach to enhance the information system and the relationship between the business objectives of the understanding, deepen their understanding of information systems throughout the organization in understanding the impact.Information Engineering includes four steps: planning, analysis, design and implementation. The planning stage of project information generated information system architecture, including enterprise data model.1.3 Information System PlanningInformation systems planning objective is to enable IT organizations and the business strategy closely integrated, such integration for the information systems and technology to make the most of the investment interest is very important. As the table as a description, information engineering approach the planning stage include three steps, we in the follow-up of three sections they discussed.1. Critical factors determining the planningPlanning is the key factor that organizational objectives, critical success factors and problem areas. These factors determine the purpose of the establishment of planning and environment planning and information systems linked to strategic business planning. Table 2 shows the Pine Valley furniture company's key planning a number of possible factors, these factors contribute to the information systems manager for the new information systems and databases clubs top priority to deal with the demand. For example, given the imprecise sales forecasts this problem areas, information systems managers in the organization may be stored in the database additional historical sales data, new market research data and new product test data.2. The planning organizations set targetsOrganizations planning targets defined scope of business, and business scope will limit the subsequent analysis and information systems may change places. Five key planning targets as follows:● organizational units in the various sectors.● organizations location of the place of business operations.● functions of the business support organizations handling mission of the relevant group. Unlike business organizations function modules, in fact a function can be assigned to various organizations modules (for example, product development function is the production and sale of the common responsibility of the Ministry).● types of entities managed by the organization on the people, places and things of the major types of data.● Information System data set processing software applications and support procedures.3. To set up a business modelA comprehensive business model including the functions of each enterprise functional decomposition model, the enterprise data model and the various planning matrix. Functional decomposition is the function of the organization for a more detailed decomposition process, the functional decomposition is to simplify the analysis of the issue, distracted and identify components and the use of the classical approach. Pine Valley furniture company in order to function in the functional decomposition example in figure 2 below. In dealing with business functions and support functions of the full set, multiple databases, is essential to a specific database therefore likely only to support functions (as shown in Figure 2) provide a subset of support. In order to reduce data redundancy and to make data more meaningful, has a complete, high-level business view is very helpful.The use of specific enterprise data model to describe the symbol. Apart from the graphical description of this type of entity, a complete enterprise data model should also include a description of each entity type description of business operations and a summary of that business rules. Business rules determine the validity of the data.An enterprise data model includes not only the types of entities, including the link between the data entities, as well as various other objects planning links. Showed that the linkage between planning targets a common form of matrix. Because of planning matrix need not be explicit modeling database can be clearly described business needs, planning matrix is an important function. Regular planning matrix derived from theoperational rules, it will help social development activities that top priority will be sorting and development activities under the top-down view through an enterprise-wide approach for the development of these activities. There are many types of planning matrix is available, their commonalities are:● locations - features show business function in which the implementation of operational locations.● unit - functions which showed that business function or business unit responsible for implementation.● Information System - data entities to explain how each information system interact with each data entity (for example, whether or not each system in each entity have the data to create, retrieve, update and delete).● support functions - data in each functional entities in the data set for the acquisition, use, update and delete.● Information System - target indication for each information system to support business objectives.Data entities matrix. Such a matrix can be used for a variety of purposes, including the following three objectives:1) identify gaps in the data entities to indicate the types of entities not use any function or functions which do not use any entity.2) found that the loss of each functional entities involved in the inspection staff through the matrix to identify any possible loss of the entity.3) The distinction between development activities if the priority to the top of a system development function for a high-priority (probably because it important organizational objectives related), then this area used by entities in the development of the database has a high priority. Hoffer, George and Valacich (2002) are the works of the matrix on how to use the planning and completion of the Information Engineering.The planning system more complete description.2 database development processBased on information engineering information systems planning database is a source of development projects. These new database development projects is usuallyin order to meet the strategic needs of organizations, such as improving customer support, improve product and inventory management, or a more accurate sales forecast. However, many more database development project is the bottom-up approach emerging, such as information system user needs specific information to complete their work, thus beginning a project request, and as other information systems experts found that organizations need to improve data management and begin new projects. Bottom-up even in the circumstances, to set up an enterprise data model is also necessary to understand the existing database can provide the necessary data, otherwise, the new database, data entities and attributes can be added to the current data resources to the organization. Both the strategic needs or operational information needs of each database development projects normally concentrated in a database. Some projects only concentrated in the database definition, design and implementation of a database, as a follow-up to the basis of the development of information systems. However, in most cases, the database and associated information processing function as a complete information systems development project was part of the development.2.1 System Development Life CycleGuide management information system development projects is the traditional process of system development life cycle (SDLC). System development life cycle is an organization of the database designers and programmers information system composed of the Panel of Experts detailed description, development, maintenance and replacement of the entire information system steps. This process is because Waterfall than for every step into the adjacent the next step, that is, the information system is a specification developed by a piece of land, every piece of the output is under an input. However shown in the figure, these steps are not purely linear, each of the steps overlap in time (and thus can manage parallel steps), but when the need to reconsider previous decisions, but also to roll back some steps ahead. (And therefore water can be put back in the waterfall!)Figure 4 on the system development life cycle and the purpose of each stage of the product can be delivered concise notes. The system development life cycle including each stage and database development-related activities, therefore, the question of database management systems throughout the entire development process. In Figure 5 we repeat of the system development life cycle stage of the seven, and outlines thecommon database at each stage of development activities. Please note that the systems development life cycle stages and database development steps一一对应exists between the relationship between the concept of modeling data in both systems development life cycle stages between.Enterprise ModelingDatabase development process from the enterprise modeling (system development life cycle stage of the project feasibility studies, and to choose a part), Organizations set the scope and general database content. Enterprise modeling in information systems planning and other activities, these activities determine which part of information systems need to change and strengthen the entire organization and outlines the scope of data. In this step, check the current database and information systems, development of the project as the main areas of the nature of the business, with a very general description of each term in the development of information systems when needed data. Each item only when it achieved the expected goals of organizations can be when the next step.Conceptual Data ModelingOne has already begun on the Information System project, the concept of data modeling phase of the information systems needs of all the data. It is divided into two stages. First, it began the project in the planning stage and the establishment of a plan similar to Figure 1. At the same time outlining the establishment of other documents to the existing database without considering the circumstances specific development projects in the scope of the required data. This category only includes high-level data (entities), and main contact. Then in the system development life-cycle analysis stage must have a management information system set the entire organization Details of the data model definition of all data attributes, listing all data types that all data inter-entity business linkages, defining description of the full data integrity rules. In the analysis phase, but also the concept of inspection data model (also called the concept behind the model) and the goal of information systems used to explain other aspects of the model of consistency categories, such as processing steps, rules and data processing time of timing. However, even if the concept is such detailed data model is only preliminary, because follow-up information system life cycle activities in the design of services, statements, display and inquiries may find that missing element or mistakes. Therefore, the concept of data often said that modeling is atop-down manner, its areas of operation from the general understanding of the driver, rather than the specific information processing activities by the driver.3. Logical Database DesignLogical database design from two perspectives database development. First, the concept of data model transform into relational database theory based on the criteria that means - between. Then, as the design of information systems, every computer procedures (including procedures for the input and output format), database support services, statements, and inquiries revealed that a detailed examination. In this so-called Bottom-up analysis, accurate verification of the need to maintain the database and the data in each affairs, statements and so on the needs of those in the nature of the data.For each separate statements, services, and so on the analysis must take into account a specific, limited but complete database view. When statements, services, and other analysis might be necessary to change the concept of data model. Especially in large-scale projects, the different analytical systems development staff and the team can work independently in different procedures or in a centralized, the details of their work until all the logic design stage may be displayed. In these circumstances, logic database design stage must be the original concept of data model and user view these independent or merged into a comprehensive design. In logic design information systems also identify additional information processing needs of these new demands at this time must be integrated into the logic of earlier identified in the database design.Logical database design is based on the final step for the formation of good data specifications and determine the rules, the combination, the data after consultation specifications or converted into basic atomic element. Most of today's database, these rules from the relational database theory and the process known as standardization. This step is the result of management of these data have not cited any database management system for a complete description of the database map. Logical database design completed, we began to identify in detail the logic of the computer program and maintenance, the report contents of the database for inquiries.4. Physical database design and definitionPhysical database design and definition phase decisions computer memory (usuallydisk) database in the organization, definition of According to the library management system for physical structure, the procedures outlined processing services, produce the desired management information and decision support statements. The objective of this stage is to design an effective and safe management of all data-processing database, the physical database design to closely integrate the information systems of other physical aspects of the design, including procedures, computer hardware, operating systems and data communications networks.5. Database ImplementationThe database prepared by the realization stage, testing and installation procedures for handling databases. Designers can use the standard programming language (such as COBOL, C or Visual Basic), the dedicated database processing languages (such as SQL), or the process of the non-exclusive language programming in order to produce a statement of the fixed format, the result will be displayed, and may also include charts. In achieving stage, but also the completion of all the database files, training users for information systems (database) user setup program. The final step is to use existing sources of information (documents legacy applications and databases and now needs new data) loading data. Loading data is often the first step in data from existing files and databases to an intermediate format (such as binary or text files) and then to turn intermediate loading data to a new database. Finally, running databases and related applications for the actual user maintenance and retrieval of data. In operation, the regular backup database and the database when damaged or affected resume database.6. Database maintenanceDuring the database in the progressive development of database maintenance. In this step, in order to meet changing business conditions, in order to correct the erroneous database design, database applications or processing speed increase, delete or change the structure of the database. When a procedure or failure of the computer database affect or damage the database may also be reconstruction. This step usually is the longest in the database development process step, as it continued to databases and related applications throughout the life cycle, the development of each database can be seen as a brief database development process and data modeling concepts arise, logical and physical database design and database to achieve dealing with the changes.2.2 Information System developed by other meansSystem Development Life Cycle minor changes in law or its variant of the often used to guide information systems and database development. Information System is a life-cycle methodology, it is highly structured approach, which includes many checks and balances to ensure that every step of produce accurate results, and new or alternative information system and it must communications or data definitions consistent existing system needs consistency. System development life cycle because of the regular need to have a working system for a long time been criticized because only work in the system until the end of the whole process generated. More and more organizations now use rapid application development method, it is a includes analysis, design and implementation of steps to repeat the rapid iterative process until convergence to users the system so far. Rapid Application Development Act required the database has been in existence, and enhance system is mainly to the application of data retrieval application, but not to those who generate and modify database applications.The most widely used method of rapid application development is one of the prototype. The prototype system is a method of iterative development process, analysts and users through close co-operation, continuing to revise the system will eventually convert all the needs of a working system. Figure 6 shows prototype of the process. In this diagram we contains notes, briefly describes each stage of the prototype of the database development activities. Normally, when information systems problems were identified, tried only a rough concept of data modeling. In the development of the initial prototype, the design of the user wants to display and statements, and that any new database needs and define a term prototype database. This is usually a new database, copy the part of the existing system, but might also added some new content. When the need for new content, these elements are usually from external data sources, such as market research data, the general economic indicators or industry standards.When a prototype of a new version to repeat the achievement and maintenance of database activities. Usually only a minimum level of security and integrity control, because at this time the focus is as soon as possible to produce a prototype version can be used. But document management project also deferred to the final, only be used in the delivery of user training. Finally, once constructed an acceptable prototype,developers, and users will be the final decision of whether to prototype delivery and the use of the database. If the system (including database) efficiency is very low, then the system and database will be re-programming and re-organization in order to achieve the desired performance.Along with visual programming tools (such as Visual Basic, Java, Visual C + + and fourth generation language) increasingly popular use of visual programming tools can easily change the user interface with the system, the prototype is becoming the choice of system development methodology. Customers using the prototype method statements and show changes to the content and layout is quite easy. In the process, the new database needs were identified, so it is the development of the use of the existing database should be amended. There is even the possibility of a need for a new database system prototype method, in such circumstances, when the system demand in the iterative process of development in the ever-changing needs access to sample data, the construction or reconstruction of the database prototype.3 database development of the three-tier architecture modelIn this article on the front of the database development process mentioned in the interpretation of a system development project on the establishment of the several different, but related database view or model:● conceptual model (in the analysis stage of the establishment).● external model or user view (in the analysis phase and the establishment of logical design phase).● physical model or internal model (in the physical design phase of the establishment).Figure 7 describes the database view that the relationship between the three, it is important to remember that they are the same organizations database view or model. In other words, each organization has a database of the physical model, a concept model and one or more users view.Therefore, the three-tier architecture model using the same data set observe the different ways definition database.Concept models on the full database structure, has nothing to do with the technical specifications. Conceptual model definition do not involve the entire database datastored in the computer how the secondary memory. Usually, the conceptual model by entities - links (E-R) map or object modeling symbols such a graphical format to describe, we have this type of concept model called the data model. In addition, the conceptual model specification as a metadata stored in the database or data dictionary.Physical models including conceptual model of how data stored in computer memory in the two specifications. Analysts and the database design is as important to the physical database (physical mode) definition, it provides information on the distribution and management of data storage and access of the physical memory space of two full database technology specifications.Database development and database technology database is among the three models divided into basis. Database development projects may have a role to only deal with these three views of a related work. For example, a beginner may be designed for one or more procedures external model, and an experienced developer will design the physical model or conceptual model. Database design issues at different levels are quite different.4 three-tier structure of the database positioning systemObviously, all the good things in the database are, and the "three"!When designing a database, you have to choose where to store data. This option in the physical database design stage. Database is divided into individual databases, the Working Group database, departmental databases, corporate databases and the Internet database. Individuals often by the end-user database design and development of their own, just by database experts to give training and advice to help, it only contains individual end-users interested in the data. Sometimes, personal database from the database or enterprise Working Group extracted from the database, such circumstances database prepared by some experts from the regular routine to create local database. Sector Working Group database and the database is often the end-user, business experts and the central database system experts development. The collaborative work of these officers is necessary because in the design of the database to be shared by a large number of issues weigh: processing speed, ease of use, data definition differences and other similar problems. Due to corporate databases and the Internet database broad impact, large-scale, it is normally concentrated in the database development team has received professional training to develop a database of experts.1. Customers layerA desktop or notebook also known as that layer, which specialized management user interface and system localization data in this layer can be implemented on the Web scripting tasks.2. Server / Web serverHTTP protocol handling, scripting tasks, the implementation of computing and provide data access, the layer known as processing services layer.3. Enterprise Server (Minicomputer or mainframe) layerThe implementation of complex computing and inter-organizational management from multiple data sources of data integration, also known as data services layer.In an organization, hierarchical database and information system architecture for distributed computing and the client / server architecture of the concept of correlation. Client / server architecture based on a LAN environment, including servers (referred to as database server or database engine) database software implementation from the client workstation database orders, each customer applications focus on their user interface functions. In fact, the whole concept of the database (as well as the application of these databases to handle routine) as a distributed database or the separate but related physical database distribution in the local PC workstation, server intermediate (working group or sector) and one center server (departments or enterprises ). Simply said that the use of client / server architecture for:● it can handle multiple processors on the same application at the same time, improve application response time and data processing speed.● It can use each computer platform of the best data processing (such as PC Minicom Advanced user interface with the mainframe and computing speed).● can mix various client technology (Intel or Motorola processor assembly of personal computers, computer networks, information kiosks, etc.) and public data sharing. In addition, you can change the technology at any layer and other layers only a small influence on the system module.● able to handle close to the data source to be addressed to improve response time and reduce network traffic.。

An_Introduction_to_A_Portrait_of_the_Artist_as_a_Young_Man


Rembrandt would fix his gaze in the mirror to the
world behind the mirror as he painted. Joyce’s use of the title may suggest the following chiasmic structure:
image. Portrait is riddled with chiasmus, (“Apologise, pull out his eyes, pull out his eyes, Apologise.”) Even the structure of the novel itself is chiasmic. (Parts two and four both end with images of women; parts three and five end with men.) I In the center of the middle chapter lies the passage: “The preacher took a chainless watch from a pocket within his soutane and, having considered its dial for a moment in silence, placed it silently before him on the table.”


and mother but Uncle Charles was older than Dante” “The Vances lived in number seven. They had a different father and mother. They were Eileen’s father and mother. ” “When I grow up I’ll be a father, when she grows up, she’ll be a mother.” “When he was grown up, he was going to marry Eileen” Why does Joyce highlight this relationship-oriented development?
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

英文翻译数据库管理系统的介绍Raghu Ramakrishnan数据库(database,有时被拼作data base)又称为电子数据库,是专门组织起来的一组数据或信息,其目的是为了便于计算机快速查询及检索。

数据库的结构是专门设计的,在各种数据处理操作命令的支持下,可以简化数据的存储、检索、修改和删除。

数据库可以存储在磁盘、磁带、光盘或其他辅助存储设备上。

数据库由一个或一套文件组成,其中的信息可以分解为记录,每一条记录又包含一个或多个字段(或称为域)。

字段是数据存取的基本单位。

数据库用于描述实体,其中的一个字段通常表示与实体的某一属性相关的信息。

通过关键字以及各种分类(排序)命令,用户可以对多条记录的字段进行查询,重新整理,分组或选择,以实体对某一类数据的检索,也可以生成报表。

所有数据库(除最简单的)中都有复杂的数据关系及其链接。

处理与创建,访问以及维护数据库记录有关的复杂任务的系统软件包叫做数据库管理系统(DBMS)。

DBMS软件包中的程序在数据库与其用户间建立接口。

(这些用户可以是应用程序员,管理员及其他需要信息的人员和各种操作系统程序)DBMS可组织、处理和表示从数据库中选出的数据元。

该功能使决策者能搜索、探查和查询数据库的内容,从而对正规报告中没有的,不再出现的且无法预料的问题做出回答。

这些问题最初可能是模糊的并且(或者)是定义不恰当的,但是人们可以浏览数据库直到获得所需的信息。

简言之,DBMS将“管理”存储的数据项和从公共数据库中汇集所需的数据项用以回答非程序员的询问。

DBMS由3个主要部分组成:(1)存储子系统,用来存储和检索文件中的数据;(2)建模和操作子系统,提供组织数据以及添加、删除、维护、更新数据的方法;(3)用户和DBMS之间的接口。

在提高数据库管理系统的价值和有效性方面正在展现以下一些重要发展趋势:1.管理人员需要最新的信息以做出有效的决策。

2.客户需要越来越复杂的信息服务以及更多的有关其订单,发票和账号的当前信息。

3.用户发现他们可以使用传统的程序设计语言,在很短的一段时间内用数据库系统开发客户应用程序。

4.商业公司发现了信息的战略价值,他们利用数据库系统领先于竞争对手。

数据库模型数据库模型描述了在数据库中结构化和操纵数据的方法,模型的结构部分规定了数据如何被描述(例如树,表等):模型的操纵部分规定了数据添加、删除、显示、维护、打印、查找、选择、排序和更新等操作。

分层模型第一个数据库管理系统使用的是分层模型,也就是说,将数据记录排列成树形结构。

一些记录根目录,在其他所有记录中都有独立的父记录。

树形结构的设计反映了数据被使用的顺序,也就是首先访问处于树根位置的记录,接下来是根下面的记录等。

分层模型的开发是因为分层关系在商业应用中普遍存在。

众所周知,一个组织结构图表就描述了一种分层关系:高层管理人员在高层,中层管理人员在较低的层次,负责具体事务的雇员在底层。

值得注意的是,在一个严格的分层结构体系中,在每个管理层下可能有多个雇员或多个层次的雇员,但每个雇员只有一个管理者。

分层结构数据的典型特征是数据之间的一对多关系。

在分层方法中,当数据库建立时,每一关系即被明确地定义。

在分层数据库中的每一记录只能包含一个关键字段,任意两个字段之间只能有一种关系。

由于数据并不总是遵循这种严格的分层关系,所以这样可能会出现一些问题。

关系模型在1970年,数据库研究取得了重大突破。

E.F.Codd提出了一种截然不同的数据库管理方法,使用表作为数据结构,称之为关系模型.关系数据库是使用最广的数据结构,数据被组织成关系表,每个表由称作记录的行和称作字段的列组成。

每个记录包含了专用项目的字段值。

例如,在一个包含雇员信息的表中,一个记录包含了像一个人姓名和地址这样的字段的值。

结构化查询语言(SQL)是一种在关系型数据库中用于处理数据的查询语言。

它是非过程化语言或者说是描述性的,用户只须指定一种类似于英语的描述,用来确定操作、记录或描述记录组合。

查询优化器将这种描述翻译为过程执行数据库操作。

网状模型网状模型在数据之间通过链接表结构创建关系,子记录可以链接到多个父记录。

这种将记录和链接捆绑到一起的方法叫做指针,它是指向一个记录存储位置的存储地址。

使用网状方法,一个子记录可以链接到一个关键记录,同时,它本身也可以作为一个关键记录链接到其他一系列子记录。

在早期,网状模型比其他模型更有性能优势,但是在今天,这种优势的特点只有在自动柜员机网络,航空预定系统等大容量和高速处理过程中才是最重要的。

分层和网状数据库都是专用程序,如果开发一个新的应用程序,那么在不同的应用程序中保持数据库的一致性是非常困难的。

例如开发一个退休金程序,需要访问雇员数据,这一数据同时也被工资单程序访问。

虽然数据是相同的,但是也必须建立新的数据库。

对象模型最新的数据库管理方法是使用对象模型,记录由被称作对象的实体来描述,可以在对象中存储数据,同时提供方法或程序执行特定的任务。

对象模型使用的查询语言与开发数据库程序所使用的面向对象的程序设计语言是相同的,因为没有像SQL这样简单统一的查询语言,所以会产生一些问题。

对象模型相对较新,仅有少数几个面向对象的数据库实例。

它引起了人们的关注,因为选择面向对象程序设计语言的开发人员希望有一个基于在对象模型基础上的数据库。

分布式数据库类似的,分布式数据库指的是数据库的各个部分分别存储在物理上相互分开的计算机上。

分布式数据库的一个目的是访问数据信息时不必考虑其他位置。

注意,一旦用户和数据分开,通信和网络则开始扮演重要角色。

分布式数据库需要部分常驻于大型主机上的软件,这些软件在大型机和个人计算机之间建立桥梁,并解决数据格式不兼容的问题。

在理想情况下,大型主机上的数据库看起来像是一个大的信息仓库,而大部分处理则在个人计算机上完成。

分布式数据库系统的一个缺点是它们常以主机中心模型为基础,在这种模型中,大型主机看起来好像是雇主,而终端和个人计算机看起来好像是奴隶。

但是这种方法也有许多优点:由于数据库的集中控制,前面提到的数据完整性和安全性的问题就迎刃而解。

当今的个人计算机,部门级计算机和分布式处理都需要计算机之间以及应用程序之间在相等或对等的基础上相互通信,在数据库中客户机/服务器模型为分布式数据库提供了框架结构。

利用相互连接的在计算机上运行的数据库应用程序的一种方法是将程序分解为相互独立的部分。

客户端是一个最终用户或通过网络申请资源的计算机程序,服务器是一个运行着的计算机软件,存储着那些通过网络传输的申请。

当申请的资源是数据库中的数据时,客户机/服务器模型则为分布式数据库提供了框架结构。

文件服务器指的是一个通过网络提供文件访问的软件,专门的文件服务器是一台被指定为文件服务器的计算机,这是非常有用的。

例如,如果文件比较大而且需要快速访问,在这种情况下,一台微型计算机或大型主机将被用作文件服务器。

分布式文件服务器将文件分散到不同的计算机上,而不是将它们集中存放到专门的文件服务器上。

后一种文件服务器拥有在其他计算机上存储和检索文件的能力,并可以在每一台计算机上消除重复文件。

然而,一个重要的缺点是每个读写请求需要在网络上传播,在刷新文件时可能出现问题。

假设一个用户申请文件中的一个数据并修改它,同时另外一个用户也申请这个数据并修改它,解决这种问题的方法叫做数据锁定,即第一个申请使其他申请处于等待状态,直到完成第一个申请,其他用户可以读取这个数据,但不能修改。

数据库服务器是一个通过网络为数据库申请提供服务的软件。

例如,假设某个用户在他的个人计算机上输入了一个数据查询命令,如果应用程序按照客户机/服务器模型设计,那么个人计算机上的查询语言通过网络传送到数据库服务器上,当发现数据时发出通知。

在工程界也有许多分布式数据库的例子,如SUN公司的网络文件系统(NFS)被应用到计算机辅助工程应用程序中,将数据分散到由SUN工作站组成的网络上的不同硬盘之间。

英文原文An Introduction to Database Management SystemRaghu RamakrishnanA database (sometimes spelled data base) is also called an electronic database, referring to any collection of data or information, and that is specially organized for rapid search and retrieval by a computer. Databases are structured to facilitate the storage, retrieval, modification, and deletion of data in conjunction with various data-processing operations .Databases can be stored on magnetic disk or tape, optical disk, or some other secondary storage device.A database consists of a file or a set of files. The information in these files may be broken down into records, each of which consists of one or more fields. Fields are the basic units of data storage, and each field typically contains information pertaining to one aspect or attribute of the entity described by the database. Using keywords and various sorting commands, users can rapidly search, rearrange, group, and select the fields in many records to retrieve or create reports on particular aggregate of data.Complex data relationships and linkages may be found in all but the simplest databases. The system software package that handles the difficult tasks associated with creating, accessing, and maintaining database records is called a database management system(DBMS). The programs in a DBMS package establish an interface between the database itself and the users of the database. (These users may be applications programmers, managers and others with information needs, and various OS programs)A DBMS can organize, process, and present selected data elements form the database. This capability enables decision makers to search, probe, and query database contents in order to extract answers to nonrecurring and unplanned questions that aren’t available in regular reports. These questions might initially be vague and/or poorly defined, but people can “browse”through the database until they have the needed information. In short, the DBMS will “manage”the stored data items andassemble the needed items from the common database in response to the queries of those who aren’t programmers.A database management system (DBMS) is composed of three major parts: (1) a storage subsystem that stores and retrieves data in files; (2) a modeling and manipulation subsystem that provides the means with which to organize the data and to add, delete, maintain, and update the data; (3) and an interface between the DBMS and its users. Several major trends are emerging that enhance the value and usefulness of database management systems:1. Managers: who require more up-to-data information to make effective decision.2. Customers: who demand increasingly sophisticated information services and more current information about the status of their orders, invoices, and accounts.3. Users: who find that they can develop custom applications with database systems in a fraction of the time it takes to use traditional programming languages.4. Organizations: that discover information has a strategic value; they utilize their database systems to gain an edge over their competitors.The Database ModelA data model describes a way to structure and manipulate the data in a database. The structural part of the model specifies how data should be represented (such as tree, tables, and so on). The manipulative part of the model specifies the operation with which to add, delete, display, maintain, print, search, select, sort and update the data. Hierarchical ModelThe first database management systems used a hierarchical model-that is-they arranged records into a tree structure. Some records are root records and all others have unique parent records. The structure of the tree is designed to reflect the order in which the data will be used that is, the record at the root of a tree will be accessed first, then records one level below the root, and so on.The hierarchical model was developed because hierarchical relationships are commonly found in business applications. As you have known, an organization char often describes a hierarchical relationship: top management is at the highest level,middle management at lower levels, and operational employees at the lowest levels. Note that within a strict hierarchy, each level of management may have many employees or levels of employees beneath it, but each employee has only one manager. Hierarchical data are characterized by this one-to-many relationship among data.In the hierarchical approach, each relationship must be explicitly defined when the database is created. Each record in a hierarchical database can contain only one key field and only one relationship is allowed between any two fields. This can create a problem because data do not always conform to such a strict hierarchy.Relational ModelA major breakthrough in database research occurred in 1970 when E. F. Codd proposed a fundamentally different approach to database management called relational model, which uses a table as its data structure.The relational database is the most widely used database structure. Data is organized into related tables. Each table is made up of rows called and columns called fields. Each record contains fields of data about some specific item. For example, in a table containing information on employees, a record would contain fields of data such as a person’s last name, first name, and street address.Structured query language (SQL) is a query language for manipulating data in a relational database. It is nonprocedural or declarative, in which the user need only specify an English-like description that specifies the operation and the described record or combination of records. A query optimizer translates the description into a procedure to perform the database manipulation.Network ModelThe network model creates relationships among data through a linked-list structure in which subordinate records can be linked to more than one parent record. This approach combines records with links, which are called pointers. The pointers are addresses that indicate the location of a record. With the network approach, a subordinate record can be linked to a key record and at the same time itself be a key record linked to other sets of subordinate records. The network mode historically hashad a performance advantage over other database models. Today, such performance characteristics are only important in high-volume, high-speed transaction processing such as automatic teller machine networks or airline reservation system.Both hierarchical and network databases are application specific. If a new application is developed, maintaining the consistency of databases in different applications can be very difficult. For example, suppose a new pension application is developed. The data are the same, but a new database must be created.Object ModelThe newest approach to database management uses an object model, in which records are represented by entities called objects that can both store data and provide methods or procedures to perform specific tasks.The query language used for the object model is the same object-oriented programming language used to develop the database application. This can create problems because there is no simple, uniform query language such as SQL. The object model is relatively new, and only a few examples of object-oriented database exist. It has attracted attention because developers who choose an object-oriented programming language want a database based on an object-oriented model. Distributed DatabaseSimilarly, a distributed database is one in which different parts of the database reside on physically separated computers. One goal of distributed databases is the access of information without regard to where the data might be stored. Keeping in mind that once the users and their data being separated, the communication and networking concepts come into play.Distributed databases require software that resides partially in the larger computer. This software bridges the gap between personal and large computers and resolves the problems of incompatible data formats. Ideally, it would make the mainframe databases appear to be large libraries of information, with most of the processing accomplished on the personal computer.A drawback to some distributed systems is that they are often based on what is called a mainframe-entire model, in which the larger host computer is seen as themaster and the terminal or personal computer is seen as a slave. There are some advantages to this approach. With databases under centralized control, many of the problems of data integrity that we mentioned earlier are solved. But today’s personal computers, departmental computers, and distributed processing require computers and their applications to communicate with each other on a more equal or peer-to-peer basis. In a database, the client/server model provides the framework for distributing databases.One way to take advantage of many connected computers running database applications is to distribute the application into cooperating parts that are independent of one anther. A client is an end user or computer program that requests resources across a network. A server is a computer running software that fulfills those requests across a network. When the resources are data in a database, the client/server model provides the framework for distributing database.A file server is software that provides access to files across a network. A dedicated file server is a single computer dedicated to being a file server. This is useful, for example, if the files are large and require fast access. In such cases, a minicomputer or mainframe would be used as a file server. A distributed file server spreads the files around on individual computers instead of placing them on one dedicated computer.Advantages of the latter server include the ability to store and retrieve files on other computers and the elimination of duplicate files on each computer. A major disadvantage, however, is that individual read/write requests are being moved across the network and problems can arise when updating files. Suppose a user requests a record from a file and changes it while another user requests the same record and changes it too. The solution to this problem called record locking, which means that the first request makes others requests wait until the first request is satisfied. Other users may be able to read the record, but they will not be able to change it.A database server is software that services requests to a database across a network. For example, suppose a user types in a query for data on his or her personal computer. If the application is designed with the client/server model in mind, thequery language part on the personal computer simple sends the query across the network to the database server and requests to be notified when the data are found.Examples of distributed database systems can be found in the engineering world. Sun’s Network Filing System (NFS), for example, is used in computer-aided engineering applications to distribute data among the hard disks in a network of Sun workstation.。

相关文档
最新文档