Accurate Dependability Analysis of CAN-based Networked

合集下载

经济普查数据质量全流程控制办法

英文回答：When conducting an economic census, the assurance of data quality is paramount in the generation of precise and dependable results. Aprehensive approach to overseeing the quality of data throughout the entire process is indispensable in mitigating errors and bias. This necessitates the implementation of measures at every stage of the census, from the collection of data to its processing and analysis. By adhering to a rigorous quality control plan, organizations can instill confidence in the integrity of their economic census data.在进行经济普查时，确保数据质量对于产生准确和可靠的结果至关重要。

在整个过程中对数据质量进行监督的全面做法对于减轻错误和偏见是必不可少的。

这就需要在人口普查的每个阶段，从数据收集到处理和分析，都执行措施。

通过遵守严格的质量控制计划，各组织可以对其经济普查数据的完整性产生信心。

The first thing we need to do to make sure the economic census data is good quality is to have really clear and consistent ways of collecting the data. That means giving our staff proper training so they know how important it is to get accurate andplete information. We can use standard forms and rules tohelp cut down on mistakes and differences in how the data is collected. And using technology like digital tools can make it even better by making it easier and quicker to get the data. We also need to keep an eye on the data collection and check in regularly to fix any problems thate up.为了保证经济普查数据的质量，我们首先需要做的是，有真正明确和一致的方法来收集数据。

2007-共因失效和不完全覆盖

58IEEE TRANSACTIONS ON RELIABILITY, VOL. 56, NO. 1, MARCH 2007Reliability Evaluation of Phased-Mission Systems With Imperfect Fault Coverage and Common-Cause FailuresLiudong Xing, Member, IEEEAbstract—This paper proposes efﬁcient methods to assess the reliability of phased-mission systems (PMS) considering both imperfect fault coverage (IPC), and common-cause failures (CCF). The IPC introduces multimode failures that must be considered in the accurate reliability analysis of PMS. Another difﬁculty in analysis is to allow for multiple CCF that can affect different subsets of system components, and which can occur -dependently. Our methodology for resolving the above difﬁculties is to separate the consideration of both IPC and CCF from the combinatorics of the binary decision diagram-based solution, and adjust the input and output of the program to generate the reliability of PMS with IPC and CCF. According to the separation order, two equivalent approaches are developed. The applications and advantages of the approaches are illustrated through examples. PMS without IPC and/or CCF appear as special cases of the approaches. Index Terms—Binary decision diagram, common-cause failure, fault tree, imperfect fault coverage, phased-mission reliability.PMS-BDD RAN SA NotationA BDD-based PMS approach [2] Rendezvous And Navigation Solar Arraynumber of phases in the PMS set of all components in all phases of PMS number of CC in phase of PMS total number of CC in PMS, duration of phase of PMS an elementary CC occurring in phase . All CC in phase are indexed by subscript a CCE deﬁned over CC, CCE space: a subset of containing all components that are affected by a group of phase components caused to fail by the same common cause a subset of containing all components in , plus all those components in all later of PMS phases a component in the PMS state indicator variable of in phase modiﬁed failure function of in phase used in PMS-BDD evaluation event: 1 or more components fail uncovered event: no component fails uncovered event: PMS fails given that occurs orAcronyms1 ACS BDD CC CCE CCF CCG CDSAttitude Control System Binary Decision Diagram Common Cause CC Event CC Failure CC Group Command and Data handling SystemGPMS-CPR A PMS approach considering IPC [1] HGA High-Gain Antenna if-then-else representation IPC ImPerfect fault Coverage IPCM IPC Model (Fig. 4) MBDD Multi-state BDD MDD Multiple-valued Decision Diagrams MOI Mars Orbit Insertion OS Orbiting Sample PMS Phased-Mission SystemManuscript received June 26, 2006; revised July 25, 2006; accepted August 3, 2006. Associate Editor: J. P. Kharoufeh. The author is with the Department of Electrical and Computer Engineering, University of Massachusetts, Dartmouth, MA 02747 USA (e-mail: lxing@). Digital Object Identiﬁer 10.1109/TR.2006.8909001Theunreliability of PMS ignoring IPC Pr(transient restoration) in the IPCM Pr(permanent coverage) in the IPCM Pr(single point failure) in the IPCM unreliability of PMS unreliability of reduced PMS, i.e.,singular & plural of an acronym are always spelled the same. 0018-9529/$25.00 © 2007 IEEEXING: RELIABILITY EVALUATION OF PMS WITH IMPERFECT FAULT COVERAGE AND COMMON-CAUSE FAILURES59I. INTRODUCTION HE operation of missions encountered in aerospace, nuclear power, and many other applications often involves several different tasks or phases that must be accomplished in sequence. Systems used in these missions are usually called phased-mission systems (PMS). A classic example is an aircraft ﬂight that involves take-off, ascent, level-ﬂight, descent, and landing phases. During each mission phase, the system has to accomplish a speciﬁed task, and may be subject to different stresses as well as different dependability requirements. Thus, system conﬁguration, success criteria, and component failure behavior may change from phase to phase [1]. This dynamic behavior usually requires a distinct model for each phase of the mission in the reliability analysis. Further complicating the analysis are -dependencies across the phases for a given component. For example, the state of a component at the beginning of a new phase is identical to its state at the end of the previous phase [2]. The consideration of these dynamics and dependencies poses unique challenges to existing analysis methods. Considerable research efforts have been expended in the reliability analysis of PMS; efﬁcient combinatorial methods (see, e.g., [1]–[4]), state space oriented approaches based on Markov chains or/and Petri nets (see, e.g., [5]–[8]), as well as a modular solution based on binary decision diagrams (BDD) and Markov chains [9] have been developed. PMS, especially those devoted to safety-critical applications such as aerospace, and nuclear power, are typically designed with sufﬁcient redundancies, and automatic recovery mechanisms to be tolerant of errors that may occur. However, the recovery mechanisms can fail, such that the system cannot adequately detect, locate, and recover from a fault occurring in the system. This uncovered fault can propagate through the system, and may lead to an overall system failure, despite the presence of sufﬁcient redundancies. This phenomenon is known as imperfect fault coverage (IPC) [10]. The IPC introduces additional failure modes that must be considered for accurate reliability analysis of fault-tolerant PMS. Speciﬁcally, the analysis must allow two failure modes including covered failure, and uncovered failure, as well as an operation mode, rather than the traditional binary designation of failure and operation. At this point, it is relevant to mention that recently more attention has been focused on the study of multi-state systems in which both the system, and its components exhibit multiple performance levels varying from perfect operation to complete failure [11], [12]. Systems with multimode failures (for example, caused by the IPC as described above) are also widely referred to as multi-state systems [13]. Various approaches have been proposed for the analysis of multi-state systems; examples include the universal moment generating function-based methods [14], [15], and the BDD-based method [1], [13], [16]. Xing & Dugan also proposed efﬁcient approaches based on multi-state BDD (MBDD) [17], and multiple-valued decision diagrams (MDD) [18] to the reliability analysis of PMS with multimode failures. It is interesting as future research to study the general multi-state PMS with multiple performance levels using MBDD or MDD methods. The analysis becomes even more complicated when considering components can be subject to common-cause failures (CCF) during any phase of the mission. CCF are multiple dependent component failures within a system that are a directTresult of a common cause (CC) [19], such as extreme environmental conditions, or human errors. It has been shown by many studies that the presence of CCF tends to increase a system’s joint failure probabilities, and thus contributes signiﬁcantly to the overall unreliability of the system subject to CCF [19]. Therefore, it is crucial that CCF be modeled & analysed appropriately. Considerable research efforts have been expended in the study of CCF for the system reliability analysis; refer to [20] for a discussion of various approaches, their contributions, and their limitations concerning the analysis of non-PMS. We ﬁnd many of the same limitations in the CCF models developed for the reliability analysis of PMS in the literature; for example, the method presented in [21] allows at most one CC to affect a component, and CCF to exist only among -identical components. In this paper, we seek to address the limitations of existing CCF models by using a more general CCF model which allows for multiple CC that can affect different subsets of system components, and which can occur -dependently. Because failure to consider either IPC or CCF results in overestimated system reliability, it is signiﬁcant to incorporate both IPC, and CCF into the system reliability analysis. A great deal of work has been done to separately study IPC (see, e.g., [1], [10], [17], [22], [23]), and CCF (see, e.g., [19], [21], [24]–[27]). To the best of our knowledge, little work ([8], [29]) considered both IPC, and CCF in solving reliability problems for non-PMS. Moreover the existing work [28], [29] shares a restricted assumption that a single elementary CC affects all the system components. In this paper, we relax this restriction by employing our general CCF model described in Section II-B. And we present efﬁcient, separable BDD-based approaches for analyzing the reliability of PMS with both IPC, and CCF. The remainder of the paper is organized as follows. Section II presents an overview of the problem to be solved, as well as assumptions. Section III depicts an illustrative example PMS subject to IPC, and CCF. Section IV presents a separable approach to the reliability analysis of PMS with both IPC, and CCF. Section V illustrates the approach through the step-by-step analysis of the example PMS. Section VI presents an alternative approach, and a proof of the equivalence of the two approaches. In the last section, we present our conclusions, as well as directions for future work. II. PROBLEM STATEMENT This paper considers the problem of evaluating the reliability of fault-tolerant PMS subject to both IPC, and CCF. To help make tangible the type of systems for which the proposed approaches are meant as well as the analytical challenges we address in this paper, we present an example of PMS subject to IPC, and CCF (adapted from [20]). Fig. 1 shows the high-level fault tree model of a proposed Mars orbiter mission system, which involves Launch, Cruise, Mars Orbit Insertion (MOI), Commissioning, and Orbit phases. Each mission phase is characterized by at least one major event in which the mission failure can occur. Examples of failure events for this example include: • the launch event during the launch phase; • the deployment of the solar arrays (SA) & high-gain antennas (HGA), and the conﬁguration of the heaters after launch which occur during the cruise phase;60IEEE TRANSACTIONS ON RELIABILITY, VOL. 56, NO. 1, MARCH 2007Fig. 1. High-level fault tree model of the Mars orbiter mission system; the triangles are transfer gates to the fault tree for Subsystem F.Fig. 2. Fault tree model of Subsystem F.• the propulsive capture into Mars’ orbit during the MOI phase; and • the release of an orbiting sample (OS), and the inclusion of a rendezvous and navigation (RAN) platform on the orbiter which might induce additional failure modes during orbit [20]. As shown in Fig. 2, Subsystem F of Fig. 1 is decomposed into the Telecom, Power, Propulsion, Command and Data handling System (CDS), Attitude Control System (ACS), and Thermal subsystems. Subsystem F has the same conﬁguration for each mission phase. As described in [20], these subsystems are subject to CCF from two CC: a micrometeoroid attack that can result in the failure of the entire system, and a solar ﬂare which fails the subsystem’s electronics, most notably the CDS in all pre-MOI phases. The subsystems can also be subject to IPC. Consider the fault tree model of the CDS system in Fig. 3; it includes a hot standby system that consists of two -identical subsystems: Side-A, and Side-B. For example, the Side-A is the primary subsystem, and Side-B is the standby subsystem that is automatically switched on upon failure of the primary subsystem Side-A. Under ideal circumstances, the CDS system functions correctly as long as one of the two subsystems, and other components as indicated in Fig. 3, operate correctly. However, in reality, the failure of the primary subsystem Side-A must be detected, and appropriately handled before the standby subsystem Side-B can be used. In other words, an uncovered fault of any component in the subsystem Side-A may lead to the failure of the CDS system, and thus the failure of the entire mission, despite the presence of adequate redundancies (i.e., the subsystem Side-B).Fig. 3. Fault tree model of CDS.As described in the Introduction, the reliability analysis of PMS subject to IPC, and CCF, like the Mars orbiter system, requires the consideration of dynamics & dependencies across different phases including the multiple failure modes introduced by IPC, as well as the multiple dependent component failures caused by CCF. All of them pose a big challenge to the existing analysis methods. In this paper, we propose two equivalent separable approaches to the reliability analysis of PMS with both IPC & CCF as one way to meet the above challenge in an efﬁcient, elegant manner. PMS without IPC and/or CCF will appear as special cases of our methods. The assumptions, and inputs for the problem are listed in the following subsections. A. General Assumptions on PMS 1) Component failures are -independent within each phase. Dependencies arise among different phases, and failure modes. 2) For at least one component, an uncovered fault causes the overall system failure, even in the presence of fault-tolerant mechanisms. 3) Conditional fault occurrence probability for each component for each phase is given either as a ﬁxed probabilityXING: RELIABILITY EVALUATION OF PMS WITH IMPERFECT FAULT COVERAGE AND COMMON-CAUSE FAILURES61Fig. 4. General structure of the IPCM for a component.4) 5)6)7)8)for a speciﬁed mission time, or in terms of a lifetime distribution. Phase durations are deterministic. The system fails to achieve the mission if it fails during any one phase. That is, phase-OR requirements are assumed for this paper. Thus, the reliability of a PMS is the probability that the mission successfully achieves its objectives in all phases. The system is not maintained during the mission; once a component transfers from the operation mode to a failure mode (covered or uncovered), it will remain in that failure mode for the rest of the mission time. The system is coherent, which means that each component contributes to the system state, and the system state worsens (at lease does not improve) with an increasing number of component failures [30]. The IPC behavior is described using Dugan et al.’s imperfect coverage model (IPCM) [10]. Within the context of the quantitative reliability analysis, it is required only to refer to the 3 exit probabilities, also known as coverage factors: , , and , where . Each of them is a conditional probability conditioned on a fault occurring to a component, as deﬁned in Fig. 4.Fig. 5. System conﬁguration for the example PMS in the fault tree model.to the same elementary common-cause common-cause group .constitute aB. CCF Model for PMS 1) PMS can be subject to CCF due to different elementary CC occurring within a phase or in different phases. In general, we express the elementary CC existing in a PMS as (Phase 1); (Phase 2); ; (Phase ). denotes the number of elementary CC involved in phase . is the total number of phases in the PMS; thus is the total number of CC to which the PMS is subject. 2) CC are external to the system. Note that the failure of a component within the system may cause simultaneous failure of multiple other components within the system. Such CCF due to internal CC can be dealt with by the functional dependency gate in the dynamic fault trees approach [10]. 3) Different CC, whether from the same phase or from different phases, can be mutually exclusive, -independent, or -dependent. 4) A component may be affected by multiple CC, that is, one single component can belong to more than one common-cause group (CCG). All components that fail dueC. Problem Inputs The following lists all the required input parameters for solving the problem. 1) Mission time. 2) Number of phases: . 3) Duration of each phase : . 4) Failure criteria for each phase, described using the fault tree model. 5) Failure parameters for each component in phase , which are conditioned on the success of in the previous phase (refer to assumption 3 in Section II-A). 6) Fault coverage factors ( , , ) for each component in each phase. 7) Statistical relationship between elementary CC: -independent, -dependent, or mutually exclusive. 8) Occurrence probabilities of elementary CC or conditional occurrence probabilities of CC in case of two CC being -dependent. We recognize that analytical results are strongly inﬂuenced by the input parameters, therefore realistic estimates of them are crucial. Fault injection (see, e.g., [31], [32]) is a commonly used technique for estimating values of component failure parameters, and fault coverage factors. The occurrence probabilities of CC, and their statistical relationship may be obtained from available data sources [24], [25]; for example, research into estimating parametric values using failure event data was presented in [26], [27]. In this paper, we consider the above listed parameters as given input parameters of the problem. Follow-up research will include investigation of the sensitivity of results to changes in the input parameter values. III. AN ILLUSTRATIVE EXAMPLE To illustrate the applications & advantages of the proposed approaches, and to perform comparative study with and without IPC/CCF, we consider an example PMS adapted from [1]. As shown in the system fault tree model (Fig. 5), the system consists of 4 types of components that are used in different system conﬁgurations over 3 consecutive phases: 1) , : needed for all phases; one of them must be functional during all the three : only needed for Phase 1, and Phase 2; it must phases. 2)62IEEE TRANSACTIONS ON RELIABILITY, VOL. 56, NO. 1, MARCH 2007TABLE I INPUT PARAMETERS FOR THE EXAMPLE PMS ( , ARE ALL IN THE COVERAGE FACTOR IS 0 FOR ALL COMPONENTS IN ALL PHASES, ) AND CAN BE OBTAINED USING;. This hypothetical scenario about CCF in the example PMS is general, breaking the assumptions or restrictions usually found in the literature, which include each component belonging to at most one CCG, CC affecting -identical components only, and all CC occurring -independently. IV. ANALYSIS OF PMS WITH IPC AND CCF To analyse the reliability of a PMS with both IPC and CCF, we propose to decompose the original problem into a number of reduced PMS reliability problems that are freed from the concern about CCF. And then we separate the IPC from the combinatorics of the solution to each reduced problem based on the simple, efﬁcient algorithm in [1], [22]. After the above two-step separation process, the set of reduced reliability problems does not have to consider both IPC, and CCF. Thus, existing PMS reliability analysis approaches that ignore both IPC and CCF, such as the PMS-BDD method [1], [2], can be directly applied to solve these reduced problems. Finally, results of all the reduced problems are integrated to obtain the entire phased-mission reliability considering both IPC, and CCF. In the following subsections, we present the details of these four steps: separating CCF, separating IPC, solving reduced problems using the PMS-BDD, and aggregating for ﬁnal results. We summarize the whole approach at the end of this section. A. Step 1: Separating CCF Our methodology for incorporating CCF into the phased-mission analysis is to decompose a reliability problem with CCF into a number of reduced reliability problems based on the Total Probability Theorem [33]. The set of reduced problems need not consider CCF because the effects of CCF have been factored out. Finally, results of all the reduced reliability problems are aggregated to obtain the reliability measure considering CCF. Speciﬁcally, according to the general CCF model described in Section II-B, there exist totally elementary CC in a PMS. The CC partition the event space into the following disjoint subsets, each called a common-cause event (CCE):be functional during these two phases. 3) , : work during Phase 1, and Phase 3; both of them must be functional during Phase 1, and at least one of them must be functional during , , : work during Phase 2, and Phase 3; all of Phase 3. 4) them must be functional during Phase 2, and at least two of them must be functional during Phase 3. Note that some components may not be explicitly used in some phase, although they can still fail & contribute to the system failure. For example, component may suffer an uncovered failure in Phase 3, and thus contribute to the system failure even though its covered failure in Phase 3 does not affect the entire system operation. The failure parameter of each component for the duration of a phase is given as a ﬁxed probability ; or in terms of an exponential distribution with a constant failure rate ; or in terms of a Weibull distribution deﬁned with a scale parameter , and . The coverage factors of each compoa shape parameter nent are given as ﬁxed probabilities. To perform comparative study with and without the consideration of IPC/CCF, we use the same values for the input parameters (including failure parameters, coverage factors, and phase durations) as in [1] (also see Table I). To demonstrate the effects of CCF on the reliability analysis of this example PMS, we propose that the system is sub) ject to CCF from three CC: hurricanes (denoted by during Phase 2, during Phase 1, lightning strikes during Phase 3. A hurricane of sufﬁcient and ﬂoods intensity in Phase 1 would cause components and to fail, that is, , where is the in Phase 1, and state indicator variable of component denotes the failure of in Phase 1; serious lightning strikes in Phase 2 would cause , , and to fail, that ; serious ﬂooding in Phase 3 is, to fail, that is, . would cause , and We use the following probabilities in our example (they could be extracted from available weather information; see Section II-C). The probability of a hurricane occurring in Phase 1 ; the probability of a lightning strike occurring is ; the ﬂoods often occur in conin Phase 2 is junction with hurricanes, and the -dependency between these two CC can be deﬁned by a set of conditional probabilities conditioned on the state of hurricanes in Phase 1 (occurred or not occurred). The probability that ﬂoods occur in Phase 3 conditioned on the occurrence of hurricanes in Phase 1 is denoted ; similarly, by , , andWe build a space called "CCE space" over this set of collectively exhaustive, and mutually exclusive events that can occur . in the PMS, that is, denote the occurrence probability of ; Let , and , then we have . Based on the above CCE space, and the Total Probability Theorem, the unreliability of a PMS with IPC and CCF can be evaluated as(1)XING: RELIABILITY EVALUATION OF PMS WITH IMPERFECT FAULT COVERAGE AND COMMON-CAUSE FAILURES63As we will show through the analysis of an example in Seccan be obtained based on the relationship tion V, between the elementary CC, and the occurrence probabilities of in (1) is a conCC, which are given as input parameters. ditional probability that the PMS fails conditioned on the occur. The evaluation of is actually a reduced rence of reliability problem in which the set of components affected by (represented by ) does not appear. Speciﬁcally, in the system fault tree model, each basic event (denoting the will be replaced failure of a component) that appears in by a constant logic value ‘1’ (True). After the replacement, a Boolean reduction can be applied to the PMS fault tree to generate a simpler fault tree in which all the components in do not appear. Most importantly, the evaluation of the reduced fault tree can proceed without further consideration of CCF. Thereby, the overall solution complexity is reduced. In the following subsection, we present an efﬁcient approach for further separating the effects of IPC from the combinatorics in (1). of the solution to each reduced problem B. Step 2: Separating IPC Xing & Dugan [1] proposed a BDD-based algorithm called GPMS-CPR for incorporating the IPC into the analysis of PMS without CCF. The methodology is to separate all component uncovered failures from the combinatorics of the solution based on the simple, efﬁcient algorithm in [1], [22]. We apply this in (1). method to the evaluation of each (1 Consider two mutually exclusive & complete events: (no or more components in the PMS fail uncovered), and component in the PMS experiences an uncovered failure). According to the Total Probability Theorem, for event (meaning ), we haveFig. 6. A PMS BDD branch used in the recursive algorithm.in different must consider the uncovevaluation of ered failures of all components in the system, including those , and thus being removed from the problem affected by . This is due to the fact that an uncovered failure leads to the overall mission failure, even in the presence of adequate are calculated by the same redundancies. Therefore, all the (3). or in (2) is the unreliability of the corresponding perfect coverage system ignoring both IPC, and CCF; . that is, It should be evaluated given that no component experiences an uncovered failure, and that all components from are eliminated. Therefore, before calculating , we need to modify the failure function of each component in each phase to a conditional failure probability , conditioned on there being no uncovered failure during the mission using (13) in [1]. Using these modiﬁed component failure functions, we using any approach that ignores both IPC, can calculate and CCF. In this research, we use an efﬁcient approach called , and we brief the basics of PMS-BDD [2] for solving PMS-BDD in the next subsection. C. Step 3: Solving Reduced Problems Using PMS-BDD(2) Because an uncovered failure leads to the overall system failure even in the presence of fault-tolerant mechanisms, the conditional probability in (2) simply equals 1. To which is deﬁned as , and which is calculate deﬁned as in (2), we need to use (9)–(12) in [1] to ﬁnd the occurrence probabilities of three mutually exclusive before the end of & complete events for each component ( has not failed before the end of phase each phase : ), ( has failed covered before the end of phase ), and ( has failed uncovered before the end of phase ). Let denote the set of all components in PMS, and be the number can be calculated as of phases; then(3) where is the probability that fails uncovered during the mission, and can be calculated using (9) in [1]. Note that theReference [2] presented a ﬁve-step algorithm based on BDD for the reliability analysis of PMS without IPC and CCF. The algorithm uses phase algebra [4] to combine each single phase BDD to obtain a ﬁnal PMS BDD. The PMS unreliability can be obtained through a recursive evaluation of the ﬁnal PMS BDD. For the 1-edge or 0-edge linking variables of different components, the evaluation method is the same as the ordinary BDD. But for the 1-edge linking variables of the same component, a special treatment is needed due to -dependence between , ( , are phases, and ): , , and and . The following gives the summary of the recursive algorithm for calculating the unreliability from the ﬁnal PMS BDD. Consider a PMS BDD branch in Fig. 6, the if-then-else [2] representation of the ﬁgure is , and . The 1-edge of each node (representing a component in the PMS) is associated with the failure function of the component; the 0-edge is associated with the operation function of the component. The unreliability concerning the sub-PMS BDD can be evaluated using the following recursive algorithm: • If , belong to the same component, then .64IEEE TRANSACTIONS ON RELIABILITY, VOL. 56, NO. 1, MARCH 2007Fig. 7. First separate CCF, and then IPC.belong to different components, then . is the unreliability with respect to the current where sub-PMS BDD. Let be the root node of the entire PMS BDD, is the unreliability of the entire PMS. is the then failure function of component . Note that as discussed in ((13) in [1]) must be Section IV-B, to consider IPC, used for computing . The exit condition of this recursive algorithm is , then . • If , then . • If The nature of the BDD ensures that an automatic cancellation of components from earlier phases can be done without additional operations [2]. Thus, there is a considerable reduction in computing and storage requirements over the earlier approaches like the EZ approach [3], and Markov-chain based approaches (see, e.g., [6]) for the PMS analysis. Therefore, we select this in (2) in this research. PMS-BDD algorithm for computing D. Step 4: Aggregating for Final Results After obtaining all the in (2) using the PMS-BDD apwith using (2) proach, we integrate the result of each . And then we aggregate the results of to obtain each with using (1) to obtain the ﬁnal unreliability of PMS subject to both IPC, and CCF. E. Approach Summary The approach ﬁrst separates CCF, and then IPC, as summarized in (4). Fig. 7 shows a conceptual overview of the separable approach for analyzing the unreliability of PMS with both IPC, and CCF presented in the preceding subsections.• If,can be implemented in constant time. After the separation of IPC, and CCF from the solution combinatorics, we obtain reduced reliability problems, each of which can be solved using the PMS-BDD method. It has been shown that, in most cases, the BDD-based methods require less memory, and are more efﬁcient in reliability evaluation than other traditional methods [1], [2], [10], [13], [23]. Refer to [34] for the computational & storage complexity analysis of an efﬁcient implementation of the BDD method. Also, given the fact that CCF rarely occur of (i.e., systems are usually subject to a very small number elementary CC), and considering the parallel processing capability of modern computing systems, even though there are reduced problems involved in our approach, the overall solution complexity is still low. In addition, as we will show through the example in Section V, most PMS fault trees after the reduction (i.e., the reduced problems) are trivial to solve. Another advantage offered by our approach is that it allows reliability engineers to use their favorite software package that ignores both IPC, and CCF for computing PMS reliability, and adjust the input, and output of the program to produce the PMS reliability considering both IPC, and CCF. Based on this proposed approach, we present a concrete step by step analysis of the example PMS (Section III) in the next section. V. EXAMPLE ANALYSIS AND RESULTS We present the analysis of the example PMS with both IPC, and CCF (Section III) step by step in the following subsections. A. Step 1: Separating CCF According to the decomposition & aggregation approach to considering CCF (Section IV-A), we ﬁrst build a CCE space for the example PMS. Because there are 3 elementary common (hurricanes), (lightning strikes), and causes, (ﬂoods), the CCE space is composed of CCE, that is, . Each is a distinct, disjoint combination of elementary CC, as deﬁned in the ﬁrst column of Table II. is the set of As brieﬂy deﬁned in Section IV-A, components that are the only ones affected by the event . In other words, the occurrence of leads to(4) As described in Sections IV-A, and IV-B, the two-step separation process for considering IPC, and CCF is simple, and。

语言测试教案3

语言测试教案3
单击此处输入你的副标题，请尽量言简意赅的阐述观点。
Chapter Three Some Essential elements
in maintaining or evaluating test quality
（语言测试的基本要素）
What do we need to know ?
Validity in general refers to the
appropriateness of a given test or any of its component parts as a measure of what it is supposed to measure. A test is said to be valid to the extent that it measures what it is supposed to measure. Validity may be determined in a variety of ways.
法）(consistency in form) 3. split-half testing method.（对半分析法）
( internal consistency )
III、What is test validity?
语言测试的效度也称有效性,指测试所考的,是否就是所要考的,或者说,在多大程度上是考了目的所要考（李筱菊,2019）。
V.some other elements in language testing
Authenticity 真实性
Involvement 交互性

Practicality 可操作性 Washback effect 后效作用
WWhaht haatvehyaovuegoytofruomgthoetlesson fromttohdealye?sson today?

Reliability,Vali...

Learning Resources CenterUCSF School of NursingReliability, Validity, and Statistical Analysis Instrument ReliabilityInstrument reliability is the degree of consistency or dependability with which an instrument measures what it is supposed or designed to measure. It is not a property of the instrument, itself, but of the instrument when administered to a particular sample under certain conditions, e.g., a reliable measurement of adults in the United States may not be reliable when applied to adults in Canada. Reliability can be assessed by:• Stability, which refers to the extent that the same results are obtained on repeatedadministration of the instrument. It is determined by test-retest reliability, whereby scores obtained on repeated administrations are correlated by computing a reliability coefficient. Such a correlation coefficient (r) is an index ranging from –1 (perfect inverse or negative relation) through 0 (no relationship) to +1 (perfect directrelationship), and summarizes the degree of relationship between the two variables.The higher the value, whether negative or positive, the stronger the relationship. A positive value indicates that, as one increases, so does the other; a negative value, as one increases, the other decreases. The higher the correlation coefficient, the more stable the instrument. An instrument with r < .70 may be considered unreliable;70% of the variability in obtained scores represents actual individual differences, while 30% is due to random, extraneous fluctuations. Aside fromstrong correlations, both positive and negative, one can also speak of moderate and weak correlations.• Internal consistency, which refers to the degree to which the subparts ofan instrument all measure the same attribute or dimension. It is determined by one of several procedures, e.g., split-half, a technique whereby items comprising theinstrument are split into two groups (e.g., odd and even) and scores for the halves correlated. Because longer scales tend inherently to be more reliable than shorter ones, a formula is used to correct the reliability estimate calculated by the split-half method. Still, reliability estimates can vary depending upon “splits,” and apreferable means of determining a reliability coefficient is coefficient alpha(Cronbach’s alpha), the advantage of which is that it gives an estimate of split-half correlations for all possible ways of halving the instrument.• Equivalence or agreement, which refers to two or more measures of a singleattribute. When different observers use the same measuring tool, an index ofinterrater reliability must be developed to demonstrate the strength or correlation coefficient between the rating of one observer and the other.Instrument ValidityInstrument validity is the degree to which an instrument measures what it is intended to measure. It is inferred from the evidence presented, and cannot be said to be proven orestablished. One validates the application of an instrument, not the instrument, itself. Problems of validity relate to whether one really is measuring the attribute one thinks is being measured.There are three general types of validity:• Content, which refers to the sampling adequacy of the content areas being measured.It asks the question “How representative of all questions that could be asked are the questions actually being asked in the instrument?” Does the instrument, in otherwords, adequately represent the domain of the variables being measured.• Criterion-related, which refers to the degree to which instrument scores arecorrelated with some external criterion. There are two types:Predictive, which refers to the degree to which an instrument can predict somecriterion observed at a future time. How adequately, in other words, will aninstrument be in differentiating between the performance or behavior ofindividuals on some future criterion, e.g., How well do GRE scores predict future grades?Concurrent, which refers to the degree to which an instrument can distinguishindividuals who differ on some criterion measured or observed at the same time.How adequately will an instrument be in differentiating between individuals who differ in their present status on some criterion, e.g., Based on current observedbehaviors, who should be released from an institution?• Construct, which refers to the degree to which an instrument measures the construct under investigation. It asks the question “What is the instrument actuallymeasuring?” Construct validity can be difficult to determine, as there may be noobjective criterion for abstract concepts, e.g., grief or role conflict. The moreabstract the concept, the less suitable, too, it is to validation by a criterion-related approach.Design ValidityThe researcher can be confident in the design validity of results if there is:• Internal validity, which is the degree to which it can be inferred that the independent variable, rather than extraneous factors, is responsible for the observed effects.Quasi-experimental designs, i.e., those that lack control and/or randomization(which, together with manipulation characterize experimental design), admit tocompeting explanations for the obtained results. These threats to internal validity include history, selection, maturation, testing, and mortality—all of which are rival hypotheses that could account for the observed results.• External validity, which is the degree to which the results of the study can begeneralized to other settings or samples. The generalizability of an instrument is the degree to which the research procedures justify the inference that the findingsrepresent something more than the specific observations upon which they are based.Threats to external validity include Hawthorne, novelty, interaction, experiment, and measurement.Statistical AnalysisStatistical analysis is comprised of two categories: descriptive statistics, which are used to describe the researcher’s data set (observation and analysis) andinferential statistics, which permit inference from the data about a particular sample to conclusions or relationships about a larger population. Statistical inference consists of two techniques:• Estimation of parameters(when no prediction can be made about thepopulation mean, as with the testing of a new drug) is either by point orinterval estimation. An interval estimate is referred to as a confidence interval and isa calculated range of values, usually 95% or 99%, within which a populationparameter is estimated to lie.• Statistical hypothesis testing is based on negative inference. Even though it is not possible to prove that one’s hypothesis is correct, it is possible to demonstrate the probability that a null hypothesis is incorrect. The null hypothesis (H o) states that there is no relationship between the variables under study and, ideally, will berejected. To accept as false a null hypothesis that is true, i.e., to conclude that arelationship exists when it does not, is a Type I error. To accept as true a nullhypothesis that is false, i.e., to conclude that no relationship exists when it does, is a Type II error. These two types of errors are in relationship: as Type I is decreased, Type II is increased. The selection of a significance level determines the probability of Type I error. It is the probability (p) that an observed relationship could be caused by chance. The most frequently used levels of significance are .05 and .01; p < .05 means that, out of 100 samples, a true null hypothesis would be rejected 5 times.There is, in other words, only a 5% chance that the results are due to error. Hypothesis testing can be determined by several different statistical methods:• T-test is a type of parametric statistic for testing differences between twoindependent group means. (Parametric statistics are characterized by: the estimation of at least one parameter, the use of interval means, and assumptions aboutvariables.) Whether the t-value is statistically significant is determined by a table indicating probability levels and degrees of freedom. A paired t-test is used when two groups are dependent, e.g., two measurements from the same group, as in pre-and post-testing.• Chi-square (χ2) is a test of statistical significance used to assess whether or not a relationship exists between two nominal-level variables. It is applied to acontingency table (in which the frequency distribution of two variables havebeen cross-tabulated) to test the significance of different properties of cases that fall into various categories. The table indicates the χ2values for various degrees offreedom and significance levels.• Multiple regression is an example of more complex multivariate statistics (ananalysis of more than one, usually three or more, dependent variables) that analyzes the effects of two or more independent variables (the one that is manipulated and is believed to cause or influence the dependent variable) on the dependent variable (the outcome variable that is predicted or hypothesized and is the presumed effect of the independent variable). In simple regression, one variable is used to predict another,e.g., scholastic achievement from GRE scores. The higher the correlation betweenthe two variables, the more accurate the prediction. It is because this correlation is seldom perfect (r = 1) that multiple regression is used to improve the prediction by including more than one variable, e.g., achievement as predicted by SAT andGRE scores.2000 rev. 2002 • LRC_STATS.DOCLearning Resources CenterUCSF School of Nursing。

有病理实验室标准操作流程和技术操作规范

英文回答：The implementation of standard operating procedures (SOP) within a pathological laboratory is imperative in ensuring the production of precise and dependable test oues. The initial phase involves the meticulous verification of the laboratory's possession of the requisite tools and equipment for the conduction of experiments, epassing microscopes, centrifuges, and other specialized instruments. Furthermore, it is essential for the laboratory to uphold appropriate ventilation and maintenance standards, thereby guaranteeing a secure working milieu for its staff. The establishment of standardized protocols for sample collection, storage, and processing is of paramount significance in maintaining uniformity and dependability in test results. Adherence to these protocols is critical in upholding the quality and integrity of the laboratory's operations.在病理学实验室内执行标准操作程序对于确保生产准确和可靠的试验用地至关重要。

考研英语基因鉴定及其存在的问题原文

考研英语基因鉴定及其存在的问题原文【中英文版】Title: Genetic Testing and Its ChallengesGenetic testing, a powerful tool in the field of healthcare, has revolutionized the way diseases are diagnosed and treated.It involves the analysis of an individual"s DNA to identify any changes or mutations that may be associated with genetic disorders or increased risks of developing certain diseases.The process of genetic testing typically involves collecting a sample of blood, saliva, or tissue from the individual.The sample is then sent to a laboratory where scientists analyze the DNA to look for specific genetic variants or mutations.The results of the test can confirm whether an individual has a genetic disorder, carrier status for a genetic condition, or an increased risk of developing certain diseases.While genetic testing offers numerous benefits, it also comes with its fair share of challenges.One of the main concerns is the potential for false-negative or false-positive results, which can lead to unnecessary anxiety or delayed treatment.Additionally, genetic testing can sometimes reveal unexpected results, such as the presence of genetic mutations in individuals who do not exhibit any symptoms of a disease.This can lead to psychological distress and challenges in terms of disclosure and counseling.Furthermore, genetic testing raises ethical considerations, including issues related to privacy, consent, and the potential for discrimination based on genetic information.There are also concerns about the accessibility and affordability of genetic testing, as well as the need for appropriate training and education for healthcare professionals to ensure accurate interpretation of test results and appropriate management of patients.In conclusion, genetic testing is a valuable diagnostic tool that can provide valuable information about an individual"s genetic health.However, it is important to be aware of the potential challenges and complexities associated with the process, and to address these issues to ensure that genetic testing is used responsibly and ethically in healthcare settings.基因检测是医疗领域的一个强大工具，它已经改变了疾病诊断和治疗的方式。

o开头的英文形容准确可靠的词语

o开头的英文形容准确可靠的词语In the realm of descriptive linguistics, words that begin with the letter 'O' and convey a sense of accuracy and reliability are not only fascinating but also quite potent in their ability to communicate precision. One such word is "objective," which denotes a lack of bias, judgment, or prejudice, leading to a fair and accurate representation of facts or situations. This term is often used in scientific discourse, where data must be presented without the influence of personal feelings or opinions.Another term that fits this description is "omniscient," typically used to describe a narrator in literature who has a complete and accurate understanding of all characters and events within a story. This perspective allows for a reliable and comprehensive portrayal of the narrative, leaving no room for uncertainty or partiality."Omnipotent" is also a word that conveys a sense of ultimate power or authority, often used in theological contexts to describe a deity's ability to perform any action with perfect accuracy and reliability. In a broader sense, it can refer to any entity or person possessing complete control and influence over outcomes, ensuring that their actions are precise and dependable.In the field of mathematics, the concept of an "ordinal" number is used to represent the position or rank of an item in a sequential order, which is a precise and reliable way to describe its placement. Ordinal numbers are essential in various aspects of daily life and scientific research, providing an accurate method to quantify and compare elements within a set.The word "orthodox" is traditionally associated with ideas, practices, or beliefs that are established and accepted as accurate and reliable. While it often refers to conventional religious practices, it can also apply to any methodology or theory that is widely regarded as correct and trustworthy due to its long-standing validation and acceptance.In technology and computing, "open-source" is a term used to describe software for which the original source code is made freely available and may be redistributed andmodified. This approach is considered reliable because it allows for peer review and community collaboration, ensuring that the software is accurate, secure, and efficient."Operational" is a term that signifies something is in working order or ready for use. In the context of machinery, systems, or processes, being operational implies that they are functioning correctly and can be relied upon to perform their intended tasks accurately and effectively.Lastly, "optimal" is a word that describes the most favorable condition or the highest level of effectiveness. When something is deemed optimal, it is considered the best possible choice or solution, given the circumstances, ensuring accuracy and reliability in achieving the desired outcome.These words, all beginning with 'O,' serve as powerful tools in the English language to articulate concepts of accuracy and reliability. They provide a means to express certainty and dependability across various domains, from literature and religion to mathematics and technology. The precision these terms offer is invaluable in conveying clear, trustworthy information that stands up to scrutiny and fosters understanding and confidence in communication. 。

如何改善自己的人际关系150字英语作文

全文分为作者个人简介和正文两个部分：作者个人简介：Hello everyone, I am an author dedicated to creating and sharing high-quality document templates. In this era of information overload, accurate and efficient communication has become especially important. I firmly believe that good communication can build bridges between people, playing an indispensable role in academia, career, and daily life. Therefore, I decided to invest my knowledge and skills into creating valuable documents to help people find inspiration and direction when needed.正文：如何改善自己的人际关系150字英语作文全文共3篇示例，供读者参考篇1How to Improve Your Interpersonal RelationshipsHey everyone, today I want to talk about something that a lot of us struggle with - our interpersonal relationships. Whether it's with friends, family, or that special someone, having goodrelationships is so important for our happiness and well-being. But maintaining healthy connections can be really tough sometimes. I know I've had my fair share of ups and downs when it comes to my relationships. That's why I did some research and wanted to share some tips on how we can all work on improving our interpersonal skills.First off, let's talk about communication. We've all heard it before - communication is key. But it's so true! A lot of relationship problems stem from poor communication or misunderstandings. The biggest piece of advice I can give is to be an active listener. That means really focusing on what the other person is saying, not just waiting for your turn to talk. Ask questions, paraphrase what they said to make sure you understand, and avoid interrupting. It also helps to use "I" statements when expressing your thoughts and feelings, rather than blaming the other person.Another big one is empathy. Trying to see things from the other person's perspective can go a long way in strengthening your relationships. None of us are mind readers, so it's easy to make assumptions about what someone else is thinking or feeling. Instead, have an open and honest conversation to gain a better understanding of where they're coming from. A littleempathy and compassion can prevent a lot of arguments and hurt feelings.Speaking of arguments, conflict is inevitable in any relationship. What's important is how you handle it. My advice? Fight fair. No name-calling, yelling, or bringing up past issues. Stick to the topic at hand and use respectful language. It's also a good idea to take a break if things get too heated and return to the conversation when you've both had a chance to cool down.Outside of arguments and disagreements, there are plenty of positive things we can do to nurture our relationships as well. Quality time is huge - put down your phones, turn off the TV, and really be present with that person. Do an activity you both enjoy, or just have a nice conversation over dinner. Small gestures like giving a compliment, doing a favour, or just telling someone how much they mean to you can also go a long way.Boundaries are important too. As close as you might be with someone, you're still individuals with your own needs, values, and personal space. Voicing your boundaries and respecting others' boundaries creates a dynamic of mutual understanding and respect.Finally, one of the biggest pieces of advice I can give is to be yourself. We've all pretended to be someone we're not at somepoint to try and impress others or make them like us. But real, lasting relationships are built on authenticity and acceptance of each other's true selves. The right people will appreciate you for who you are.I know improving interpersonal relationships isn't easy, and it's something we all have to work on constantly. But it's so worth it to have those meaningful connections in your life. Just remember the basics: communicate openly, show empathy, fight fair, spend quality time, respect boundaries, and be your authentic self. With some awareness and effort, I really believe we can all strengthen the relationships that matter most to us. Thanks for reading, and I wish you all the best with your interpersonal skills!篇2How to Improve Your Interpersonal RelationshipsInterpersonal relationships are a crucial part of our lives. Whether it's with friends, family, colleagues, or romantic partners, the way we interact with others can significantly impact our overall well-being and happiness. Unfortunately, many people struggle with maintaining healthy relationships, leading to misunderstandings, conflicts, and even emotional distress. As astudent, I understand the importance of having strong interpersonal skills not only for personal growth but also for academic and professional success. In this essay, I will explore some strategies that can help improve your interpersonal relationships.Effective Communication:Communication is the foundation of any successful relationship. Active listening is key to understanding others' perspectives and fostering a sense of mutual respect. Instead of waiting for your turn to speak, truly listen to what the other person is saying, without interrupting or formulating a response in your mind. Additionally, practice clear and concise expression of your thoughts and feelings. Avoid ambiguous language or passive-aggressive communication, as these can lead to misinterpretations and resentment.Empathy and Emotional Intelligence:Developing empathy and emotional intelligence can significantly enhance your interpersonal relationships. Empathy involves putting yourself in someone else's shoes and trying to understand their feelings, thoughts, and motivations. Emotional intelligence refers to the ability to recognize, understand, and manage your own emotions as well as those of others. By beingattuned to emotional cues, you can respond appropriately and show genuine concern for the well-being of those around you.Conflict Resolution:Conflicts are inevitable in any relationship, but how you handle them can make or break the bond. Approach conflicts with a calm and open mind, and be willing to compromise. Avoid escalating arguments by using accusatory language or personal attacks. Instead, focus on finding a mutually satisfactory solution. If tensions run high, take a break and revisit the issue when emotions have cooled down.Respect and Boundaries:Respect is a fundamental component of healthy relationships. Treat others with courtesy, kindness, and consideration, regardless of their background or beliefs. Respect also involves acknowledging and honoring personal boundaries. Everyone has different comfort levels and expectations when it comes to personal space, privacy, and emotional intimacy. Be mindful of these boundaries and adjust your behavior accordingly.Appreciation and Positive Reinforcement:Expressing appreciation and providing positive reinforcement can go a long way in strengthening interpersonal relationships. Recognize and acknowledge the efforts, achievements, and positive qualities of those around you. A simple compliment or word of encouragement can brighten someone's day and foster a sense of mutual support and encouragement.Forgiveness and Letting Go:Holding grudges and dwelling on past hurts can be detrimental to your interpersonal relationships. Learn to forgive others for their mistakes or shortcomings, and let go of resentment. This doesn't mean excusing unacceptable behavior but rather choosing to move forward in a constructive manner. Forgiveness can be a powerful healing force, allowing you to repair damaged relationships and cultivate a more positive outlook.Building Trust and Consistency:Trust is the foundation of any strong relationship. Be reliable, honest, and consistent in your words and actions. Follow through on your commitments, and strive to create an environment where others feel safe to be vulnerable and open. Consistency in your behavior and emotional availability canfoster a sense of security and dependability in your interpersonal connections.Personal Growth and Self-Awareness:Improving your interpersonal relationships also requires personal growth and self-awareness. Reflect on your own communication styles, emotional patterns, and behavioral tendencies. Identify areas where you can improve, and actively work on developing the necessary skills. Seek feedback from trusted friends or mentors, and be open to constructive criticism.Building and maintaining strong interpersonal relationships is an ongoing process that requires effort, patience, and commitment. By implementing the strategies outlined above, you can create more fulfilling and rewarding connections with the people in your life. Remember, healthy relationships are not just beneficial for personal happiness but can also contribute to academic and professional success, as well as overall life satisfaction.篇3How to Improve Your Interpersonal RelationshipsAs a student, building strong interpersonal relationships is crucial for academic success, personal growth, and overallwell-being. Whether it's with classmates, professors, or peers outside the classroom, the ability to effectively communicate and connect with others can greatly enhance your college experience and prepare you for future endeavors. In this essay, I'll share some valuable insights and strategies that have helped me improve my interpersonal relationships.Effective Communication: The Foundation of Healthy RelationshipsOne of the most important aspects of interpersonal relationships is effective communication. It's not just about what you say, but also how you say it and how you listen to others. Active listening is a skill that involves giving your full attention to the speaker, avoiding distractions, and demonstrating that you understand their perspective through verbal and non-verbal cues.When communicating, strive for clarity and honesty. Express your thoughts and feelings in a respectful manner, without being confrontational or defensive. Seek to understand the other person's point of view, even if you disagree. Remember, communication is a two-way street, and both parties should feel heard and valued.Emotional Intelligence: Understanding Yourself and OthersEmotional intelligence plays a significant role in building and maintaining healthy interpersonal relationships. It involves being aware of your own emotions, recognizing the emotions of others, and managing those emotions effectively. When you have a better understanding of yourself and how your actions and words impact others, you're better equipped to navigate social situations with empathy and compassion.Practicing self-awareness can help you identify your strengths, weaknesses, and triggers, allowing you to respond more thoughtfully in challenging situations. Additionally, being attuned to the emotional cues of others can help you respond appropriately and build deeper connections.Conflict Resolution: Turning Challenges into OpportunitiesConflicts are inevitable in any relationship, but how you handle them can make or break the bond you share with others. Approach conflicts with an open mind and a willingness to compromise. Instead of attacking or blaming, focus on finding a mutually acceptable solution that addresses the concerns of all parties involved.When conflicts arise, practice active listening, seek to understand the other person's perspective, and express your own concerns clearly and respectfully. Be willing to admit whenyou're wrong, and be open to constructive feedback. Remember, resolving conflicts in a healthy manner can strengthen relationships and promote personal growth.Networking and Building ConnectionsIn today's interconnected world, building a strong network of relationships is essential for both personal and professional success. Attend social events, join clubs or organizations that align with your interests, and make an effort to engage with new people. Don't be afraid to strike up conversations and introduce yourself to others.When networking, be genuine and authentic. Show an interest in the other person's life, aspirations, and experiences. Offer support and encouragement, and be willing to share your own knowledge and insights. Building a diverse network of connections can open doors to new opportunities, broaden your perspectives, and provide invaluable support and guidance.Cultivating Empathy and CompassionEmpathy and compassion are powerful forces that can strengthen interpersonal relationships. Empathy involves putting yourself in someone else's shoes and trying to understand their thoughts, feelings, and experiences. Compassion is the desire toalleviate the suffering of others and extend kindness and support.Practising empathy and compassion can help you build stronger connections with those around you. When others feel understood and supported, they are more likely to reciprocate those feelings, creating a positive cycle of mutual understanding and care.ConclusionImproving your interpersonal relationships is an ongoing journey that requires self-awareness, effective communication skills, emotional intelligence, conflict resolution abilities, and a willingness to build connections and cultivate empathy and compassion. By implementing these strategies, you can create more meaningful and fulfilling relationships that enrich your college experience and prepare you for future success in both your personal and professional life.Remember, building strong interpersonal relationships takes time, effort, and a genuine commitment to understanding and supporting others. Embrace the challenges and opportunities that come with interpersonal interactions, and never stop learning and growing. With patience, practice, and an open heart,you can develop the skills necessary to thrive in any social or professional setting.。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Accurate Dependability Analysis of CAN-based Networked SystemsJ. Pérez1, M. Sonza Reorda2, M. Violante21Universidad de la República, Facultad de Ingeniería,Instituto de Ingeniería Eléctrica, Montevideo, Uruguay2Politecnico di Torino, Dipartimento di Automatica e Informatica, Torino, ItalyAbstract*Computer-based systems where several nodes exchange information via suitable network interconnections are today exploited in many safety-critical applications, like those belonging to the automotive field. Accurate dependability analysis of such a kind of systems is thus a major concern for designers. In this paper we present an environment we developed in order to assess the effects of faults in CAN-based networks. We developed an IP core implementing the CAN protocol controller, and we exploited it to set-up a network composed of several nodes. Thanks to the approach we adopted we were able to assess via simulation-based fault injection the effects of faults bothin the bus used to carry information and inside each CAN controller, as well. In this paper, we report a detailed description of the environment we set-up and we present some preliminary results we gathered to assess the soundness of the proposed approach.1. IntroductionIn recent years, there has been a rapid increase in the use of computer-based systems in safety-critical applications such as railway traffic control, aircraft flight and telecommunications, where it is quite common to find several computing nodes connected by a proper network sub-system.This trend has led to concerns regarding the validation of the fault tolerance properties of these systems and the evaluation of their reliability. The continuous increase in the integration level of electronic systems is indeed making more difficult than ever to guarantee an acceptable degree of reliability due to the occurrence of soft errors that can dramatically affect the behavior of systems. As an example, the decrease in the * This work has been partially supported by European Commission, through the Alfa II-0086-FI TOSCA Project. Contact author: Massimo Violante, Politecnicodi Torino, Dip. Automatica e Informatica, C.so Duca degli Abruzzi 24, 10129 Torino, Italy, E-mail: massimo.violante@polito.it magnitude of the electric charge used to carry and store information is seriously raising the probability that highly energized particles hitting the circuit may induce transient errors (often modeled as Single Event Upsets) [1].To face the above issue, mechanisms are required to increase the robustness of electronic devices and systems with respect to possible errors occurring during their normal function, and for these reasons on-line testing is now becoming a major area of research. No matter the level these mechanisms work at (hardware, system software, or application software), there is a need for techniques and methods to debug and verify their correct design and implementation.Fault injection [2] is commonly adopted to this purpose, and several techniques have been proposed and practically experimented. They can be grouped into simulation-based techniques (e.g., [3][4]), software-implemented techniques (e.g., [5][6][7]), and hardware-based techniques (e.g., [8][9]).As pointed out in [2], physical fault injection (hardware- and software-implemented fault injection approaches) is more suited when a prototype of the system is already available, or when the system itself is too large to be modeled and simulated at an acceptable cost. Conversely, simulation-based fault injection is very effective in allowing early analysis of designed systems, since it can be exploited when a prototype is not yet available. The main disadvantage of simulation-based fault injection versus physical fault injection is the high CPU time required to simulate the model of the system (provided that it is available).As the complexity of the system under analysis grows, it is thus important to define which is the most suitable approach to be exploited during fault injection in order to match two conflicting requirements. On the one hand designers should adopt system models that are as close as possible to system physical implementations in order to precisely reflect the effects of real faults. At the same time, designers should be able to minimize the time needed for performing injection experiments, thus to be able to analyze sets of faults that are wide enough for providing statistically meaningful information.In this paper we addressed the aforementioned problems while devising a fault injection environment tostudy the effects of soft errors on CAN networks. In particular, given a system where several nodes exchange information via a CAN network, we are interested in studying the effects of soft errors inside the controller implementing the CAN protocol (i.e., in the memory elements that each CAN controller embeds), and in the network as well.Due to the nature of the fault locations we are interested in, and in particular due to the need for injecting faults inside the memory elements each CAN controller embeds, we developed a synthesizable IP-core (modeled at the Register Transfer level) implementing the features of a CAN protocol controller; we then modeled the analyzed network resorting to the developed IP-core. Finally, we exploited simulation-based fault injection for inoculating faults in the locations we selected.The IP-core we developed is fully compliant with the CAN protocol specifications, it thus allows obtaining an accurate model of the physical implementation of the system. As a result, we were able to gather meaningful information about the effects of soft errors in the real system.To assess the soundness of the environment we developed, we gathered some preliminary results by considering a simple network composed of four nodes, one of them acting as a master that periodically polls the remaining three nodes. When a commercial VHDL simulator is exploited, the simulation of the whole network (modeled at the RT-level) takes about 0.3 seconds for each transmitted data frame (each one composed of 51 bits).This preliminary result suggests that the environment is suitable for debugging purpose and that it can be exploited even for validation purposes, when a limited amount of packets are sent over the network.The remainder of the paper is organized as follows. Section 2 reports an overview of the CAN protocol, while Section 3 describes the network model we developed, while Section 4 discusses the techniques we adopted for supporting fault injection. Section 5 reports the experimental results we gathered and finally section 6 draws some conclusions.2. BackgroundThe Computer Area Network (CAN) protocol was initially developed by Bosch in the middle eighties and published later as the ISO 11898 standard [14] [15]. It has been used in the automotive industry for on board networks since 1992.CAN is a multiple access, non-destructive collision detection contention protocol. The specification defines two possible values for the bus state: recessive and dominant. If more than one node is driving the bus at the same time, the dominant value prevails over the recessive value. Usually the recessive value is associated with logical value “1” and the dominant value with a logical “0”, so the bus can be seen as a wired and.Data are exchanged in frames (Data and Remote Frames) containing an ID and 0 to 8 bytes of data. The addressing of the nodes is not specified in the protocol. The ID field on each frame identifies the message, and itis defined by the application generating the transmitted information.Two formats are available for Data and Remote Frames: the standard format (11 bit ID) and the extended format (11 bit basic ID + 18 bit extended ID).If a collision takes place between two nodes that started the transmission of a frame at the same time, a priority arbitration based on the arbitration field1 value is used to resolve the contention. At the first difference detected in the arbitration field, the node driving a recessive value must retire and become a receiver.The error detection mechanisms provided by CAN include bit error (each node monitors the bus while driving it, and it expects to read the same value it sent), stuff error, CRC error, form error (errors detected when the fields of the frame do not comply with the standard) and acknowledgement error (the transmitter expects a dominant value driven by the receivers at the end of each frame).In order to assure data consistency across all the nodes, any node detecting an error will drive the bus to a dominant value for six bit periods thus forcing a stuff error on all the other nodes. In this way transmission will not be completed and the transmitter must resend the frame.Suitable fault confinement mechanisms are implemented in order to avoid completely blocking the bus in presence of permanent faults. Each node maintains two counters totalizing the number of recent transmission and reception errors. The normal condition of each node is the error active status. If the error counter exceeds a given threshold, the node moves to the error passive status in which it cannot drive the bus during an error frame. If even more errors are detected, the node enters the bus off status and it only returns to error active after having monitored the bus idle for a long period of time.3. Network modelFor the purpose of performing fault injection experiments in both the internal memory elements of a CAN protocol controller and in the network where it is1 The arbitration field is formed by the ID, the IDE flag specifying the ID format and the RTR flag specifying the kind of frame.exploited, a model of the CAN protocol controller is needed. In particular, access to the source code of the model is required to effectively support fault injection. As the model is intended for being used also in emulation-based fault injection experiments, the model should be fully synthesizable, too.At the beginning of the project there was no open source IP core available, so a VHDL core was developed starting from the scratch (an open source Verilog IP core was later released with a VHDL version announced [16]). For this purpose, some results from a previous work [17] tutored by one of the authors were used.The first version of the core has been completed and validated (it accounts for about 2,400 lines of synthesizable VHDL code) and it has been used for the first fault injection experiments. It still lacks the fault confining mechanism, meaning that it always behaves as an error active node.The main block of the core is the can_interface, which implements one network node. It is provided with an interface to the CAN bus (via the tx_data and rx_data lines) and to the application using the core (via rx_msg, tx_msg and handshake signals).An additional synthesizable block was developed, can_bus, which is used to model the behavior of the CAN bus and it essentially an AND gate.A top-level model of a generic network composed of N controllers communicating via one CAN bus was finally coded, where N is a parameter freely customizable by the model final user.The obtained network can be either simulated at the RT level or emulated via FPGA.4. Supporting fault injectionThe fault tolerance community is increasingly concerned by the occurrence of soft errors resulting from the perturbation of storage cells caused by ionization, known as Single Event Upset (SEU). SEUs are random events and thus they may occur at unpredictable times. For example, they may corrupt the content of a register during any clock cycle.As far as SEUs in CAN controllers are considered, we adopted the fault model called upset or transient bit-flip, which results in the modification of the content of a storage cell during any clock cycle. Possible fault locations are thus memory elements, e.g. flip-flops, inside each CAN controller in the considered network.As far as the bus connecting the CAN controllers is considered, we adopted two different fault models: alongside the SEUs, we considered the permanent stuck-at fault model. The former is used to mimic the effects of transient failure such as the ones induced by highly-energized particles or electromagnetic interferences on the bus, the latter is conversely exploited to account for permanent damage.In order to support injection experiments, we enriched the model with features that simplify the injection of the aforementioned fault models.For this purpose, we equipped each CAN controller with a saboter module [8] that allows to complement or stuck the content of the bus at any clock cycle. Moreover, the memory elements in the CAN controller module were equipped with the architecture described in [18], which allows injecting bit-flips in all the internal memory elements of a device with low intrusiveness and high accuracy.The obtained fault injection-enriched model can then be used for assessing the effects of faults either via simulation or FPGA-based emulation.No matter which approach is exploited, the effects of faults were classified according to the following categories:• No effect: all the data frame in the given workload were transmitted and correctly received.• Incorrect answer: at least one of the nodes in the network received corrupted data and the protocoldid not detect the presence of the fault.• Performance degradation: after some error frames and retransmissions due to the error detection andcorrection mechanisms the protocol provides, thesystem recovered its normal behavior and the dataframe in the given workload were transmitted andcorrectly received. However, all the operationswere completed in a time greater than the expectedone.• Network degradation: due to the effects of faults the network was partitioned in two sets, oneproperly working (i.e., transmitting and receivingas expected), and one not working. CAN controllers belonging to the latter set did notproduce any network activity.• Timeout: the operations in the workload were not completed after a given time limit was reached.5. Experimental resultsIn order to assess the soundness of the proposed environment, we performed some experiments on a simple network configuration encompassing four nodes. One of the nodes behaves as a master and all the others as slaves. Each slave implements a sequence generator (e.g., a counter) that is monitored by the master.After reset, the master (whose behavior is described in figure 1) initiates a polling loop over the slaves (whose behavior is described in figure 2) asking for the next value in the sequence, and it waits for the answer from the slave. Upon reception of one data frame fromany slave, the master checks the correctness of the answer. In case a value out of the expected sequence is received, the fault effect is classified as incorrect answer.In our experiments a workload consisting in the transmission of 12 data frames, each composed of 51 bits, was considered. We simulated the whole network by means of the Modelsim VHDL simulator on a Sun Enterprise 250 machine running at 400 MHz and equipped with 2GB of RAM.Figure 1: Behavior of the master nodeFigure 2: Behavior of the slave node As far as the CPU time for the execution of the workload was considered, we measured a simulation time of 4 seconds, which corresponds to an average CPU time of 0.3 seconds for each data frame. Please note that the simulation we run were cycle accurate: they thus allowed observing all the details of each CAN protocol controller during workload execution.During our experiments, we considered two possible scenario, a debug one presented in sub-section 5.1 and a validation one, which is reported in sub-section 5.2.5.1. Debug scenarioIn this scenario the user is still developing the networked system, and he needs to quickly evaluate the effects of just a limited number of faults, corresponding to the design corner cases. From the experience learned in developing our IP-core, we observed that in this case a very simple workload composed of one packet, only, is typically exploited. As a result, the performance we obtained is compatible with the simulation of some thousands of faults per hour.As far as fault effects are concerned, we observed that:• When the injection of permanent faults in the CAN bus was considered, we observed several timeoutconditions provoked by the bit error detection/correction mechanisms the protocolembeds: the sender node identified a mismatchbetween the value read from the bus and thetransmitted one, and thus it continuously retried toresend the same data frame. Another condition weobserved when injecting permanent faults in therx_data line of any node was the networkdegradation one. The affected node did notproduce any network activity and thus it acted asdisconnected from the bus.• When the CAN bus was affected by transient faults, we observed either no effects, when thefaults modified meaningless bits in the data frame,or performance degradation when the fault wasdetected and a retransmission performed.• As far as SEUs in the memory elements embedded in the CAN protocol controller are considered, forthe sake of this paper we injected faults in thecounter registers that control the length of eachframe field, both for transmission and for receptionpurposes. The results we gathered showed that faultare classified either as no effects or performancedegradation.5.2. Validation scenarioIn this scenario the network is ready, and the designer needs to validate the application that is using the network. A workload encompassing the transmission of several packets may be required, while the fault list is composed of several thousands of faults. In this case, in order to gather statistical evidence of the application robustness, faults are normally randomly selected. For the considered application, given a workload composed of 12 packets, we estimated a simulation performance of about 900 faults per hour.No effect 241Incorrect answer 4Performance degradation 351Network degradation 0Timeout 4Table 1: Injection of SEUs in the master nodemaster(){while(1) {foreach slave {send request(slave ID)wait data frame from(slave ID)compare with counter(slave ID)increment counter(slave ID)}}}slave(){count = 1while(1) {wait request from(master ID)new data frame = countsend data frame(master ID)count++}}As far as fault effects are concerned, we performedthe injection of 600 SEUs inside the counters embeddedin the CAN controllers. For each fault we selectedrandomly the time, the network node and the register bitwhere the SEU was injected. The figures we measuredare reported in table 2.We found some faults classified as Timeout andIncorrect answer . After analyzing in detail these faults itwas found that all of them correspond to situationswhere a frame is not consistently accepted or rejected byall the nodes. For example, on one of these faults theEndOfFrame field of a transmitting node was made longer by a fault and thus this node rejected the frame because of an error detected at the end of the longer EndOfFrame field, when all the receivers have already accepted the frame. To illustrate the possibility offered by our environment of emulating different types of serviceinterruption, we performed a further set of experimentsduring which we perturbed the CAN bus with SEUscharacterized by a duration of several clock cycles. Thefigures we measured are reported in table 2, where fault effect classification is reported for SEU durations ranging from 1 to 20 clock cycles. Fault duration [Clock cycles]1 5 10 20No effect 97 60 17 15Incorrect answer 0 0 0 0 Performance degradation 3 40 82 84 Network degradation 0 0 0 0 Timeout 0 0 1 1Table 2: Injection of SEUs in the master node The observed variation of fault effects with faultduration is due to the peculiarity of the selected fault location: the CAN bus is sampled at any clock cycle but its value is used only once during each bit interval (10 clock cycles with the can bus timing parameters used). As a result, any fault that lasts 10 clock cycles or more is likely to affect the data transmission for at least one bit, while faults lasting for less than 10 clock cycles have lower probability of producing any visible effect on the system.6. ConclusionsIn this paper, we reported a detailed description of an environment we set-up for assessing the effects of faults in a networked system based on the CAN protocol.Our environment is based on a synthesizable IP-core implementing the CAN protocol, which is used to layouta generic network comprising N nodes, where N is a parameter fully customizable by the model end user. The network model was enriched with fault injection-oriented features allowing evaluating the effects of permanent and transient faults in both the CAN bus and the memory elements each CAN controller embeds. The cycle-accurate simulation experiments we performed allowed us to experimentally verify that our environment can be exploited for both debug and validation purposes of a simple network composed of four nodes. References [1] M. Nikoladis, “Time Redundancy Based Soft-Error Tolerance to Rescue Nanometer Technologies”, IEEE 17th VLSI Test Symposium, April 1999, pp. 86-94[2] R. K. Iyer and D. Tang, “Experimental Analysis of Computer System Dependability”, Chapter 5 of Fault-Tolerant Computer System Design, D. K. Pradhan (ed.), Prentice Hall, 1996 [3] E. Jenn, J. Arlat, M. Rimen, J. Ohlsson, J. Karlsson,“Fault Injection into VHDL Models: the MEFISTO Tool2, Proc. FTCS-24, 1994, pp. 66-75 [4] T.A. Delong, B.W. Johnson, J.A. Profeta III, “A Fault Injection Technique for VHDL Behavioral-LevelModels”, IEEE Design & Test of Computers, Winter1996, pp. 24-33 [5] G.A. Kanawati, N.A. Kanawati, J.A. Abraham, “FERRARI: A Flexible Software-Based Fault andError Injection System”, IEEE Trans. on Computers,Vol 44, N. 2, February 1995, pp. 248-260 [6] J. Carreira, H. Madeira, J. Silva, “Xception: Software Fault Injection and Monitoring in Processor FunctionalUnits”, DCCA-5, Conference on Dependable Computing for Critical Applications, Urbana-Champaign, USA, September 1995, pp. 135-149[7] A. Benso, P. Prinetto, M. Rebaudengo, M. SonzaReorda, “EXFI: a low cost Fault Injection System for embedded Microprocessor-based Boards”, ACM Transactions on Design Automation of Electronic Systems, Vol. 3, Number 4, October 1998, pp. 626-634[8] J. Arlat, M. Aguera, L. Amat, Y. Crouzet, J.C. Fabre,J.-C. Laprie, E. Martins, D. Powell, “Fault Injection for Dependability Validation: A Methodology and some Applications”, IEEE Transactions on Software Engineering, Vol. 16, No. 2, February 1990[9] J. Karlsson, P. Liden, P. Dahlgren, R. Johansson, U.Gunneflo, “Using Heavy-Ion Radiation to Validate Fault-Handling Mechanisms”, IEEE Micro, Vol. 14, No. 1, pp. 8-32, 1994[10] D. Gil, R. Martinez, J. V. Busquets, J. C. Baraza, P. J.Gil, “Fault Injection into VHDL Models: Experimental Validation of a Fault Tolerant Microcomputer System”, Dependable Computing EDCC-3, September 1999, pp. 191-208[11] J. Boué, P. Pétillon, Y. Crouzet, “MEFISTO-L: AVHDL-Based Fault Injection Tool for theExperimental Assessment of Fault Tolerance”, Proc.FTCS´98, 1998[12] A. Benso, M. Rebaudengo, L. Impagliazzo, P. Marmo,“Fault-List Collapsing for Fault InjectionExperiments”, RAMS'98: Annual Reliability andMaintainability Symposium, January 1998, pp. 383-388[13] K. H. Huang and J. A. Abraham, “Algorithm-BasedFault Tolerance for Matrix Operations”, IEEETransactions on Computers, vol. 33, June 1984, pp.518-528[14] Bosch’s Controller Area Network Homepage at/[15] Can in Automation (CiA) Can Protocol introduction athttp://www.can-cia.de/can/protocol/[16] Opencores Can protocol controller project./projects/can/[17] J. Juárez, P. Bustamante, “Interfase Bus Can”, CursoDiseño Lógico 2. FinalProject. .uy/ense/asign/dlp/proyectos/2002/can/[18] P. Civera, L. Macchiarulo, M. Rebaudengo, M. SonzaReorda, M. Violante, “An FPGA-based approach forspeeding-up Fault Injection campaigns on safety-critical circuits”, Journal of Electronic Testing:Theoryand Applications, Vol. 18, No. 3, June 2002, pp. 261-271。