(iii) Counting process models, (iv) “Innovations ” and Statistics. III. (1120) Models and

合集下载

(英文)生物外文文献

58%
FIG. 3
SK-BR-3 cell，HER2过表达的乳腺癌细胞，STAT3可以通过特定的细胞因子激活。HER2是重要的乳腺癌预后判断因子，HER2阳性（过表达或扩增）的乳腺癌，其临床特点和生物学行为有特殊表现，治疗模式也与其他类型的乳腺癌有很大的区别。
36%
FIG. 4 LIF刺激
FIG. 1
15.5%
FIG. 2
GRN knockdown reduced the mRNA expression of these genes, similar to the effects of STAT3 knockdown
58%
染色质免疫共沉淀技术（ chromatin immunoprecipitation assay, CHIP ）
70%
Suggesting that some but not all phenotypes associated with GRN knockdown can be rescued by constitutively active STAT3.
FIG. 6
42.9%
13.5%
These findings indicate that in primary breast cancers, GRN expression specifically correlates with enhanced STAT3 transcriptional activity in the presence of tyrosinephosphorylated STAT3
• 皮尔森相关系数（Pearson correlation coefficient）也称皮尔森积矩相关系数(Pearson product-moment correlation coefficient) ，是一种线性相关系数。皮尔森相关系数是用来反映两个变量线性相关程度的统计量。相关系数用r表示，其中n为样本量，分别为两个变量的观测值和均值。r描述的是两个变量间线性相关强弱的程度。 r的绝对值越大表明相关性越强。

tadgan异常值检测原理

tadgan异常值检测原理引言：异常值检测是数据分析中一个重要的步骤，它能够帮助我们发现数据中的异常值，即与其他数据点明显不同的数据点。

异常值可能是由于测量误差、数据录入错误、系统故障或者真实的异常事件所引起的。

在本文中，我们将介绍tadgan异常值检测方法的原理和应用。

正文内容：1. tgan异常值检测方法的原理1.1 数据生成模型tadgan使用生成对抗网络（GAN）作为异常值检测的基础。

GAN是由一个生成器和一个判别器组成的模型。

生成器负责生成与真实数据相似的合成数据，而判别器则负责判断输入数据是真实数据还是合成数据。

生成器和判别器通过博弈的方式相互对抗，最终生成器能够生成与真实数据非常相似的合成数据。

1.2 异常值检测tadgan的异常值检测方法基于GAN的生成器。

生成器通过学习真实数据的分布，能够生成与真实数据相似的合成数据。

对于给定的数据点，如果它与生成的合成数据相差较大，则可以认为它是一个异常值。

tadgan使用生成器生成的合成数据与真实数据进行对比，通过计算它们之间的差异度量异常值的程度。

1.3 训练过程tadgan的训练过程包括两个阶段：预训练和对抗训练。

在预训练阶段，生成器和判别器分别被训练来学习真实数据的分布和区分真实数据和合成数据。

在对抗训练阶段，生成器和判别器相互对抗，生成器试图生成更逼真的合成数据，而判别器试图更准确地判断真实数据和合成数据。

2. tgan异常值检测方法的应用2.1 金融领域在金融领域，异常值可能代表着欺诈行为或者异常交易。

使用tadgan异常值检测方法可以帮助金融机构发现潜在的欺诈行为或者异常交易，从而保护客户的资金安全。

2.2 工业领域在工业领域，异常值可能代表着设备故障或者生产异常。

使用tadgan异常值检测方法可以帮助工业企业监测设备状态，及时发现故障或者异常情况，从而提高生产效率和产品质量。

2.3 网络安全领域在网络安全领域，异常值可能代表着网络攻击或者异常行为。

XGBoost模型在重症结局早期预测上的应用

科技视界SCIENCE & TECHNOLOGY VISION0 引言脓毒症是由细菌等病原体入侵人体所引发的全身炎症综合征。

患者通常伴有发热或体温过低，严重时可出现休克、重要器官功能衰竭等损害。

重症监护室中脓毒症始终是一个短期病死率高（15%～30%）的急危重症[1-3]。

早期预测对脓毒症干预至关重要[4]。

临床上，医生通常采用序贯器官衰竭评分、格拉斯哥昏迷指数等因子来实现不良结局的早期预测。

危重症评分不仅没有针对单一病症的特异性，而且评分本身受医生主观认知的干扰，并不能很好地区分院内死亡结局[5]。

疫情以来，国外因开放政策导致重症监护室满负荷工作，利用重症数据建模预测的实际需求受到重视[6-9]。

其中XGBoost[10]在基于数据库的脓毒症分类任务中优于其他机器学习算法[11]。

1 资料与方法1.1 数据来源和伦理学MIMIC数据库[12-13]是MIT麻省理工下属管理的一个公共重症数据库。

本实验采用第4版，该版数据库由2008—2019年收治于贝斯以色列女执事医疗中心(Beth Israel Deaconess Medical Center) 的45万余条诊疗记录构成，记录包括人口统计学信息，既往史、诊断、生命体征、生化指标、检查治疗医嘱等数据。

所有遵守其使用XGBoost模型在重症结局早期预测上的应用孙方园黄明宇*温州医科大学附属第一医院，浙江温州 325015作者简介：孙方园，主要研究方向为医疗信息化。

通信作者：黄明宇，助理工程师，主要研究方向为医疗信息化。

MEDICAL HEALTH医学健康规则（https://）的用户均为合法用户。

实验额外收集了一份温州医科大学附属第一医院数据用于交叉验证。

数据包含本院2015—2022年期间收住于ICU的脓毒症患者。

该数据集通过了医院伦理备案，并在抽取过程中进行了脱敏处理。

1.2 数据抽取标准本实验用Sepsis-3.0诊断标准[14-16]抽取病例数据，排除怀孕，未成年（年龄＜18岁），超长期监护（ICU 收住超100 d），超短期监护（ICU收住少于1 d）等数据，并且对多次入住ICU的病例仅取第一次收住的诊疗数据。

细胞增殖英文CellReproduction课件

1. What does “diploid” mean? 2. We have __ total chromosomes. 3. In the term 2n, what does “n” stand for in us?
In a gypsy moth? 4. Why does mitosis occur? Major functions? 5. In what cells (general term) does mitosis
• Groups of single-chromatid chromosomes reach poles of cell
• Nuclear envelop begins to reform
• 2 new daughter cells formed
• Cytokinesis begins with appearance of cell plate
It would be a genetic mess!
Instead, gametes are haploid (n).
Egg and sperm both have exactly half the number of chromosomes of somatic cells
At fertilization, n + n = 23; 23 + 23 = 46!
THE STEPS OF MITOSIS
• Interphase
(actually, this is not part of mitosis itself)
• Prophase • Metaphase • Anaphase • Telophase
Interphase
Onion root tip

监督学习——随机梯度下降算法（sgd）和批梯度下降算法（bgd）

监督学习——随机梯度下降算法（sgd）和批梯度下降算法（bgd）线性回归⾸先要明⽩什么是回归。

回归的⽬的是通过⼏个已知数据来预测另⼀个数值型数据的⽬标值。

假设特征和结果满⾜线性关系，即满⾜⼀个计算公式h(x)，这个公式的⾃变量就是已知的数据x，函数值h(x)就是要预测的⽬标值。

这⼀计算公式称为回归⽅程，得到这个⽅程的过程就称为回归。

假设房⼦的房屋⾯积和卧室数量为⾃变量x，⽤x1表⽰房屋⾯积，x2表⽰卧室数量；房屋的交易价格为因变量y，我们⽤h(x)来表⽰y。

假设房屋⾯积、卧室数量与房屋的交易价格是线性关系。

他们满⾜公式上述公式中的θ为参数，也称为权重，可以理解为x1和x2对h(x)的影响度。

对这个公式稍作变化就是公式中θ和x都可以看成是向量，n是特征数量。

假如我们依据这个公式来预测h(x)，公式中的x是我们已知的（样本中的特征值），然⽽θ的取值却不知道，只要我们把θ的取值求解出来，我们就可以依据这个公式来做预测了。

最⼩均⽅法（Least Mean squares）在介绍LMS之前先了解⼀下什么损失函数的概念。

我们要做的是依据我们的训练集，选取最优的θ，在我们的训练集中让h(x)尽可能接近真实的值。

h(x)和真实的值之间的差距，我们定义了⼀个函数来描述这个差距，这个函数称为损失函数，表达式如下：这⾥的这个损失函数就是著名的最⼩⼆乘损失函数，这⾥还涉及⼀个概念叫最⼩⼆乘法，这⾥不再展开了。

我们要选择最优的θ，使得h(x)最近进真实值。

这个问题就转化为求解最优的θ，使损失函数J(θ)取最⼩值。

（损失函数还有其它很多种类型）那么如何解决这个转化后的问题呢？这⼜牵扯到⼀个概念：LMS 和梯度下降（Radient Descent）。

LMS是求取h(x)回归函数的理论依据，通过最⼩化均⽅误差来求最佳参数的⽅法。

梯度下降我们要求解使得J(θ)最⼩的θ值，梯度下降算法⼤概的思路是：我们⾸先随便给θ⼀个初始化的值，然后改变θ值让J(θ)的取值变⼩，不断重复改变θ使J(θ)变⼩的过程直⾄J(θ)约等于最⼩值。

pytorch logging_steps参数

pytorch logging_steps参数摘要：1.概述2.pytorch 中的logging_steps 参数3.logging_steps 参数的作用4.logging_steps 参数的使用方法5.示例6.总结正文：1.概述PyTorch 是一个广泛使用的深度学习框架，它提供了许多便捷的功能，如自动求导、优化器、数据加载和预处理等。

在PyTorch 中，我们可以使用logging 模块来记录训练过程中的信息，如损失函数、准确率等。

这对于分析模型的性能和调试问题非常有帮助。

在logging 模块中，有一个名为logging_steps 的参数，可以用来控制记录信息的频率。

2.pytorch 中的logging_steps 参数在PyTorch 中，logging_steps 参数用于设置记录器在训练过程中输出日志的频率。

该参数在PyTorch 的logging 模块中定义，格式为：logging_steps(int)。

其中，int 表示记录器每多少个步骤（step）输出一次日志。

3.logging_steps 参数的作用logging_steps 参数的主要作用是控制日志输出的频率。

当设置好logging_steps 参数后，记录器会在每个step_num * logging_steps 步时输出一次日志。

这样，我们可以在训练过程中实时了解模型的性能，并根据输出的日志来调整超参数或者解决潜在问题。

4.logging_steps 参数的使用方法要在PyTorch 中使用logging_steps 参数，首先需要导入logging 模块，然后设置logging_steps 参数，最后使用record() 函数来启动记录器。

下面是一个简单的示例：```pythonimport torchimport torch.nn as nnimport torch.optim as optimfrom torch.utils.data import DataLoaderfrom torch.utils.logging import logging_steps# 定义一个简单的模型class SimpleModel(nn.Module):def __init__(self):super(SimpleModel, self).__init__()self.conv1 = nn.Conv2d(1, 16, 5)self.conv2 = nn.Conv2d(16, 32, 5)self.fc1 = nn.Linear(32 * 4 * 4, 256)self.fc2 = nn.Linear(256, 10)def forward(self, x):x = self.conv1(x)x = nn.functional.relu(x)x = self.conv2(x)x = nn.functional.relu(x)x = nn.functional.max_pool2d(x, 2)x = torch.flatten(x, 1)x = self.fc1(x)x = nn.functional.relu(x)x = self.fc2(x)return nn.functional.log_softmax(x, dim=1) # 创建数据集和数据加载器data = torchvision.datasets.MNIST(root="./data", train=True, download=True)train_loader = DataLoader(data, batch_size=64, shuffle=True) # 设置logging_steps 参数logging_steps(10)# 训练模型model = SimpleModel()optimizer = optim.Adam(model.parameters(), lr=0.001)for epoch in range(10):for i, (images, labels) in enumerate(train_loader):# 前向传播outputs = model(images)loss = nn.functional.nll_loss(outputs, labels)# 反向传播和优化optimizer.zero_grad()loss.backward()optimizer.step()# 输出日志if (i + 1) % logging_steps == 0:print("Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}".format(epoch + 1, 10, i + 1, len(train_loader), loss.item()))# 关闭记录器logging.close()```5.示例在上述示例中，我们首先导入了所需的库，并定义了一个简单的模型。

Volocity Tracking Tutorial说明书

DataLive Cell TrackingWorkflowTracking objects may be appropriate if you are interested in characterizing the movement of objects (i.e. their speed, direction), or monitoring properties of objects as they move over time. In Volocity, tracking is a two stage process:1) The identification of the objects.2) The analysis of the positions of those objects and the building of tracks.Finding objectsClick once on the data in the library list and click on the Measurements tab to display the Measurement View. The image is shown in the mode that best shows the objects to be measured, and below it is an area where all measurements made will be displayed. At the top left of the screen is an area where a measurement protocol will be built, to find objects of interest in the dataset and track them, using the list of tasks in the area below.Volocity Tutorial TrackingThis tutorial will demonstrate how to perform tracking using Volocity ®.TUTORIAL NOTEIt is important that the Measurement Protocol identifies objects as accurately as possible in each timepoint as this will be essential for the tracking algorithm.The objects within this dataset do not exhibit the same intensity values throughout the time-course; therefore thresholding on the same intensity values in each timepoint is unlikely to be successful. The task “Find Objects Using SD Intensity” selects intensities within a specified number of standard deviations from the mean. Select this task and drag and drop it into the measurement protocol.Where intensities are found within range a colored overlay is applied. Groups of selected intensities form objects. View the image, with object overlays, in different ways by changing the mode of view in the top left.In this example, some of the objects formed are too large, they are actually two or more objects currently identified as one. Use the task “Separate Touching Objects” to improve this situ ation. The Object size guide, shown in the “Separate Touching Objects” task box, can be set to the approximate size of the smallest object that can be created by the separating step. In this example the Object size guide is 0 µm3, the best separation is achieved by not restricting the size of objects that will be made.TrackingOnce objects have been identified they can be connected together by tracks. Drag and drop a “Track Objects” task to the measurement protocol.The tracking algorithm uses the centroid measurement for each previously identified object to determine whether there is any movement of objects over time. Tracks are generated by connecting the centroids so as to trace the path of a moving object. The track objects task will always place itself at the bottom of the list of tasks in the protocol since objects must be found before they can be tracked.To see the results of the “Track Objects” task all timepoints must be measured. Select Measure all timepoints from the Measurements menu.To alter how the tracks are measured, click on the cog icon on the “Track Objects”task to access the secondary dialog for this task.For example, it may be necessary to set a maximum distance between objects, in this example 5 µm.Every object measured is given a unique object ID. Objects that have been tracked, and are therefore determined to be the same object in different locations are given the same Track ID; however their object ID remains the same.In the drop-down menu at the top of the measurements table, filter by tracks to just see measurements made on tracks. This contains summary information such as track length, the average velocity for the duration of the track, the trajectory and the meandering index.Filter by objects and sort by Track ID to see individual properties of the object and how they change over time.Now that the objects have been measured and tracked, we should confirm these tracking results by examining individual tracks. Select a row (shift-click to select multiple rows) in the table, representing a track, to show the individual overlay of that track on the image..Showing object feedback for the current timepoint only can assist in understanding what is being shown. To adjust the feedback that is displayed on the image, select Feedback Options... from the Measurements menu.Use the time navigation controls to compare the feedback with the underlying image data.The most likely problem with tracks is caused by setting the wrong maximum distance between objects in the secondary dialogue of the “Track Objects” task (as discussed previously). If tracks are incorrect because they switch to different objects part way through the time series, the maximum distance assigned is too great. If tracks are incorrect because they do not follow an object far enough in the time course, the maximum distance may be too small. Adjust as appropriate.Analysis of tracking dataThe tracking process generates information about object movement and object properties over time. Tracksrecord information about the behavior of the object for the duration of the time that it was followed. Individualproperties for each object, some of which are added by the tracking step, are also recorded.To easily extract what is of particular interest from this wealth of information, you may store all the measurements, in table format, as a separate Measurement Item within the library, and then perform further analysis. Select Make Measurement Item… from the Measurements menu, remembering to select Measure All Timepoints when prompted.Raw tables, analysis tables and charts, created within the resulting Measurement Item, can all be viewed in Volocity or exported as text or image files.is a registered trademark of PerkinElmer, Inc. All other trademarks are the property of their respective owners.。

基于PLC的生产线输送带控制系统设计毕业设计(论文)

摘要目前，输送带系统在工业的各个领域有着广泛的应用。

其结构简单、运行平稳、运转可靠、能耗低、对环境污染小、便于集中控制和实现自动化、管理维护方便，在连续装载条件下可实现连续运输。

对于输送带的控制，它的控制形式也多种多样，它可以由单片机，PLC，以及计算机来控制，以前都采用接触继电器控制系统。

而接触继电器控制系统接线复杂、抗干扰能力差，易因接触不良而造成故障，而且功能扩展性差。

PLC 因其可靠性高、功能完善而越来越受到企业的青睐，传统的接触继电器控制系统已逐步为PLC所取代。

根据所学知识和文献资料对基于PLC的生产线输送带控制系统设计所采用的方法是PLC集中控制的办法，利用PLC内部存储来执行逻辑运算、顺序控制、定时、计数和算术运算等操作，并采用数字量，模拟量的输入和输出来完成控制过程，从而实现对传送带的智能控制。

而且PLC能把计算机的许多功能和继电控制系统结合起来，使PLC和组态控制软件的联系更加紧密，使其模拟量控制、位置控制控制等使其远程通信功能更加完善。

因此本次设计选择了用PLC来控制输送带的整个运行过程，利用PLC简单可视化的程序，实现自动控制的目的。

PLC的运用使得系统的电路变得简明清楚，而且十分便于日后的运行维护带式输送机。

矚慫润厲钐瘗睞枥庑赖。

关键词：PLC；输送带；集中控制Design Of The Conveyor Belt Control System For Production lineBased on PLC 聞創沟燴鐺險爱氇谴净。

AbstractAt present, belt conveyor system in the industrial areas in which there is a wide range of applications. Its simple structure, smooth running, run reliable, low energy consumption and environmental pollution on small, easy to focus on control and automation, and ease of maintenance, and management in a continuous load conditions for transport. Conveyor belt for the control, it has a variety of forms of control, it can be done by single-chip Phone, as well as computer, PLC, have in the past to control relay control system with touch. Contact with relay control system wiring complex, anti-interference ability is poor, vulnerable to failure and poor contact, and scalability. PLC due to its high reliability, and functionality has been more and more enterprises, traditional contact relay control system has been replaced by PLC for step-by-step. 残骛楼諍锩瀨濟溆塹籟。

精益生产-LineBalanceModels中英文版

改进关键因子确认发掘潜在的解决方法选择方案优化方案实行方案
控制过程变革和控制制定控制计划计算最终财务过程指标项目过渡给未来项目管理者项目鉴别转化机会
测量
定义
项目编号工具项目定义表净现值分析内部回报率分析折算现金流分析？（按现值计算的现金流量分析）PIP管理过程RACIQuad 表
过程图价值分析脑力风暴投票归类法柏拉图因果图/鱼骨图FMEA查检表运行图控制图量具 R&R
Line Balance Model
学习目标
如何设计和实施由“线平衡模型”支持的一个流程以确保优化配置：人地方固定资产材料知道如何使生产率最大化
Line Balance Model

What’s in It for Me?
Able to design and implement a balanced process lineUnderstand the issues in a typical process environment and how to impact those issues
Revised 1-12-02
Line Balance Model
精益6
过程改善流程
分析
控制
改进
定义选定题目列出客户从顾客之声中列出关建需求定出项目焦点和重要指标完成 PDF
测量绘制业务流程图绘制价值流程图制定数据收集计划测量系统分析收集数据过程能力分析
分析提出关键因子区分关键因子验证关键因子评枯每个关键因子对结果的影响量化机会根本原因排序寻找根本原因针对关键因子
Process MappingValue AnalysisBrainstormingMulti-Voting TechniquesPareto ChartsC&E/Fishbone DiagramsFMEACheck SheetsRun ChartsControl ChartsGage R&R

pmp英文试题及答案解析

pmp英文试题及答案解析PMP (Project Management Professional) English Test Questions and Answers Analysis1. What is the main purpose of the Project Management Plan (PMP) document?A. To serve as a legal document for the projectB. To provide a comprehensive guide for executing, monitoring, and closing the projectC. To document all project risksD. To list all project stakeholdersAnswer: B. The Project Management Plan is a key document that integrates all the subsidiary management plans and other information necessary for effective project management. It serves as a guide for coordinating all project activities.2. Which of the following is NOT a process group in the PMBOK (Project Management Body of Knowledge)?A. InitiatingB. PlanningC. ExecutingD. DelegatingAnswer: D. The PMBOK Guide outlines five process groups: Initiating, Planning, Executing, Monitoring and Controlling, and Closing. Delegating is not a process group but rather an action that may occur within the Executing process group.3. What is the primary role of the project manager in the Executing process group?A. To define new project policiesB. To implement the project management plan to ensure the project deliverables are producedC. To perform quality control on all deliverablesD. To close the projectAnswer: B. During the Executing process group, the project manager's primary role is to work with the team to implement the project management plan, ensuring that the project deliverables are produced according to the plan.4. In the context of project management, what does the acronym "WBS" stand for?A. Work Breakdown StructureB. Workload Budget SystemC. Workload Breakdown ScheduleD. Workflow Breakdown SystemAnswer: A. WBS stands for Work Breakdown Structure, which is a hierarchical decomposition of the work to be executed by the project team to deliver the project's products, services, and results.5. What is the purpose of the Change Control Board (CCB)?A. To approve or reject all changes to the project scopeB. To manage the project budgetC. To recruit project team membersD. To develop the project scheduleAnswer: A. The Change Control Board is responsible for reviewing, evaluating, approving, or rejecting changes to the project scope, ensuring that all changes are documented and managed properly.6. Which of the following is a true statement regarding the project management process?A. The process of project management is sequential and linearB. The process of project management is iterative and overlappingC. The process of project management is completed in one phaseD. The process of project management involves only theproject managerAnswer: B. The process of project management is iterative and overlapping, with many processes occurring in parallel and requiring ongoing management and control throughout the project's life cycle.7. What is the main difference between a project and a program?A. A project is a temporary endeavor, while a program is permanentB. A project has a definite beginning and end, while a program does notC. A project is always smaller in scope than a programD. A project is managed by a team, while a program is managed by an individualAnswer: B. A project is a temporary endeavor with adefinitive beginning and end, while a program is an organized group of related projects managed together to achieve strategic objectives.8. What is the purpose of the Risk Management Plan in project management?A. To identify and prioritize risks that may impact the projectB. To document all known risks and their mitigationstrategiesC. To predict the exact outcome of the projectD. To provide a legal document for the project stakeholdersAnswer: B. The Risk Management Plan documents how risk management activities will be structured and performed. It includes how risks will be identified, analyzed, and prioritized, as well as how mitigation strategies will be implemented.9. In project management, what does the term "Critical Path" refer to?A. The sequence of activities that affects the project's durationB. The path with the highest financial costC. The path with the most complex tasksD. The path that leads to the project's successAnswer: A. The Critical Path is the sequence of activities that determines the duration of the project. It is the longest path through the project's activity network andidentifies the activities that have the least amount of scheduling flexibility.10. Which of the following is NOT an output of the Close Project or Phase process?A. Project documents updatesB. Contract closureC. Final product, service, or result transitionD. Lessons Learned registerAnswer: D. The Lessons Learned register is an output of the Closing process group, but it is not specifically an output of the Close Project or Phase process. The Close Project or Phase process does include project document updates, contract closure, and the transition of the final product, service, or result.In conclusion, understanding。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

3
Nonparametric Kaplan-Meier Th’y
t Recall that SX (t) = exp(− 0 hX (x) dx). Deﬁne the cumulative hazard function t 0
HX (t) =
hX (x) dx = − ln(SX (t))
t 0

t 0
fX (x) dx 2 (x)S (x) SX C

4
Eﬃciency Comparison
√ Since Conf. Int. widths are proportional to σest / n, equal widths can be achieved if another estimator with 2 is applied on sample of size nalt , where avar σalt nalt ˆalt , ϑ ˆest) = σ 2 /σ 2 = ARE(ϑ alt est n Miller’s (1983) ARE comparisons (Weibull vs KM) — in case of Expon censoring — are: ARE’s of KM versus parametric MLE of survival prob. at 3 quantiles 0.5, 0.25, 0, 0.1 for Weibull SX and Exponential SC . Shape parameter = γ . Quantiles Upper Quartile .51 .64 .64 .58 .63 .64 .62 .63 .64 Upper Decile .21 .46 .59 .40 .52 .59 .56 .62 .65
3-Lecture Minicourse on Statistics of Survival Data
Eric Slud I. (11/6) Death Hazards & Competing Risks Concepts : (i) Statistical Estimation as mathematical problem, (ii) Identiﬁability, nonparametric vs. nonparametric. II. (11/13) Population Cohorts & Martingales Concepts : (iii) Counting process models, (iv) “Innovations” and Statistics. III. (11/20) Models and Likelihoods with ∞-Dimensional Parameters Concepts : (v) Nuisance parameters, (vi) Asymptotic Relative Eﬃciency.
KM estimator of SX (t) from survival data is equiv. to ˆ X (t) = H where N (t) =
n i=1
dN (t) Y (t) Y (t) =
n i=1
∆ i I [Βιβλιοθήκη i ≤ t ] ,I [T i ≥ t ]
Recall from last time: can view survival on (t, t + δ ) for each surviving individual as an indep. coin-toss: failure occurs with prob. ≈ δ · hX (t) each, so overall prob. of an observed failure is δ · Y (t) hX (t). Hence √ √ ˆ X (t)−HX (t)) = n n (H N (t) −

Censoring dist.’n unknown (‘nuisance parameter’) but not depending on ϑ so ignored in likelihood. Function to maximize in ϑ = (λ, γ ) becomes
n i=1
γ Known Cens. % Med. 1 Y 50 .64 25 .56 0 .48 2 Y 50 .57 25 .52 0 .48 2 N 50 .60 25 .63 0 .66
5
Multiplicative Intensity Model
Cox (1972), Aalen (1978) introduced the class of models E (N (t + dt) − N (t) | Z, (Y (s), V (s) : s < t)) = Y (t−) eβ Z +γ V (t−) λ(t) dt Idea: parameters (β, γ ) to be ﬁtted describe eﬀect on prognosis of individual subjects, while the (inﬁnitedimensional) nuisance hazard function λ(t) describes the general background population. Exponent usable as prognostic index. Research Topics Related to Today’s Lecture: • Theoretical: large-sample theory of eﬃcient estimators for semiparametric models like these with ∞-dim nuisance parameters. Efron (1977), Johansen (1983) and others proved eﬃciency of Cox’s (1972) ˆ based on maximizing logLik with estimator of β t Λ(t) = 0 λ(s) ds replaced by ˆ t) = Λ(
Lecture Slides (incl. annotated references) at : /∼evs/SurvSlid3.pdf
Parametric vs. Nonparametric Trade-oﬀ
Return to survival-data setting of the ﬁrst lecture to focus the question of how much ‘eﬃciency’ is lost by nonparametric statistical estimation of survival probability. Data: Ti = min(Xi, Ci), ∆i = I[Xi≤Ci], Zi, 1 ≤ i ≤ n
event time, death-indicator, treatment-grp indicator
First Objective: estimation of P (X1 > t) including 95% Conﬁdence Interval, under assumption either of indep. Xi, Ci or a more detailed parametric model. Compare estimates based on popular parametric model • Exponential which says fX (x) = λe−λx, x > 0 , or more general • Weibull saying fX (x) = λ γ xγ −1 e−λx , x > 0 vs. Kaplan-Meier estimate (no other assumptions). Methodology: statistical theory provides asymptotic prob. dist’n for estimator and 95% conﬁdence interval in each setting, which can be compared through σ : √ D n (˜ p − P (X1 > t)) −→ N (0, σ 2) σ σ p P (X1 > t) ∈ ˜ + 1.96 √ ˜ − 1.96 √ , p n n
t 0
Y (x) hX (x) dx
t
is a martingale, as is
dN (x) − hX (x) Y (x) dx 0 Y (x) From this, can prove asympt. normality and ﬁnd variance formula, leading to √ D 2 ˆKM (t) − SX (t) −→ 0, S (t) N n S X X
Sources for Current Lecture: R. Miller (1983) ”What price Kaplan-Meier ?” Biometrics, param. vs. nonparam. eﬃciency Slud & Kong (1997) Biometrika treatment eﬀectiveness testing using ‘adaptively’ ﬁtted misspeciﬁed Cox models Slud & Korn (1997) Biometrika testing in 2-grp case w ∞-dim nuisance parameters in setting with ‘post-randomization’ variables Slud & Vonta (2002) Consistency of the NPML estimator in the right-censored transformation model. Preprint, available from web-page.