A Survey of Intrusion Detection Systems

合集下载

Intrusion Detection Systems (II)

Intrusion Detection Systems (II)
CS 6262 Spring 02 - Lecture #6 (Thursday, 1/24/2002)
STAT/USTAT
State transition analysis: a rule-based ruleintrusion detection approach
CMIP, SNMP, TMN
Intrusion Detection System( IDS )
You have learnt from this class. ☺
Issues in IDS & NMS
Issues in the Current IDS
Anomaly Detection
Model Construction, Feature Selection
Some Details
Anomaly detection for Unix processes
“Short sequences” of system calls as normal profile (Forrest et al. UNM) …,open,read,mmap,mmap,open,getrlimit,mmap,close,… Sliding window of length k Y … open,read,mmap,mmap read,mmap,mmap,open mmap,mmap,open,getrlimit mmap,open,getrlimit,mmap … normal
A Sense of Self - Immunology Approach
Prof. Forrest at University of New Mexico

基于稀疏向量距离的网络入侵数据检测

基于稀疏向量距离的网络入侵数据检测杨浩;章玲玲;熊焕东;谢昕【摘要】The traditional network intrusion detection has a slow speed,poor real-time performance and high false alarm rate.Therefore,a method of network intrusion detection based on sparse vector distance was proposed.This method first carries on the preliminary analysis of the network sample data,K-means algorithm is adopted to get the distribution position of the data stream in the quantization of the data packet,using compressed sensing sparse encoding technology processing data sparse representation,and then through the random projection data acquisition two Hash value encoding can be approximately represented the sparse vector distance,compared with the threshold,to judge whether the data for intrusion data.According to the distance of these sparse vectors,the network data can be detected quickly and accurately.The experimental results show that compared with the traditional detection algorithm,this algorithm has fast speed,good real-time performance,low false alarm rate,the performance of the intrusion detection system has been greatly improved,ensuring the security of the network.%传统的网络入侵检测速度慢、实时性差,且误报率较高.为此,提出一种基于稀疏向量距离的网络入侵数据检测方法.该方法首先对所获得的网络样本数据进行初步分析,采用K-means算法对样本数据包进行量化处理得到该数据流的位置分布集,使用压缩感知的稀疏编码技术处理,得到数据的稀疏表示,然后通过随机投影获取数据集的二值哈希编码可以近似地表示稀疏向量的距离,与设定的阈值进行比较,判断该数据是否为入侵数据.根据这些稀疏向量的距离能够快速而准确地检测到入侵的网络数据.实验结果表明,相对于传统检测算法,本文算法具有速度快、实时性好、误报率低等优点,使入侵检测系统的性能得到了很大提高,充分确保了网络的安全性.【期刊名称】《科学技术与工程》【年(卷),期】2017(017)027【总页数】5页(P88-92)【关键词】压缩感知;位置敏感哈希;入侵检测;二值哈希编码;网络安全【作者】杨浩;章玲玲;熊焕东;谢昕【作者单位】国网江西省电力科学研究院,南昌330096;国网江西省电力科学研究院,南昌330096;华东交通大学信息工程学院,南昌330013;华东交通大学信息工程学院,南昌330013【正文语种】中文【中图分类】TP393.08现如今，在计算机技术不断发展与应用的同时，网络入侵的现象层出不穷，使得互联网入侵攻击方法多样化、智能化以及复杂化，入侵检测难度大大增加，网络的安全问题越来越受到人们的关注。

A Survey of Host Based Intrusion Detection Systems

– Anomaly also captures misuse :
• There is no intrusion, however, due to bad programming or administering, the process behaves differently than normal (i.e. a bug in the code) • Intrusions are also anomalies
False Positives
Model
9
A Visual Description of False Positives and Completeness
Normal Behavior
False Negatives
Model
10
N-Gram
• Pioneering work in the field. • Forrest et. al. A Sense of Self for Unix Processes, 1996. • Tries to define a normal behavior for a process by using sequences of system calls. • As the name of their paper implies, they show that fixed length short sequences of system calls are distinguishing among applications. • For every application a model is constructed and at runtime the process is monitored for compliance with the model. • Definition: The list of system calls issued by a program for the duration of it’s execution is called a system call trace.

基于机器学习的物联网入侵检测系统综述

目前，物联网设备越来越智能，并且广泛应用于各种领域，如家庭、教育、娱乐、能源分配、金融、医疗、智能城市、旅游以及交通运输，简化了人们的日常生活和工作方式。

然而，无论商届或者学界都在朝着商业化的潮流前进，却很少关注物联网设备的安全性，这样可能会危及到物联网用户，更严重甚至会导致生态系统失衡。

例如，制造业的员工将感染了病毒的U盘插入机器；医院被恶意软件破坏的核磁共振成像机器，或是黑客引导输液泵注射致命剂量的药物，都将造成严重后果。

根据文献[1]可知，至2020年，网络犯罪破坏预算将达到每年60亿美元，并且有500亿物联网设备需要保护。

物联网受到攻击[2]后，不仅会影响物联网本身，还会影响包括网络、应用、社交平台以及服务器在内的完整生态系统，即在物联网系统中，只要破坏单个组件或通信通道，就可能会使部分或者整个网络瘫痪。

因此，在关注物联网带来便利的同时，更需考虑物联网的脆弱性[3]。

传统的安全解决方案已经覆盖了服务器、网络和云存储，这些解决方案大多可部署于物联网系统。

其中，密码编码学[4]作为保障信息安全的基础，通过密钥中心与传感器网络或其他感知网络的汇聚点进行交互，实现对网络中节点的密钥管理；对数据安全保护常用的办法有同态加密、密文检索等；其他安全技术如认证与访问基于机器学习的物联网入侵检测系统综述王振东，张林，李大海江西理工大学，江西赣州341000摘要：物联网技术的广泛应用在给人们带来便利的同时也造成诸多安全问题，亟需建立完整且稳定的系统来确保物联网的安全，使得物联网对象间能够安全有效地通信，而入侵检测系统成为保护物联网安全的关键技术。

随着机器学习和深度学习技术的不断发展，研究人员设计了大量且有效的入侵检测系统，对此类研究进行了综述。

比较了现阶段物联网安全与传统的系统安全之间的不同；从检测技术、数据源、体系结构和工作方式等方面对入侵检测系统进行了详细分类；从数据集入手，对现阶段基于机器学习的物联网入侵检测系统进行了阐述；探讨了物联网安全的未来发展方向。

网络安全中的入侵检测和防御策略

网络安全中的入侵检测和防御策略随着互联网的快速发展，网络安全问题也越来越受到人们的关注。

在如今互联网飞速发展的背景下，入侵检测和防御策略成为了网络安全的重要环节。

本文将重点论述入侵检测和防御策略，以期为读者提供有关这一话题的详细了解。

一、入侵检测入侵检测是指通过分析网络流量、系统行为和用户行为等手段，及时发现并应对潜在的入侵活动。

主要分为以下两种类型：基于特征的入侵检测系统和基于异常的入侵检测系统。

1. 基于特征的入侵检测系统基于特征的入侵检测系统通过监控已知的恶意行为特征，对网络流量和系统行为进行检测。

这种方式依赖于预先定义的规则和特征库，当系统中出现与这些规则和特征库相匹配的行为时，就会发出警报。

然而，这种方法的局限性在于无法检测到未知的入侵行为，同时还需要不断维护和更新规则库。

2. 基于异常的入侵检测系统基于异常的入侵检测系统通过学习和分析正常系统行为的模式，当发现与这些模式有明显差异的行为时，会发出警报。

这种方法在检测新型入侵行为方面具有较好的效果，但也容易产生误报。

因此，合理选择和设置阈值是使用这种方法的关键。

二、防御策略除了入侵检测，有效的防御策略也是保护网络安全的重要手段。

以下是几种常见的防御策略：1. 防火墙防火墙是一种网络安全设备，它通过过滤网络流量中的恶意数据包，保护内部网络免受外部攻击。

防火墙可以设置规则来限制特定IP地址、端口或应用程序的访问权限，有效控制恶意流量的传输。

2. 加密通信加密通信是一种通过加密技术保护敏感数据的方法。

它将数据转化为一种无法被未授权人员读取的形式，只有掌握正确密钥的人员才能解密数据。

这种策略可以在数据传输和存储过程中起到保护作用，防止数据泄露和窃取。

3. 强密码和身份验证使用强密码和身份验证策略可以增加网络安全性。

强密码应该包含大写字母、小写字母、数字和特殊字符，并且定期更改密码。

而身份验证可以采用双因素身份验证或多因素身份验证，确保只有被授权的用户可以访问网络系统。

网络安全参考文献

网络安全参考文献网络安全是当今社会中一个非常重要的领域，以下是几篇关于网络安全的参考文献：1. Meyerovich, L. A., & Livshits, B. (2010). Cuts: Static enforcement of security policies for web applications. ACM SIGPLAN Notices, 45(6), 23-32.这篇论文介绍了一个静态分析工具Cuts，它可以用于在Web 应用程序中强制执行安全策略，从而提高网络安全性。

2. McFadden, T., & McGrath, M. (2013). Cybersecurity: Public sector threats and responses. Cambridge Journal of Regions, Economy and Society, 6(1), 33-48.该研究探讨了公共部门面临的网络安全威胁和应对措施，包括政策制定和安全培训等方面。

3. Liu, C. N., & Chiang, C. C. (2016). A survey of recent advances in intrusion detection systems. Computers & Electrical Engineering, 54, 266-282.此文综述了入侵检测系统的最新进展，包括基于网络流量、机器学习和数据挖掘等技术的应用。

4. Kirda, E., Kruegel, C., & Vigna, G. (2019). An introduction to malware analysis. IEEE Security & Privacy, 17(5), 32-37.这篇文章介绍了恶意软件分析的基本概念和方法，特别关注了静态和动态分析技术。

A Survey of Distributed Intrusion Detection Approaches

a r X i v :c s /0501001v 1 [c s .C R ] 1 J a n 2005A Survey of Distributed Intrusion Detection ApproachesMichael TreasterNational Center for Supercomputing Applications (NCSA)University of IllinoisEmail:treaster@AbstractDistributed intrustion detection systems detect attacks on computer systems by analyzing data aggregated from distributed sources.The dis-tributed nature of the data sources allows pat-terns in the data to be seen that might not bedetectable if each of the sources were examined individually.This paper describes the various approaches that have been developed to share and analyze data in such systems,and discusses some issues that must be addressed before fully decentralized distributed intrusion detection sys-tems can be made viable.IntroductionIntrusion detection systems (IDS)have existed since the 1980’s,ever since the rise of the Inter-net made it possible to attack computer systems from a remote terminal.Although the ﬁrst such systems operated independently on each machine on which they were installed,eventually the idea was proposed of aggregating IDS data from mul-tiple machines in order to look for patterns across a network.This can improve the system’s ability to detect attacks that might otherwise be unde-tectable because each single host cannot does not have enough evidence to draw any conclusions.In general,distributed intrusion detection sys-tems leverage some kind of single-node IDS soft-ware to monitor security events and collect data.Therefore,research typically focuses more on the sharing,aggregation,and processing of this data from a variety of nodes rather than on the ex-act nature of the monitoring itself.Existing ap-proaches can be categorized along a variety of axes;here we examine data sharing,the nature of the data analysis,and security and trust fea-tures.Data Sharing In a distributed IDS system,each agent shares its data with other agents in the system.How-ever,there are a wide variety of sharing schemes that have been developed.These schemes can be viewed as a continuum,with centralized data re-porting on one side and completely decentralized sharing on the other.The most extreme centralization is repre-sented by systems in which a commercial vendor collects security information from a wide varietyof customers,each running the vendor’s agent software [4,5].The vendor typically has multiple machines handling the data collection and analy-sis load that this widespread deployment incurs.When the vendor detects a possible Internet-scale attack,customers receive alerts and advice from the professional security experts who man-age the system.This approach has two primary shortcomings.First,the central management and processing of data represents a single point of failure or vulnerability.Second,it results in a scalability bottleneck,and,due to the volume of incoming data,these systems often have slow response time to new threats.The most common distributed IDS approach is one in which all agents report data to a cen-tral server controlled at a domain or enterprise level [6,8,10,12,13].This is fundamentally the same as in the previous centralization ap-proach,but on a diﬀerent scale,and this pos-sesses most of the advantages and disadvantages of these larger-scale systems.These are usu-ally oriented towards enterprise security,and are generally unsuitable for use among independent peers on the Internet due to the central control. To address the scalability problem of a central-ized system,many techniques use a hierarchical structure[7,9,15].Data is passed up a hierar-chy tree and is processed at each level to search for intrusions and to reduce the amount of infor-mation that must be passed higher up the tree. This helps address scalability and allows a sys-tem to be deployed across large enterprise-scale networks,but it limits the kinds of intrusions that can be detected at the highest levels.This also helps address the single point of failure prob-lem,since if a higher node in the hierarchy fails the lower tiers can typically continue to function, albeit with reduced detection capabilities. Between the hierarchical approach and the fully distributed approach lie projects such as [11],which uses a hybrid hierarchical-distributed approach.Each agent publishes“interests”to the network,which are distributed through a hierarchical structure.Agents share data with other nodes who are interested,and all analysis occurs locally at the agent level.Instances of completely distributed solu-tions are much more rare and are much less well-developed.Gossiping,multicast,or subscription-based data sharing techniques have been proposed[14],but none of these have yet been implemented in a distributed IDS system. Other systems[17]ignore the topic entirely or pass it oﬀto the underlying peer-to-peer sub-strate.Although these examples are still under development,they represent solutions that can be deployed on the Internet at large,indepen-dent of any central authority.Nature of Data AnalysisAlthough distributed IDS systems are usually in-dependent of the techniques used to detect indi-vidual security events,the ways in which these security events are used can vary greatly.Since most systems work in heterogeneous environ-ments,and since the security relationship be-tween,say,a port scan and a buﬀer overﬂow at-tack may not be obvious,how does a system turn event detection into a response?Expert systems are a common approach[8, 13],relying on rule sets to process and respond to events.These rules can attempt to deﬁne se-curity policies,normal behavior,and/or anoma-lous behavior,and alerts or actions are generated based on how events match against the rules.[8]attempts to map actions back to a particular human user,such that events can be correlated with the intentions of an individual.Many systems[2,9]use a threshold scheme. Each security event increases a global alert level. The amount of the increase can be based on any number of factors,such as the particular event that was observed and its relation to other events in time or space.When the alert level exceeds a certain threshold,generic increased security measures are deployed,or an administrator is alerted.Long periods of time without security events can cause the alert level to decrease. Augmented goal trees can be used to model intrusion possibilities[12].As more states of the goal tree are fulﬁlled(based on data from the dis-tributed agents),the system is able to anticipate and counter future stages of the intrusion.An alternative graph-based approach in which con-nections between machines are logged and con-structed into a graph of network activity has also been studied[9].These graphs are then analyzed by an expert system to detect possible intrusions. Security and TrustSecurity and trust are crucial aspects of any dis-tributed system.However,in most proposed dis-tributed IDS systems,however,these issues are given a much lower priority than other design considerations.They address the possibility of a rogue agent or a denial of service attack on the system only in passing.In all cases,a complete solution for trust and security is not provided, but sometimes a concrete solution to a limited aspect of the problem is presented.One issue is that of message authentication,allowing agents to ensure that messages come from who they claim to come from.Several sys-tems[14,17]use signed messages,relying on a central certiﬁcate authority to generate the cre-dentials.This authority does not necessarily par-ticipate in the rest of the distributed IDS system. Although this approach validates the source of a message and ensures that the contents have not been tampered with,it cannot protect a legiti-mate agent sending malicious data.Smaller-scale,centrally controlled systems such as[6]can rely upon a login mechanism,such as Kerberos.Agents only acknowledge logged-in systems,providing a measure of trust to the val-idated agents.This solution is only appropriate for systems with a central login authority,how-ever.This solution,like the signed message ap-proach,is unable to protect against a legitimate agent sending malicious data.The issue of trust can be left to individ-ual agents in the system[15].Each agent de-cides whether or not to trust higher level agents (“monitors”)in the system hierarchy.The agent then subscribes to exchange information from those monitors it chooses to trust.By aggregat-ing and forwarding the data they receive from the lower level agents,the monitors are able to distribute data throughout the network.It is not clear how a monitor protects against sub-scription by rogue agents which then feed it mis-information.Denial of service attacks on agents can be de-tected using heartbeat signals[2].Each agent periodically sends a message to inform the rest of the system that it is functioning properly.If other agents do not receive the heartbeat mes-sage on schedule,a denial of service attack is suspected and treated as another security event on the network.Beyond these initial approaches to security and trust,there has been little work in this area with regard to distributed intrusion detection systems,especially in systems with a centralized control component.Most distributed IDS ap-proaches ignore this topic entirely,but some list it among future work.Several projects[14,17] suggest the possibility of using a“web of trust”among peers,but this approach has not yet been explored.Future DirectionsWe believe tolerance of misinformation is a key area in which to focus,due to the lack of atten-tion that it has been given in previous work.In existing systems,a rogue agent might easily cor-rupt the network by spreading incorrect data. Systems must protect themselves against this type of attack.Centrally managed systems can rely on having complete control over every agent in the network to protect themselves.However, agents in a centrally managed system might be subverted,and fully decentralized systems can-not rely on this at all.One approach that has been suggested is to build a web of trust between agents in the net-work.As an agent reports information that is veriﬁed by others,the reputation of the agents is increased and it is trusted more in the future. However,the system must protect against an agent adopting malicious behavior after building up a high level of trust.This approach is closely related to several trust-oriented research endeav-ors[1,3,16].However,the details of a such a protocol have not yet been carefully speciﬁed. References[1]A.Abdul-Rahman and S.Hailes.A distributedtrust model.In Proceedings of the1997workshop on New security paradigms,pages48–60.ACM Press,1997.[2]J.Barrus and N. C.Rowe.A distributedautonomous-agent network-intrusion detection and response system.In Command and Con-trol Research and Technology Symposium,pages 577–586,Monterey,CA,June1998.[3]R.Chen and W.Yeager.Poblano:A distributedtrust model for peer-to-peer networks.Techni-cal report,Sun Microsystems,Santa Clara,CA, February2003.[4]Deepsight website./.[5]Dshield website./.[6]D.F.et al.A framework for cooperative in-trusion detection.In21st National Informa-tion Systems Security Conference,pages361–373,October1998.[7]J.S.B.et al.An architecture for intrusion de-tection using autonomous agents.In ACSAC, pages13–24,1998.[8]S.R.S.et al.DIDS(distributed intrusion de-tection system)-motivation,architecture,and an early prototype.In Proceedings of the14th National Computer Security Conference,pages 167–176,Washington,DC,1991.[9]S.S.-C.et al.GrIDS–A graph-based intrusiondetection system for large networks.In Proceed-ings of the19th National Information Systems Security Conference,1996.[10]V.C.et al.A distributed intrusion detectionprototype using security agents.In11th Work-shop of the HPOVUA,June2004.[11]R.Gopalakrishna and E.H.Spaﬀord.A frame-work for distributed intrusion detection using in-terest driven cooperating agents,2001.[12]M.-Y.Huang,R.J.Jasper,and T.M.Wicks.A large scale distributed intrusion detectionframework based on attack strategy analysis.Computer Networks(Amsterdam,Netherlands: 1999),31(23–24):2465–2475,1999.[13]K. A.Jackson, D.H.DuBois,and C. A.Stallings.An expert system application for net-work intrusion detection.In14th National Com-puter Security Conference,Washington,DC,Oc-tober1991.[14]R.Janakiraman,M.Waldvogel,and Q.Zhang.Indra:A peer-to-peer approach to network in-trusion detection and prevention.In Proceedings of IEEE WETICE2003,Linz,Austria,June 2003.[15]P.A.Porras and P.G.Neumann.EMERALD:Event monitoring enabling responses to anoma-lous live disturbances.In Proc.20th NIST-NCSC National Information Systems Security Conference,pages353–365,1997.[16]B.Sniﬀen.Trust economies in the free havenproject,2000.[17]V.Vlachos,S.Androutsellis-Theotokis,andD.Spinellis.Security applications of peer-to-peer works,45(2):195–205,2004.。

入侵检测综述

⼊侵检测综述⼀、什么是⼊侵检测1．⼊侵检测的概念安全领域的⼀句名⾔是：“预防是理想的，但检测是必须的”。

⼊侵是任何企图破坏资源的完整性、保密性和可⽤性的⾏为集合。

只要允许内部⽹络与Internet相连，攻击者⼊侵的危险就是存在的。

新的漏洞每周都会发现，⽽保护⽹络不被攻击者攻击的⽅法很少。

如何识别那些未经授权⽽使⽤计算机系统的⾮法⽤户和那些对系统有访问权限但滥⽤其特权的⽤户就需要进⾏⼊侵检测。

⼊侵检测（Intrusion Detection）是对⼊侵⾏为的发觉，是⼀种试图通过观察⾏为、安全⽇志或审计数据来检测⼊侵的技术。

⼊侵者如何进⼊系统主要有三种⽅式：物理⼊侵：⼊侵者以物理⽅式访问⼀个机器进⾏破坏活动，例如趁⼈不备进⼊机房试图闯⼊操作系统、拿着钳⼦改锥卸掉硬盘装在另⼀台机器上进⾏研究。

系统⼊侵：⼊侵者在拥有系统的⼀个低级账号权限下进⾏的破坏活动。

通常，如果系统没有及时打补丁，那么拥有低级权限的⽤户就可能利⽤系统漏洞获取更⾼的管理特权。

远程⼊侵：⼊侵者通过⽹络渗透到⼀个系统中。

这种情况下，⼊侵者通常不具备任何特殊权限，他们要通过漏洞扫描或端⼝扫描等技术发现攻击⽬标，再利⽤相关技术执⾏破坏活动。

⼊侵检测的内容包括：试图闯⼊、成功闯⼊、冒充其他⽤户、违反安全策略、合法⽤户的泄漏、独占资源以及恶意使⽤。

进⾏⼊侵检测的软件与硬件的组合便是⼊侵检测系统（Intrusion Detection System，IDS）。

它通过从计算机⽹络或计算机系统的关键点收集信息并进⾏分析，从中发现⽹络或系统中是否有违反安全策略的⾏为和被攻击的迹象并且对其做出反应。

有些反应是⾃动的，它包括通知⽹络安全管理员（通过控制台、电⼦邮件），中⽌⼊侵进程、关闭系统、断开与互联⽹的连接，使该⽤户⽆效，或者执⾏⼀个准备好的命令等。

⼊侵检测技术是动态安全技术的最核⼼技术之⼀。

传统的操作系统加固技术和防⽕墙隔离技术等都是静态安全防御技术，对⽹络环境下⽇新⽉异的攻击⼿段缺乏主动的反应。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

A Survey of Intrusion Detection Systems DOUGLAS J.BROWN,BILL SUCKOW,and TIANQIU WANG Department of Computer Science,University of California,San DiegoSan Diego,CA92093,USA1IntroductionThere should be no question that one of the most pervasive technology trends in modern com-puting is an increasing reliance on network con-nectivity and inter-host communication.Along with the tremendous opportunities for informa-tion and resource sharing that this entails comes a heightened need for information security,as computing resources are both more vulnerable and more heavily depended upon than before.One subset of information security research that has been the subject of much attention in re-cent years is that of intrusion detection systems. The National Institute of Standards and Tech-nology classiﬁes intrusion detection as“the pro-cess of monitoring the events occurring in a com-puter system or network and analyzing them for signs of intrusions,deﬁned as attempts to com-promise the conﬁdentiality,integrity,availability, or to bypass the security mechanisms of a com-puter or network.”1This deﬁnition captures the essence of intrusion detection but fails to address the methods by which Intrusion Detection Sys-tems(IDS’s)automate this process.The con-cepts of false positive and false negative are essen-tial to this classiﬁcation process.False positives are those sequences of innocuous events that an IDS erroneously classiﬁes as intrusive,while false negatives refer to intrusion attempts that an IDS fails to report:the reduction of both false posi-tives and false negatives is a critical objective in intrusion detection.Modern IDS’s are extremely diverse in the techniques they employ to gather and analyze data.Most rely,however,on a common archi-tecture for their structure:a detection module gathers data that may contain evidence of intru-sions,an analysis engine processes this data to identify intrusive activity,and a response compo-nent reports intrusions.As the response mecha-nisms tend to be dictated by site-speciﬁc policy rather than science,this paper will not discuss this feature of IDS’s in much detail.Section2 discusses the primary sources for intrusion detec-tion while Section3examines current approaches to the problem of intrusion data analysis.Section 4describes a number of issues that represent the current state of research in intrusion detection; Section5is devoted to addressing the state of the art in evaluating these systems.Opportuni-ties for future research are presented in Section6 and concluding remarks are located in Section7. 2Information SourcesVirtually all modern intrusion detection systems monitor either host computers or network links to capture intrusion-relevant data.Each of these data sources oﬀers a unique set of opportunities as well as challenges for an IDS.2.1Host Intrusion DetectionHost intrusion detection refers to the class of intrusion detection systems that reside on and monitor an individual host machine.There are a number of system characteristics that a host intrusion detection system(HIDS)can make use of in collecting data including:File System-changes to a host’sﬁle system can be indicative of the activities thatare conducted on that host.In particu-lar,changes to sensitive or seldom-modiﬁedportions of theﬁle system and irregularpatterns ofﬁle system access can provideclues in discovering attacks.Network Events-an IDS can intercept all net-work communications after they have beenprocessed by the network stack before theyare passed on to user-level processes.Thisapproach has the advantage of viewing thedata exactly as it will be seen by the endprocess,but it is important to note that itwill is useless in detecting attacks that arelaunched by a user with terminal access orattacks on the network stack itself.System Calls-with some modiﬁcation of the host’s kernel,an IDS can be positioned insuch a way as to observe all of the systemcalls that are made.This can provide theIDS with very rich data indicating the be-havior of a program.A critical decision in any HIDS is therefore choosing the appropriate system characteristics to monitor.This decision involves a number of tradeoﬀs including the content of the data that is monitored,the volume of data that is captured, and the extent to which the IDS may modify the operating system of the host machine.2.2Network Intrusion DetectionA network intrusion detection system(NIDS) monitors the packets that traverse a given net-work link.Such a system operates by placing the network interface into promiscuous mode,aﬀord-ing it the advantage of being able to monitor an entire network while not divulging its existence to potential attackers.Because the packets that a NIDS is monitoring are not actually addressed to the host the NIDS resides on,the system is also impervious to an entire class of attacks such as the“ping-of-death”attack that can disable a host without ever triggering a HIDS.A NIDS is obviously of little value in detecting attacks that are launched on a host through an interface other than the network.Network data has a variety of characteristics that are available for a NIDS to monitor:most operate by examining the IP and transport layer headers of individual packets,the content of these packets,or some combination thereof.Regard-less of which characteristics a system chooses to monitor,however,the positioning of a NIDS fun-damentally presents a number of challenges to its correct operation.On a heterogeneous network,a NIDS gener-ally does not possess intimate knowledge of all of the hosts on the network and is incapable of determining how a host may interpret packets with ambiguous characteristics.Without explicit knowledge of a host system’s protocol implemen-tation,a NIDS is impotent in determining how a sequence of packets will aﬀect that host if diﬀer-ent implementations interpret the same sequence of packets in diﬀerent ways.4A savvy attacker can exploit this property by sending packets that are designed to confuse a NIDS.Such attacks are referred to as insertion and evasion attacks based on whether they insert additional information into a packet stream that a NIDS will see and the target host will ignore or if they evade detection by forging data in such a way that a NIDS cannot completely analyze a packet stream.Protocol ambiguities can also present a prob-lem to a NIDS in the form of crud.Crud ap-pears in a network stream from a variety of sources including erroneous network implementa-tions,faulty network links,and network patholo-gies that have no connection to intrusion at-tempts.10If a NIDS performs insuﬃcient analy-sis on a stream containing crud,it can generate false positives by incorrectly identifying this crud as being intrusive.While a NIDS therefore is in a very convenient position whereby it has complete access to all packets traversing a network link,its perspicacity is challenged due to ambiguities in network data and its limited perspective of host system implementations and network topology. 3Analysis TechniquesOnce intrusion detection data has been gleaned, an IDS uses its analysis engine to process this data in order to identify intrusions.Modern sys-tems primarily employ two techniques to perform this analysis:misuse detection and anomaly de-tection.3.1Misuse DetectionThe essence of misuse detection centers around using an expert system to identify intrusions based on a predetermined knowledge base.As a result,misuse systems are capable of attaining high levels of accuracy in identifying even very subtle intrusions that are represented in their expert knowledge base;similarly,if this expert knowledge base is crafted carefully,misuse sys-tems produce a minimal number of false posi-tives.1A less fortunate ramiﬁcation of this architec-ture results from the fact that a misuse detection system is incapable of detecting intrusions that are not represented in its knowledge base.Sub-tle variations of known attacks may also evade analysis if a misuse system is not properly con-structed.Therefore,the eﬃcacy of the system relies heavily on the thorough and correct con-struction of this knowledge base,a task that tra-ditionally requires human domain experts.3.2Anomaly DetectionAnomaly detection is concerned with identify-ing events that appear to be anomalous with re-spect to normal system behavior.A wide va-riety of techniques including statistical model-ing,neural networks,and hidden Markov models have been explored as diﬀerent ways to approach the anomaly detection problem.Each of these anomaly-based approaches fundamentally relies upon the same principles:anomalous activity is indicative of an attempted attack and the cor-rect set of characteristics can suﬃciently diﬀeren-tiate anomalies from normal system usage.De-veloping an anomaly detection system therefore involvesﬁrst establishing a baseline model that represents normal system behavior and against which anomalous events can be distinguished. The system then analyzes an event by consid-ering it within this model and classifying it as anomalous or normal based on whether it falls within a certain threshold of the range of normal behavior.The most appealing feature of anomaly de-tection systems is their ability to identify new and previously unseen attacks.Because the pro-cess of establishing a baseline model of normal behavior is usually automated,anomaly systems also do not require expert knowledge of computer attacks.This approach is not without its handi-caps,however,as anomaly detection may fail to detect even attacks that are very well-known and understood if these attacks do not diﬀer signif-icantly from what the system establishes to be normal behavior.Anomaly based systems are also prone to higher numbers of false positives,as all anomalous events are assumed to be intrusive although in reality a variety of other factors can produce behavior that appears anomalous(e.g., implementation errors).14Intrusion Detection Issues Individual systems take diﬀering approaches to the problem of intrusion detection.There exist, however,a number of common issues that plague the range of detection strategies.This section examines a number of these issues and some of the ways in which researchers have attempted to ameliorate them.4.1Deriving an Expert Rule Set As previously mentioned,one drawback of misuse detection systems is their reliance on an expert rule set that traditionally must be constructed by a human domain expert.This rule set is therefore expensive to produce and susceptible to human error.Snort,a real-time NIDS,addresses this co-nundrum by minimizing the eﬀort required to de-velop new attack rules.In Snort,each attack rule is a single line of text that speciﬁes exactly which characteristics of a packet are to be examined and what values these characteristics must equal in order to trigger the rule.11This approach to the problem is helpful,but is limited in its ability to detect variations of codiﬁed attacks and does not resolve the issue of requiring a human expert to devise a knowledge base for the IDS.A technique that allows for the automated construction of attack rules would therefore be tremendously valuable.The architecture for an artiﬁcial immune system(ARTIS)that Hofmeyr and Forrest have proposed takes a bold step in this direction.Based on a model of the human immune system,ARTIS is concerned with devel-oping a set of lymphocytes(analogous to attack rules)that can detect pathogens(intrusive ac-tivity)in a system.In an attempt to closely model the way this is performed in the human immune system,ARTIS continuously generates lymphocytes with random characteristic values and deploys these to detectors that are replicated throughout the system.Over time,lymphocytes randomly die oﬀand are replaced;when a par-ticular lymphocyte correctly binds to a particu-lar event sequence in the system,it is reinforced and its probability of being replaced is reduced.5 In this fashion,those lymphocytes that are most eﬀective in detecting misuse become more preva-lent in the system.This design is advantageous in that it elimi-nates much of the dependency on expert human knowledge in developing a misuse detection sys-tem.In reality,however,ARTIS does require hu-man intervention in distinguishing the lympho-cytes that bind correctly from those that do not in order to reduce the number of false positives that the system produces.Because lymphocytes are constructed at random,there is also a high probability that well-known attacks may not be detected at all if the characteristics needed to identify these attacks are never generated.AR-TIS achieves one of the beneﬁts of anomaly sys-tems in the ability to detect new attacks with-out relying on a human-encoded expert knowl-edge base.It does this,however,at the expense of some of the advantage of traditional misuse systems,namely it is less eﬀective at detecting known attacks and is more likely to produce false positives.Another approach to the problem of derivingexpert system rule bases uses anomaly detection to identify new attacks which could then be cod-iﬁed into rules that could be used in a misuse system.This diﬀers from ARTIS primarily in the sense that new attack rules are determined based on perceived anomalies rather than from randomly generated data.One of the critical challenges of such a system would be the proper encoding of an attack into an appropriate attack rule with the proper characteristics.To date, there has been little exploration of this problem space.4.2Detecting Attack Variations Detecting subtle variations of known attacks presents a sizeable challenge to many systems that rely upon misuse detection.Because of the way intrusion signatures must be codiﬁed into an expert knowledge base,it can be very diﬃ-cult for misuse systems to identify attacks that may originate from more than one source,vary in the means by which they are conducted,or are protracted over long periods of time.State transition analysis(STAT)is one tech-nique that addresses this issue.In such a system, attacks are represented by a state transition di-agram:the start state represents a pristine sys-tem,intermediate states represent changes to the system that occur during an attack,and theﬁnal state represents a system compromise.For all of the codiﬁed attacks that exist in an IDS’s rule base,the system retains internal data indicating which states of the attack have been reached.Be-cause this data is global,it is independent of who causes a state change,the way in which a new state is reached,and the time scale on which state transitions occur.Through this means,count-less variations of a single attack can still be de-tected because the system monitors system state changes that are symptomatic of an intrusion at-tempt rather than monitoring the actions that cause those state changes.6This advantage is not gained,however,with-out considerable expense.Because the occur-rence of state transitions in an attack may not be a simple linear progression,the system must maintain internal data for every intermediate state of an attack that has ever been reached. This can lead to an explosion in the amount of data that the system must maintain as the num-ber of attacks and attack states monitored be-comes large,a property that an attacker could exploit.Codifying attacks as state transition di-agrams also complicates the process of developing the system’s expert knowledge base:if a critical attack state is omitted from the diagram,false positives result when state progressions resem-bling an attack occur;if a non-critical state is inserted into an attack’s diagram,false negatives occur when variations of the attack transpire that do not instantiate this non-critical state.A similar tactic to the problem of detect-ing attack variations makes use of Colored Petri Nets to monitor codiﬁed event sequences.Col-ored Petri Nets allow for attacks to be encoded as graphs in such a way that the progression of attack steps is moreﬂexible than the rigid or-der imposed by state transition analysis.Guards are used to deﬁne the situations in which attack signatures are matched rather than maintaining data indicating all current and previous states.7 This helps to achieve many of the beneﬁts of us-ing the STAT approach without making the sys-tem vulnerable to the state-consuming attacks.Unfortunately,Colored Petri Nets do not present a viable option for practical intrusion detection because the process of state uniﬁca-tion and matching in this model is prohibitively compute-expensive.The complexity of this matching is exponential on the number of states represented by the system,while partial order matching requires super-exponential time.7Al-though an IDS based on Colored Petri Nets would not be vulnerable to a state-consuming attack,it would be very vulnerable to a CPU-consuming attack.The eﬃcacy of such a system would therefore be rapidly eroded in a produc-tion environment in which an attacker could ren-der the IDS useless by overwhelming the analysis engine.4.3Training Behavioral Models Anomaly systems universally suﬀer from the problem of how to correctly construct a baseline model of behavior that is suﬃcient for complete and correct operation of the system.Any suc-cessful means of training the system must expose the system to the full range of normal behavior in order to minimize false positives as well as avoid exposing the system to properties of anomalous activity that may desensitize the IDS to attacks.Depending on the design of the anomaly sys-tem,systems can be trained with just normal data or with two sets of data that are correctly identiﬁed as normal and intrusive.In presenting intrusive data that is going to be used to train thesystem,it is important that this data represent a range of anomalies so that the trained system is not incapable of identifying certain classes of attack.Ghosh et al.have experimented with using randomly generated events to represent anoma-lous behavior in order to train the neural net-works that provide the analysis for their IDS.In empirical tests,those networks that were trained with randomly generated anomaly data consis-tently out-performed those that did not receive this training by reducing the number of false neg-atives in the system(none of the networks pro-duced any false positives).3While these results are very promising,they suggest that a likely rea-son that the neural networks trained with ran-dom data performed well is because the normal data set with which they were trained deﬁned a narrow range of behavior that very closely resem-bled the normal data that the system was tested against.Were this not the case,the system could reasonably have been expected to generate at least a small number of false positives as some of the randomly generated events trained as anoma-lies would fall into the range of normal activity.It is further unclear how well a system trained over random anomalous data would perform in cor-rectly identifying actual attacks,as the attack data against which this system was tested also consisted of randomly generated events.A more ideal solution to the training prob-lem would be one that allows for an anomaly sys-tem to be correctly trained over noisy data,ie. data that contains an assortment of both normal and anomalous behavior.This would allow for the system to be eﬀectively trained in a produc-tion environment without relying on hypothetical data sets representing normal and anomalous be-havior.Eleazar Eskin has developed a process that uses learned probability distributions to train an anomaly system over noisy data.This technique uses machine learning to create a probability dis-tribution of the training data and then applies a statistical test to identify anomalies.2Inter-estingly,Eskin’s technique requires no domain-speciﬁc knowledge.It does,however,operate on three assumptions about the training data:nor-mal data can be eﬀectively modeled using a prob-ability distribution;anomalous events diﬀer sig-niﬁcantly enough from normal events that they can be identiﬁed;and the number of anomalous events is small compared to the number of nor-mal events.Furthermore,before the system is trained,one must deﬁne a valueλindicating the percentage of the training data that is expected to be anomalous.Because this data is noisy and not artiﬁcially constructed,choosing the value of λto best ensure correct operation of the system is very diﬃcult.This model is additionally lim-ited by its assumption that normal data is dis-tributed normally across noisy data:in actuality, it is likely that intrusion attempts are temporally clustered.Regardless of the means by which one is trained,there is also the issue of evolving normal behavior with which anomaly systems must con-tend.One option is that a system be retrained periodically with new training data that repre-sents current normal behavior.This,however,in-creases the dependency of the system on training and underscores the inherent diﬃculties in devel-oping suﬃcient data or an appropriate technique for this task.Adaptive anomaly systems have been sug-gested as a solution that would allow for systems to evolve their normal behavior models gradu-ally as normal behavior ne and Brod-ley’s machine learning system is an example of a system that makes use of this technique.This system maintains aﬁnite-sized dictionary of nor-mal event sequences and uses a least-recently-used(LRU)policy to replace seldom-occurring sequences with new ones that are determined to be normal.8It is important to note that sys-tems that use adaptive training techniques face the problem of preventing an attacker from grad-ually training the system over time to accept a range of anomalous behavior as normal.Resolv-ing this diﬃculty remains an open challenge. 4.4Attack Against the IDSWhile the purpose of an intrusion detection sys-tem is to detect attacks against a host or set of hosts,an ironic consequence of its existence is that the IDS itself may draw attack from an at-tacker seeking to disable the IDS.It is critical that the design of a system be performed within the framework that the IDS itself be resistant to and tolerant of attack attempts designed to ob-struct its ability to correctly detect intrusions.One class of such attacks referred to as “crash attacks”attempts to disable an IDS by causing it to fault or to run out of some critical resource.Assuming that it is infeasible to totally prevent these attacks,the goal of an IDS in the face of such an attack is therefore to minimizethe extent to which the attacker is successful in disabling the IDS.Bro,a real-time system for detecting net-work intrusions,provides two mechanisms for maximizing operation in the face of crash attacks. First,Bro maintains a“watchdog”timer that ex-pires on a conﬁgurable interval and checks to see if the system is still analyzing the same event that it was when the previous timer expired.If this is the case,the system assumes that it is in a processing jam and terminates the monitor process so that the system can continue with the next event.Second,Bro is launched from a script that can recognize if the system ever crashes,in which case it launches tcpdump in order to gather the data that Bro would be gathering had it not crashed.This data can then be analyzed at a later time or by another system.10While these facilities are a certain improve-ment over a system that has no means of provid-ing failure recovery,they are certainly not with-out weakness.Provided that an attacker has a means of engaging an IDS in a processing jam,it would be a relatively easy matter to simply in-ject a series of such events into the data stream in order to keep the system occupied while an at-tack on a monitored host is launched.Although the watchdog timer ensures that Bro never be-comes permanently disabled by a single series of events,it cannot prevent a determined attacker from creating a quantum of time during which an attack can proceed undetected.Bro’s second pro-vision suﬀers similarly:although the script does ensure that any intrusion that follows Bro’s crash will be recorded,it provides no means of analyz-ing this record.Additionally,because tcpdump is launched immediately after the IDS crashes, it cannot determine the sequence of events that caused the crash to occur.Another class of attacks seeks simply to in-ject a large quantity of spurious data into a monitor’s event stream in order to distract an IDS while an attack on a monitored host takes place.Such attacks can be particularly lethal when launched against a NIDS that is already un-der the onus of monitoring and performing analy-sis on data for a large number of hosts.In the face of attacks like these,Vern Paxson suggests that such systems perform“triage”against incoming ﬂows:if the system detects that it is nearing ex-haustion,it can shed load by discarding state for monitoredﬂows that do not appear to be mak-ing progress.This suggestion operates under the assumption that an attacker is less likely to have complicity from hosts on both sides of the moni-tor,therefore making it diﬃcult for the attacker to fake a large number of active connections.4 The idea of triage is a precarious one.On the one hand,the load that is shed during triage can allow for an IDS to operate continuously,help-ing to maximize the coverage of the system and prevent an attacker from denying service to the system by overwhelming it with illegitimate data. On the other hand,when a system enters a mode of triage,it is in essence performing denial of ser-vice on itself.The eﬃcacy of a triage mechanism therefore hinges on the system’s ability to prop-erly determine which data it can safely ignore and which data it cannot:if the system were able to make this distinction perfectly,however,there would never be any need to examine the irrele-vant data.As with the survival facilities that Bro employs,the adoption of triage involves a trade-oﬀbetween correctness and performance,but is certainly preferable to the absence of such mech-anisms which may sacriﬁce both.5ResultsThis paper has presented a veritable cornucopia of intrusion detection systems and discussed the relative merits of each,but has not addressed the issue of quantiﬁable results.The matter of per-formance metrics is another extremely challeng-ing issue in intrusion detection and one that does not lend itself well to simple empirical evalua-tion.In evaluating intrusion detection systems, the three most important qualities that need to be measured are completeness,correctness,and performance.4The current state of the art in intrusion detection restricts measurement of new systems to tests over incomplete data sets and micro-benchmarks that can test a narrowly deﬁned component of the system.Presently,a number of anomaly-based systems are tested over con-trived data sets in order to determine how well the system classiﬁes anomalies.This evaluation is limited by the quality of the data set that the system is measured against:constructing data sets that are both realistic and comprehensive is an extremely hard and open problem.Examples of micro-benchmarks include stress tests to ex-pose the maximum rate of events that an IDS can withstand before it begins to experience ex-haustion or running the system in a production environment to determine the speed with whichit can be expected to perform under typical load. Tests such as these can give a good indication as to the computational boundaries of an IDS but are very limited in the degree to which they can quantify completeness and correctness.A number of ideas for the establishment of security metrics have been proposed.For in-stance,“pretty good assurance”seeks to pro-vide a process by which claims about the secu-rity properties of systems can be clearly stated and accompanied by evidence that substantiates these claims.As formal proof of correctness in the intrusion detection domain is exceptionally challenging and expensive,pretty good assurance presents a way in which systems can be measured that allows fuzzy decisions,trade-oﬀs,and prior-ities as long as these properties are accompanied by appropriate assurance arguments.12Bennet Yee has suggested another metric in which the strength of a system is measured by the work factor required for an attacker to penetrate the system’s defenses.Such a measure must take into consideration the amount of work required to discover a vulnerability,engineer a means to exploit this weakness,and execute an attack on the system.13Although such a measurement in-herently involves a good deal of approximation and guesswork,the concept of work factor yields great promise in providing an acceptable bench-mark against which intrusion detection systems could be compared.6Future ResearchThe study of intrusion detection systems is quite young relative to many other areas of systems research and it stands to reason that this topic oﬀers a number of opportunities for future ex-ploration.In discussing some of the problematic issues that confront IDS designers,this paper has touched upon a number of open questions related to intrusion detection including:•Can anomaly detection systems be used togenerate attack rules for misuse detectionsystems?•In what ways can variations of knownattacks be detected by a misuse systemwithout exposing the IDS to resource-consuming attacks?•Can an anomaly system that adaptivelymodiﬁes its model of normal behavior overtime be protected from being training byattackers to accept intrusions as normal be-havior?•Is it possible for triage mechanisms to pro-vide an IDS with the ability to shed loadwithout diminishing its eﬃcacy or its cov-erage?•How can the completeness,correctness,andperformance of intrusion detection systemsbe measured in order to facilitate rela-tive comparison and absolute evaluation ofthese systems?In addition to these issues,there are a num-ber of unresolved issues regarding the scope of analysis that an IDS performs and the interoper-ability of intrusion detection systems.Most in-trusion detection eﬀorts today focus on providing analysis for a relatively localized target:either a single host or a collection of hosts joined by a net-work.A system that operates with a more global scope may be capable of detecting distributed at-tacks or those that aﬀect an entire enclave.De-velopment of such a system would be a valuable contribution to the study of intrusion detection.There have recently been a number of ef-forts including the Common Intrusion Detection Format(CIDF)and the IETF standardization ef-fort motivated towards providing interoperability among intrusion detection systems.Such frame-works can provide a means by which diﬀering analysis and data collection techniques can be ag-gregated within a single system,improving both the coverage and redundancy of the system.An increasing number of intrusion detection systems such as EMERALD are beginning to make use of this idea,although it will likely be some time before a standard frameworkﬁnds its way into widespread use.9More research towards synthe-sizing the commonalities of intrusion detection systems and the most eﬃcient format for inter-IDS communication is still needed.7ConclusionsSince the study of intrusion detection began to gain momentum in the security community roughly ten years ago,a number of diverse ideas have emerged for confronting this problem.In-trusion detection systems vary in the sources they use to obtain data and in the speciﬁc techniques they employ to analyze this data.Most systems today classify data either by misuse detection or。