De novo transcriptome of the desert beetle Microdera punctipennis

合集下载

百花拥簇浅绿的风信子盛开的季节的英语

百花拥簇浅绿的风信子盛开的季节的英语The Vibrant Season of Verdant Hyacinths in BloomAs the sun's warm rays caressed the verdant landscape, the gentle breeze carried the sweet fragrance of the season's most enchanting blooms. It was the time of year when the hyacinths, their delicate petals unfurling in a symphony of soft hues, adorned the gardens and parks with their captivating presence. This was a season of renewal and wonder, a time when the world seemed to come alive with the promise of new beginnings.The hyacinth, with its lush foliage and vibrant flowers, was the undisputed star of this verdant spectacle. These resilient plants, native to the Mediterranean region, had long been revered for their beauty and symbolic significance. In ancient Greek mythology, the hyacinth was said to have sprung from the blood of the young Hyacinthus, a beloved of the god Apollo, who was tragically killed in a freak accident. The flower's striking colors and delicate form were believed to be a testament to the enduring love and grief of the gods.As the days grew longer and the air grew warmer, the hyacinthsbegan to emerge from the earth, their tightly coiled buds slowly unfurling to reveal the stunning array of hues that would captivate all who laid eyes upon them. From the deepest indigo to the palest lavender, the hyacinths painted a breathtaking tapestry across the landscape, their intoxicating fragrance wafting through the air and enticing the senses.In gardens and parks, the hyacinths were carefully tended and cultivated, their vibrant colors and delicate forms the centerpiece of meticulously designed displays. Gardeners took great pride in their hyacinth beds, meticulously grooming the plants and ensuring that they received the optimal amount of sunlight and moisture to thrive. The result was a stunning display of nature's artistry, a riot of color and fragrance that left visitors awestruck and inspired.But the hyacinth's appeal extended far beyond its visual splendor. For centuries, these flowers had been imbued with a rich symbolism, representing everything from loyalty and constancy to playfulness and joy. In many cultures, the hyacinth was seen as a harbinger of spring, a symbol of the renewal and rebirth that came with the changing of the seasons. Its delicate beauty and resilience in the face of adversity made it a beloved emblem of hope and resilience.As the days passed and the hyacinths continued to bloom, the world seemed to take on a new energy and vibrancy. People flocked to thegardens and parks, drawn by the allure of the flowers and the promise of a season filled with possibility. Children chased butterflies through the hyacinth beds, their laughter echoing through the air, while couples strolled hand-in-hand, captivated by the beauty that surrounded them.For those who had the privilege of witnessing the hyacinth's annual display, the experience was truly transformative. In the presence of these magnificent blooms, the stresses and worries of daily life seemed to melt away, replaced by a sense of wonder and connection with the natural world. The hyacinths, with their timeless elegance and enduring symbolism, served as a reminder of the beauty and resilience that can be found in even the most humble of plants.As the season drew to a close and the hyacinths began to fade, a bittersweet sense of nostalgia set in. But the memories of their vibrant colors and intoxicating fragrance would linger long after the last petal had fallen, a testament to the enduring power of nature's most captivating creations. And so, the cycle would continue, with the hyacinths returning year after year to enchant and inspire all who were lucky enough to witness their breathtaking display.。

香茅精油的生物活性及其在果蔬中的应用

基金项目：福建省科技厅对外合作项目（2020I0033）；福建省省属公益类基本科研专项（2019R1030-3）。

作者简介：李海明（1975—），女，硕士，副研究员，研究方向：植物病理学。

收稿日期：2021-10-29香茅精油的生物活性及其在果蔬中的应用李海明吴水金李跃森（福建省农业科学院热带农业研究所，福建漳州363005）摘要：该文从香茅精油的生物活性着手，重点介绍了香茅精油的抑菌性及抗氧化活性，综述了香茅精油在果蔬生产中虫害防控、病害防控及果蔬采后的应用等方面的研究进展。

基于目前的研究结果，香茅精油作为绿色农药在果蔬生产和采后保鲜等方面有着极大的应用空间。

关键词：香茅精油；生物活性；果蔬中图分类号TS255.3文献标识码A文章编号1007-7731（2022）05-0123-03香茅［Cymbopogon citratus （DC.）Starf ］是多年生草本植物，归属于禾本科香茅属，亦称为香茅草，为常见的香草之一，在亚洲地区主要分布于印度、泰国、斯里兰卡、越南等地；在我国，香茅主要分布于福建、广东、台湾、海南、贵州、云南等地［1］。

香茅在亚洲地区如泰国、印度等国主要是作为食物调料［2］。

香茅的茎、叶具有芳香味，可采用水蒸气蒸馏法、萃取法从鲜香茅或干香茅中提取淡黄色、亮黄色的挥发油，即香茅油，在日化上应用广泛，可作为驱蚊剂、面膜、香水、香皂等的原料。

香茅油有一定的药用作用，可治感冒、跌打损伤等症，香茅油中的活性成分具有抑菌功效，对金黄色葡萄球菌、枯草芽孢杆菌、白色念株菌等具有一定的抑制作用，同时具有抗氧化活性。

香茅精油因其无毒、无残留、安全等特性，在农业生产中的害虫、植物病原菌的抑制、防控等方面成为研究热点。

为此，本文综述了香茅油及其的生物活性及其在果蔬生产中、采后保鲜中的应用研究进展，以期为香茅精油的推广应用提供参考。

1香茅精油的化学成分及生物活性1.1化学成分香茅精油的化学组成十分复杂，一种精油往往含有几十种甚至上百种组分，从结构上大体上可以分为四大类，分别是萜烯类化合物、芳香族化合物、脂肪族化合物以及含氮含硫类化合物［3］。

分子生物学研究法-基因功能研究技术

分子生物学研究法-基因功能研究技术第六章分子生物学研究法（下）基因功能研究技术随着越来越多的基因组序列相继被测定，人类对生物本质的认识已经发生了重大变化。

但是，海量序列信息也向我们提出了新的挑战。

如何开发利用这些序列信息，如何通过生物化学、分子生物学等方法研究基因的功能，从而进一步了解生物体内各种生理过程，了解生物体生长发育的调节机制，了解疾病的发生、发展规律，给出控制、减缓甚至完全消除人类遗传疾病，是新时期生物学家所面临的主要问题。

转录组测序技术、原位杂交技术、基因芯片技术为研究单个或多个基因在生物体某些特定发育阶段或在不同环境条件下的表达模式提供了强有力的手段。

用基因定点突变（site-directed mutagenesis）技术、基因敲除技术、RNAi技术可以全部或部分抑制基因的表达，通过观察靶基因缺失后生物体的表型变化研究基因功能。

酵母单杂交、双杂交技术，四分体技术等都是研究蛋白质相互作用、蛋白质-DNA相互作用等的重要手段。

随着分子生物学技术的发展，研究者可以在活细胞内和细胞外研究蛋白质之间的相互作用，为认识信号转导通路、蛋白质翻译后修饰加工等提供了丰富的技术支持。

本章将主要介绍研究基因功能的各种分子生物学技术和方法。

6. 1 基因表达研究技术6. 1. 1转录组测序6.1.1 转录组分析和RNA-Seq转录组（transcriptome），广义上指在某一特定生理条件或环境下，一个细胞、组织或者生物体中所有RNA的总和，包括信使RNA（mRNA）、核糖体RNA（rRNA）、转运RNA（tRNA）及非编码RNA（non-coding RNA或sRNA）；狭义上特指细胞中转录出来的所有mRNA的总和。

基因组－转录组－蛋白质组（genome－transcriptome －proteome）是中心法则在组学框架下的主要表现形式。

通过特定生理条件下细胞内的mRNA丰度来描述基因表达水平并外推到最终蛋白质产物的丰度是目前基因表达研究的基本思路。

脉红螺幼虫变态过程多组学解析及关键基因的调控作用

博士学位论文脉红螺幼虫变态过程多组学解析及关键基因的调控作用作者姓名：宋浩指导教师: 张涛研究员（博士）中国科学院海洋研究所学位类别: 理学博士学科专业: 海洋生态学研究所: 中国科学院海洋研究所2018年6 月Understanding the Metamorphosis in Veined Rapa Whelk Rapana venosa from omics insight and the regulation role of key genes on itsmetamorphosisA Dissertation Submitted toUniversity of Chinese Academy of SciencesIn partial fulfillment of the requirementFor the degree ofDoctor of philosophyByHao SongDissertation Supervisor : Professor Tao ZhangInstitute of Oceanology, Chinese Academy of SciencesJune, 2018摘要摘要脉红螺（Rapana venosa），自然分布于我国的渤海、黄海和东海以及日本海等海域，是我国重要的经济贝类，但在欧洲黑海、爱琴海、美国切萨皮克湾、阿根廷拉普拉塔河等海域为生物入侵种，对当地的双壳贝类资源造成破坏。

变态过程是贝类生活史中重要的发育阶段，变态的成功与否直接关系到贝类种群资源变动。

因此，研究脉红螺幼虫变态机理，对于促进其苗种繁育、资源恢复、生物入侵防控等工作的开展具有重要的现实和理论意义。

本研究利用RNA-seq、iTRAQ、GC-MS、Real time PCR等技术对脉红螺幼虫变态过程分子机理展开研究，从转录组水平、蛋白质组水平和代谢组水平揭示了幼虫变态过程调控特征，筛选了脉红螺变态过程中的差异表达的关键转录本/蛋白组/代谢物，并对它们在变态中发挥的潜在生物学功能进行了探讨；开展了脉红螺幼虫变态过程microRNA的响应特征研究，筛选了变态中的差异表达的microRNA并对它们潜在调控的靶基因进行预测，揭示其在变态过程中所发挥的功能；筛选了在脉红螺变态发育过程中和在不同组织中稳定表达的内参基因，为将来进一步研究关键基因在变态过程中的表达水平提供基础；获得关键基因5-HT receptor和NOS的cDNA序列，探讨了其在脉红螺变态过程中表达特点及调控机理。

植物单细胞核转录组测序流程

植物单细胞核转录组测序流程英文回答：The single-cell RNA sequencing (scRNA-seq) workflow for plant cells involves several key steps. First, the plant tissue or cells are isolated and dissociated into single cells. This can be done using enzymatic digestion or mechanical disruption. The dissociated cells are then captured and individually barcoded using droplet-based or plate-based methods. These barcodes allow for the identification and tracking of individual cells throughout the sequencing process.After barcoding, the cells are lysed, and the RNA is extracted. The extracted RNA is then reverse transcribed into complementary DNA (cDNA), which serves as a template for library preparation. Library preparation involves amplifying the cDNA and adding sequencing adapters. This step is essential for ensuring that there is enough material for sequencing and for attaching the necessarysequencing tags.Once the libraries are prepared, they are sequenced using high-throughput sequencing platforms, such asIllumina or PacBio. The sequencing generates millions of short reads or long reads, depending on the platform used. These reads are then aligned to a reference genome or assembled de novo to obtain the transcriptome information for each cell.After alignment or assembly, the transcriptome data is processed to identify and quantify gene expression levelsin each cell. This involves assigning reads to genes and calculating expression values, such as transcripts per million (TPM) or fragments per kilobase of transcript per million (FPKM). Various computational tools and algorithms are available for this analysis.Finally, the gene expression data can be further analyzed to identify differentially expressed genes, perform clustering analysis to identify cell types or states, and explore gene regulatory networks. This analysiscan provide insights into the cellular heterogeneity and functional diversity within the plant tissue or cell population.中文回答：植物单细胞核转录组测序流程包括几个关键步骤。

转录组测序以及常用算法简介

转录组测序以及常用算法简介转录组测序，也被称为“全转录组鸟枪法测序”（WTSS），由于转录组测序的高覆盖率，它也被称为深度测序。

它主要利用新一代高通量测序技术，对物种或组织的RNA反转录而成的cDNA文库进行测序，并得到相关的RNA信息。

其研究对象为特定细胞在某一功能状态下所能转录出来的所有RNA的总和，包括mRNA和非编码RNA。

它是指用新一代高通量测序技术，对物种或组织的RNA反转录而成的cDNA文库进行测序，并得到相关的RNA信息。

转录组测序根据有无基因组参考序列分为：有参考基因组的转录组测序，和无参考基因组的de novo测序。

如果有基因组参考序列，可以把转录本映射回基因组，确定转录本位置、剪切情况等更为全面的遗传信息，而这些遗传信息可以广泛应用于生物学研究、医学研究、临床研究中。

虽然转录组测序和基因组测序的步骤大体相同，但是在文库制备和分析方法上却有很大的区别。

在生物信息学领域，序列比对作为识别DNA、RNA和蛋白质相似区域的有效手段，有助于我们更好地研究其结构、功能以及进化方向的关系。

下图简要说明了转录组测序的主要流程：首先将细胞中所有的反转录产物转化为cDNA文库，再将cDNA随机剪切为小DNA片段，并在两端加上接头（Adapter），所得序列通过比对（有参考基因组）或者从头组装de novo（无参考基因组），形成全基因组范围的转录谱。

图1 转录组测序流程图常用算法简介TopHat（/software/tophat/index.shtml）TopHat是Cole Trapnell等人于2009年发表在Bioinformatics上的基于Bowtie的转录组测序比对算法，是马里兰大学生物信息和计算机生物中心，以及加利福尼亚大学伯克利分校数学系和分子细胞生物学系以及哈佛大学的干细胞与再生生物学系联合开发的结果。

它通过超快的高通量短序列比对RNA序列来识别剪切位点。

图2 TopHat流程图TopHat首先先用Bowtie将RNA序列与整个参考基因组进行比对，找到匹配的序列，再用Maq合并匹配的序列，对外显子进行选择性的拼接。

恩格贝大沙漠英语作文

恩格贝大沙漠英语作文Engbei Desert: A Land of Transformation and HopeNestled in the vast landscape of northern China, Engbei Desert stands as a unique yet challenging terrain. Once a barren and desolate place, this desert has undergone remarkable transformation in recent years, becoming a symbol of resilience and the power of human intervention.Engbei Desert, also known as the "Desert of Hope," stretches across vast areas, its sand dunes rolling endlessly under the blistering sun. However, beneath its harsh exterior lies a story of resilience and perseverance. Over the years, numerous efforts have been made to reclaim this desert, turning it into a lush oasis.The key to this transformation lies in the implementation of various ecological restoration projects. Planting trees and shrubs, introducing water sources, and encouraging the growth of native vegetation have all played crucial roles in bringing life back to this desolate land. The results have been remarkable, with vast areas of the desert now covered in greenery.Not only has this transformation benefited the environment, but it has also brought economic opportunities to the local communities. Tourism has flourished in Engbei Desert, attracting visitors from all over the world who are fascinated by its unique beauty and the story of its revival. Local residents have also benefited from the creation of jobs and income-generating opportunities related to the restoration projects.Moreover, Engbei Desert serves as a valuable lesson in the importance of sustainable development. It reminds us that even the most challenging environments can be transformed with dedication, hard work, and a commitment to conservation. It is a testament to the fact that humans can coexist harmoniously with nature, provided we make the right choices and take action.In conclusion, Engbei Desert is a land of transformation and hope. It is a place where the harsh realities of nature have been overcome through the power of human intervention and the spirit of resilience. It serves as a reminder that with perseverance and dedication, we can create a greener and more sustainable future for ourselves and our planet.。

ngs测序原理

ngs测序原理Next-generation sequencing (NGS) is a high-throughput DNA sequencing technology that has revolutionized the field of genomics. It allows for the rapid and cost-effective analysis of the entire genome, transcriptome, or epigenome of an organism. In this document, we will explore the principles behind NGS sequencing and the key steps involved in the process.The NGS sequencing process begins with the isolation of DNA or RNA from the sample of interest. This nucleic acid is then fragmented into smaller pieces, which are subsequently sequenced using a variety of NGS platforms. The sequencing platforms use different methods to detect the sequence of each fragment, resulting in the generation of millions of short DNA sequences, known as reads.The next step in the NGS process involves the alignment of these short reads to a reference genome or transcriptome. This is done to determine the original sequence of the DNA or RNA fragments. Once the reads have been aligned, the next step is to assemble them into longer contiguous sequences, known as contigs. This process is particularly important for de novo sequencing, where no reference genome is available.After the assembly step, the contigs are further analyzed to identify genetic variations, such as single nucleotide polymorphisms (SNPs) and insertions/deletions (indels). These variations can provide valuable insights into the genetic diversity and evolutionary history of the organism being studied. Additionally, the NGS data can be used to quantify gene expression levels and identify differentially expressed genes in various biological conditions.One of the key advantages of NGS technology is its ability to generate massive amounts of sequencing data in a relatively short period of time. This high-throughput nature of NGS has enabled researchers to undertake large-scale genomic projects, such as the 1000 Genomes Project and The Cancer Genome Atlas (TCGA), which have significantly advanced our understanding of human genetics and disease.In conclusion, NGS sequencing has revolutionized the field of genomics by enabling the rapid and cost-effective analysis of the entire genome, transcriptome, or epigenome of an organism. The principles behind NGS sequencing involve the fragmentation, sequencing, alignment, assembly, and analysis of DNA or RNA fragments. The high-throughput nature of NGS technology has paved the way for large-scale genomic projects and has significantly advanced our understanding of genetics and disease.。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

De novo transcriptome of the desert beetle Microdera punctipennis (Coleoptera:Tenebrionidae)using illumina RNA-seq technologyXueying Lu •Jieqiong Li •Jianhuan Yang •Xiaoning Liu •Ji MaReceived:12September 2013/Accepted:15July 2014/Published online:21August 2014ÓThe Author(s)2014.This article is published with open access at Abstract Insects in Tenebrionidae have unique stress adaptations that allow them to survive temperature extremes.We report here a gene expression proﬁling of Microdera punctipennis ,a beetle in desert region,to gain a global view of its environmental adaptations.A total of 48,158,004reads were obtained by transcriptome sequencing,and the de novo assembly yielded 56,348unigenes with an average length of 666bp.Based on similarity searches with a cut-off E-value of 10-5against two protein sequence databases,41,109of the unigenes (about 72.96%)were matched to known proteins.An in-depth analysis of the data revealed a large number of genes were associated with environmental stress,including genes that encode heat shock proteins,antifreeze proteins,and enzymes such as chitinase,trehalose,and trehalose-6-phosphate synthase.This study generated a substantial number of M.punctipennis transcript sequences that can be used to discover novel genes associated with stress adap-tation.These sequences are a valuable resource for future studies of the desert beetle and other insects in Tenebri-onidae.Transcriptome analysis based on Illumina paired-end sequencing is a powerful approach for gene discovery and molecular marker development for non-model species.Keywords Microdera punctipennis ÁTranscriptome ÁIllumina sequencing ÁHeat shock protein ÁAntifreeze proteinIntroductionDeserts are among the most hostile habitats on earth.In summer,it is extremely hot during daytime,while in the depth of the winter’s night,it is surprisingly cold,as well as extremely dry in some seasons.Under these extreme con-ditions,small arthropods and particularly Tenebrionidae beetles are conspicuous components of the fauna.To achieve this impressive resistance to extreme stress,these small animals possess several behavioral,morphological and physiological adaptations [1–3],such as burying themselves deeply in the substrate to avoid high tempera-tures and extreme dry during the day [1,4],and taking up fog-water as a water source [5–8].Most desert tenebrionids adopt seasonal behavioral changes to avoid hostile condi-tions [9,10].Subelytral cavity,an airtight space formed by the fusion of the elytra [11,12],is found especially in desert Tenebrionidae and it helps to lower cuticular water permeability in desert beetles [13,14].Desert insects have the capacity of making signiﬁcant and rapid adjustments to even slight changes in environ-mental temperature in their physiological state,character-ized by cellular desiccation,build-up of metabolic wastes and depressed metabolic activity [15].However,even in the frozen state some complex physiological processes continue,including cryoprotectant synthesis [16,17]and diapause development [18].Understanding of the roles ofX.Lu ÁJ.Li ÁJ.Yang ÁX.Liu ÁJ.Ma (&)Xinjiang Key Laboratory of Biological Resources and Genetic Engineering College of Life Science and Technology,Xinjiang University,14Shengli Road,Urumqi 830046,China e-mail:majibrge@X.LuKey Laboratory of Chemistry of Plant Resources in AridRegions,Xinjiang Technical Institute of Physics and Chemistry Chinese Academy of Sciences,Urumqi 830011,China J.YangDepartment of Pediatrics,Texas Children’s Center,Dan L.Duncan Cancer Center,Baylor College of Medicine,Houston,TX,USAMol Biol Rep (2014)41:7293–7303DOI 10.1007/s11033-014-3615-6various proteins in insect has advanced substantially in the past20years.The development of powerful molecular tools and the increasing ease of their application have facilitated the identiﬁcation and structural characterization of novel proteins,and progress is being made on deter-mining their function in promoting winter survival in insects.Heat shock proteins(HSPs),also known as stress proteins,play a critical role in protecting organisms from injury due to high or low temperature[19],anoxia,desic-cation[20]and a range of chemical stresses[21].Besides, it is well known that antifreeze proteins(AFPs)play important roles in protecting poikilothermic organisms from freezing by promoting supercooling and inhibiting ice formation[22].Moreover,it is found that AFP genes also expressed in summer beetles in desert region[23,24]and is induced by high temperature[25].These results suggest that AFPs may play a role in the adaptation of desert insects to environment.One of these,Microdera punctipennis(Coleoptera: Tenebrionidae)is an endemic beetle in the Gurbantunggut Desert in Xinjiang[26],the north west of China.It is ﬂightless,night active;and its behavioral and morpholog-ical characteristics for desert living have been identiﬁed carefully[10].The day-night and seasonal temperature vary greatly in this region.This extreme variation in tem-perature might suggest that M.punctipennis have evolved a range of physiological and molecular adaptations for sur-vival.Adults of M.punctipennis have supercooling points below-19.6°C,and their capacity for supercooling has been shown to increase considerably with decreasing of water in their bodyﬂuid,but the underlying molecular basis remains unknown[23].The study of desert beetles is important because it illustrates many of the solutions evolved by arthropods to the problems engendered,in an extreme form,by life in all terrestrial environments.RNA-Seq is a recently developed large-scale genome-wide approach that has been applied successfully to gene discovery and expression proﬁling,and to the study of functional,comparative and evolutionary genomics in non-model organisms for which little previous genomic infor-mation existed.RNA-Seq has the advantages of being cost effective,highly sensitive,and accurate,with a large dynamic range[27].In the past few years,this technology has been used to investigate molecular mechanisms in insect species such as Micrarchus nov.sp.2,Tomicus yunnanensis,and Cryptolaemus montrouzieri[28–30]. Here,we describe the use of Illumina/Solexa paired-end technology for de novo transcriptome analysis of M. punctipennis.We obtained transcriptome sequences and discovered most of the known HSP and AFP genes,as well as the genes involved in the pathways for trehalose and chitin biosynthesis.Here,for theﬁrst time,we report the genomic proﬁle information of the arid beetle M.punctipennis.This study also provides an insight into the molecular pathways involved in stress adaptation in this species.Experimental proceduresInsectsM.punctipennis beetles were collected from the southern edge of the Gurbantunggut Desert(N44°24,E087°510, 444m),Xinjiang,China.The M.punctipennis adults were reared at25°C in the laboratory.Then,the samples were frozen in liquid nitrogen and stored at-80°C until further use.cDNA library generation and Illumina sequencingTotal RNA was extracted from three adult beetles using TRIzol Reagent(Sangon Biotech,China)according to the manufacturer’s instructions.The extracted RNA was assessed for quality and quantiﬁed using an Agilent2100 Bioanalyzer(Agilent Technologies,Mississauga,Canada) with an RNA integration number(RIN)of8,which is an algorithm for assigning integrity values to RNA measure-ments.For transcriptome analysis,the cDNA library was prepared using the TruSeq Sample Preparation Kit(Illu-mina,San Diego,CA,USA)following the manufacturer’s recommendations.Brieﬂy,mRNA was puriﬁed from2l g of total RNA using oligo(dT)magnetic beads.Divalent cations were used to fragment the puriﬁed mRNA into small pieces at94°C for5min;thereby priming bias was avoided when synthesizing the cDNA.The cleaved RNA fragments were used for double-stranded cDNA synthesis using a SuperScript Double-Stranded cDNA Synthesis kit (Invitrogen,Camarillo,CA,USA)with random hexamer (N6)primers(Illumina).The synthesized cDNA was sub-jected to end repair and a-Tailing processes before ligation of the adaptors.The end products were puriﬁed using a2% TAE-agarose gel(Certiﬁed LowRange Ultra Agarose,Bio-Rad)and enriched by PCR to create theﬁnal cDNA library with sequences of approximately300bp.After detection using an Agilent2100Bioanalyzer,the cDNA library clusters were generated by cBot machine(Illumina,San Diego,CA,USA)and then sequenced in Pair-End method by Sangon Biotech(Shanghai)Co.,Ltd.,China using an Illumina HiSeq TM2000(Illumina,San Diego,CA,USA) according to the manufacturer’s instructions.Sequence statistic and de novo assemblyPrior to assembly,the raw reads were cleaned by removing adapter sequences through the standard Illumina pipelineincluding the CASSAVA program(http://support.illumina. com/sequencing/sequencing_software/casava.ilmn).Low quality reads(those with quality value less than20)and reads containing N(N represents ambiguous bases in reads),length less than35bp wereﬁltered by a sliding window approach,the window size is5bp[31].De novo assembly of the valid reads was performed using the November2011version of the Trinity program(http://tri /)which was designed speciﬁ-cally for transcriptome assembly from RNA-Seq data[32]. Brieﬂy,Trinity combines reads of a certain length of overlap to form longer fragments and then processes them for sequence clusters with the sequence clustering software TGICL.The resultant sequences were deﬁned as unigenes. Bioinformatic analysisThe assembled unigenes were searched against the NCBI nr sequence database(ftp://),the Swiss-Prot database(/docs/swiss-prot_guideline. html),kyoto encyclopedia of genes and genome(KEGG, http://www.genome.jp/kegg/),cluster of orthologous groups(COG)and eukaryotic orthologous groups(KOG) (ftp:///pub/COG/COG)with the BLASTXalgorithm(accessed in Sept2012).The E-value cut-off was set at10-5.Genes were tentatively identiﬁed based on the best hits against known sequences.Blast2GO[33]was used to predict the functions of the sequences,to assign gene ontology(GO)terms(/),and to predict the metabolic pathways in COG and KEGG databases.Amino acid sequences were deduced by using ORF Finder(/gorf/gorf.html) and GENSCAN(/GENSCAN.html). The putative protein sequences were used for alignment by ClustalX(v1.83)program[34].The MEGA5.0software [35]was used to construct the consensus phylogenetic tree by using the neighbor-joining method based on Poisson correction model.Bootstrap analysis of1,000replication trees was performed to evaluate the branch strength of each tree.ResultsIllumina high-throughput sequencing and de novo assemblyA total of48,158,004raw reads were obtained by HiSeq TM 2000(Illumina)paired-end sequencing(Table1).After a stringentﬁltering process39,654,340valid reads of aver-age length95bp were obtained.We used the Trinity software to perform a paired end-joining de novo assembly of the valid reads.After assembly,56,348unigenes with an average length of 666bp and an N50of1,603bp were obtained.Of the 56,348unigenes,11,568unigenes(20.52%)were [1,000bp long and2,014(3.57%)were[3,000bp long.Annotation and function assignmentTo identify putative functions,the56,348unigenes were ﬁrstly aligned by BLASTX(E-value B10-5)to several protein databases:NCBI nr,UniProtKB/Swiss-Prot,Uni-ProtKB/TrEMBL,CDD and Pfam.A total of41,109 (72.96%)unigenes had at least one hit to one of the dat-abases(Table2)and quite a large proportion(about30%) apparently has no signiﬁcant match to any of the sequences in these databases,indicating that they may contain novel sequences and,perhaps,a high number of Coleoptera or species-speciﬁc transcripts or transcript parts(e.g.orphan UTRs).This might be expected,because there is very little sequence information from species closely related to M. punctipennis in these databases.The species distribution of the best match result for each sequence showed that the M. punctipennis sequences have64.56%matches with sequences from the Coleoptera species(Tribolium casta-neum)(Fig.1),while very low proportion(\1%)of them have matches to other insects,for example,there was only 0.27%(number of unigenes were94)of them have mat-ches to Drosophila melanogaster(not show independently in theﬁgure).It demonstrated that M.punctipennis has a near evolution distance with T.castaneum.Table1Summary statistics of the sequence assembly generated from M.punctipennisNumberNumber of raw reads48,158,004Number of valid reads(average length)39,654,340(95bp)Total unigenes(average length,N50)56,348(666bp,1,603bp) Number of unigenes C1,000bp,C3,000bp,C5,000bp11,568;2,014;287Length range89bp–10,230bpTable2Summary statistics of functional annotation for M.puncti-pennis unigenes in public protein databasesProtein database Number of unigene hits Percentage NR35,03462.17 SWISS-PROT25,34344.98 TREMBL34,21460.72 CDD22,69640.28 PFAM19,60334.79 Total41,10972.96Pathway annotation was carried out based on the GO,COG/KOG (KOG is the eukaryotic version of COG),and KEGG databases.Assignment of GO termsGO (/)is an international classiﬁcation system for standardized gene functions,offering a controlled vocabulary and a strictly deﬁned conceptualization for comprehensively describing the properties of genes and their products within any organism [36].The three main,independent GO categories are bio-logical processes,molecular functions,and cellular com-ponents.A total of 8,477different GO terms were assigned to 27,823predicted unigene-encoded peptides that previ-ously had matches with known proteins in the UniProtKB database.The terms were from the three main GO cate-gories and covered 52sub-categories (functional groups)(Fig.2).Within the biological process group,the majority of unigenes were related to metabolic process (18678,21.16%)and cellular process (18195,21.05%);within cellular component,the largest proportion were assigned to cells (19422,29.49%),cell part (19422,29.42%),and organelles (10828,16.44%);and within molecular func-tion the majority were assigned as binding (20084,40.12%)and catalytic activity (17818,35.59%)including hydrolases,kinases,and transferases,allowing for the identiﬁcation of genes that may be involved in secondary metabolite synthesis pathways.COG/KOG classiﬁcationCOG (/COG/)compares the protein sequences that are encoded in complete genomes and represents them in major phylogenetic lineages [37].The COG construction protocol included an automaticprocedure for detecting candidate sets of orthologs,manual splitting of multidomain proteins into the component domains,and subsequent manual curation and annotation [38].Furthermore,it has been extended to complex,mul-ticellular eukaryotes by constructing clusters of probable orthologs [39].Altogether,8,980unigenes was clustered into 25functional categories (Fig.3a).Among of them,the ‘‘general function prediction only’’cluster was the largest (16.82%),followed by ‘‘function unknown’’(11.82%).The other larger categories were:(posttranslational modi-ﬁcation,protein turnover,chaperones (7.01%);replication,recombination and repair (6.39%)amino acid transport and metabolism (6.10%);inorganic ion transport and metabolism (5.76%);and cell cycle control,cell division,chromosome partitioning (4.91%).An additional 649unigenes (4.02%)belonged to the ‘‘carbohydrate transport and metabolism’’group among which 17unigenes were annotated as chitinase.The COG classiﬁcations shed some light on speciﬁc responses and functions of genes that may be involved in regulating various molecular processes in M.punctipennis .The KOG classiﬁcations corresponded to 25of the functional categories already observed in the COG analysis (Fig.3b).Assignment of KEGG pathwaysTo identify the biological pathways that are active in the M.punctipennis ,we mapped the 56,348annotated sequences to the reference canonical pathways in KEGG.A total of 9,986unigenes were assigned to 283known metabolic or signaling KEGG pathways.The top 10KEGG pathways were spliceosome (290unigenes),purine metabolism (269),protein processing in endoplasmic reticulum (261),Huntington’s disease (239),lysosome (227),RNA trans-port (225),ubiquitin mediated proteolysis (221),pathways in cancer (218),endocytosis (208),and focaladhesionFig.1Species distribution of the BLAST hit for each unigenes.Note that nearly 64.56%of top hits are to the beetle T.castaneum whose complete genome has been sequenced.We used the ﬁrst hit of each sequence for analysis(204).These annotations will provide a valuable resource for investigating speciﬁc processes,functions and path-ways in M.punctipennis .Several of the KEGG metabolite pathways were impli-cated in enhancing stress defense through their generation of speciﬁc metabolites.Among the 9,986unigenes,1,689were mapped to 35pathways that are related to metabolism (Fig.4).For example,the ‘‘purine metabolism’’(ID:ko00230)and ‘‘amino sugar and nucleotide sugar metab-olism’’(ID:ko00520)pathways were the largest groups,containing a total of 425unigenes among them.A further 142and 113unigenes were assigned to the ‘‘glycero-phospholipid metabolism’’(ID:ko00564)and ‘‘aminoacyl-tRNA biosynthesis’’(ID:ko00970)pathways,respectively.Fig.2Pie charts showing gene ontology (GO)classiﬁcation (level 2).GO analysis of Mp sequences corresponding to 27,823unigenes,as predicted for their involvement in biological processes (a )cellular component (b )and molecular function (c )is knownPutative environment stress-related unigenes Heat shock proteinsA total of 72HSP-related unigenes were identiﬁed in the M.punctipennis transcriptome and 31of them were longer than 500bp (Table 3).The majority of the HSP-related unigenes were predicted to encode the HSP70type.The other HSP types among the HSP-related unigenes were,HSP1,HSP9,HSP20.6,HSP90,HSP60,sHSP21,and HSP cognate 1.These results should be validated by gene cloning based on the fragments obtained here.The annotation results for seven of the unigene sequences,Comp9719_c0_seq1,Comp9464_c0_seq1,Comp7346_c0_seq1,Comp9464_c0_seq1,Comp7346_c0_seq1,Comp113296_c0_seq1(105bp),and Comp9719_c0_seq6(355bp),are consistent with the experimental pre-clone known as M.punctipennis sequences in the GenBank database.The annotation results for the Comp9719_c0_seq3,Comp9719_c0_seq4,Comp9719_c0_seq5,Comp64045_c0_seq1(124bp)sequences are consistent with the experimental pre-clone known as sequences of Anatolica polita boreali .Antifreeze proteinsPrevious studies have shown that insect AFPs play important roles in cold tolerance,and there are numerous reports that the AFPs are speciﬁcally induced in insects that are exposed to low temperatures when they have been shown to improve insect freezing tolerance [40].The M.punctipennis anit-freeze protein (MpAFP)is Cys-,Thr-,and Ser-rich,and ExPASy prediction software indicates that its secondary structure is composed of tandem 12-residue repeats (TCTxSxxCxxAx)with extensive disulﬁde bond [41,42].Three unigenes in our assembly were identiﬁed as putativelyencoding MpAFP,two of them (Comp9408_c0_seq1and Comp9408_c0_seq2)have complete ORF.Alignment of the predicted proteins deduced from the two potentially com-plete unigenes showed that their percentage of identity was 78.19%(Fig.5),conﬁrming the remarkable conservation within the AFP family.The relationships among the AFP sequences of M.punctipennies showed that Comp9408_c0_seq1closed to MpAFPS52,MpAFPS77and AFP1(Fig.6).The result could provide the basis for further studies on the function of these genes.Other candidatesIn addition to the unigenes that have been analysed in detail above,other M.punctipennis unigenes with high sequence similarity to important genes related to stress metabolism and targets were identiﬁed.In particular,a number of unigenes were annotated as enzymes related to heat or cold metabolic resistance,such as trehalase,trehalose-6-phos-phate synthase,chitinase,and cathepsin (Table 4).Although most of these unigenes are not full length sequences,they are nevertheless useful candidates for fur-ther characterisation by RACE to retrieve the full length cDNAs.The abundance of these transcripts demonstrates the quality of our sequencing data.This information will provide new leads for functional studies of the genes that play potential roles in beetle resistance to enviroment stress.DiscussionReads generation and de novo sequence assembly The de novo assembly of short reads without a reference genome remains a challenge in spite of the developmentofFig.3Histogram presenting clusters of orthologous group (COG/KOG)classiﬁcation.a Of 56,348unigenes,8,980sequences were assigned to 25COG classiﬁcation,b Of 56,348unigenes,18,014sequences were assigned to 25KOG classiﬁcationmany bioinformatics tools for data assembly and analysis [43–46].Here,we obtained more than 4.8billion raw reads,and assembled de novo using the Trinity software.We obtained 41,109unigenes that matched one or more of the searched databases.The unannotated unigenes may represent novel genes whose function has not yet been identiﬁed.Speciﬁcally,the unigenes had 35,034(62.17%)hits to the nr database which was higher than the hits to any of the other databases (Table 2).Most of the top nr mat-ches (ﬁrst hit)were to sequences from the red ﬂour beetle (T .castaneum )probably because:(1)it demonstrated that M.punctipennis has a near evolution distance with T.castaneum ;(2)this is the only beetle with a completely sequenced genome [47].We mapped more than 15.94%of the M.punctipennis unigenes to the COG database,31.97%to the KOG database,17.72%to KEGG path-ways and 49.38%to GO terms,and found that 326unig-enes were homologous to known stress resistance genes.Many other genes and pathways related to stress adaptation were identiﬁed but need to be analyzed further.Heat shock protein genesHSPs are expressed in most organisms in response to a wide range of stressful environmental conditions and are generally viewed as a protective cellular mechanism [48].The HSP70family includes the strictly stress-inducible HSP70and the constitutive HSC70(heat shock cognate proteins),the glu-cose-regulated protein Grp78(BiP)[49],and the mito-chondrial form mitHSP70(grp75)[50].In a previous study,we isolated the full length cDNA sequence of a Hsp70gene from M.punctipennis (Mphsp 70)using the RACE-PCR technique.Real-time quantitative PCR showed that the mRNA levels of Mphsp 70at 37°C and 42°C was 21.6and 389.3fold respectively that of the control at 25°C,and the mRNA levels decreased as time prolonged at the high tem-peratures [51].In the present transcriptome we obtained a considerable number of inducible HSPs genes (72in total)and we speculated that these genes may help M.punctipennis adapt to the extreme desert environment.Besides,two unigenes (Comp7218_c0_seq1andComp7218_c0_seq2)Fig.4Unigenes from M.punctipennis related to metabolic pathwayswere found similar to the sequence of HSC70cDNA.Since HSC70is an important part of the protein folding machinery in a cell[52,53],the expression of HSC70in M.punctipennis may help protect its tissues from stress.Antifreeze protein genesAFPs were characterized initially in marineﬁshes[54,55], where they protect their hosts from freezing by binding to and preventing the growth of seed ice crystals[56].AFPs lower the freezing point of a solution containing ice below the melting point of the ice.AFPs function both in freeze resistance and freeze avoidance insects,thus AFPs may help insects survive most inhospitable environments.In previous study,four isoforms of AFPs from M. punctipennis have been isolated and identiﬁed[25,41,42]. Two of the cDNAs(Mpafps77and Mpafps52)were from beetles that were collected in summer.The deduced amino acid sequences of the MpAFPs expressed in summer are one12-residue repeat shorter and have signiﬁcantly dif-ferent C-terminal end sequences compared with the MpAFPs expressed in winter[25].Dozens of AFP isoforms have been indentiﬁed in Choristoneura fumiferana[57], Tenebrio molitor[58]and Dendroides canadensis[24]. The function of these AFP isoforms may be different.Six isoforms of cfAFP from C.fumiferana were shown development-speciﬁc expression patterns[59].Similar to C.fumiferana and T.molitor AFPs,the MpAFPsTable3Putatively identiﬁed HSP genes([500bp)in M.punctipennisGene ID Gene Name Length(bp)First hit E-value Blast annotation/organismComp9597_c0_seq1HSP702,391ADB44081 6.00E-22heat shock protein70[Mantichorula semenowi]Comp9719_c0_seq1HSP701,046AEB52075 1.00E-161heat shock protein70[Microdera punctipennis]Comp9719_c0_seq2HSP70982AEB52075 6.00E-162heat shock protein70[Microdera punctipennis]Comp9872_c0_seq1HSP702,149XP_973521 2.00E-47PREDICTED:similar to heat shock protein70B2[Tribolium castaneum] Comp1983_c0_seq1HSP702,054XP_0027804130heat shock protein70,putative[Perkinsus marinus ATCC50,983] Comp9719_c0_seq3HSP70864ABQ399707.00E-120heat shock protein70[Anatolica polita borealis]Comp4058_c0_seq1HSP70818NP_001164098 2.00E-157heat shock protein TC005094[Tribolium castaneum]Comp2209_c0_seq1HSP70765NP_001164098 1.00E-138heat shock protein TC005094[Tribolium castaneum]Comp9464_c0_seq1HSP70509AEB52075 3.00E-92heat shock protein70[Microdera punctipennis]Comp7346_c0_seq1HSP70505AEB52075 1.00E-82heat shock protein70[Microdera punctipennis]Comp18449_c0_seq1HSP702,054XP_6282280heat shock protein,Hsp70[Cryptosporidium parvum Iowa II]Comp9719_c0_seq4HSP70687ABQ39970 2.00E-112heat shock protein70[Anatolica polita borealis]Comp9719_c0_seq5dnaK/70534ABQ39970 2.00E-77heat shock protein70[Anatolica polita borealis]Comp7218_c0_seq1dnaK/701,816XP_968075 4.00E-23PREDICTED:similar to Heat shock protein cognate1CG8937-PA[Tribolium castaneum]Comp7218_c0_seq2dnaK/701,763XP_9680750PREDICTED:similar to Heat shock protein cognate1CG8937-PA[Tribolium castaneum]Comp7893_c0_seq1dnaK/701,1373LDL7.00E-95A Chain A,Crystal Structure Of Human Grp78Comp69543_c0_seq1dnaK/70596BAF49512 2.00E-74heat shock protein9[Branchiostoma belcheri]Comp4031_c0_seq1dnaJ/701,043XP_0013883288.00E-32heat shock protein[Cryptosporidium parvum Iowa II]Comp6249_c0_seq1CRY a B905XP_966780 3.00E-77PREDICTED:similar to small heat shock protein21isoform1[Triboliumcastaneum]Comp10639_c0_seq1CRY a B897XP_966780 1.00E-72PREDICTED:similar to small heat shock protein21isoform1[Triboliumcastaneum]Comp10639_c0_seq2CRY a B667XP_966780 5.00E-74PREDICTED:similar to small heat shock protein21isoform1[Triboliumcastaneum]Comp5547_c0_seq1CRY a B682XP_968760 6.00E-91PREDICTED:similar to heat shock protein1[Tribolium castaneum] Comp6543_c0_seq1CRY a B1,091XP_973442 6.00E-75PREDICTED:similar to small heat shock protein21[Triboliumcastaneum]Comp1975_c0_seq1HSP20.6913XP_973685 4.00E-112PREDICTED:similar to heat shock protein20.6[Tribolium castaneum] Comp3391_c0_seq1TST719XP_966808 3.00E-44PREDICTED:similar to heat shock protein67B2[Tribolium castaneum] Comp9978_c0_seq1TST670XP_966808 5.00E-29PREDICTED:similar to heat shock protein67B2[Tribolium castaneum] Comp9978_c0_seq2TST653XP_966808 6.00E-29PREDICTED:similar to heat shock protein67B2[Tribolium castaneum] Comp3391_c0_seq2TST586XP_9668087.00E-45PREDICTED:similar to heat shock protein67B2[Tribolium castaneum] Comp13432_c0_seq1HSP90A2,296AAC471730heat shock protein90[Eimeria bovis]Comp9568_c0_seq1HSPD12,263XP_971630 2.00E-102PREDICTED:similar to60kDa heat shock protein[Tribolium castaneum] Comp11141_c0_seq3HSP752,297XP_0016547580heat shock protein[Aedes aegypti]apparently consist of many isoforms with conserved resi-dues [60],which may play important roles in maintaining the integrity of the structure and function of the AFPs.In the present study,three unigene sequences that potentially encode AFPs were identiﬁed;their sequences were con-served when aligned with those pre-cloned (Comp9408_c0_seq1and Comp9408_c0_seq2vs .MpAFP,MpAFP1,MpAFPS52and MpAFPS77).This sequences were obtained under different conditions,such as room tem-perature and cold treatment [25,41],which suggested that different MpAFPs may have additional functions that were trigered by environmental signals.Metabolism related unigenesAccording to the information provided by GO classiﬁca-tion,most of unigenes in the present data were related to metabolism in the biological process.We analyzed 270unigenes which belong to 13different groups,and are related to metabolism of M.punctipennis (Table 4).These genes were grouped into the following functions:transmembrane transporter activity (GO:0022857),cata-lytic activity (GO:0003824),polysaccharide catabolic process (GO:0000272),cysteine-type endopeptidase activ-ity (GO:0004197),etc.Thirteen unigenes were annotated as trehalase (a-glucoside-1-glucohydrolase,EC 3.2.1.28),which is an enzyme that hydrolyzes trehalose to yield two glucose molecules.Trehalose is the major hemolymph sugar in most insects,which acts as an indispensable sub-strate for energy production and macromolecular biosyn-thesis [61].It is predominantly synthesized in the fat body and released into the hemolymph [62].Trehalase plays a pivotal role in various physiological processes in insect,including ﬂight metabolism [63],chitin synthesis during molting [64],and cold tolerance [65].All these functions are achieved through the hydrolysis of trehalose(a-D-Fig.5Alignment of the antifreeze protein sequences of M.puncti-pennis .Identical residues are shaded black ,conserved substitutions are shaded grey .Dash (-)indicates insertion or deletion.The antifreeze protein name and GenBank ID of M.punctipennies :MpAFPS52(ADJ93820.1),AFPS77(ADJ93819.1),MpAFP1(AAW67980.1),MpAFP(AAW67979.1)Fig.6The homology relationships of M.punctipennies antifreeze proteins.The tree was generated using the neighbor-joining method provided by the software MEGA5with Poisson correction for multiple amino acid substitutions,and bootstrapping test was performed with 1,000replicates.The antifreeze protein name and GenBank ID:MpAFPS52(ADJ93820.1),AFPS77(ADJ93819.1),MpAFP1(AAW67980.1),MpAFP (AAW67979.1)Table 4Putative genes of interest related to stress resistance in M.punctipennis Gene name Number of unigenes had a hit with nr database Trehalase13TRET1/facilitated trehalose transporter 20Trehalose-6-phosphate synthase 5Glycogen 23Chitinase 65Cathepsin 57citrate synthase 2ATP synthase 45Aquaporin10Nucleoside diphosphate kinase 6Cyclophilin8Glutathione S transferase 11Superoxide dismutase5。