Prediction, Annotation and Analysis of Human Promoters

Michael Q. Zhang

Cold Spring Harbor Laboratory

1 Bungtown Road, Cold Spring Harbor, NY 11724

INTRODUCTION

Since the celebrated discovery of Watson-Crick double-helix structure of DNA, it has taken 50 years for human genome to be sequenced. It may very well take another 50 years for the functional information to be fully decoded. Up till recently, genome research has mainly been focusing on coding regions, where the immediate questions are “where are the protein coding regions?” and “what are the functions of the gene products”. Increasingly, the field is advancing towards non-coding regions, where the central questions become “where are the regulatory regions?” and “how do they control gene expressions”. In 1961, Jacob and Monod published “On the regulation of gene activity” at the 26th Cold Spring Harbor Symposium on Quantitative Biology, in which some of the fundamental concepts of gene regulation were first elegantly formulated. Regulatory regions are most fundamental, because all the gene structures are defined by and recognized through the cis-elements in such regions; further more, what a gene does in vivo is intimately related to when, where and how much it is expressed. A phenotype, upon which the selection force is acting, is the integrated result of gene function and regulation. It is argued that the animal diversity is mainly due to the evolutionary expansion in regulatory complexity (Levine & Tjian 2003). Most regulations occur at the transcriptional level and the initiation of transcription is largely determined by the promoter located at the beginning of each gene, identification of promoters and cis-regulatory elements within them has become the prerequisite for understanding of gene regulation. For a few model organisms with compact genome (such as phage, bacteria and yeast), many of the gene regulatory pathways or networks have been worked out. But for mammalian systems, such as human, systematically identification of regulatory

regions and gene networks have turned out to be extremely difficult, largely due to the size and complexity of the genomes (hence, as a result, the diversity of the cell/tissue types and the complication of developmental stages).

Here I will outline our approaches to this problem. As genome research is data and technology driven, many approaches in the field can soon become obsolete once new or more data or technologies become available. I will try to state generic ideas and methodologies that may be evolving with or refined by new data or technologies. I will also try to point out open problems and to suggest new experiments to attack them.

In silico prediction of mammalian promoters

Transcription of a eukaryotic protein-coding gene is preceded by multiple events; these include decondensation of the locus, nucleosome remodeling, histone modifications, binding of transcriptional activators (or derepressors) and coactivactors to enhancers and promoters, and recruitment of the basal transcription machinery to form the preinitiation complex (PIC) at the core promoter. A core promoter is defined approximately as the DNA region (?40,+40) with respect to the transcriptional start site (TSS). It may contain the TFIIB recognition element (BRE) and the TATA-box at the 5’-end, the initiator (Inr) around the TSS and the downstream promoter element (DPE) at the 3’-end (see, e.g. Smale and Kadonaga 2003). Although, in a mammalian genome, distal enhancers/silencers can be 10 ~ 100 kb away from the target gene; most of the cis-regulatory elements are contained in a proximal promoter region of 0.5 ~ 2 kb in size. Putatively mapping of known transcription factor binding site (TFBS) density profile was originally used to develop the first computational promoter prediction program called Promoterscan (Prestridge 1995, see Fickett and Hatzigeorgiou 1997 for survey and

evaluation of earlier promoter prediction programs), later discriminative oligo-nucleotide based algorithms, such as PromoterInspector (Scherf et al. 2000), showed much improved performance.

We hypothesized that the molecular pattern recognition may be achieved by different molecular machinery with different resolutions at different scales (Zhang 1998a). An analogy would be that, if one tries to locate a landmark on earth from an airplane, one could use a coarse-grained tool to locate a regional landscape before zooming in with a finer mapping tool. Ideally, a coarse-grained promoter finder should be able to detect a chromatin and/or epigenetic landscape at the proximal promoter level (resolution < 2 kb). It could be an easier problem if one had 3D structural images (and this could happen within the next 10 years). With only the primary DNA sequences, one would have to use large-scale statistical features of those length characteristics. Fortunately, for human (or vertebrate), CpG islands can provide one such discriminative feature for at least 50% genes (Antequera and Bird 1993)! The human genome contains ~ 50,000 CpG islands, ~30,000 after repeatmasking and majority of these are near promoters. Computationally, a CpG island is defined (Gardinger-Garden and Frommer 1987) by a DNA region > 200 bp that has > 50% GC-content and > 0.6 ratio of CpG over expected CpG. Using this criteria, one would find ~345,000 CpG islands in the human genome. By detecting promoter associated CpG islands, we have developed an algorithm (called CpG_Promoter) for coarse-grained promoter mapping (Ioshikhes and Zhang 2000). Promoter associated CpG islands tend to be larger (0.5 ~ 2 kb), higher GC-content and the CpG ratio, other CpG islands are mostly associated with Alu repeats. Takai and Jones (2002) proposed a new definition: size > 500 bp that has > 55% GC-content and >

0.65 CpG ratio. Using this new criteria, one would find ~37,000 CpG islands. Later other CpG island based promoter prediction algorithms, such as CpG+ (Hannenhalli and Levy 2001) and CpGProD (Ponger and Mouchiroud 2002), have also become available. We would like to see more large-scale experimental data, such as chromosome bandings, methylation patterns, histon modification profiles, Dnase hypersensitive sites, ChIP profiles and genomewide transcription reporter constructs. Integrating these data will allow better promoter landscape mapping algorithms to be developed.

For a finer promoter mapping, aiming at predicting the TSS with resolution < 100 bp, we developed an algorithm, called CorePromoter (Zhang 1998b), based on quadratic discrimination analysis (QDA) using position-dependent oligo-nucleotide features (these positions are designed to capture the known core-promoter elements). By combining a coarse-grained and a fine prediction tools, I demonstrated how the TSS could be precisely located for App gene (a 300 kb gene in chromosome 21) encoding Amyloid precursor (Zhang 2000). Instead of oligomers, Eponine uses a set of weight matrices in a hybrid machine-learning approach (Down and Hubbard 2002) to identify TSS. Dragon Promoter Finder (DPF) is an Artificial Neural Network (ANN) based algorithm which uses multiple sensors (promoters, exons and introns) to predict TSS (Bajic et al 2002).

As gene structures are often correlated (i.e. neighboring introns or exons can help predicting promoters as demonstrated in DPF above. See review by Zhang 2002a). We have developed FirstEF that integrates promoter, 5’UTR and first-intron information for predicting human first exons and promoters simultaneously (Davuluri el al 2001). There is an increasing evidence that transcription and splicing are coupled, we expect that promoter may influence the first donor site selection. Recently, DPF output and CpG

islands were integrated into a larger ANN program called Gene Start Finder (DGSF) to achieve a comparable promoter prediction in a test using chromosome 4, 21 and 22 (Bajic and Seah 2003). Although modern gene prediction programs, such as Genscan (Burge and Karlin 1997), try to predict first coding exons; FirstEF is the only program that is capable of predicting non-coding (untranslated) first exons. The most important open problems in promoter prediction are (1) how to improve accuracy on predicting promoters (or first exons) that are not CpG island associated; (2) how to predict alternative promoters and to predict multiple TSSs (a single promoter can regulate multiple start sites especially when the multiple start site downstream element (MED-1) is present (Ince and Scotto 1990).

Automatic construction of CSHL Mammalian Promoter Database reference system As microarray expression data become prevalent, biologists often need to extract various sets of promoter sequences from clustered genes (Zhang 1999a). Originally, we developed PEG (Promoter Extraction from GenBank) using a set of accession numbers or ESTs as the input to facilitate the extraction of large sets of promoters (Zhang and Zhang 2001). When the nearly finished genome became available in April 2003 (Human built 33), we developed our automated annotation pipeline (an expert system) called FexAnnotator (First exon annotator, Davuluri et al. 2003), which can reduce false-positives and false-negatives from the FirstEF predictions by using existing knowledge in the public sequence database annotations (mRNA/EST and ENSEMBL genes). In this first pass annotation, we have ~53,000 first exons (including ~8,000 alternative first exons only annotated for the Refseq genes). The accuracy check shows that among ~10,000 experimentally verified first exons (such as those in EPD and in DBTSS), ~80%

were found within 500 bp of our pipeline predictions. Another check using known TFBSs in TRANSFAC, the density if these TFBSs are indeed concentrating within the vicinity of annotated core promoters.

For genome scale regulation studies, building a high quality promoter database, which allows easy and flexible data query or retrieval as well as on-the-fly analysis, is essential. Our Saccharomyces cerevisiae Promoter Database (SCPD, Zhu and Zhang 1999) has proved to be very instrumental for the yeast community. In order to better annotate the human promoters, we are currently building the CSHL Mammalian Promoter Database, which initially includes Homo sapiens Promoter Database (HsPD, based on Human built 33, April 2003), Mus musculus Promoter Database (MmPD, based on Mouse release Feb. 2003) and Rattus norvegicus Promoter Database (RnPD, based on Rat release Jan. 2003). A new pipeline has been developed (Z.Y. Xuan et al. unpubl.), in addition to ENSEMBL (Hubbard et al. 2002), it also makes use of results from GenomeScan (Yeh et al. 2001), Fgeneh+ (Solovyev 2002) and TwinScan (Korf et al. 2001) in order to annotate promoters for potential novel genes (for further experimental validations). The new pipeline, taking advantage of cross-species comparisons, can automatically annotate multiple genomes in parallel on a Linux cluster (Figure. 1) and have been used to create the initial reference system for the CSHL Mammalian Promoter Database (Z.Y. Xuan, F. Zhao, et al. unpubl.). In this database, orthologous promoters will be linked so that a user can input a list of UnigeneIDs or Accession Numbers (from a clustered microarray data, say), specify the range of promoter region, extract orthologous promoter sequences, do motif finding on-the-fly; or select a gene of interest, do orthologous promoter alignment on-the-fly and look for conserved motifs (Figure 2).

Maintaining computability in addition to manual browsibility will serve well to both computational and experimental biologists.

Functional curation of cell cycle transcription factors and their target genes

A promoter reference system created by automatic pipeline can insure completeness, it is consistent with most of the known information and also has reasonable accuracy. It must contain rich functional information (TFs, TFBSs, TSS, CpG islands) and links to other related databades and literature reference in order to be useful. Therefore, we are adding on top of HsPD/MmPD/RnPD with TRED (Transcription Regulatory Element Database, F. Zhao et al. unpubl.) (Figure 1), which allows semi-automated or even hand-curated information to be entered. Three most important issues every useful database must face to are (1) assign quality value to the raw record; (2) insure accuracy and usefulness; (3) open data disseminations. For (1), we have assigned different quality values to promoters and TFBSs according to how they were derived. For (3), we are discussing with NCBI (D. Lipman, pers. comm.) and EBI (E. Birney, pers. Comm.) on ways to incorporate our results into public databases. The most difficult and time-consuming task is (2), which involves hand-curation and out reach to transcription expert labs. We are initially focusing on cell cycle and cancer related TFs including their target genes, and will give authorship to related transcription labs that contribute data or expertise. Currently, out of 60,519 promoters (40,658 genes) in human part of TRED, only 2003 promoters (1853 genes) are in the best quality class (known and curated class). Other classes are: known but not curated, predicted based on Refseq, predicted based other mRNAs, predicted based on other ESTs and purely predicted. As an example, for human E2F targets, TRED contains 233 promoters (182 genes) in the best quality class.

High throughput experimental validations

All computational predictions must subject to experimental verifications and both positive as well as negative results are crucial feedbacks for further database and algorithm improvement. Lacking high throughput experimental validation has become the bottleneck in this feedback loop. As cDNA libraries become more saturating, novel gene finding has gradually shifted its paradigm from EST sequencing to computational prediction plus experimental validation (Das et al 2001, Guigo et al 2003). To validate first exons and TSSs, getting 5’-complete cDNAs are essential (Suzuki et al. 2000, Davuluri et al 2000). Recently using reporter construct, 5’-quality of Refseq and MGC clones have been randomly assayed for transcriptional activity of the upstream sequences (Trinklein et al. 2003).

In collaboration with McCombie and Hannon labs at CSHL on developing high throughput experimental genome annotation technologies, we have performed systematic 5’-RACE-PCR validation of 300 first exon predictions in 15 mouse tissue libraries (Balija et al. 2003). We have selected the predicted exons in 5 categories according to having support evidence from: (a) EPD (this serves as the positive control), (b) Refseq, (c) more than two ESTs, (d) only one EST, (e) pure prediction. The success rates are 12/13, 17/27, 18/23, 28/169 and 16/68, respectively. Here ~ 25% predicted novel genes are likely to be real.

Working with Wang lab at U. of Chicago on developing GLGI (Generation of Longer 3’cDNA from SAGE Tag for Gene Identification, see Chen et al. 2003) -based genome annotation technology, we also obtained 57 positives from a test of 104 first

exon predictions in human tissues and 15 full length cDNAs were sequenced from 47 novel exon/SAGE-tag clones (S.M. Wang, pers. comm.).

To test promoter activities, we have collaborated with Stubbs lab at LLNL in annotating predicted genes/promoters in 800 kb region (containing 48 genes) of human ch19q13 using luciferase report system in addition to RT-PCR. Out of 38 tested predictions, 26 were tested positive (L. Stubbs, pers. comm. and see Figure 3).

These experimental exercises have demonstrated the validity of the large-scale computational prediction plus experimental verification approach for accurate genome annotation. It is also alarming that many previous false positives can be turned into true positives after more issues are tested or more sensitive experimental techniques are used (kapranov et al. 2003). The new challenge for computational biologists is how to recover false negatives; while for experimental biologists is how to prove a false positive!

Computational challenges in identification of cis-regulatory elements and

transcriptional networks

Although most TFBSs are in the promoter region, many may be in the first intron (which can also be located by FirstEF prediction) and some may be in the 3’-flanking region (which can be located by EST/poly(A) mapping). There are also many distal enhancers/silencers/boundary elements that are so far away from the target genes, they are the most difficult to find. And even if they are found, linking to the correct target genes is still no easy task. We are focusing on proximal promoter region for cis-regulatory element discovery; many of the methods may also be applied to other regulatory regions once they are approximately localized (for example, by comparative

genomic analysis or by DNase hypersensitivity mapping or enhancer trapping technologies).

A. Computation-then- validation paradigm

Traditionally, identification of a cis-regulatory element is very laborious: collect known binding sites, build consensus or weight matrix and search for new loci. One cannot discover novel sites in this way. To study human cell cycle regulation, we have developed E2F SiteScan based on genetic algorithm trained on known sites in TRANSFAC and scanned ~5,000 promoters in the public database to identify more than 300 E2F targets, many of which were also validated by ChIP-PCR method (Kel et al. 2001). Since E2F motif was built mainly from known cell cycle genes, they may be biased as E2F also plays important roles in other biological pathways (such as apoptosis, DNA repair, etc.). By analyzing promoters from ChIP-PRC top candidates, we were able to identify novel E2F targets that do not have the conventional binding motif (Weinmann et al. 2001). But the scope with PCR is still very limited. When large-scale genome-wide data and technologies become available, one now is able to study TFBS in the whole genome together with their transcriptional readouts. It is expected that computational approaches are becoming more indispensable and will play more important roles in the future to come (Zhang et al. 2002, Zhang 2002b).

B. Large scale gene expression analysis

DNA microarray gene expression has become the widely used methods for studying gene regulation. It provides the direct readout of the cellular transcriptional programs. Interpretation of gene expression patterns by cis-elements and trans- factors, or conversely reconstruction of regulatory circuits from transcriptional responses is the main

challenge in the 21th century (Zhang 1999a, Banerjee and Zhang 2002). Using cluster analysis followed by motif searching of promoters of co-regulated genes, we were quite successful in identification of cis-elements involved in yeast cell cycle regulations (Spellman et al 1998, Zhang 1999b). By combining functional information, such as MIPS (Zhu and Zhang 2000) or GO (Chen et al. 2003), one can further select gene clusters that are not only co-expressed but also share significant number of genes involved in similar functional pathways or structural complexes.

Human cis-elemet detection is much more difficult due to much smaller signal-to-noise ratio (promoter region is much larger and uncertain, motifs are more degenerate, there are many repeats, etc.). Most commonly used motif finders, such as Consensus (Hertz et al. 1990), MEME (Beiley and Elkan 1994) and Gibbs sampler (Lawrence et al. 1993, Neuwald et al. 1995), assume a specific background model (e.g. Markov of order k). In order to increase specificity, we have developed a novel motif finding software package called BEAST (Binding Element AnalySis Tools, Hata and Zhang, 2003) that allows arbitrary background sequences to be the control set. The algorithm is based on exhaustive word counting strategy (allowing gap and reverse-complement, overlapping word is treated similarly as in van Helden et al. 2000). For each motif, the Fisher exact test (or chi-square test with Yates’s correction) is used to evaluate p-value (with multiplicity correction) for the significance of motif association to the target (promoter) sequences against the background control. BEAST has been applied to microarray expression data from transcription factor knockout experiments (Chen et al. 2003), using the up regulated promoters, the down regulated or the combination as the target and using

the unchanged as the control. Combined with GO annotation (Ashburner et al. 2000), results agree well with the corresponding ChIP-chip analysis (data not shown).

BEAST was tested in detecting liver-specific promoter elements when a set of 35 proximal promoters of known liver specific genes was used for the targets and the pool of 1800 EPD promoters was used as the control. The HNF-1 motif YAMT..TTRA (p=6.1x 10-12) was clearly identified on top of other putative motifs (Hata and Zhang 2003). The new challenge is to apply BEAST systematically to mammalian tissue expression data, using tissue-specific gene promoters as the targets and using the pool as the control, for discovering tissue-specific promoter elements. Future adaptation of BEAST with weight matrices should further improve its sensitivity for degenerate motifs or motif combinations.

C. Large scale chromatin localization analysis

Unlike the indirect co-regulation strategy above, ChIP-chip assay allows to detect TF binding targets in the whole genome by cross-linking protein to chromatin DNA in vivo. The first two human ChIP-chip experiments were done using a CpG island DNA chip (Weinmann et al. 2002) or using a Refseq genes promoter chip (Ren et al. (2002) to map E2F4 target genes.

In collaboration with Ren lab, we have used ChIP-chip assay to discover a global transcriptional regulatory role for c-myc in Burkitt’s lymphoma cells (Li et al. 2003). We find that c-myc together with its heterodimeric partner, Max, occupy more than 15% of the gene promoters tested and they colocalize with TFIID in these cells, indicating a general role for over-expressed c-myc in global gene regulation of some cancer cells. One surprise from the promoter analysis is that many of the targets do not have the

conventional E-box, instead we find a novel motif CGGAAG by BEAST which is the most significant cis-element shared by large number of c-myc/Max binding target promoters (Hata et al. unpubl.). Furthermore, most of the elements are located near TSS (within 100 bp) and their positions are conserved among human, mouse and rat (data not shown). We are currently seeking experimental test for its functional relevance.

Recently there are two other motif detection algorithms suitable for ChIP-chip and expression data analysis. One is a word-based linear regression algorithm called REDUCE (Bussemaker et al. 2001) and another is a hybrid (word enumeration and weight matrix) greedy search algorithm called MDscan (Liu et al. 2002). Comparing to these, BEAST conveniently provides motif p-values and is more discriminative against a given background control set.

D. Comparative genomic analysis

Increasingly, comparative genomics has become very powerful method for detecting functional elements in non-coding regions. We began with a compative DNA sequence analysis of mouse and human protocadherin gene clusters in collaboration with experimentalists. The genomic organization of the human protocadherin α, β and γ gene clusters (designated Pcdhα, Pcdhβ and Pcdhγ) is remarkably similar to that of immunoglobulin and T-cell receptor genes. The extracellular and transmembrane domains of each protocadherin protein are encoded by an unusually large “variable” region exon, while the intracellular domains are encoded by three small “constant” region exons located downstream from a tandem array of variable region exons. By comparing human draft and mouse BAC sequences, we were able to identify an alternative CpG island associated promoter in front of each variable exon in the α and γ gene clusters as

well as a highly conserved cis-regulatory element within the promoter (Wu et al. 2001). Later, it was further confirmed that these cis-elements are functionally important (Wang et al. 2002) and alternative promoter choice determines first intron splice site selection (Tasic et al. 2002).

To build our comparative genomics infrastructure, we carried out whole genome comparison between human and both (Celera and public) versions of mouse assemblies and published our CSEdb (Conserved Sequence Element) (Xuan et al. 2002). CSEs cover ~ 3% of the human genome. One third of these CSEs are related to known genes, some are related to other functional elements (such as RNA genes, antisense genes, etc.); but more than half are still functional unknown. Unknown CSEs provide excellent candidates for discovering novel genes or cis-regulatory regions. CSEs also allow us to arrive at another independent estimate of the number of human genes (~40,000).

Although comparative genomics has proved to be promising for discovering cis-regulatory regions (Pennacchio and Rubin 2001), because different promoter evolves with different rate, multiple species would have to be needed for narrowing down to short TFBSs. Initial success in yeast (Kellis et al. 2003, Cliften et al. 2003) may not directly translate in human, novel integrated approaches would have to be required to teeth out functional cis-elements even if the number of mammalian genomes were doubled.

E. Integration, combinatorial analysis and network reconstruction

Genomic data is noisy; the best weapon for combating noise is signal correlation analysis. Combinatorial interaction among TFs introduces correlation among their binding sites. Recently, there have been new motif finding algorithms, such as CO-Bind (GuhaThakurta and Stormo 2001), that are designed specifically for detecting correlated

motifs. Integration of evolutionary conservation with word-pair analysis can yield a better regression to expression data (Chiang et al. 2003).

Integrating ChIP-chip and expression data at the single motif level has recently attempted (Conlon et al. 2003). We have developed two methods for studying cooperativity by integrating ChIP-chip data and microarray expression data. For a given pair of TFs, A and B, the first method compares expression patterns of the targets of both TFs to that of A or B alone. If the former is more coherent (correlated), it is more likely that the two TFs are interacting in the transcription regulation of their common targets (Banerjee and Zhang 2003). The second method further integrates with promoter sequence analysis in order not only to infer the interacting TFs, but also to assign their corresponding binding sites by iteratively and exhaustively searching for significant TFs combinations and motifs combinations up to the triplet level (Kato et al. 2003). After analyzing over hundred TF ChIP-chip data (Lee et al. 2002), we were able to reconstruct the yeast cell cycle transcriptional regulation network so that (1) it extends the previous chain of single regulators to expanded chain of regulatory modules; (2) modeuls at adjacent phases often share common component that can bridge the continuity of the cycle; (3) there are modules at specific checkpoints (branchpoints) that allow cell entry or exit of the cycle according to external signals (Figure 4). Experimental verification is necessary to confirm any network predictions (Segal et al. 2003).

We are waiting for experimentalists to generate good quality data of ChIP-chip and expression from the same sample preparations for mammalian systems as well as to sequence multiple vertebrate genomes. Mammals alone are not enough for cis-element

studies about human; one needs distant organisms (such as chicken, for phylogenetic footprinting) as well as close ones (such as chimpanzee, for phylogenetic shadowing). CONCLUSIONS

It is clear now that, having a “periodic table” of genes is not enough, we also need a network diagram telling us how the genes are connected and for this, we are going to need another “periodic table” of gene regulatory elements. Combination of computational and functional genomics will help us to filling up these tables quickly. Infrastructure such as promoter databases and cis-element/trans-factor databases is urgently needed. New technologies that can provide different genomewide view of the regulatory networks and new algorithms that integrate various large-scale data will be the keys for attacking human gene regulation problems (Banerjee and Zhang 2002). Conservation is important for revealing function; non-conservation can be even more important for understanding evolution (Wray et al. 2003). The recent discovery of a promoter that acquired p53 responsiveness during primate evolution through microsatellite expansion of weak binding sites (Contente et al. 2003) is an amazing testimony, and for this, one would have to look beyond just rodents.

Acknowledgements

I would like to thank all (including previous) members of Zhang lab and my collaborators for contributing most of the data and the figures, many before publications. Zhang lab is supported by grants (HG01696, GM60513, CA81152, CA88351) from NIH.

References

Ashburner M., Ball C.A., Blake J.A., Botstein D., Butler H., Cherry J.M., Davis A.P., Dolinski K., Dwight S.S., Eppig J.T., Harris M.A., Hill D.P., Issel-Tarver L., Kasarskis A., Lewis S., Matese J.C., Richardson J.E., Ringwald M., Rubin G.M., and Sherlock G.. 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet25: 25.

Antequera F. and Bird A. 1993. Number of CpG islands and genes in human and mouse. Proc. Natl. Acad. Sci. USA90:11995.

Bajic V.B., Seah S.H., Chong A., Zhang G., Koh J.L.Y., and Brusic V. 2002. Dragon Promoter Finder: recognition of vertebrate RNA polymerase II promoter. Bioinformatics 18:198.

Bajic V.B. and Seah S.H. 2003. Dragon Gene Start Finder identifies approximate locations of the 5’ ends of genes. Nucl. Acid. Res. 31:3560.

Balija V., Nascimento L., Dike S., Zutavern T., Oui J., Palmer L., Hannon G., Xuan Z.Y., Zhang M.Q., and McCombie W.R. 2003. The mammalian gene set: systematic examination of gene predictions in mouse genome. Submitted.

Banerjee N. and Zhang M.Q. 2002. Functional genomics as applied to mapping transcription regulatory networks. Current Opinion in Microbiology5:313.

Banerjee N. and Zhang M.Q. 2003. Identifying cooperativity among transcription factors controlling yeast cell cycle. Submitted.

Beiley T.L. and Elkan C.P. 1994. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Intell. Sys. Mol. Biol. 2:28.

Burge C. and Karlin S. 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268:78.

Bussemaker H.J., Li H. and Siggia E.D. 2001. Regulatory element detection using correlation with expression. Nat Genet. 27:167.

Chen G.X. Hata N. and Zhang M.Q. 2003. Transcription factor binding element detection using functional clustering of mutant expression data. Submitted.

Chen J.J., Lee S., Zhou G., Rowley J.D. and Wang S.M. 2003. Generation of longer cDNA fragments from SAGE tags for gene identification. Methods Mol. Biol. 221:207.

Chiang D.Y., Moses A.M., Kellis M., Lander E.S., and Eisen M. 2003. Phylogenetically and spatially conserved word pairs associated with gene-expression changes in yeasts. Genome Biol. 4:R43.

Cliften P., Sudarsanam P., Desikan A., Fulton L., Fulton B., Majors J., Waterson R., Cohen B.A. and Johnston M. Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science301:71.

Conlon E.M., Liu X.S. Lieb J.D. and Liu J.S. 2003. Integrating regulatory motif discovery and genome-wide expression analysis. Proc Natl Acad Sci U S A. 100:3339.

Contente A. Zischler H., Einspanier A., and Dobbelstein M. 2003. A promoter that acquired p53 responsiveness during primate evolution. Cancer Res. 63:1756.

Das M., Burge C.B., Park E., Colinas J., and Pelletier J. 2001. Assessment of the total number of human transcription units. Genomics77:71

Davuluri R.V., Suzuki Y., Sugano S., and Zhang M.Q. 2000. CART classification of 5’UTR sequences. Genome Res. 10:1807.

Davuluri R.V., Grosse I., and Zhang M.Q. 2001. Computational identification of promoters and first exons in the human genome. Nat. Genet. 29:412.

Davuluri R.V., Grosse I., and Zhang M.Q. 2003. Annotation of promoters and first exons in the human genome. Submitted.

Java注解

注解可以先把注解当成注释来看，注释就是给类的各个组成部分（包、类名、构造器、属性、方法、方法参数，以及局部变量）添加一些解释。可以先不去管注解是用来干什么的，就把它当成注释来看。注解的格式当然不能与注释相同，注解是需要声明的，声明注解与声明一个接口有些相似。当然Java也有一些内置注解，例如：@Override就是内置注解。 1声明注解声明注解与声明一个接口相似，它需要使用@interface。一个注解默认为Annotation的注解还可以带有成员，没有成员的注解叫做标记注解。成员的类型只能是基本类型、枚举类型）、String、基本类型数组、String[]，以及注解和注解数组类型。其中String表示成员的类型，value()表示成员名称。其中圆括号不能没有，也不能在圆

括号内放参数，它不是一个方法，只是一个成员变量。注解可以有多个成员，但如果只有一个成员，那么成员名必须为value。这时在设置成

Java还提供了一些元注解，用来控制注解，例如@Retention和@Target： ●@Target：ElementType类型（枚举类型），表示当前注解可以标记什么东西，可选值为： TYPE：可以标记类、接口、注解类、Enum。 FIELD：可以标记属性。 METHOD：可以标记就去。 PARAMETER：可以标记参数。 CONSTRUCTOR：可以标记构造器。 LOCAL_VARIABLE：可以标记局部变量。 ANNOTATION_TYPE：可以标记注解类声明。

PACKAGE：可以标记包。 ●@Retention：RetentionPolicy类型（枚举类型），表示注解的可保留期限。可选值为： SOURCE：只在源代码中存在，编译后的字节码文件中不保留注解信息。 CLASS：保留到字节码文件中，但类加载器不会加载注解信息到JVM。 RUNTIME：保留到字节码文件中，并在目标类被类加载器加载时，同时加载注解信息到JVM，可以通过反射来获取注解信息。 2访问注解很多第三方程序或工具都使用了注解完成特殊的任务，例如Spring、Struts等。它们都提供了自己的注解类库。在程序运行时使用反射来获取注解信息。下面我们来使用反射来获取注解信息。

annotation入门_

Java Annotation 入门
摘要：本文针对 java 初学者或者 annotation 初次使用者全面地说明了 annotation 的使用方法、定义方式、分类。初学者可以通过以上的说明制作简单的 annotation 程序，但是对于一些高级的 an notation 应用（例如使用自定义 annotation 生成 javabean 映射 xml 文件）还需要进一步的研究和探讨。涉及到深入 annotation 的内容，作者将在后文《Java Annotation 高级应用》中谈到。
同时，annotation 运行存在两种方式：运行时、编译时。上文中讨论的都是在运行时的 annota tion 应用，但在编译时的 annotation 应用还没有涉及，
一、为什么使用 Annotation：
在 JAVA 应用中，我们常遇到一些需要使用模版代码。例如，为了编写一个 JAX-RPC web serv ice，我们必须提供一对接口和实现作为模版代码。如果使用 annotation 对远程访问的方法代码进行修饰的话，这个模版就能够使用工具自动生成。另外，一些 API 需要使用与程序代码同时维护的附属文件。例如，JavaBeans 需要一个 BeanIn fo Class 与一个 Bean 同时使用/维护，而 EJB 则同样需要一个部署描述符。此时在程序中使用 a nnotation 来维护这些附属文件的信息将十分便利而且减少了错误。
二、Annotation 工作方式：
在 5.0 版之前的 Java 平台已经具有了一些 ad hoc annotation 机制。比如，使用 transient 修饰符来标识一个成员变量在序列化子系统中应被忽略。而@deprecated 这个 javadoc tag 也是一个 ad hoc annotation 用来说明一个方法已过时。从 Java5.0 版发布以来，5.0 平台提供了一个正式的 annotation 功能：允许开发者定义、使用自己的 annoatation 类型。此功能由一个定义 annotation 类型的语法和一个描述 annotation 声明的语法，读取 annotaion 的 API，一个使用 annotation 修饰的 class 文件，一个 annotation 处理工具（apt）组成。
1

shiro入门教程

一、介绍： shiro是apache提供的强大而灵活的开源安全框架，它主要用来处理身份认证，授权，企业会话管理和加密。 shiro功能：用户验证、用户执行访问权限控制、在任何环境下使用session API，如cs程序。可以使用多数据源如同时使用oracle、mysql。单点登录(sso)支持。remember me服务。详细介绍还请看官网的使用手册：https://www.360docs.net/doc/e09624022.html,/reference.html 与spring security区别，个人觉得二者的主要区别是： 1、shiro灵活性强，易学易扩展。同时，不仅可以在web中使用，可以工作在任务环境内中。 2、acegi灵活性较差，比较难懂，同时与spring整合性好。如果对权限要求比较高的项目，个人建议使用shiro，主要原因是可以很容易按业务需求进行扩展。附件是对与shiro集成的jar整合及源码。二、shiro与spring集成 shiro默认的配置，主要是加载ini文件进行初始化工作，具体配置，还请看官网的使用手册（https://www.360docs.net/doc/e09624022.html,/web.html）init文件不支持与spring的集成。此处主要是如何与spring及springmvc集成。 1、web.xml中配置shiro过滤器，web.xml中的配置类使用了spring的过滤代理类来完成。 Xml代码 2、在spring中的application.xml文件中添加shiro配置：

Java代码

anon org.apache.shiro.web.filter.authc.AnonymousFilter authc org.apache.shiro.web.filter.authc.FormAuthenticatio nFilter authcBasic org.apache.shiro.web.filter.authc.BasicHttpAuthenti cationFilter logout org.apache.shiro.web.filter.authc.LogoutFilter noSessionCrea tion org.apache.shiro.web.filter.session.NoSessionCreati onFilter perms org.apache.shiro.web.filter.authz.PermissionsAuthor izationFilter port org.apache.shiro.web.filter.authz.PortFilter rest org.apache.shiro.web.filter.authz.HttpMethodPermiss ionFilter roles org.apache.shiro.web.filter.authz.RolesAuthorizatio nFilter ssl org.apache.shiro.web.filter.authz.SslFilter user https://www.360docs.net/doc/e09624022.html,erFilter

RESTEasy入门经典

RESTEasy是JBoss的开源项目之一，是一个RESTful Web Services框架。RESTEasy的开发者Bill Burke同时也是JAX-RS的J2EE标准制定者之一。JAX-RS 是一个JCP制订的新标准，用于规范基于HTTP的RESTful Web Services的API。我们已经有SOAP了，为什么需要Restful WebServices？用Bill自己的话来说："如果是为了构建SOA应用，从技术选型的角度来讲，我相信REST比SOAP更具优势。开发人员会意识到使用传统方式有进行SOA架构有多复杂，更不用提使用这些做出来的接口了。这时他们就会发现Restful Web Services的光明之处。" 说了这么多，我们使用RESTEasy做一个项目玩玩看。首先创造一个maven1的web 项目 Java代码 1.mvn archetype:create -DgroupId=org.bluedash \ 2. 3.-DartifactId=try-resteasy -DarchetypeArtifactId=maven-archetype -webapp 准备工作完成后，我们就可以开始写代码了，假设我们要撰写一个处理客户信息的Web Service，它包含两个功能：一是添加用户信息；二是通过用户Id，获取某个用户的信息，而交互的方式是标准的WebService形式，数据交换格式为XML。假设一条用户包含两个属性：Id和用户名。那么我们设计交换的XML数据如下: Java代码 1. 2. 1 3. liweinan 4. 首先要做的就是把上述格式转换成XSD2，网上有在线工具可以帮助我们完成这一工作3，在此不详细展开。使用工具转换后，生成如下xsd文件： Java代码 1. 2. 4.

关系映射annotation

一对一(One-To-One) 使用@OneToOne注解建立实体Bean之间的一对一关联。一对一关联有三种情况：(1).关联的实体都共享同样的主键，(2).其中一个实体通过外键关联到另一个实体的主键(注意要模拟一对一关联必须在外键列上添加唯一约束)，(3).通过关联表来保存两个实体之间的连接关系(要模拟一对一关联必须在每一个外键上添加唯一约束)。 1.共享主键的一对一关联映射： @Entity @Table(name="Test_Body") public class Body { private Integer id; private Heart heart; @Id public Integer getId() { return id; } public void setId(Integer id) { this.id = id; } @OneToOne @PrimaryKeyJoinColumn public Heart getHeart() { return heart; }

public void setHeart(Heart heart) { this.heart = heart; } } @Entity @Table(name="Test_Heart") public class Heart { private Integer id; @Id public Integer getId() { return id; } public void setId(Integer id) { this.id = id; } } 通过@PrimaryKeyJoinColumn批注定义了一对一关联 2.使用外键进行实体一对一关联： @Entity @Table(name="Test_Trousers") public class Trousers { @Id public Integer id;

java中注解的几大作用

@SuppressWarnings("deprecation")//阻止警告 @HelloAnnotation("当为value属性时，可以省掉属性名和等于号。") public static void main(String[]args)throws Exception{ System.runFinalizersOnExit(true); if(AnnotationTest.class.isAnnotationPresent(HelloAnnotation.class)){ HelloAnnotation helloAnnotation= (HelloAnnotation)AnnotationTest.class.getAnnotation(HelloAnnotation.class); System.out.println("color():"+helloAnnotation.color()); System.out.println("value():"+helloAnnotation.value()); System.out.println("author():"+helloAnnotation.author()); System.out.println("arrayAttr():"+helloAnnotation.arrayAttr().length); System.out.println("annotationAttr():"+helloAnnotation.annotationAttr().value()); System.out.println("classType(): "+helloAnnotation.classType().newInstance().sayHello("hello,ketty")); } } @Deprecated//自定义：备注过时的方法信息 public static void sayHello(){ System.out.println("hello,world"); } }

ERDAS IMAGINE快速入门

实验一ERDAS IMAGINE快速入门一、背景知识 ERDAS IMAGINE是美国ERDAS公司开发的遥感图像处理系统，后来被Leica公司合并。它以其先进的图像处理技术，友好、灵活的用户界面和操作方式，面向广阔应用领域的产品模块，服务于不同层次用户的模型开发工具以及高度的RS/GIS（遥感图像处理和地理信息系统）集成功能，为遥感及相关应用领域的用户提供了内容丰富而功能强大的图像处理工具，代表了遥感图像处理系统未来的发展趋势。 ERDAS IMAGINE是以模块化的方式提供给用户的，可使用户根据自己的应用要求、资金情况合理地选择不同功能模块及其不同组合，对系统进行剪裁，充分利用软硬件资源，并最大限度地满足用户的专业应用要求，目前的最高版本为9.1。ERDAS IMAGINE面向不同需求的用户，对于系统的扩展功能采用开放的体系结构以IMAGINE Essentials、IMAGINE Advantage、IMAGINE Professional的形式为用户提供了低、中、高三档产品架构，并有丰富的功能扩展模块供用户选择，使产品模块的组合具有极大的灵活性。 ?IMAGINE Essentials级：是一个花费极少的，包括有制图和可视化核心功能的影像工具软件。该级功能的重点在于遥感图像的输入、输出与显示；图像库的建立与查询管理；专题制图；简单几何纠正与非监督分类等。 ?IMAGINE Advantage级：是建立在IMAGINE Essential级基础之上的，增加了更丰富的图像光栅GIS和单片航片正射校正等强大功能的软件。IMAGINE Advantag提供了灵活可靠的用于光栅分析，正射校正，地形编辑及先进的影像镶嵌工具。简而言之，IMAGINE Advantage是一个完整的图像地理信息系统（Imaging GIS）。 ?IMAGINE Professional级：是面向从事复杂分析，需要最新和最全面处理工具，

一小时搞明白注解处理器(Annotation Processor Tool)

一小时搞明白注解处理器（Annotation Processor Tool）什么是注解处理器？注解处理器是（Annotation Processor）是javac的一个工具，用来在编译时扫描和编译和处理注解（Annotation）。你可以自己定义注解和注解处理器去搞一些事情。一个注解处理器它以Java代码或者（编译过的字节码）作为输入，生成文件（通常是java文件）。这些生成的java文件不能修改，并且会同其手动编写的java代码一样会被javac编译。看到这里加上之前理解，应该明白大概的过程了，就是把标记了注解的类，变量等作为输入内容，经过注解处理器处理，生成想要生成的java代码。处理器AbstractProcessor 处理器的写法有固定的套路，继承AbstractProcessor。如下： [java] view plain copy 在CODE上查看代码片派生到我的代码片 public class MyProcessor extends AbstractProcessor { @Override public synchronized void init(ProcessingEnvironment processingEnv) { super.init(processingEnv); } @Override public Set getSupportedAnnotationTypes() { return null; } @Override public SourceVersion getSupportedSourceVersion() { return https://www.360docs.net/doc/e09624022.html,testSupported(); } @Override public boolean process(Set annotations, RoundEnvironment roundEnv) { return true; } } init(ProcessingEnvironment processingEnv) 被注解处理工具调用，参数ProcessingEnvironment 提供了Element，Filer，Messager等工具 getSupportedAnnotationTypes() 指定注解处理器是注册给那一个注解的，它是一个字符串的集合，意味着可以支持多个类型的注解，并且字符串是合法全名。getSupportedSourceVersion 指定Java版本 process(Set annotations, RoundEnvironment roundEnv) 这个也是最主

二次开发入门MapBasic--MapInfo教程

MapInfo教程--二次开发入门摘要：MapBasic是Mapinfo自带的二次开发语言，它是一种类似Basic的解释性语言，利用MapBasic编程生成的*.mbx文件能在Mapinfo软件平台上运行，早期的Mapinfo二次开发都是基于MapBasic进行的。MapBasic学起来容易，用起来却束缚多多，无法实现较复杂的自定义功能，用它来建立用户界面也很麻烦，从现在角度看，MapBasic比较适合用于扩展Mapinfo功能。一、利用MapBasic开发 MapBasic是Mapinfo自带的二次开发语言，它是一种类似Basic的解释性语言，利用MapBasic编程生成的*.mbx文件能在Mapinfo软件平台上运行，早期的Mapinfo二次开发都是基于MapBasic进行的。MapBasic学起来容易，用起来却束缚多多，无法实现较复杂的自定义功能，用它来建立用户界面也很麻烦，从现在角度看，MapBasic比较适合用于扩展Mapinfo功能。二、利用OLE自动化开发 1. 建立Mapinfo自动化对象基于OLE自动化的开发就是编程人员通过标准化编程工具如VC、VB、Delphi、PB等建立自动化控制器，然后通过传送类似MapBasic语言的宏命令对Mapinfo进行操作。实际上是将Mapinfo用作进程外服务器，它在后台输出OLE自动化对象，供控制器调用它的属性和方法。 OLE自动化开发的首要一步就是建立Mapinfo自动化对象，以Delphi为例(后面都是如此)，你可设定一个Variant类型的全程变量代表OLE自动化对象，假设该变量名为olemapinfo，那么有： oleMapinfo := CreateOleObject('Mapinfo.Application') 一旦OLE自动化对象建立，也就是后台Mapinfo成功启动，你就可以使用该对象的Do方法向Mapinfo发送命令，如： oleMapinfo.Do('Set Next Document Parent' + WinHand + 'Style 1') 这一命令使Mapinfo窗口成为应用程序的子窗口，WinHand是地图窗口句柄，style 1 是没有边框的窗口类型。你还可以使用自动化对象的Eval方法返回MapBasic表达式的值，如下面语句返回当前所打开的表数： TablesNum:=olemapinfo.eval('NumTables()') 你也可以直接调用Mapinfo菜单或按钮命令对地图窗口进行操作，如地图放大显示：oleMapinfo.RunMenuCommand(1705) 2. 建立客户自动化对象触发CallBack 基于OLE自动化开发的难点在于所谓的CallBack，Mapinfo服务器对客户程序地图窗口的反应叫CallBack，假如你在地图窗口中移动地图目标，Mapinfo能返回信息告诉你地图目标当前的坐标位置，这就是CallBack功能。如果你想定制自己的地图操作工具或菜单命令，你必须依靠CallBack。但是想捕获CallBack信息，你的客户程序必须具备接收CallBack信息的能力，为此需要在客户程序中定义自己的OLE自动化对象，如： //定义界面 IMyCallback = interface(IDispatch) ['{2F4E1FA1-6BC7-11D4-9632-913682D1E638}'] function WindowContentsChanged(var WindowID: Integer):SCODE;safecall; function SetStatusText(var StatusText: WideString): SCODE; safecall; //定义界面实现

java注解详解

注解(Annotation)简介 Annotation(注解)是JDK5.0及以后版本引入的一个特性。注解是java的一个新的类型（与接口很相似），它与类、接口、枚举是在同一个层次，它们都称作为java的一个类型（TYPE）。它可以声明在包、类、字段、方法、局部变量、方法参数等的前面，用来对这些元素进行说明，注释。它的作用非常的多，例如:进行编译检查、生成说明文档、代码分析等。 JDK提供的几个基本注解 a.@SuppressWarnings 该注解的作用是阻止编译器发出某些警告信息。它可以有以下参数: deprecation：过时的类或方法警告。 unchecked：执行了未检查的转换时警告。 fallthrough：当Switch程序块直接通往下一种情况而没有Break时的警告。 path：在类路径、源文件路径等中有不存在的路径时的警告。 serial：当在可序列化的类上缺少serialVersionUID定义时的警告。 finally：任何finally子句不能完成时的警告。 all：关于以上所有情况的警告。 b.@Deprecated 该注解的作用是标记某个过时的类或方法。 c.@Override 该注解用在方法前面，用来标识该方法是重写父类的某个方法。元注解 a.@Retention 它是被定义在一个注解类的前面，用来说明该注解的生命周期。它有以下参数： RetentionPolicy.SOURCE：指定注解只保留在一个源文件当中。 RetentionPolicy.CLASS：指定注解只保留在一个class文件中。 RetentionPolicy.RUNTIME：指定注解可以保留在程序运行期间。 b.@Target 它是被定义在一个注解类的前面，用来说明该注解可以被声明在哪些元素前。它有以下参数： ElementType.TYPE：说明该注解只能被声明在一个类前。 ElementType.FIELD：说明该注解只能被声明在一个类的字段前。 ElementType.METHOD：说明该注解只能被声明在一个类的方法前。 ElementType.PARAMETER：说明该注解只能被声明在一个方法参数前。

Maya从入门到精通经典讲解

Maya绝技83式从入门到精通第1招、自制MAY A启动界面在安装目录下的BIN文件夹中的MayaRes.dll文件，用Resource Hacker打开。在软件的目录树中找到“位图”下MAY ASTARTUPIMAGE.XPM并保存。图片分辨率要一致，然后选择替换位图，把自己修改的图片替换保存，即可。第2招、控制热盒的显示 MAYA中的热盒可以按着空格键不放，就可以显示出来。并且按下鼠标左键选择Hotbox Style 中的Zones Only可以不让热盒弹出。如果选择Center Zone Only可以连AW的字样也不会出现。完全恢复的快捷键是ALT+M。第3招、创建多彩的MAY A界面 MAYA默认界面色彩是灰色的，如果你想尝试一下其他的色彩界面，可以自行修改。方法是选择Windows/Settings/Preferences/Colors... 第4招、创建自己的工具架把自己最常用的工具放置在工具架的方法是，按下Ctrl＋Shift的同时，点选命令，该命令就可以添加到当前的工具架上了。第5招、自定义工具架图标我们将一行MEL添加到工具架上的时候，图标出现MEL字样，不容易区分，此时可以选择Windows/Settings/Preferences/Shelves选择新添加的命令，单击Change Image按钮，选择要替换的图片，选择Save All Shelves按钮，就替换成功。第6招、自定义标记菜单执行Windows/Settings/Preferences/Marking Menus设置相关参数，然后在Settings下符合自己操作习惯来设置参数，最后单击Save即可。第7招、自定义物体属性如果想添加一个属性，并且把其他数据进行设置表达式或者驱动关键帧，就必须在属性对话框中点击Attributes/add... 第8招、选择并且拖动打开Windows/Settings/Preferences在Selection中，勾选Click Drag Select然后点击Save这样就可以了。第9招、界面元素隐藏或显示执行Display/UI Elements下的Show UI Elements或者Hide UI Elements可以对于全界面下元素显示或者隐藏。第10招、改变操纵器的显示大小与粗细打开Windows/Settings/Preferences在Manipulators中修改Line Size可以改变操纵器的显示粗细，按下小键盘的“+”“-”可以改变操纵器的显示大小。

hibernate_annotation

Hibernate Annotation 使用hibernate Annotation来映射实体准备工作下载 hibernate-distribution-3.3.2.GA hibernate-annotations-3.4.0.GA slf4j 导入相关依赖包 Hibernate HOME： \hibernate3.jar \lib\bytecode（二进制） \lib\optional（可选的） \lib\required（必须的）导入required下的所有jar包 antlr-2.7.6.jar commons-collections-3.1.jar dom4j-1.6.1.jar hibernate3.jar javassist-3.9.0.GA.jar jta-1.1.jar slf4j-api-1.5.10.jar slf4j-log4j12-1.5.10.jar log4j-1.2.14.jar mysql.jar ---Annotation包 ejb3-persistence.jar hibernate-annotations.jar hibernate-commons-annotations.jar

简单的例子，通过annotation注解来映射实体PO 1、建立（Java Project）项目：hibernate_0100_annotation_HelloWorld_default 2、在项目根下建立lib目录 a)导入相关依赖jar包 antlr-2.7.6.jar commons-collections-3.1.jar dom4j-1.6.1.jar ejb3-persistence.jar hibernate-annotations.jar hibernate-commons-annotations.jar hibernate3.jar javassist-3.9.0.GA.jar jta-1.1.jar log4j-1.2.14.jar mysql.jar slf4j-api-1.5.10.jar slf4j-log4j12-1.5.10.jar 3、建立PO持久化类cn.serup.model.Teacher 内容如下 package cn.serup.model; import javax.persistence.Entity; import javax.persistence.Id; //@Entity表示该是实体类 @Entity public class Teacher { private int id ; private String username ; private String password ; //ID为主键，主键手动分配 @Id public int getId() { return id; } public void setId(int id) { this.id = id;

springMVC+annotation的简单配置

Spring MVC + Annotation 1.创建一个web工程名字spring_mvc 2.添加相应配置文件添加spring应用上下文的文件applicationContext.xml Log4j配置文件：log4j.properties 3.在web.xml中添加配置 1.配置spring上下文 contextConfigLocation /WEB-INF/applicationContext.xml 2.添加log4J的配置文件 log4jConfigLocation /WEB-INF/classes/log4j.properties 3.设置字符集 CharacterEncodingFilter org.springframework.web.filter.CharacterEncodingFilter encoding UTF-8 forceEncoding true CharacterEncodingFilter /* 4.对spring上下文添加监听 org.springframework.web.context.ContextLoaderListener

WGCNA新手入门笔记2(含代码和数据)

WGCNA新手入门笔记2（含代码和数据）上次我们介绍了WGCNA的入门（WGCNA新手入门笔记（含代码和数据）），大家在安装WGCNA包的时候，可能会遇到GO.db这个包安装不了的问题。主要问题应该是出在电脑的防火墙，安装时请关闭防火墙。如果还有问题，请先单独安装AnnotationDbi这个包，biocLite("AnnotationDbi") 再安装GO.db，并尝试从本地文件安装该包。如果还有问题，请使用管理员身份运行R语言，尝试上述步骤。另外如果大家问题解决了请在留言处留个言，告知大家是在哪一步解决了问题，谢谢！因为本人没有进行单因素实验，不知道到底是哪个因素改变了实验结果。。。今天给大家过一遍代码。网盘中有代码和数据。链接：https://www.360docs.net/doc/e09624022.html,/s/1bpvu9Dt 密码：w7g4 ##导入数据## library(WGCNA)options(stringsAsFactors = FALSE)enableWGCNAThreads()

enableWGCNAThreads()是指允许R语言程序最大线程运行，像我这电脑是4核CPU的，那么就能用上3核：当然如果当前电脑没别的事，也可以满负荷运作 samples=read.csv( 'Sam_info.txt',sep = 't',https://www.360docs.net/doc/e09624022.html,s = 1)expro=read.csv( 'ExpData.txt',sep = 't',https://www.360docs.net/doc/e09624022.html,s = 1)dim(expro) 这部分代码是为了让R语言读取外部数据。当然了在读取数据之前首先改变一下工作目录，这一点在周二的文章中提过了。R语言读取外部数据的方式常用的有read.table和read.csv，这里用的是read.csv，想要查看某一函数的具体参数，可以用？函数名查看，比如：大家可以注意到read.table和read.csv中header参数的默认值是不同的，header=true表示第一行是标题，第二行才是数据，header=false则表示第一行就是数据，没有标题。##筛选方差前25%的基因## m.vars=apply(expro, 1,var)expro.upper=expro[which(m.vars>quantile(m.vars, probs = seq( 0, 1, 0.25))[ 4]),]dim(expro.upper)datExpr= as.data.frame(t(expro.upper));nGenes = ncol(datExpr)nSamples = nrow(datExpr) 这一步是为了减少运算量，因为一个测序数据可能会有好几

java中注解的几大作用

注解的作用： 1、生成文档。这是最常见的，也是java 最早提供的注解。常用的有@see @param @return 等 2、跟踪代码依赖性，实现替代配置文件功能。比较常见的是spring 2.5 开始的基于注解配置。作用就是减少配置。现在的框架基本都使用了这种配置来减少配置文件的数量。以后java的程序开发，最多的也将实现注解配置，具有很大用处; 3、在编译时进行格式检查。如@override 放在方法前，如果你这个方法并不是覆盖了超类方法，则编译时就能检查出。使用方法详解：下面是注解类,其实注解也就是一个类文件 package annotation; import https://www.360docs.net/doc/e09624022.html,ng.annotation.ElementType; import https://www.360docs.net/doc/e09624022.html,ng.annotation.Retention; import https://www.360docs.net/doc/e09624022.html,ng.annotation.RetentionPolicy; import https://www.360docs.net/doc/e09624022.html,ng.annotation.Target; import entity.PersonChiness; /*** * Retention:保持、保留 * RetentionPolicy：政策、方针 * @author huawei *@Retention *1、指示注释类型的注释要保留多久。如果注释类型声明中不存在Retention 注释，则保留策略默认为RetentionPolicy.CLASS *2、有三种取值(代表三个阶段)： * RetentionPolicy.SOURCE:保留注解到java源文件阶段，例如Override、SuppressWarnings * RetentionPolicy.CLASS:保留注解到class文件阶段,例如 * RetentionPolicy.RUNTIME:保留注解到运行时阶段即内存中的字节码,例如Deprecated */ //元注解：表示的是注解的注解，（同义词有元信息、元数据） //如果不加,javac会把这无用的注解丢掉 @Retention(RetentionPolicy.RUNTIME) @Target({ElementType.TYPE,ElementType.METHOD})//指定该注解使用的用处：用在class上和用在方法体上。 public @interface HelloAnnotation {

如何做annotation

如何做annotation 可以从文章结构（topic, thesis, thesis statement, topic sentence, coherence, unity, etc.）、文章内容（main idea and function of each para.）、读后（reflection）。具体建议： ?Underline key words. ?Note the meaning and function of each para. ?Summarize the key information. ?Identify the purpose. ?Evaluate the strengths and weaknesses in logic, ? writing style, languages, etc./Agree or disagree ?Reflect on the topic 90-100分的评分标准如下： 90-100 An active and in-depth examination of a text—explanatory and critical ?Recognize and remember the vital information (summarize main idea, purpose, etc.) ?Analyze the organization of the text (topic, thesis, function of each para., etc.) ?Analyzing facts, opinions, and bias statements, if there are ?Interpreting the facts along with the author’s attitude ?Evaluate the strengths and weaknesses of the text ?Agree or disagree with certain points in the text ?Reflect: Think about the subject matter more deeply and thoroughly

Angular2 入门

Angular2 入门快速上手 Why Angular2 Angular1.x显然非常成功，那么，为什么要剧烈地转向Angular2？性能的限制 AngularJS当初是提供给设计人员用来快速构建HTML表单的一个内部工具。随着时间的推移，各种特性被加入进去以适应不同场景下的应用开发。然而由于最初的架构限制（比如绑定和模板机制），性能的提升已经非常困难了。快速变化的WEB 在语言方面，ECMAScript6的标准已经完成，这意味着浏览器将很快支持例如模块、类、lambda表达式、generator等新的特性，而这些特性将显著地改变JavaScript的开发体验。在开发模式方面，Web组件也将很快实现。然而现有的框架，包括Angular1.x对WEB组件的支持都不够好。移动化想想5年前......现在的计算模式已经发生了显著地变化，到处都是手机和平板。Angular1.x没有针对移动应用特别优化，并且缺少一些关键的特性，比如：缓存预编译的视图、触控支持等。简单易用说实话，Angular1.x太复杂了，学习曲线太陡峭了，这让人望而生畏。Angular 团队希望在Angular2中将复杂性封装地更好一些，让暴露出来的概念和开发接口更简单。 Rob Eisenberg / Angular 2.0 Team

ES6工具链要让Angular2应用跑起来不是件轻松的事，因为它用了太多还不被当前主流浏览器支持的技术。所以，我们需要一个工具链： Angular2是面向未来的科技，要求浏览器支持ES6+，我们现在要尝试的话，需要加一些垫片来抹平当前浏览器与ES6的差异： ?angular2 polyfills - 为ES5浏览器提供ES6特性支持，比如Promise 等。 ?systemjs - 通用模块加载器，支持AMD、CommonJS、ES6等各种格式的JS模块加载 ?typescript - TypeScript转码器，将TypeScript代码转换为当前浏览器支持的ES5 代码。在本教程中，systemjs被配置为使用TypeScript转码器。 ?reactive extension - javascript版本的反应式编程/Reactive Programming实现库，被打包为systemjs的包格式，以便systemjs动态加载。 ?angular2 - Angular2框架，被打包为systemjs的包格式，以便systemjs 动态加载模块。处于方便代码书写的考虑，我们将这些基本依赖打包到一个压缩文件中：?angular2.beta.stack.min.js