MORAL -- A Vision-based Object Recogntion System for Autonomous Mobile Systems

合集下载

基于改进YOLOX的钢材表面缺陷检测研究

现代电子技术Modern Electronics Technique2024年5月1日第47卷第9期May 2024Vol. 47 No. 90 引言钢材作为一种重要的工业产品正随着经济的发展扩张规模、提升产量，尽管目前生产制造水平有了巨大的进步，但在钢材的生产和加工过程中，很容易受到各种不良因素的影响，从而使钢材表面产生多种类型的缺陷[1]。

比较常见的缺陷有划痕、孔洞、裂纹，这些不同类型的缺陷会使得钢材的耐久性、使用强度急剧下降，缺陷会随着时间影响正常使用，甚至会造成不可预料的后果，所以，迅速而精确地识别钢材表面的缺陷变得至关重要。

在AI 技术日益成熟的今天，计算机视觉逐步取代了传统的检测手段。

计算机视觉可以较好地解决传统检测方法漏检率高、成本高等缺点，它在图像分类、人脸识别和目标检测等领域得到了广泛应用[2]。

近年来，在金属表面缺陷检测领域，文献[3]提出了一种基于自适应空间特征融合结构与EIOU 损失函数的改进YOLOv4算法，提高了检测精度。

文献[4]在YOLOv4的基础上设计了一个并行的双通道注意力模块，提出了YOLO ⁃DCSAM 算法对铝带缺陷进行检测。

文献[5]基于YOLOX 模型，结合新型特征提取网络ECMNet 与数据增基于改进YOLOX 的钢材表面缺陷检测研究刘毅，蒋三新(上海电力大学电子与信息工程学院，上海 201306)摘要：针对目前单阶段目标检测网络YOLOX 的特征提取能力不足、特征融合不充分以及钢材表面缺陷检测精度不高等问题，提出一种改进YOLOX 的钢材表面缺陷检测算法。

首先，在Backbone 部分引入改进的SE 注意力机制，增添一条最大池化层分支，进行权重融合，强化重要的特征通道；其次，在Neck 部分引入ASFF 模块，充分利用不同尺度的特征，更好地进行特征融合；最后，针对数据集所呈现的特点，将IOU 损失函数替换为EIOU 损失函数，改善模型定位不准确的问题，提高缺陷检测精度。

Model-based recognition of 3d objects from single images

Abstract
Keywords:
object recognition, model based visio
Almost all the work on invariants so far has been concerned with transformations between spaces of equal dimensionality, e.g. 40, 41, 25]. In the single-view case, invariants were found for the projection of a planar shape onto the image, although the planar shape was embedded in 3D. For real 3D objects, most of the work has involved multiple views with known correspondence, which amounts to a 3D to 3D projection. Yet, humans have little problem recognizing a 3D object from a single 2D image. This recognition ability cannot be based on pure geometry, since it has been shown (e.g. 7]) that there are no geometric invariants of a projection from 3D to 2D. Thus, when we only have 2D geometric information, we need to use some modeling assumptions to recover the 3D shape. There are several possibilities for a modelling assumption. The simplest one is having a library of speci c 3D models. In theory there could be many models that project into the same image, so an object cannot be identi ed uniquely. In practice, however, in most cases there is only one or very few models in the database that would project to the same image, so it is possible to recognize them. Another possibility is to have more generic assumptions, rather than speci c models. One such assumption can be that the visible object is symmetric in 3D. More general assuptions for curved objects were studied in 45]. In this paper we deal mainly with the two assumptions mentioned above, namely speci c models and symmetry. To a lesser extent we use other assumptions, such as that a vanishing point in a 2D image indicates parallel lines in 3D. The outline of our recognition method is as follows. We de ne a 3D invariant space, namely a space with three invariant coordinates I1; I2; I3. Given a 3D model, we can extract a set of such invariant triplets from it, so it can be represented as a set of points in the invariant space. Given an image of the model, the depth information is lost so the invariant point cannot be recovered. However, we show that we can draw a set of \invariant light rays" in 3D, each ray passing through a 3D invariant model point (Fig. 1). When enough rays intersect the model points in 3D, we can safely assume that the model is indeed the one visible in the image. We do not need to search for correspondences. We can also see that the rays converge at a point in the invariant space that represents the location of the camera center with respect to the model. Thus it is easy to nd the pose of the model. Given this, we can project the original (non-invariant) model onto the image. That makes it possible to perform a more exact match between the model projection and the given image of the object. In summary, the invariant modeling assumption and the object descriptors make it possible to perform recognition regardless of viewpoint and with no need for a search for correspondence. The use of modeling for shape recovery from single images is of course not new. However, most of the earlier work was not concerned with viewpoint invariance. Some recent research does use invariance in modeling. However, most of it uses very speci c modeling assumptions that cannot be applied to general shapes. A major example is the assumption that the objects are composed of \general cylinders" 4, 46] of various forms. The invariance and generality of this assumption are limited. A subset of this is the 1

2017年6月英语六级阅读真题及答案第1套选词填空

2017年6月英语六级阅读真题及答案第1套选词填空After becoming president of Purdue University in2013, Mitch Daniels asked the faculty to prove that their students have actually achieved one of higher education’s most important goals: critical thinking skills. Two years before, a nationwide study of college graduates had shown that more than a third had made no 26 gains in such mental abilities during their school years. Mr. Daniels needed to__27__ the high cost of attending Purdue to its students and their families. After all, the percentage of Americans who say a college degree is "very important" has fallen 28 in the last 5-6 years.Purdue now has a pilot test to assess students' critical thinking skills. Yet like many collegeteachers around the U.S., the faculty remain __29__ that their work as educators can be measured by "learning 30 _ " such as a graduate's ability to investigate and reason. However, the professors need not worry so much. The results of a recent experiment showed that professors can use __31__ metrics to measure how well students do in three key areas: critical thinking, written communication, and quantitative literacy.Despite the success of the experiment, the actual results are worrisome, and mostly __32__ earlier studies. The organizers of the experiment concluded that far fewer students were achieving high levels on critical thinking than they were doing for written communication or quantitative literacy. And that conclusion is based only on students nearing graduation.American universities, despite their global 33__ for excellence in teaching, have only begun to demonstrate what they can produce in real-world learning. Knowledge-based degrees are still important, but employers are still important, but employers are __34__ advanced thinking skills from college graduates. If the intellectual worth of a college degree can be __35__ measured, more people will seek higher education—and come out better thinkers.A. accuratelyB. confirmC. demandingD. doubtfulE. drasticallyF. justifyG. monopolizedH. outcomeI. predominanceJ. presumingK. reputationL. significantM. signifyN. simultaneouslyO. standardized答案：(26)L. significant(27)F. justify(28)E. drastically(29)D. doubtful(30)H. outcome(31)O. standardized(32)B. confirm(33)K. reputation(34)C. demanding(35)A. accurately2017年6月英语六级阅读真题及答案第1套仔细阅读2篇Open data sharers are still in the minority in many fields. Although many researchers broadlyagree that public access to raw data would accelerate science, most are reluctant to post the results of their own labors online.Some communities have agreed to share online—geneticists, for example, post DNA sequencesat the GenBank repository (库) , and astronomers are accustomed to accessing images of galaxies and stars from, say, the Sloan Digital Sky Survey, a telescope that has observed some500 million objects—but these remain the excepti on, not the rule. Historically, scientists have objected to sharing for many reasons: it is a lot of work; until recently, good databases did not exist; grant funders were not pushing for sharing; it has been difficult to agree on standardsfor formatting data; and there is no agreed way to assign credit for data.But the barriers are disappearing, in part because journals and funding agencies worldwide areencouraging scientists to make their data public. Last year, the Royal Society in London said inits report that scientists need to "shift away from a research culture where data is viewed as aprivate preserve". Funding agencies note that data paid for with public money should be publicinformation, and the scientific community is recognizing that data can now be shared digitallyin ways that were not possible before. To match the growing demand, services are springing upto make it easier to publish research products online and enable other researchers to discover and cite them.Although calls to share data often concentrate on the moral advantages of sharing, the practice is not purely altruistic (利他的). Researchers who share get plenty of personal benefits, including more connections with colleagues, improved visibility and increased citatio ns. The most successful sharers—those whose data are downloaded and cited the most often---get noticed, and their work gets used. For example, one of the most popular data sets onmultidisciplinary repository Dryad is about wood density around the world; it has beendownloaded 5,700 times. Co-author Amy Zanne thinks that users probably range from climate-change researchers wanting to estimate how muc h carbon is stored in biomass, to foresters looking for information on different grades of timber. "I'd much prefer to have my data used by the maximum number of people to ask their own questions," she says. "It's important to allow readers and reviewers to see exactly how you arrive at your results. Publishing data and code allows your science to be reproducible."Even people whose data are less popular can benefit. By making the effort to organize andlabel files so others can understand them, scientists become more organized and better disciplined themselves, thus avoiding confusion later on.46. What do many researchers generally accept?A. It is imperative to protect scientists' patents.B. Repositories are essential to scientific research.C. Open data sharing is most important to medical science.D. Open data sharing is conducive to scientific advancement.47. What is the attitude of most researchers towards making their own data public?A. Opposed.B. Ambiguous.C. Liberal.D. Neutral.48. According to the passage, what might hinder open data sharing?A. The fear of massive copying.B. The lack of a research culture.C. The belief that research data is private intellectual property.D. The concern that certain agencies may make a profit out of it.49. What helps lift some of the barriers to open data sharing?A. The ever-growing demand for big data.B. The advancement of digital technology.C. The changing attitude of journals and funders.D. The trend of social and economic development.50. Dryad serves as an example to show how open data sharing ________.A. is becoming increasingly popularB. benefits sharers and users alikeC. makes researchers successfulD. saves both money and laborPassage TwoQuestions 51 to 55 are based on the following passage.Macy's reported its sales plunged 5.2% in November and December at stores open more than a year, a disappointing holiday season performance that capped a difficult year for a department store chain facing wide-ranging challenges. Its flagship stores in major U.S. cities depend heavily on international tourist spending, which shrank at many retailers due to a strong dollar. Meanwhile, Macy's has simply struggled to lure consumers who are more interested in spending on travel or dining out than on new clothes or accessories.The company blamed much of the poor performance in November and December on unseasonably warm weather. "About 80% of our company's year-over-year declines in comparable sales can be attributed to shortfalls (短缺) in cold-weather goods," said chief executive Teny Lundgren in a press release. This prompted the company to cut its forecasts for the full fourth quarter.However, it's clear that Macy's believes its troubles run deeper than a temporary aberration (偏离) off the thermometer. The retail giant said the poor financial performance this year has pushed it to begin implementing $400 million in cost-cutting measures. The company pledged to cut 600 back-office positions, though some 150 workers in those roles would be reassigned to other jobs. It also plans to offer "voluntary separation" packages to 165 senior executives. It will slash staffing at its fleet of 770 stores, a move affecting some 3,000 employees.The retailer also announced the locations of 36 stores it will close in early 2016. The company had previously announced the planned closures, but had not said which locations would be affected. None of the chain's stores in the Washington metropolitan area are to be closed.Macy's has been moving aggressively to try to remake itself for a new era of shopping. It has plans to open more locations of Macy's Backstage, a newly-developed off-price concept which might help it better compete with ambitious T. J. Maxx. It's also pushing ahead in 2016 with an expansion of Bluemercury, the beauty chain it bought last year. At a time when young beauty shoppers are often turning to Sephora or Ulta instead of department store beauty counters, Macy's hopes Bluemercury will help strengthen its position in the category.One relative bright spot for Macy's during the holiday season was the online channel, where it rang up "double-digit" increases in sales and a 25% increase in the number of orders it filled. That relative strength would be consistent with what was seen in the wilder retail industry during the early part of the holiday season. While Thanksgiving, Black Friday and Cyber Monday all saw record spending online, in-store sales plunged over the holiday weekend.51. What does the author say about the shrinking spending of international tourists in the U.S.?A. It is attributable to the rising value of the U.S. dollar.B. It is a direct result of the global economic recession.C. It reflects a shift of their interest in consumer goods.D. It poses a potential threat to the retail business in the U.S.52. What does Macy's believe about its problems?A. They can be solved with better management.B. They cannot be attributed to weather only.C. They are not as serious in its online stores.D. They call for increased investments.53. In order to cut costs, Macy's decided to ________.A. cut the salary of senior executivesB. relocate some of its chain storesC. adjust its promotion strategiesD. reduce the size of its staff54. Why does Macy's plan to expand Bluemercury in 2016?A. To experiment on its new business concept.B. To focus more on beauty products than clothing.C. To promote sales of its products by lowering prices.D. To be more competitive in sales of beauty products.55. What can we learn about Macy's during the holiday season?A. Sales dropped sharply in its physical stores.B. Its retail sales exceeded those of T. J. Maxx.C. It helped Bluemercury establish its position worldwide.D. It filled its stores with abundant supply of merchandise.Passage one46.D47.A48.C49.C50.BPassage two51.A52.B53.D54.D55.A2017年6月英语六级阅读真题及答案第2套选词填空Half of your brain stays alert and prepared for danger when you sleep in a new place, a study has revealed. This phenomenon is often __26__ to as the "first-night-effect". Researchers from Brown University found that a network in the left hemisphere of the brain "remained more active" than the network in the right side of the brain. Playing sounds into the right ears (stimulating the left hemisphere) of __27__ was more likely to wake them up than if the noises were played into their left ear.It was __28__ observed that the left side of the brain was more active during deep sleep. When the researchers repeated the laboratory experiment on the second and third nights they found the left hemisphere could not be stimulated in the same way during deep sleep. The researchers explained that the study demonstrated when we are in a __29__ environment the brain partly remains alert so that humans can defend themselves against any __30__ danger.The researchers believe this is the first time that the "first-night-effect" of different brain states has been __31__ in humans. It isn't, however, the first time it has ever been seen. Some animal __32__ also display this phenomenon. For example, dolphins, as well as other __33__ animals, shut down one hemisphere of the brain when they go to sleep. A previous study noted that dolphins always __34__ control their breathing. Without keeping the brain active while sleeping, they would probably drown. But, as the human study suggest, another reason for dolphins keeping their eyes open during sleep is that they can look out for __35__ while asleep. It also keeps their physiological processes working.A.ClassifiedB. consciouslyC. dramaticallyD. exoticE. identifiedF. inherentG. marineH. novelI. potential J. predators K. referred L. species M. specifically N. varieties O. volunteers答案(26)K. referred(27)O. volunteers(28)M. specifically(29)H. novel(30)I. potential(31)E. identified(32)L. species(33)G. marine(34)B. consciously(35)J. predators2017年6月英语六级阅读真题及答案第2套仔细阅读2篇Passage One Questions 46 to 50 are based on the following passage.We live today indebted to McCardell, Cashin, Hawes, Wilkins, and Maxwell, and other women who liberated American fashion from the confines of Parisian design. Independence came in tying, wrapping, storing, harmonizing, and rationalizing that wardrobe. These designers established the modem dress code, letting playsuits and other active wear outfits suffice for casual clothing, allowing pants to enter the wardrobe, and prizing rationalism and versatility in dress, in contradiction to dressing for an occasion or allotment of the day. Fashion in America was logical and answerable to the will of the women who wore it. Implicitly or explicitly, American fashion addressed a democracy, whereas traditional Paris-based fashion was prescriptive and imposed on women, willing or not.In an earlier time, American fashion had also followed the dictates of Paris, or even copied and pirated specific French designs. Designer sportswear was not modeled on that of Europe, as "modem art" would later be; it was genuinely invented and developed in America. Its designers were not high-end with supplementary lines. The design objective and the business commitment were to sportswear, and the distinctive traits were problem-solving ingenuity and realistic lifestyle applications. Ease of care was most important: summer dresses and outfits, in particular, were chiefly cotton, readily capable of being washed and pressed at home. Closings were simple, practical, and accessible, as the modem woman depended on no personal maid to dress her. American designers prized resourcefulness and the freedom of women who wore the clothing.Many have argued that the women designers of this time were able to project their own clothing values into a new style. Of course, much of this argument in the 1930s-40s was advanced because there was little or no experience in justifying apparel (服装) on the basis of utility. If Pariswas cast aside, the tradition of beauty was also to some degree slighted. Designer sportswear would have to be verified by a standard other than that of pure beauty; the emulation of a designer's life in designer sportswear was a crude version of this relationship. The consumer was ultimately to be mentioned as well, especially by the likes of Dorothy Shaver, who could point to the sales figures at Lord & Taylor.Could utility alone justify the new ideas of the American designers? Fashion is often regarded as a pursuit of beauty, and some cherished fashion's trivial relationship to the fine arts. What the designers of the American sportswear proved was that fashion is a genuine design art, answering to the demanding needs of service. Of course these practical, insightful designers have determined the course of late twentieth-century fashion. They were the pioneers of gender equity, in their useful, adaptable clothing, which was both made for the masses and capable of self-expression.46. What contribution did the women designers make to American fashion?A. They made some improvements on the traditional Parisian design.B. They formulated a dress code with distinctive American features.C. They came up with a brand new set of design procedures.D. They made originality a top priority in their fashion design.47. What do we learn about American designer sportswear?A. It imitated the European model.B. It laid emphasis on women's beauty.C. It represented genuine American art.D. It was a completely new invention.48. What characterized American designer sportswear?A. Pursuit of beauty.B. Decorative closings.C. Ease of care.D. Fabric quality.49. What occurred in the design of women's apparel in America during the 1930s-40s?A. A shift of emphasis from beauty to utility.B. The emulation of traditional Parisian design.C. A search for balance between tradition and novelty.D. The involvement of more women in fashion design.50. What do we learn about designers of American sportswear?A. They catered to the taste of the younger generation.B. They radically changed people's concept of beauty.C. They advocated equity between men and women.D. They became rivals of their Parisian counterparts.Passage Two Questions 51 to 55 are based on the following passage.Massive rubbish dumps and sprawling landfills constitute one of the more uncomfortable impacts that humans have on wildlife. They have led some birds to give up on migration. Instead offlying thousands of miles in search of food, they make the waste sites their winter feeding grounds.Researchers in Germany used miniature GPS tags to track the migrations of 70 white storks(鹳) from different sites across Europe and Asi a during the first five months of their lives. While many birds travelled along well-known routes to warmer climates, others stopped short and spent the winter on landfills, feeding on food waste, and the multitudes of insects that thrive on the dumps.In the short-term, the birds seem to benefit from overwintering (过冬) on rubbish dumps. Andrea Flack of the Max Planck Institute found that birds following traditional migration routes were more likely to die than German storks that flew only as far as northern Morocco, and spent the winter there on rubbish dumps. "For the birds it's a very convenient way to get food. There are huge clusters of organic waste they can feed on," said Flack. The meals are not particularly appetising, or even safe. Much of the waste is discarded rotten meat, mixed inwith other human debris such as plastic bags and old toys."It's very risky. The birds can easily eat pieces of plastic or rubber bands and they can die," said Flack."And we don't know about the long-term consequences. They might eat something toxic and damage their health. We cannot estimate that yet."The scientists tracked white storks from different colonies in Europe and Africa. The Russian, Greek and Polish storks flew as far as South Africa, while those from Spain, Tunisia and Germany flew only as far as the Sahel.Landfill sites on the Iberian peninsula have long attracted local white storks, but all of theSpanish birds tagged in the study flew across the Sahara desert to the western Sahel. Writing inthe journal, the scientists describe how the storks from Germany were clearly affected by thepresence of waste sites, with four out of six birds that survived for at least five months over wintering on rubbish dumps in northern Morocco, instead of migrating to the Sahel.Flack said it was too early to know whether the benefits of plentiful food outweighed the risksof feeding on landfills. But that's not the only uncertainty. Migrating birds affect eco systems both at home and at their winter destinations, and disrupting the traditional routes could haveunexpected side effects. White storks feed on locusts (蝗虫) and other insects that can become pests if their numbers get out of hand. "They provide a useful service," said Flack.51. What is the impact of rubbish dumps on wildlife?A. They have forced white storks to search for safer winter shelters.B. They have seriously polluted the places where birds spend winter.C. They have accelerated the reproduction of some harmful insects.D. They have changed the previous migration habits of certain birds.52. What do we learn about birds following the traditional migration routes?A. They can multiply at an accelerating rate.B. They can better pull through the winter.C. They help humans kill harmful insects.D. They are more likely to be at risk of dying.53. What does Andrea Flack say about the birds overwintering on rubbish dumps?A. They may end up staying there permanently.B. They may eat something harmful.C. They may evolve new feeding habits.D. They may have trouble getting adequate food.54. What can be inferred about the Spanish birds tagged in the study?A. They gradually lose the habit of migrating in winter.B. They prefer rubbish dumps far away to those at home.C. They are not attracted to the rubbish dumps on their migration routes.D. They join the storks from Germany on rubbish dumps in Morocco.55. What is scientists' other concern about white storks feeding on landfills?A. The potential harm to the ecosystem.B. The genetic change in the stork species.C. The spread of epidemics to their homeland.D. The damaging effect on bio-diversity.Passage one46.B47.D48.C49.A50.CPassage two51.D52.D53.B54.C55.A2017年6月英语六级阅读真题及答案第3套选词填空Let's all stop judging people who talk to themselves. New research says that those who can't seem to keep their inner monologues (独白) in are actually more likely to stay on task, remain __26__ better and show improved perception capabilities. Not bad, really, for some extra muttering.According to a series of experiments published in theQuarterly Journal of Experimental Psychology by professors Gary Lupyan and Daniel Swignley, the act of using verbal clues to __27__ mental pictures helps people function quicker.In one experiment, they showed pictures of various objects to twenty __28__ and asked themto find just one of those, a banana. Half were __29__ to repeat out loud what they were lookingfor and the other half kept their lips __30__. Those who talked to themselves found the banana slightly faster than those who didn't, the researchers say. In other experiments, Lupyan andSwignley found that __31__ the name of a common product when on the hunt for it helpedquicken someone's pace, but talking about uncommon items showed no advantage and slowed you down.Common research has long held that talking themselves through a task helps children learn, although doing so when you've __32__ matured is not a great sign of __33__. The two professors hope to refute that idea, __34__ that just as when kids walk themselves through a process, adults can benefit from using language not just to communicate, but also to help"augment thinking".Of course, you are still encouraged to keep the talking at library tones and, whatever you do, keep the information you share simple, like a g rocery list. At any __35__, there's still such a thing as too much information.A. apparentlyB. arroganceC. brillianceD. claimingE. dedicatedF. focusedG. incurH. instructedI. obscurelyJ. sealedK. spectatorsL. triggerM. utteringN. volumeO. volunteers(26)F. focused(27)L. trigger(28)O. volunteers(29)H. instructed(30)J. sealed(31)M. uttering(32)A. apparently(33)C. brilliance(34)D. claiming(35)N. volume2017年6月英语六级阅读真题及答案第3套仔细阅读2篇Passage One Questions 46 to 50 are based on the following passage.Tennessee's technical and community colleges will not outsource (外包) management of their facilities to a private company, a decision one leader said was bolstered by an analysis of spending at each campus.In an email sent Monday to college presidents in the Tennessee Board of Regents system, outgoing Chancellor John Morgan said an internal analysis showed that each campus' spending on facilities management fell well below the industry standards identified by the state. Morgan said those findings—which included data from the system's 13 community colleges, 27 technical colleges and six universities—were part of the decision not to move forward with Governor Bill Haslam's proposal to privatize management of state buildings in an effort to save money."While these numbers are still being validated by the state, we feel any adjustments they might suggest will be immaterial," Morgan wrote to the presidents. "System institutions are operating very efficiently based on this analysis, raising the question of the value of pursuing a broad scale outsourcing initiative."Worker's advocates have criticized Haslam's plan, saying it would mean some campus workers would lose their jobs or benefits. Haslam has said colleges would be free to opt in or out of the out souring plan, which has not been finalized.Morgan notified the Haslam administration of his decision to opt out in a letter sent last week. That letter, which includes several concerns Morgan has with the plan, was originally obtained by The Commercial Appeal in Memphis.In an email statement from the state's Office of Customer Focused Government, which is examining the possibility of outsourcing, spokeswoman Michelle R. Martin said officials were still working to analyze the data from the Board of Regents. Data on management expenses at the college system and in other state departments will be part of a "business justification" the state will use as officials deliberate the specifics of an outsourcing plan."The state's facilities management project team is still in the process of developing its business justification and expects to have that completed and available to the public at the end of February," Martin said. "At this time there is nothing to take action on since the analysis has yet to be completed."Morgan's comments on outsourcing mark the second time this month that he has come out against one of Haslam's plans for higher education in Tennessee. Morgan said last week that he would retire at the end of January because of the governor's proposal to split off six universities of the Board of Regents system and create separate governing boards for each of them. In his resignation letter, Morgan called the reorganization "unworkable".46. What do we learn about the decision of technical and community colleges in Tennessee?A. It is backed by a campus spending analysis.B. It has been flatly rejected by the governor.C. It has neglected their faculty's demands.D. It will improve their financial situation.47. What does the campus spending analysis reveal?A. Private companies play a big role in campus management.B. Facilities management by colleges is more cost-effective.C. Facilities management has greatly improved in recent years.D. Colleges exercise foil control over their own financial affairs.48. Workers' supporters argue that Bill Haslam's proposal would _________.A. deprive colleges of the right to manage their facilitiesB. make workers less motivated in performing dutiesC. render a number of campus workers joblessD. lead to the privatization of campus facilities49. What do we learn from the state spokeswoman's response to John Morgan's decision?A. The outsourcing plan is not yet finalized.B. The outsourcing plan will be implemented.C. The state officials are confident about the outsourcing plan.D. The college spending analysis justifies the outsourcing plan.50. Why did John Morgan decide to resign?A. He had lost confidence in the Tennessee state government.B. He disagreed with the governor on higher education policies.C. He thought the state's outsourcing proposal was simply unworkable.D. He opposed the governor's plan to reconstruct the college board system.Passage Two Questions 51 to 55 are based on the following passage.Beginning in the late sixteenth century, it became fashionable for young aristocrats to visit Paris, Venice, Florence, and above all, Rome, as the culmination (终极) of their classical education. Thus was born the idea of the Grand Tour, a practice which introduced Englishmen, Germans, Scandinavians, and also Americans to the art and culture of France and Italy for the next 300 years. Travel was arduous and costly throughout the period, possible only for a privileged class—the same that produced gentlemen scientists, authors, antique experts, and patrons of the arts.The Grand Tourist was typically a young man with a thorough grounding in Greek and Latin literature as well as some leisure time, some means, and some interest in art. The German traveler Johann Winckelmann pioneered the field of art history with his comprehensive study of Greek and Roman sculpture; he was portrayed by his friend Anton Raphael Mengs at the beginning of his long residence in Rome. Most Grand Tourists, however, stayed for briefer periods and set out with less scholarly intentions, accompanied by a teacher or guardian, and expected to return home with souvenirs of their travels as well as an understanding of art and architecture formed by exposure to great masterpieces.London was a frequent starting point for Grand Tourists, and Paris a compulsory destination; many traveled to the Netherlands, some to Switzerland and Germany, and a very few adventurers to Spain, Greece, or Turkey. The essential place to visit, however, was Italy. The British traveler Charles Thompson spoke for many Grand Tourists when in 1744 he described himself as "being impatiently desirous of viewing a country so famous in history, a country which once gave laws to the world, and which is at present the greatest school of music and painting, contains the noblest productions of sculpture and architecture, and is filled with cabinets of rarities, and collections of all kinds of historical relics". Within Italy, the great focus was Rome, whose ancient ruins and more recent achievements were shown to every Grand Tourist. Panini's Ancient Rome and Modem Rome represent the sights most prized, including celebrated Greco-Roman statues and views of famous ruins, fountains, and churches. Since there were few museums anywhere in Europe before the close of the eighteenth century, Grand Tourists often saw paintings and sculptures by gaining admission to private collections, and many were eager to acquire examples of Greco-Roman and Italian art for their own collections. In England, where architecture was increasingly seen as an aristocratic pursuit, noblemen often applied what they learned from the villas of Palladio in the Veneto and the evocative (唤起回忆的) ruins of Rome to their own country houses and gardens.51. What is said about the Grand Tour?A. It was fashionable among young people of the time.B. It was unaffordable for ordinary people.C. It produced some famous European artists.D. It made a compulsory part of college education.52. What did Grand Tourists have in common?。

针对军事集群目标的YOLOv3改进算法研究

收稿日期：2020-04-27修回日期：2020-06-06基金项目：国防基础科研基金资助项目（JCKY2017208B018）作者简介：张奔（1992-），男，山西兴县人，硕士。

研究方向：专业系统工程。

*摘要：YOLOv3目标检测模型对于巡飞弹作战中的军事集群目标存在可能漏检紧邻目标的问题。

改进算法以YOLOv3为基础，对其候选框选择算法采用的非极大值抑制（NMS ）引入惩罚函数，实现soft-NMS ，从而减少紧邻目标识别边框被误删的概率。

同时针对军事目标数据稀缺的情况，对数据的预处理采用k-fold 交叉验证策略，抑制过拟合现象，充分训练模型。

实验结果表明，改进算法后对集群目标的检测效果要好于原YOLOv3，其准确率提高了3.14%，召回率提高了17.58%，符合巡飞弹作战中对目标检测精度指标的要求。

关键词：巡飞弹，军事集群目标，YOLOv3，soft-NMS ，数据预处理中图分类号：TJ013；E91文献标识码：ADOI ：10.3969/j.issn.1002-0640.2021.05.015引用格式：张奔，徐锋，李晓婷，等.针对军事集群目标的YOLOv3改进算法研究［J ］.火力与指挥控制，2021，46（5）：81-85.针对军事集群目标的YOLOv3改进算法研究*张奔，徐锋，李晓婷，赵彦东（北方自动控制技术研究所，太原030006）Research on the Improved YOLOv3Algorithmfor Military Cluster TargetsZHANG Ben ，XU Feng ，LI Xiao-ting ，ZHAO Yan-dong （North Automatic Control Technology Institute,Taiyuan 030006,China ）Abstract ：The YOLOv3target detection model may miss detecting the adjacent targets for militarycluster targets in loitering munitions patrol operations.Based on YOLOv3,a penalty function is introduced into the non-maximum suppression （NMS ）of the candidate frame selection algorithm to realice soft-nms,so as to reduce the probability that the adjacent target recognition frame is deleted by mistakes.At the same time ，in view of the scarcity of military target data ，k-fold cross-validation strategy is adopted for data preprocessing to suppress the over-fitting phenomenon and to fully train the model.Experimental results show that the recognition effect with the improved algorithm is better than that of the original YOLOv3.Its precision and recall are increased by 3.14%and 17.58%respectively,which is in line with the requirements of target detection precision index in loitering munition operations.Key words ：loitering munitions ，military cluster target ，YOLOv3，soft-NMS ，data preprocessing Citation format ：ZHANG B ，XU F ，LI X T ，et al.Research on the improved YOLOv3algorithm for military cluster targets ［J ］.Fire Control &Command Control ，2021，46（5）：81-85.0引言巡飞弹是一种能够执行智能组网、编队飞行、侦察及打击任务的无人作战飞行器［1］。

MODEL BASED FUSION FOR MULTISENSOR TARGET RECOGNITION

1 Introduction
In this work, we present a new methodology for multisensor feature-based recognition called Automatic Multisensor Feature-based Recognition System (AMFRS). The goal of this system is to increase accuracy of the commonly used wavelet-based recognition techniques by incorporating symbolic knowledge (symbolic features) about the domain and by utilizing model-theory based fusion for multisensor feature selection. By de nition, fusion is a process of combining (fusing) information from di erent sensors when there is no physical (fusing) law indicating the correct way to combine this information. A fusion problem can be de ned in terms of nding such a fusing law. Current solutions to the fusion problem consist of numerous clever, problem-speci c algorithms 4, 7, 9]. It is not clear what unifying theory exists behind all the fusion algorithms which would guide algorithm development. The lack of an unifying theory results in a

深度学习在焊缝缺陷检测的应用研究综述

基金项目：国家自然科学基金项目（编号：61705045）；广州市科技计划现代产业技术专题项目（编号：201802010021）；佛山广工大研究院创新创业人才团队计划项目收稿日期：2020－09－04深度学习在焊缝缺陷检测的应用研究综述*王靖然1，王桂棠1，2，杨波3，王志刚3，符秦沈1，杨圳1（1.广东工业大学机电工程学院，广州510006；2.佛山沧科智能科技有限公司，广东佛山528311；3.广州特种承压设备检测研究院，广州510663）摘要：焊缝缺陷的检测在石油化工等领域是极其关键的环节，焊接质量的好坏直接影响到结构的使用性能。

对于X 射线焊缝图像评定，目前采用的人工评片受到多种主观因素的影响，导致漏检或错检情况相对较高。

近年来，随着工业智能检测技术的发展，深度学习在图像特征学习中的独特优势使其在缺陷自动检测中具备重要的实用价值。

综述了以神经网络技术为代表的深度学习模型在焊缝缺陷检测方面的研究进展，详细分析了基于卷积神经网络和Faster R-CNN 网络的工业设备焊缝缺陷自动检测的理论模型及其优缺点，并对焊缝缺陷自动检测技术的发展进行了展望。

关键词：焊缝缺陷；卷积神经网络；Faster R-CNN 网络；自动检测中图分类号：TG409文献标志码：A文章编号：1009－9492(2021)03－0065－04开放科学（资源服务）标识码（OSID ）：Summary of Research on Application of Deep Learning in Weld Defect DetectionWang Jingran 1，Wang Guitang 1，2，Yang Bo 3，Wang Zhigang 3，Fu Qinshen 1，Yang Zhen 1（1.School of Electromechanical Engineering,Guangdong University of Technology,Guangzhou 510006,China;2.Foshan Cangke Intelligent Technology Co.,Ltd.,Foshan,Guangdong 528311,China;3.Guangzhou Special Pressure Equipment Inspection and Research Institute,Guangzhou 510663,China ）Abstract:The detection of weld defects is an extremely critical link in the petrochemical industry and other fields.The quality of the weld directly affects theperformance of the structure.For the evaluation of X-ray weld image,the currently used manual evaluation is affected by a variety of subjective factors,resulting in a relatively high rate of missed or wrong inspections.In recent years,with the development of industrial intelligent detection technology,the unique advantages of deep learning in image feature learning make it have important practical value in automatic defect detection.The research progress of deep learning modelrepresented by neural network technology in weld defect detection was summarized,and the theoretical models and their advantages and disadvantages of weld defect automatic detection based on convolutional neural network and Faster R-CNN network were analyzed in detail,and the development of automatic detection in weld defects was prospected.Key words:weld defect;convolutional neural network;Faster R-CNN;automatic detection第50卷第03期Vol.50No.03机电工程技术MECHANICAL &ELECTRICAL ENGINEERING TECHNOLOGYDOI:10.3969/j.issn.1009-9492.2021.03.012王靖然，王桂棠，杨波，等.深度学习在焊缝缺陷检测的应用研究综述［J ］.机电工程技术，2021，50（03）：65-68.0引言随着我国工业化程度的不断提高，焊接技术已广泛应用到承压容器、冶金工业、石油化工等各个领域，工业设备焊接质量的好坏直接影响焊接结构的使用性能和寿命。

单兵智能头盔在消防救援中的探索与应用

第43卷第12期包装工程2022年6月 PACKAGING ENGINEERING1收稿日期：2022–01–12基金项目：国家社科艺术基金（20BG103）作者简介：彭坚（1986—），男，博士，助理教授，主要研究方向为单兵作战、装备复杂系统可诊断性设计与评价等。

通信作者：赵丹华（1982—），女，教授，博士生导师，主要研究方向为设计研究的范式建构、交通工具设计。

彭坚，王雪鹏，赵丹华，李博雅（湖南大学设计艺术学院，长沙 410082）摘要：目的在消防装备于高层建筑、地下建筑、大型商业综合体等复杂救援环境中无法满足信息感知和状态感知需求的背景下，探索和分析智能头盔当前在应急救援、军队作战、采矿安全等领域的应用研究现状，以期解决当前消防救援活动中环境视野差、协同效率低、救援人员健康状态无法保证等问题，提高消防救援装备的功能性和保障性，避免救援活动中可能发生的安全事故。

方法人工智能、多设备协作、多模态感知等概念是智慧消防新的发展结合点，以消防救援装备的智能化、集成化趋势为基础，提出了利用增强现实技术满足视觉增强、信息协同、物体识别等信息感知需求，利用脑电监测技术满足人员健康状态监测、疲劳预警等状态感知需求。

结论在城市快速发展的背景下，单兵智能头盔在消防救援领域的应用具有可靠性高、功能扩展性强、任务辅助效率高等优势，但同时面临功耗大、重量大、设备交互研究不足等挑战；适用于消防救援场景的单兵智能头盔有动态舒适性多目标优化、多通道类人感知和意图协同交互、系统故障诊断及容错控制等方面的设计研究趋势。

关键词：消防救援；智能装备；单兵头盔；增强现实；脑电信号监测中图分类号：TP29；TB472 文献标识码：A 文章编号：1001-3563(2022)12-0001-14 DOI ：10.19554/ki.1001-3563.2022.12.001Exploration and Application of Individual Soldier Intelligent Helmet in Fire RescuePENG Jian , WANG Xue-peng , ZHAO Dan-hua , LI Bo-ya(School of Design, Hunan University, Changsha 410082, China)ABSTRACT: In the context of firefighting equipment in high-rise buildings, underground buildings, large commercial complexes and other complex rescue environments cannot meet the needs of information perception and state perception, explore and analyze the current application and research status of intelligent helmets in emergency rescue, military opera-tions, mining safety and other fields, with a view to solving the problems of poor environmental vision, low efficiency of coordination, inability to guarantee the health status of rescuers, etc. in the current fire rescue activities to improve the functionality and supportability of fire rescue equipment and avoid possible safety accidents in rescue activities. Artificial intelligence, multi-device collaboration, multi-modal perception and other concepts as the new development combination point of intelligent firefighting, based on the trend of intelligence and integration of fire rescue equipment, put forward the use of augmented reality technology to meet the visual enhancement, information collaboration, object recognition and other information perception needs, and the use of EEG monitoring technology to meet the personnel health state moni-toring, fatigue warning and other state perception needs. In the context of rapid urban development, the application of in-dividual soldier intelligent helmets in the field of fire rescue has the advantages of high reliability, great functional scal-ability, and high efficiency of task assistance, but at the same time with challenges such as high power consumption, high weight, and insufficient research on device interaction; individual soldier intelligent helmets suitable for fire rescue sce-narios have design research trends in dynamic comfort multi-objective optimization, multi-channel human-like perception and intent cooperative interaction, system fault diagnosis and fault-tolerant control.KEY WORDS: fire rescue; intelligent equipment; individual soldier helmet; augmented reality; EEG signal monitoring. All Rights Reserved.2 包装工程 2022年6月随着工业化、城市化的发展，新材料、新技术的应用，消防环境越来越复杂，消防队员出警的频率呈上升趋势，同时消防救援人员所面对的任务和场景难度不断增大，对消防队员的生命安全构成了极大的威胁[1]。

P.Sayd, “Monocular vision based slam for mobile robots

Monocular Vision Based SLAM for Mobile RobotsE.Mouragnon,M.Lhuillier,M.Dhome,F.Dekeyser,P.SaydLASMEA UMR6602,Universit´e Blaise Pascal/CNRS,63177Aubi`e re Cedex,France Image and embedded computer lab.,CEA/LIST/DTSI/SARC,91191Gif s/Yvette Cedex,FranceAbstractThis paper describes a new vision based method for the Simultaneous Localization and Mapping of mobile robots. The only data used is a video input from a moving cali-brated monocular camera.From the detection and match-ing of interest points in images at video rate,robust esti-mates of the camera poses are computed in real-time and a 3D map of the environment is reconstructed.The computed 3D structure is constantly reﬁned thanks to the introduction of a fast and local bundle adjustment method that makes this approach particularly accurate and reliable.Actually,this method can be seen as a new visual tool that may be used in conjunction with usual systems(GPS,inertia sensors,etc) in SLAM applications.1.IntroductionSimultaneous Localization and Mapping(SLAM)is an essential capability for mobile robots exploring unknown environments,using very different sensors or sources of information(Figure1).Recently,many works were car-ried out on SLAM,and this paper focus on a vision based method that only uses data from a moving monocular cam-era.The robust and automatic estimate of the movement of a perspective camera(calibrated or not)and observed points,from a sequence of images have been largely studied [13,17,8,1,4,12,9,15].In Vision community,the prob-lem is called SFM,for Structure From Motion and is the subject of many works.Initially,interest points are detected and matched between successive frames.Then,robust esti-mates of relative movement are made with random samples, and a model of the environment is reconstructed in three di-mensions.One can note two main types of approaches for SFM al-gorithms.First,there are off-line methods[13,17,4,12, 9,15]carrying out a bundle adjustment optimization of the global geometry.Bundle adjustment[19]is a process which adjusts iteratively the pose of cameras as well as points po-sition in order to obtain the minimal reprojection error(due to the difference between points detected in the images and the reprojections of3D points through the cameras).Most articles refer to Levenberg-Marquardt(LM)to solve the non linear criterion involved in bundle adjustment,a method which combines the Gauss-Newton algorithm and the de-scent of gradient.In that case,a very accurate model is generated but it is very expensive in terms of computing time because of the resolution of linear systems(whose size is proportional to the number of estimated parameters)and can not be implemented in a real time application.On the other hand,there are methods without global optimization. They are really fast but their accuracy is questionable since errors accumulate in time.Among those works,Nist´e r[11] presents a method called“visual odometry”.This method estimates the movement of a stereo head or a simple camera in real time from the only visual data:the aim is to guide robots.Davison[2]proposes a real time camera pose cal-culation but he assumes that number of landmarks is small (under about100landmarks).This approach best suits to indoor environments and is not appropriate for long dis-placements because of algorithmic complexity and growing uncertainty.In this paper,we propose a complete method from the ac-quisition of images with the camera,to an estimate of the current position and uncertainty,and a3D map of the en-vironment.The method takes beneﬁt from bundle adjust-ment methods accuracy against Kalmanﬁlters[2,16],and from speed of incremental methods[11,20,18].This has been possible with the introduction of a fast and local bun-dle adjustment process which is carried out each time a new camera is added to the system.The paper is organized as follows.First,we explain in details our complete method to compute camera motion and3D structure from a video ﬂow.We explain our incremental method with local bundle adjustment:we propose to only optimize the end of the3D structure with a set of parameters restricted to the last cam-eras and3D points observed by these cameras.In a second part,we present experiments and results on real data,and we compare to GPS ground truth.Figure1.Monocular Vision among other sensors.a)Anexample of image and points tracks.b)Top view of the realtime localization.We can see3D reconstructed points andthe ellipsoid of conﬁdence for the current camera pose.2.Description of the incremental algorithmLet us consider a video sequence acquired with a cam-era settled on a vehicle moving in an unknown environment. The goal of this work is toﬁnd the position and the orien-tation in a global reference frame of the camera at several times as well as the3D position of a set of points(viewed along the scene).We use a monocular camera whose intrin-sic parameters(including radial distortion)are known and assumed to be unchanged throughout the sequence.The algorithm begins with determining aﬁrst triplet of im-ages that will be used to set up the global frame and the system geometry.After that,a robust pose calculation is carried out for each frame of the videoﬂow using features detection and matching.Some of the frames are selected and become key-frames that are used for3D points trian-gulation.The system operates in an incremental way,and when a new key-frame and3D points are added,we proceed to a local bundle adjustment.The output is the current po-sition of the camera and its uncertainty and theﬁnal result (see Figure3)is a complete trajectory and the3D coordi-nates of points seen in images.Interest points detection and matching The whole method is based on the detection and matching of features points(Figure1a.).In each frame,Harris corners[7]are detected and matched with points detected in last key frame by computing a Zero Normalized Cross Correlation score in a region of interest.The pairs with the high-scores are se-lected to provide a list of corresponding point pairs between the two images.The step“detection and matching”of the method has been efﬁciently implemented using SIMD ex-tensions of modern processors.Real-time robust pose estimation The sequence initial-ization and global coordinate system have been set up usingthe5-points algorithm[10]and a RANSAC[5]approach ona sub-sample of three frames(among other possibilities).Now,let us suppose that pose of cameras to corresponding to selected key-frames to have previ-ously been calculated in the reference reconstruction frame.We have also found a set of points whose projections are in the corresponding images.The goal is to calculate camerapose corresponding to the last acquired frame.For that,we match(last acquired frame)and(last selected key frame)to determine a set of points whose projectionson the cameras are known and whose3D coordinates have been computed before.From3D pointsreconstructed from and,we use Grunert’s poseestimation algorithm as described in[6]to compute the lo-cation of camera.A RANSAC process gives an initial estimate of camera pose which is then reﬁned using a fast LM optimization stage with only6parameters(3for optical center position and3for orientation).At this stage,the co-variance matrix of camera pose is calculated and we are able to draw an ellipsoid of conﬁdence at90%(see Figure 1b.).If is the covariance matrix of camera pose,the ellipsoide of conﬁdence is given bysince obeys a distribution with3dof. Key frames selection and3D points reconstruction The motion between two frames must be sufﬁciently large to ac-curately compute the3D positions of matched points.So, not all the frames of the input are taken into account for the 3D reconstruction,but only a sub-sample of the video.We select frames relatively far from each other but that have enough common points.For each frame,the normal way is to compute the corresponding localization using the last two key frames.We set up a criterion that indicates if a new frame must be added as a key frame or not.First,if the number of matched points with the last key frame is not sufﬁcient(typically inferior to aﬁxed level,in experiments),we have to introduce a new key-frame.Wehave also to take a new key frame if the the uncertainty of the calculated position is too high(for example,superior to the mean inter-distance between two consecutive key posi-tions).Obviously,it is not the frame for which criterion is refused that becomes a key frame but the one which imme-diately precedes.After that,new points(ie.those which are only observed in,and)are reconstructed using a standard triangulation method.Local bundle adjustment When the last key frame isselected and added to the system,a stage of optimization is carried out.It is a bundle adjustment or Levenberg-Marquardt minimization of the cost functionwhere and are respectively the cameras parameters(extrinsic parameters)and3D points chosen for this stage .The idea is to reduce the number of calculated param-eters in optimizing only the extrinsic parameters of thelast cameras and taking account of the2D reprojections in the(with)last frames(see Figure2).Thus,and contains all the3D points pro-jected on cameras.Cost function is the sum of points reprojection errors in the last frames to:where is the square of Euclidean distance between,estimated projection of point through the camera and the measured corresponding ob-servation.is the projection matrix of camera composed of extrinsic parameters and known intrinsic parameters.Thus,(number of optimized cameras at each stage) and(number of images taken into account in the repro-jection function)are the2main parameters involved in the optimization process.We must have toﬁx the re-construction frame and the scale factor at the sequence end, and we have found that or and are sufﬁcient values inpractice.Figure2.Local bundle adjustment when camera isadded.Only surrounded points and cameras are optimized.Nevertheless,we take account of3D points reprojections inthe last images.3.Experiments on real dataWe applied our incremental localization and mapping al-gorithm to a semi-urban scene and to a down-town scene. Here,the goal is to evaluate the robustness to perturbations in a complex environment,and the accuracy compared to ground truth provided by a Real Time Kinematics Differen-tial GPS(whose precision is about one inch in the horizontal plane).Hardware settings In our experiments,the camera was settled on an experimental vehicle and Image size is.We used a standard Linux PC(Pentium4at 2.8GHz,1Go of RAM memory at800MHZ)for the recon-struction process.Semi urban scene:comparison with GPS ground truth Speed and trajectory length:atVideo:long at()Reconstruction:key positions and more than3D points.This sequence is particularly interesting because of images contents(people walking in front of the camera,sunshine, etc...)that does not favor the reconstruction process.More-over,the environment is more appropriate to a GPS local-ization because the satellites in the sky are not occulted by high buildings.It is also interesting because of the trajec-tory:a turn on the right,two turns on the left and a straight line.Time measured includes feature detection(Har-ris points per frame),matching,and pose calculation for all frames.For key frames,treatment time is longer because of points3D reconstruction and local bundle adjustment.We can note that speed results are very interesting with an aver-age of for normal frames and for key frames (let us notice that time between two frames is at ).Results are reported in table1.Frames Max Time Mean Time TotalNon-key frames0.140.0930.69Key frames0.430.2826.29 putation times in.The calculated trajectory obtained with our algorithm has been compared to data given by a GPS sensor.For the comparison,we applied a rigid transformation(rotation, translation and scale factor)to the trajectory as described in[3]toﬁt with GPS reference data.Figure4shows tra-jectory registration with GPS reference.As GPS positions are given in a metric frame we can compare camera lo-cations and measure3D positioning error in meters.The maximum measured error is with a3D mean error of and a2D mean error of less thanin the horizontal plane.Very long urban sceneSpeed and trajectory length:atVideo:long at()Reconstruction:key positions and3D points.In Figure5,we can see some frames from the video,a clas-sical map of the down-town,and the3D map resulting fromFigure 3.2frames from real data experiments and atop view of the reconstructed scene and trajectory (#4.000points and 94keypositions).Figure 4.Registration with GPS reference,top:in hori-zontal plane,bottom:on altitude axis.Continuous line rep-resents GPS trajectory and points represent estimated keypositions.Coordinates are expressed in.our 3D reconstruction and localization method.The video has been acquired in real urban conditions and the trajectoryis nearly a loop;it is not a complete loop because of tech-nical reasons.One can visually ensure that reconstruction is not much deformed and drift is very low compared to the covered distance.That shows that our algorithm,very ap-propriate to long scene reconstruction in term of computing time is also quite precise and robust.The estimated mean 3D position error compared to results obtained with a clas-sical global bundle adjustment method is less than.Figure 5.The long urban sequence a)left:some framesfrom the sequence,b)right:a map of the city with the trajectory in blue,the reconstruction result (trajectory and 18.4033D points).Indoor sequences Here,we present 3D reconstructionand trajectory results obtained for 2indoor sequences.The ﬁrst one has been acquired with a camera settled on a trav-eling tripod.The trajectory is a complete loop and the start point is the same as the ﬁnal point.For the second sequence,the camera was freely handled.4.ConclusionWe presented a fast and accurate method to estimate a vehicle motion using a calibrated monocular camera.TheFigure6.The2indoor reconstructions.a)top:someframes from the camera,b)middle:the complete loop,c)thefreely handled camera sequence.The line represents thetrajectory and points are3D reconstructed points.method also gives a3D reconstruction of the environment, and the model is built with3D points reconstructed from in-terest points extracted in images and matched at video rate. Results are very encouraging in term of accuracy.We plan to apply the method to an“automatic convoy of vehicles”. Theﬁrst vehicle is guided manually and makes a3D map of the scene.It sends data to others vehicles that are able to navigate autonomously,and take the same path.More generally,the method can be adapted to many applications in robotics and many otherﬁelds where a3D localization is needed.References[1]“Boujou,”2d3Ltd,,2000.[2] A.J.Davison.Real-Time Simultaneous Localization andMapping with a Single Camera.ICCV’03.[3]O.D.Faugeras and M.Hebert.The representation,recogni-tion,and locating of3-D objects.IJRR,5(3):27-52,1986. [4]O.D.Faugeras and Q.T.Luong.The Geometry of Multiple Im-ages,The MIT Press,2001.[5]M.Fischler and R.Bolles.Random Sample Consensus:aParadigm for Model Fitting with Application to Image Anal-ysis ans Automated Cartography.GIP,24(6):381-395,1981 [6]R.M.Haralick,C.N.Lee,K.Ottenberg and M.Nolle.Reviewand analysis of solutions of the three point perspective pose estimation problem.IJCV,13(3):331-356,1994.[7] C.Harris,M.Stephens.A Combined Corner and Edge Detec-tor.Alvey Vision Conference,pp.147-151,1988.[8]R.Hartley and A.Zisserman.Multiple View Geometry inComputer Vision,Cambridge University Press,2000.[9]M.Lhuillier and Long Quan.A Quasi-Dense Approachto Surface Reconstruction from Uncalibrated Images.IEEE TPAMI,27(3):418-433,2005.[10] D.Nister.An efﬁcient solution to theﬁve-point relative poseproblem.CVPR’03.[11] D.Nister,O.Naroditsky and J.Bergen.Visual Odometry.CVPR’04.[12] D.Nister.Automatic Dense Reconstruction from Uncali-brated Video Sequences,PhD Thesis,Ericsson and University of Stockholms,2001.[13]M.Pollefeys,R.Koch and L.Van Gool.Self-Calibration andMetric Reconstruction in spite of Varying and Unknown In-ternal Camera Parameters.ICCV’98.[14]W.H.Press,S.A.Teukolsky,W.T.Vetterling,B.P.Flannery.”Numerical Recipes in C:The Art of Scientiﬁc Computing”, Cambridge University Press,1992.[15] E.Royer,M.Lhuillier,M.Dhome and T.Chateau.Localiza-tion in urban environments:monocular vision compared to a differential GPS sensor.CVPR’05.[16]S.Se,D.Lowe and J.Little.Mobile Robot Localization andMapping with Uncertainty using Scale-Invariant Visual Land-marks.IJRR,volume21,no.8,pp.735-758,2002.[17]H.Shum,Q.KE,and Z.Zhang.Efﬁcient bundle adjustmentwith virtual key frames:A hierarchical approach to multi-frame structure from motion.CVPR’99.[18] D.Steedly and I.A.Essa.Propagation of Innovative Infor-mation in Non-Linear Least-Squares Structure from Motion.ICCV’01.[19] B.Triggs,P.F.McLauchlan,R.I.Hartley and A.W.Fitzibbon.Bundle adjustment-A modern synthesis.LNCS volume1883 ,pp.298-375,Springer Verlag,2000.[20]Z.Zhang and Y.Shan.Incremental Motion Estimationthrough Modiﬁed Bundle Adjustment.ICIP’03.。

计算机视觉基本原理英文文献

计算机视觉基本原理英文文献Basic Principles of Computer VisionComputer vision is an interdisciplinary field of study that aims to enable computers to interpret and understand visual information from the world around us. This field has made tremendous advancements in recent years and has found applications in many industries, including healthcare, autonomous vehicles, and security systems. In this article, we will explore the basic principles of computer vision.1. Image AcquisitionThe first step in computer vision is to acquire an image, either through a camera or another imaging device. The image can be in various forms, including a two-dimensional image, a video sequence, or even a three-dimensional point cloud.2. Image PreprocessingOnce we have acquired the image, we need to preprocess it to make it suitable for analysis. This process involves removing any unwanted noise, enhancing contrast, and adjusting the brightness and color levels.3. Feature ExtractionThe next step in computer vision is to extract features from the preprocessed image. These features are specificcharacteristics of the image that can be used to identify or classify objects. Feature extraction can be done using various techniques such as edge detection, corner detection, and blob detection.4. Object DetectionAfter extracting features, we can use them to detect objects in the image. Object detection involves defining regions of interest and then using machine learning algorithms to detect objects within those regions. This process can be further optimized using techniques such as sliding windows andregion-based convolutional neural networks.5. Object RecognitionOnce we have detected objects in the image, we can use machine learning algorithms to recognize and classify them. Object recognition involves training a model on a set of labeled data to recognize objects based on their features.6. Image SegmentationImage segmentation is the process of dividing an image into multiple segments or regions based on their characteristics. This process is essential for applications such as object tracking and image understanding.7. Feature MatchingFeature matching is the process of finding correspondingfeatures between different images. This technique is used in applications such as image stitching and object tracking.In conclusion, computer vision is a vast field that involves various techniques and algorithms. The basic principles discussed in this article provide a foundation for understanding how computers interpret and understand visual information. As technology continues to advance, we can expect computer vision to play an increasingly vital role in many industries.。

基于多模态特征融合的井下人员不安全行为识别

基于多模态特征融合的井下人员不安全行为识别王宇1，于春华2，陈晓青1，宋家威1（1. 辽宁科技大学矿业工程学院，辽宁鞍山　114051；2. 凌钢股份北票保国铁矿有限公司，辽宁朝阳　122102）摘要：采用人工智能技术对井下人员的行为进行实时识别，对保证矿井安全生产具有重要意义。

针对基于RGB 模态的行为识别方法易受视频图像背景噪声影响、基于骨骼模态的行为识别方法缺乏人与物体的外观特征信息的问题，将2种方法进行融合，提出了一种基于多模态特征融合的井下人员不安全行为识别方法。

通过SlowOnly 网络对RGB 模态特征进行提取；使用YOLOX 与Lite−HRNet 网络获取骨骼模态数据，采用PoseC3D 网络对骨骼模态特征进行提取；对RGB 模态特征与骨骼模态特征进行早期融合与晚期融合，最后得到井下人员不安全行为识别结果。

在X−Sub 标准下的NTU60 RGB+D 公开数据集上的实验结果表明：在基于单一骨骼模态的行为识别模型中，PoseC3D 拥有比GCN （图卷积网络）类方法更高的识别准确率，达到93.1%；基于多模态特征融合的行为识别模型对比基于单一骨骼模态的识别模型拥有更高的识别准确率，达到95.4%。

在自制井下不安全行为数据集上的实验结果表明：基于多模态特征融合的行为识别模型在井下复杂环境下识别准确率仍最高，达到93.3%，对相似不安全行为与多人不安全行为均能准确识别。

关键词：智能矿山；行为识别；目标检测；姿态估计；多模态特征融合；RGB 模态；骨骼模态；YOLOX 中图分类号：TD67 文献标志码：ARecognition of unsafe behaviors of underground personnel based on multi modal feature fusionWANG Yu 1, YU Chunhua 2, CHEN Xiaoqing 1, SONG Jiawei 1(1. School of Mining Engineering, University of Science and Technology Liaoning, Anshan 114051, China ；2. Lingang Group Beipiao Baoguo Iron Mining Co., Ltd., Chaoyang 122102, China)Abstract : The use of artificial intelligence technology for real-time recognition of underground personnel's behavior is of great significance for ensuring safe production in mines. The RGB modal based behavior recognition methods is susceptible to video image background noise. The bone modal based behavior recognition methods lacks visual feature information of humans and objects. In order to solve the above problems, a multi modal feature fusion based underground personnel unsafe behavior recognition method is proposed by combining the two methods. The SlowOnly network is used to extract RGB modal features. The YOLOX and Lite HRNet networks are used to obtain bone modal data. The PoseC3D network is used to extract bone modal features. The early and late fusion of RGB modal features and bone modal features are performed. The recognition results for unsafe behavior of underground personnel are finally obtained. The experimental results on the NTU60 RGB+D public dataset under the X-Sub standard show the following points. In the behavior recognition model based on a single bone modal, PoseC3D has a higher recognition accuracy than GCN (graph convolutional network)methods, reaching 93.1%. The behavior recognition model based on multimodal feature fusion has a higher收稿日期：2023-07-16；修回日期：2023-10-27；责任编辑：胡娴。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

email: flanser,zierl,munkelt,radigg@informatik.tu-muenchen.de
Abstract. One of the fundamental requirements for an autonomous mo-
bile system (AMS ) is the ability to navigate within an a priori known environment and to recognize task-speci c objects, i.e., to identify these objects and to compute their 3D pose relative to the AMS. For the accomplishment of these tasks the AMS has to survey its environment by using appropriate sensors. This contribution presents the vision-based 3D object recognition system MORAL1 , which performs a model-based interpretation of single video images of a CCD camera. Using appropriate parameters, the system can be adapted dynamically to di erent tasks. The communication with the AMS is realized transparently using remote procedure calls. As a whole this architecture enables a high level of exibility with regard to the used hardware (computer, camera) as well as to the objects to be recognized.
I M
?
This work was supported by Deutsche Forschungsgemeinschaft within the Sonderforschungsbereich 331, \Informationsverarbeitung in autonomen, mobilen Handhabungssystemen", project L9. 1 Munich Object Recognition And Localization
CCD camera
user interface (dyn. configuration) RPC
M
AMS task control RPC
feature detection
O R A L
RPC odometry localization recognition
ion
model prediction
2.1 Calibration
In order to obtain the 3D object pose from the grabbed video image, the internal camera parameters (mapping the 3D world into pixels) as well as the external camera parameters (pose of the CCD camera relative to the manipulator or the vehicle) have to be determined with su cient accuracy.
Internal Camera Parameters. The proposed approach uses the model of a pinhole camera with radial distortions to map 3D point in the scene into 2D pixels of the video image LT88]. It includes the internal parameters as well as the external parameters R, a matrix describing the orientation, and T , a vector describing the position of the camera in the world. In the rst stage of the calibration process the internal camera parameters are computed by simultanously evaluating images showing a 2D calibration table with N circular marks Pi taken from K di erent viewpoints. This multiview calibration LZ95] minimizes the distances between the projected 3D midpoints of the marks and the corresponding 2D points in the video images. The 3D pose R; T of the camera is estimated during the minimization process. Thus, only the model of the calibration table itself has to be known a priori. Hand-Eye Calibration. Once the internal camera parameters have been determined, the 3D pose of the camera relative to the tool center point is estimated in the second stage of the calibration process (hand-eye calibration). In the case of a camera mounted on the manipulator of a mobile robot the 3D pose of the camera (R; T ) is the composition of the pose of the robot (RV ; TV ), the relative pose of the manipulator (RM ; TM ), and the relative pose of the camera (RC ; TC ), see Fig. 2 (a). The unknown pose (RC ; TC ) is determined by perk forming controlled movements (Rk ; TM ) of the manipulator similar to Wan92], M for details see LZ95]. Since the used 2D calibration table is mounted on the mobile robot itself, the manipulator can move to the di erent viewpoints for the multiview calibration automatically. Thus, the calibration can be accomplished in only a few minutes.
ok ok ok ok . . .
= = = =
moral_init(&ID) moral_load_param(ID,Configuration) moral_load_param(ID,CameraParameter) moral_load_object(ID,Object)
RPC generalized environmental model
MORAL { A Vision-based Object Recognition System for Autonomous Mobile Systems ?
Stefan Lanser, Christoph Zierl, Olaf Munkelt, Bernd Radig
Technische Universitat Munchen Forschungsgruppe Bildverstehen (FG BV), Informatik IX Orleansstr. 34, 81667 Munchen, Germany
2 System Architecture
The presented object recognition system (see Fig. 1 (a)) is implemented as a RPC server (remote procedure call ), which is called by any client process, especially by the AMS task control system. The standardized RPCs allow a hardware independent use of the MORAL system. Therefore MORAL can easily be applied on di erent platforms. By using the same mechanism, MORAL communicates (optionally) with other components, e.g., the generalized environmental model. With the help of speci c RPCs MORAL can be dynamically con gured, and can thus be exibly adapted to modi ed tasks. The internal structure of MORAL essentially consists of ve modules, which are implemented in ANSI-C and C++, respectively.