Automatic image annotation

Output of DenseCap "dense captioning" software, analysing a photograph of a man riding an elephant

Automatic image annotation (also known as automatic image tagging or linguistic indexing) is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database.

This method can be regarded as a type of multi-class image classification with a very large number of classes - as large as the vocabulary size. Typically, image analysis in the form of extracted feature vectors and the training annotation words are used by machine learning techniques to attempt to automatically apply annotations to new images. The first methods learned the correlations between image features and training annotations, then techniques were developed using machine translation to try to translate the textual vocabulary with the 'visual vocabulary', or clustered regions known as blobs. Work following these efforts have included classification approaches, relevance models and so on.

The advantages of automatic image annotation versus content-based image retrieval (CBIR) are that queries can be more naturally specified by the user.[1] CBIR generally (at present) requires users to search by image concepts such as color and texture, or finding example queries. Certain image features in example images may override the concept that the user is really focusing on. The traditional methods of image retrieval such as those used by libraries have relied on manually annotated images, which is expensive and time-consuming, especially given the large and constantly growing image databases in existence.

See also

References

  1. ^ "Archived copy" (PDF). i.yz.yamagata-u.ac.jp. Archived from the original (PDF) on 8 August 2014. Retrieved 13 January 2022.{{cite web}}: CS1 maint: archived copy as title (link)

Further reading

  • Word co-occurrence model
Y Mori; H Takahashi & R Oka (1999). "Image-to-word transformation based on dividing and vector quantizing images with words.". Proceedings of the International Workshop on Multimedia Intelligent Storage and Retrieval Management. CiteSeerX 10.1.1.31.1704.
  • Annotation as machine translation
P Duygulu; K Barnard; N de Fretias & D Forsyth (2002). "Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary". Proceedings of the European Conference on Computer Vision. pp. 97–112. Archived from the original on 2005-03-05.
  • Statistical models
J Li & J Z Wang (2006). "Real-time Computerized Annotation of Pictures". Proc. ACM Multimedia. pp. 911–920.
J Z Wang & J Li (2002). "Learning-Based Linguistic Indexing of Pictures with 2-D MHMMs". Proc. ACM Multimedia. pp. 436–445.
  • Automatic linguistic indexing of pictures
J Li & J Z Wang (2008). "Real-time Computerized Annotation of Pictures". IEEE Transactions on Pattern Analysis and Machine Intelligence.
J Li & J Z Wang (2003). "Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach". IEEE Transactions on Pattern Analysis and Machine Intelligence. pp. 1075–1088.
  • Hierarchical Aspect Cluster Model
K Barnard; D A Forsyth (2001). "Learning the Semantics of Words and Pictures". Proceedings of International Conference on Computer Vision. pp. 408–415. Archived from the original on 2007-09-28.
  • Latent Dirichlet Allocation model
D Blei; A Ng & M Jordan (2003). "Latent Dirichlet allocation" (PDF). Journal of Machine Learning Research. pp. 3:993–1022. Archived from the original (PDF) on March 16, 2005.
G Carneiro; A B Chan; P Moreno & N Vasconcelos (2006). "Supervised Learning of Semantic Classes for Image Annotation and Retrieval" (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. pp. 394–410.
  • Texture similarity
R W Picard & T P Minka (1995). "Vision Texture for Annotation". Multimedia Systems.
  • Support Vector Machines
C Cusano; G Ciocca & R Scettini (2004). Santini, Simone & Schettini, Raimondo (eds.). "Image Annotation Using SVM". Internet Imaging V. 5304: 330–338. Bibcode:2003SPIE.5304..330C. doi:10.1117/12.526746. S2CID 16246057.
  • Ensemble of Decision Trees and Random Subwindows
R Maree; P Geurts; J Piater & L Wehenkel (2005). "Random Subwindows for Robust Image Classification". Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. pp. 1:34–30.
  • Maximum Entropy
J Jeon; R Manmatha (2004). "Using Maximum Entropy for Automatic Image Annotation" (PDF). Int'l Conf on Image and Video Retrieval (CIVR 2004). pp. 24–32.
  • Relevance models
J Jeon; V Lavrenko & R Manmatha (2003). "Automatic image annotation and retrieval using cross-media relevance models" (PDF). Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 119–126.
  • Relevance models using continuous probability density functions
V Lavrenko; R Manmatha & J Jeon (2003). "A model for learning the semantics of pictures" (PDF). Proceedings of the 16th Conference on Advances in Neural Information Processing Systems NIPS.
  • Coherent Language Model
R Jin; J Y Chai; L Si (2004). "Effective Automatic Image Annotation via A Coherent Language Model and Active Learning" (PDF). Proceedings of MM'04.
  • Inference networks
D Metzler & R Manmatha (2004). "An inference network approach to image retrieval" (PDF). Proceedings of the International Conference on Image and Video Retrieval. pp. 42–50.
  • Multiple Bernoulli distribution
S Feng; R Manmatha & V Lavrenko (2004). "Multiple Bernoulli relevance models for image and video annotation" (PDF). IEEE Conference on Computer Vision and Pattern Recognition. pp. 1002–1009.
  • Multiple design alternatives
J Y Pan; H-J Yang; P Duygulu; C Faloutsos (2004). "Automatic Image Captioning" (PDF). Proceedings of the 2004 IEEE International Conference on Multimedia and Expo (ICME'04). Archived from the original (PDF) on 2004-12-09.
  • Image captioning
Quan Hoang Lam; Quang Duy Le; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen (2020). "UIT-ViIC: A Dataset for the First Evaluation on Vietnamese Image Captioning". Proceedings of the 2020 International Conference on Computational Collective Intelligence (ICCCI 2020). arXiv:2002.00175. doi:10.1007/978-3-030-63007-2_57.
  • Natural scene annotation
J Fan; Y Gao; H Luo; G Xu (2004). "Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation". Proceedings of the 27th annual international conference on Research and development in information retrieval. pp. 361–368.
  • Relevant low-level global filters
A Oliva & A Torralba (2001). "Modeling the shape of the scene: a holistic representation of the spatial envelope" (PDF). International Journal of Computer Vision. pp. 42:145–175.
  • Global image features and nonparametric density estimation
A Yavlinsky, E Schofield & S Rüger (2005). "Automated Image Annotation Using Global Features and Robust Nonparametric Density Estimation" (PDF). Int'l Conf on Image and Video Retrieval (CIVR, Singapore, Jul 2005). Archived from the original (PDF) on 2005-12-20.
  • Video semantics
N Vasconcelos & A Lippman (2001). "Statistical Models of Video Structure for Content Analysis and Characterization" (PDF). IEEE Transactions on Image Processing. pp. 1–17.
Ilaria Bartolini; Marco Patella & Corrado Romani (2010). "Shiatsu: Semantic-based Hierarchical Automatic Tagging of Videos by Segmentation Using Cuts". 3rd ACM International Multimedia Workshop on Automated Information Extraction in Media Production (AIEMPro10).
  • Image Annotation Refinement
Yohan Jin; Latifur Khan; Lei Wang & Mamoun Awad (2005). "Image annotations by combining multiple evidence & wordNet". 13th Annual ACM International Conference on Multimedia (MM 05). pp. 706–715.
Changhu Wang; Feng Jing; Lei Zhang & Hong-Jiang Zhang (2006). "Image annotation refinement using random walk with restarts". 14th Annual ACM International Conference on Multimedia (MM 06).
Changhu Wang; Feng Jing; Lei Zhang & Hong-Jiang Zhang (2007). "content-based image annotation refinement". IEEE Conference on Computer Vision and Pattern Recognition (CVPR 07). doi:10.1109/CVPR.2007.383221.
Ilaria Bartolini & Paolo Ciaccia (2007). "Imagination: Exploiting Link Analysis for Accurate Image Annotation". Springer Adaptive Multimedia Retrieval. doi:10.1007/978-3-540-79860-6_3.
Ilaria Bartolini & Paolo Ciaccia (2010). "Multi-dimensional Keyword-based Image Annotation and Search". 2nd ACM International Workshop on Keyword Search on Structured Data (KEYS 2010).
  • Automatic Image Annotation by Ensemble of Visual Descriptors
Emre Akbas & Fatos Y. Vural (2007). "Automatic Image Annotation by Ensemble of Visual Descriptors". Intl. Conf. on Computer Vision (CVPR) 2007, Workshop on Semantic Learning Applications in Multimedia. doi:10.1109/CVPR.2007.383484. hdl:11511/16027.
  • A New Baseline for Image Annotation
Ameesh Makadia and Vladimir Pavlovic and Sanjiv Kumar (2008). "A New Baseline for Image Annotation" (PDF). European Conference on Computer Vision (ECCV).

Simultaneous Image Classification and Annotation

Chong Wang and David Blei and Li Fei-Fei (2009). "Simultaneous Image Classification and Annotation" (PDF). Conf. on Computer Vision and Pattern Recognition (CVPR).
  • TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation
Matthieu Guillaumin and Thomas Mensink and Jakob Verbeek and Cordelia Schmid (2009). "TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation" (PDF). Intl. Conf. on Computer Vision (ICCV).
  • Image Annotation Using Metric Learning in Semantic Neighbourhoods
Yashaswi Verma & C. V. Jawahar (2012). "Image Annotation Using Metric Learning in Semantic Neighbourhoods" (PDF). European Conference on Computer Vision (ECCV). Archived from the original (PDF) on 2013-05-14. Retrieved 2014-02-26.
  • Automatic Image Annotation Using Deep Learning Representations
Venkatesh N. Murthy & Subhransu Maji and R. Manmatha (2015). "Automatic Image Annotation Using Deep Learning Representations" (PDF). International Conference on Multimedia (ICMR).
  • Holistic Image Annotation using Salient Regions and Background Image Information
Sarin, Supheakmungkol; Fahrmair, Michael; Wagner, Matthias & Kameyama, Wataru (2012). Leveraging Features from Background and Salient Regions for Automatic Image Annotation. Journal of Information Processing. Vol. 20. pp. 250–266.
  • Medical Image Annotation using bayesian networks and active learning
N. B. Marvasti & E. Yörük and B. Acar (2018). "Computer-Aided Medical Image Annotation: Preliminary Results With Liver Lesions in CT". IEEE Journal of Biomedical and Health Informatics.

Read other articles:

Cet article est une ébauche concernant un peintre français. Vous pouvez partager vos connaissances en l’améliorant (comment ?) selon les recommandations des projets correspondants. Pour les articles homonymes, voir Mohler. Gustave MohlerBiographieNaissance 8 mai 1836Ancien 8e arrondissement de ParisDécès 30 août 1920 (à 84 ans)NeversNationalité françaiseFormation École nationale supérieure des beaux-artsActivités Peintre, sculpteur, céramisteAutres informationsMaître...

 

 

Cet article est une ébauche concernant le domaine militaire et l’Allemagne. Vous pouvez partager vos connaissances en l’améliorant (comment ?) selon les recommandations des projets correspondants. Base aérienne de PferdsfeldPrésentationType Base aérienne militaireOuverture 1939Gestionnaire LuftwaffeLocalisationLocalisation Bad Sobernheim AllemagneCoordonnées 49° 51′ 18″ N, 7° 36′ 12″ Emodifier - modifier le code - modifier Wikidata L...

 

 

Technological college in New York, U.S. National Technical Institute for the DeafTypePrivate-Public partnershipEstablished1965PresidentGerard BuckleyLocationHenrietta, New York43°05′14″N 77°40′06″W / 43.0871°N 77.6683°W / 43.0871; -77.6683Websiterit.edu/ntid The National Technical Institute for the Deaf (NTID) is the first and largest technological college in the world for students who are deaf or hard of hearing.[1] As one of nine colleges within t...

Munisípiu Vikeke (tetum)Município de Viqueque (port.) Daten Hauptstadt Viqueque Fläche 1.872,68 km²[1] Einwohnerzahl (2022) 80.176[2] Zahl der Haushalte (2022) 16.563[2] ISO 3166-2: TL-VI Verwaltungsämter Einwohner 2022[2] Fläche[1] Lacluta 6.695 414,16 km² Ossu 18.787 403,66 km² Uato-Lari 18.459 287,94 km² Uatucarbau 7.879 130,69 km² Viqueque 28.356 636,23 km² Karten Viqueque (tetum Vikeke) ist die größte Gemeinde von Osttimor. Inhaltsverz...

 

 

Процедура — термін, який має кілька значень. Ця сторінка значень містить посилання на статті про кожне з них.Якщо ви потрапили сюди за внутрішнім посиланням, будь ласка, поверніться та виправте його так, щоб воно вказувало безпосередньо на потрібну статтю.@ пошук посилан...

 

 

Halaman ini berisi artikel tentang roti tipis. Untuk kegunaan lain, lihat Pita (disambiguasi). Lihat pula: Roti Tandoor PitaDaerahTimur Tengah, Balkan, YunaniBahan utamaTepung dan air  Media: Pita Pita atau pitta (/[invalid input: 'icon']ˈpɪtə/ PI-tə) adalah sebuah roti kosong bulat yang banyak dikonsumsi dalam berbagai masakan Timur Tengah, Mediterania, dan Balkan. Roti ini terkenal di Yunani, Balkan, Levant, Jazirah Arab, Semenanjung Malaya, Indonesia, dan Turki. Kantung udara ...

Manhattan Chinatown Chinese languages; the third-most spoken after English and Spanish Chinese speakers in the United States Year Speakers 1960a 89,609 1970a 190,260 1980[1] 630,806 1990[2] 1,319,462 2000[3] 2,022,143 2010[4] 2,808,692 ^a Foreign-born population only[5] Chinese languages, mostly Cantonese, are collectively the third most-spoken language in the United States, and are mostly spoken within Chinese-American populations and by immigrants...

 

 

Mountain in Nepal KumbhakarnaJannuJannu from the southHighest pointElevation7,710 m (25,300 ft)[1]Ranked 32ndProminence1,035 m (3,396 ft)ListingList of mountains in NepalCoordinates27°40′58″N 88°02′45″E / 27.68278°N 88.04583°E / 27.68278; 88.04583GeographyKumbhakarnaLocation in Nepal LocationEastern NepalParent rangeHimalayasClimbingFirst ascentApril 27–28, 1962 by René Desmaison, Paul Keller, Robert Paragot, Gyalzen Mitc...

 

 

American baseball player (born 1987) Baseball player Xavier ScruggsScruggs batting for the Memphis Redbirds in 2015First basemanBorn: (1987-09-23) September 23, 1987 (age 36)Whittier, California, U.S.Batted: RightThrew: RightProfessional debutMLB: September 4, 2014, for the St. Louis CardinalsKBO: March 31, 2017, for the NC DinosLast appearanceMLB: October 2, 2016, for the Miami MarlinsKBO: 2018, for the NC DinosMLB statisticsBatting...

Municipality in Center-West, BrazilJaciaraMunicipality FlagCountry BrazilRegionCenter-WestStateMato GrossoMesoregionSudeste Mato-GrossenseElevation1,194 ft (364 m)Population (2020 [1]) • Total27,807Time zoneUTC−3 (BRT) Jaciara is a municipality in the state of Mato Grosso in the Central-West Region of Brazil.[2][3][4][5] See also List of municipalities in Mato Grosso References ^ IBGE 2020 ^ Divisão Territorial do ...

 

 

Language spoken by Adam in the Garden of Eden Adamic redirects here. For other uses, see Adamic (disambiguation). Adam naming the animals as described in Genesis. In some interpretations, he uses the Adamic language to do so. The Adamic language, according to Jewish tradition (as recorded in the midrashim) and some Christians, is the language spoken by Adam (and possibly Eve) in the Garden of Eden. It is variously interpreted as either the language used by God to address Adam (the divine lang...

 

 

هذه المقالة يتيمة إذ تصل إليها مقالات أخرى قليلة جدًا. فضلًا، ساعد بإضافة وصلة إليها في مقالات متعلقة بها. (أبريل 2019) سيسيل ديك معلومات شخصية تاريخ الميلاد سنة 1915[1]  الوفاة سنة 1992 (76–77 سنة)[1]  تاهليكوا  مواطنة الولايات المتحدة  العرق شيروكي[2]  الحي...

Muchlis A.S.Kepala Kepolisian Daerah JambiMasa jabatan5 Januari 2018 – 3 Februari 2020PendahuluPriyo WidyantoPenggantiFirman SantyabudiWakil Kepala Kepolisian Daerah JambiMasa jabatan18 April 2017 – 7 Mei 2017PendahuluNugroho Aji WijayantoPenggantiAhmad Haydar Informasi pribadiLahir5 Mei 1962 (umur 61)Terusan, Maro Sebo Ilir, Batanghari, JambiAlma materAkademi Kepolisian (1987)Karier militerPihak IndonesiaDinas/cabang Kepolisian Negara Republik IndonesiaMasa...

 

 

У Вікіпедії є статті про інші значення цього терміна: Матвєєв. Олег Матвєєв Особисті дані Повне ім'я Олег Юрійович Матвєєв Народження 18 серпня 1970(1970-08-18) (53 роки)   Ростов-на-Дону, СРСР Зріст 177 см Вага 70 кг Громадянство  СРСР Україна Позиція нападник Інформація про...

 

 

Das Wappen des Landes Steiermark Die Funktion des Landesamtsdirektors als oberster Beamter einer Landesregierung in Österreich wurde mit dem Bundesverfassungsgesetz vom 30. Juli 1925 betreffend Grundsätze für die Einrichtung und Geschäftsführung der Ämter der Landesregierungen außer Wien geschaffen und muss gem. § 106 Bundes-Verfassungsgesetz von einem rechtskundigen Beamten ausgeübt werden. In der Steiermark trug der Landesamtsdirektor zwischen 1938 und 1945 die Bezeichnung „Lande...

American politician Alfonso LopezMember of the Virginia House of Delegatesfrom the 49th districtIncumbentAssumed office January 11, 2012Preceded byAdam EbbinVirginia House Democratic WhipIncumbentAssumed office January 2016Serving with Mike Mullin Personal detailsBorn (1970-07-28) July 28, 1970 (age 53)Williamsport, Pennsylvania, U.S.Political partyDemocraticSpouseSarah ZevinChildren2Alma materVassar College (BA)Tulane University (JD)CommitteesAgriculture, Che...

 

 

Indian politician Jayanti DalalBornJayanti Ghelabhai Dalal(1909-11-18)18 November 1909AhmedabadDied24 August 1970(1970-08-24) (aged 60)Notable awardsRanjitram Suvarna ChandrakRelativesGhelabhai (father) Jayanti Ghelabhai Dalal (18 November 1909 – 24 August 1970) was an Indian author, publisher, stage actor, director and politician. Born in family of theatre organiser and involved in politics during and after independence of India, he was influenced by socialism and Gandhian philosophy....

 

 

Korean drama production company Not to be confused with Story Television. This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Story TV – news · newspapers · books · scholar · JSTOR (November 2019) (Learn how and when to remove this template message) Story TVStory TV logoNative nameKorean nameHangul(주)스토리...

Toy produced by Playskool For the Flash animated series, see Weebl and Bob. Diddy Wishingwell figure in top of Weebles Barn Dance playset Weebles is a range of children's roly-poly toys that originated in 1971 by the US toy company Playskool. They are egg-shaped, so tipping one causes a weight located at the bottom-center to be raised. Once released, the Weeble is restored by gravity to an upright position. Weebles have been designed with a variety of shapes, including some designed to look l...

 

 

Australian federal electoral division This article is about the Australian House of Representatives seat. For the former Tasmanian Legislative Council seat, see Electoral division of Macquarie. For the former NSW Legislative Assembly seat, see Electoral district of Macquarie. MacquarieAustralian House of Representatives DivisionDivision of Macquarie in New South Wales, as of the 2016 federal electionCreated1901MPSusan TemplemanPartyLaborNamesakeLachlan MacquarieElectors108,119 (2022)Area...

 

 

Strategi Solo vs Squad di Free Fire: Cara Menang Mudah!