Show simple item record

dc.contributor.authorHuo, L
dc.contributor.authorBai, L
dc.contributor.authorZhou, Shang-Ming
dc.date.accessioned2021-11-05T11:50:04Z
dc.date.available2021-11-05T11:50:04Z
dc.date.issued2022-08
dc.identifier.issn2168-2267
dc.identifier.issn2168-2275
dc.identifier.urihttp://hdl.handle.net/10026.1/18229
dc.description.abstract

Automatically generating an accurate and meaningful description of an image is very challenging. However, the recent scheme of generating an image caption by maximizing the likelihood of target sentences lacks the capacity of recognizing the human-object interaction (HOI) and semantic relationship between HOIs and scenes, which are the essential parts of an image caption. This article proposes a novel two-phase framework to generate an image caption by addressing the above challenges: 1) a hybrid deep learning and 2) an image description generation. In the hybrid deep-learning phase, a novel factored three-way interaction machine was proposed to learn the relational features of the human-object pairs hierarchically. In this way, the image recognition problem is transformed into a latent structured labeling task. In the image description generation phase, a lexicalized probabilistic context-free tree growing scheme is innovatively integrated with a description generator to transform the descriptions generation task into a syntactic-tree generation process. Extensively comparing state-of-the-art image captioning methods on benchmark datasets, we demonstrated that our proposed framework outperformed the existing captioning methods in different ways, such as significantly improving the performance of the HOI and relationships between HOIs and scenes (RHIS) predictions, and quality of generated image captions in a semantically and structurally coherent manner.

dc.format.extent7441-7452
dc.format.mediumPrint-Electronic
dc.languageeng
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectTask analysis
dc.subjectContext modeling
dc.subjectVisualization
dc.subjectSolid modeling
dc.subjectImage recognition
dc.subjectHybrid power systems
dc.subjectGenerators
dc.subjectHuman-object interaction (HOI)
dc.subjecthybrid deep learning
dc.subjectimage captioning
dc.subjectimage context
dc.subjectnatural language processing
dc.titleAutomatically Generating Natural Language Descriptions of Images by a Deep Hierarchical Framework
dc.typejournal-article
dc.typeJournal Article
plymouth.author-urlhttps://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000733254800001&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=11bb513d99f797142bcfeffcc58ea008
plymouth.issue8
plymouth.volume52
plymouth.publication-statusPublished
plymouth.journalIEEE Transactions on Cybernetics
dc.identifier.doi10.1109/tcyb.2020.3041595
plymouth.organisational-group/Plymouth
plymouth.organisational-group/Plymouth/Faculty of Health
plymouth.organisational-group/Plymouth/Faculty of Health/School of Nursing and Midwifery
plymouth.organisational-group/Plymouth/REF 2021 Researchers by UoA
plymouth.organisational-group/Plymouth/REF 2021 Researchers by UoA/UoA03 Allied Health Professions, Dentistry, Nursing and Pharmacy
plymouth.organisational-group/Plymouth/Users by role
plymouth.organisational-group/Plymouth/Users by role/Academics
dc.publisher.placeUnited States
dc.identifier.eissn2168-2275
dc.rights.embargoperiodNot known
rioxxterms.versionofrecord10.1109/tcyb.2020.3041595
rioxxterms.licenseref.urihttp://www.rioxx.net/licenses/all-rights-reserved
rioxxterms.typeJournal Article/Review


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record


All items in PEARL are protected by copyright law.
Author manuscripts deposited to comply with open access mandates are made available in accordance with publisher policies. Please cite only the published version using the details provided on the item record or document. In the absence of an open licence (e.g. Creative Commons), permissions for further reuse of content should be sought from the publisher or author.
Theme by 
Atmire NV