Listing 1 - 10 of 73 | << page >> |
Sort by
|
Choose an application
Choose an application
It is our great pleasure to welcome you to the ACM International Conference on Multimedia Retrieval - WCRML'19. Learning from data in different modalities has been a popular research topic in both academia and industry. Utilizing information cross-modalities is particularly important in scenarios where multi-modal data is easy to obtain but only few of the modalities are richly annotated. In contrast to multimodal learning that mainly focuses on information fusion from multiple modalities, cross-modal learning involves semantic interactions between different modalities and has been shown to create interesting applications such as image captioning and visually indicated sounds.This workshop is of interest to audiences from both academia and industry. For academic attendees, the targeted audience includes researchers who develop models and systems interacting with heterogeneous data and who would like to see how their models and systems can be applied in industry scenarios. For students who will be applying for jobs after graduation, this workshop will show them possible research challenges that are relevant to industry. In recent years, an increasing number of industry researchers have also been joining top artificial intelligence and machine learning conferences. This workshop creates opportunities for them to exchange their experiences from different business areas, and learn from the latest advances in academia.
Choose an application
It is our great pleasure to welcome you to the 2019 ACM First International Workshop on Search as Learning with Multimedia Information - SALMM 2019. It is the first workshop which addresses multimedia aspects in the context of learning-oriented search on the Web. As such, SALMM offers a unique forum for interdisciplinary research and discussion and aims to bridge the gap between the communities of multimedia, psychology, and information retrieval research on this topic. The call for papers attracted submissions from Asia and Europe. All submissions were reviewed by three members of the program committee with respect to the scientific quality and the suitability to the workshop's topic. Submissions were accepted if all three reviewers agreed to accept, or if their average score was above zero (scores ranged from strong accept (+3) to strong reject (-3)). Consequently, two of the five submissions were accepted for presentation during the workshop (acceptance rate 40%).
Choose an application
Vision-and-Language (VL) is a popular research area that sits at the nexus of Computer Vision and Natural Language Processing (NLP). This monograph surveys vision-language pre-training (VLP) methods for multimodal intelligence that have been developed in the last few years.
Choose an application
Choose an application
Choose an application
It is our great pleasure to welcome you to the 1st Workshop on Multimodal Product Identification in Livestreaming and WAB Challenge - WAB 2021, which is held in conjunction with the ACM Multimedia 2021. The WAB Challenge and associated workshop is held to allow researchers to present their progress, communicate and co-develop novel ideas that potentially shape the future of multimodal retrieval. Based on the proposed multimodal product retrieval dataset named "Watch and Buy" (WAB), we launch WAB Challenge, which encourages participants to establish approaches to fully automatic detection and multimodal identification of products presented in real-world e-commerce livestreaming.The challenge attracted registrations of 587 teams from Asia, Europe, and North America. After preliminaries, the top 20 teams enroll in the semi-finals. After semi-finals, the top 10 teams enroll in the finals. Then based on end2end reproduced results and technical solution report, the final leaderboard was determined. The best model achieves the averaged F1 score of 0.69, which is 0.22 higher than the baseline model. All of the top-3 teams have a higher score than 0.6. For the call for papers part, the program committee accepted 5 papers, including 2 long papers and 3 short papers.
Choose an application
The Handbook of Multimodal-Multisensor Interfaces provides the first authoritative resource on what has become the dominant paradigm for new computer interfaces-user input involving new media (speech, multi-touch, hand and body gestures, facial expressions, writing) embedded in multimodal-multisensor interfaces. This three-volume handbook is written by international experts and pioneers in the field. It provides a textbook, reference, and technology roadmap for professionals working in this and related areas. This third volume focuses on state-of-the-art multimodal language and dialogue processing, including semantic integration of modalities. The development of increasingly expressive embodied agents and robots has become an active test-bed for coordinating multimodal dialogue input and output, including processing of language and nonverbal communication. In addition, major application areas are featured for commercializing multimodal-multisensor systems, including automotive, robotic, manufacturing, machine translation, banking, communications, and others.
Choose an application
Multimodal user interfaces (Computer systems) --- Data mining --- Multimedia systems
Choose an application
Listing 1 - 10 of 73 | << page >> |
Sort by
|