UPPA        

SVG Semantic Graph

(click here to download the prototype)


Khouloud Salemeh   Farah El Akoum   Joe Tekli
CS&E Dept.
American University of Ras Al Khaimah (AURAK)
10021, Ras Al Khaimah, UAE
  SOE, ECE Dept.
Lebanese American University
36 Byblos, Lebanon
  SOE, ECE Dept.
Lebanese American University
36 Byblos, Lebanon
khouloud.salameh@aurak.ac.ae   farah.elakoum@lau.edu   joe.tekli@lau.edu.lb

    I. Introduction

    Given that the meaning of an image is rarely self-evident using traditional keyword and/or content-based descriptions, t he general goal of this study is to convert, with as little human intervention as possible, a stream of web vector graphics into a searchable knowledge graph structure that encodes semantically relevant image contents. To do so, we introduce an original framework titled SSG which automatically converts a stream of SVG images and objects into a semantic graph. We introduce an incremental clustering approach to semantically annotate SVG images and their constituent objects in a fast and efficient manner , using an aggregation of shape, area, color, and location similarity measures. We then produce an RDF graph representation of the input image and integrate it in a reference knowledge graph, incrementally extending its semantic expressiveness to improve future annotation tasks. This achieves semantization of vector image contents with minimum human effort and training data, while complying with native Web standards (i.e., SVG and RDF) to preserve transparency in representing and searching images using Semantic Web stack technologies. Our solution is of linear complexity in the number of images and clusters used. We have conducted a large battery of experiments to test and evaluate our approach. We have created a labelled SVG dataset consisting of 22,553 objects from 750 images based on panoramic dental x-ray images. To our knowledge, it is the first significant dataset of labelled SVG objects and images, which we make available online as a benchmark for future research in this area. Results underline our approach's effectiveness, and its applicability in a practical application domain.  

    II. System Architecture

    An overview of our SSG image semantization framework is depicted in Figure 3. It consists of four main modules: i) SVG-to-KG conversion, ii) SVG similarity computation, iii) SVG image and object clustering, and iv) SVG annotation and KG integration. An input SVG image is first processed for feature extraction by identifying its constituent objects and their properties, and converting them into a KG representation in the form of RDF subject-predicate-object triples. Object features are processed for similarity computation using dedicated shape, area, location, and color similarity measures. The image is run through an incremental clustering process, grouping it with the most similar cluster representatives based on their aggregate object similarities. The image is then labeled according to its associated cluster, and is integrated in the reference KG accordingly. The reference KG's semantic expressiveness continuously increases as new images are labelled by the system. The user can choose to fine-tune each module, by choosing the most representative features, fine-tuning the similarity measures' weights, choosing the cluster seed representatives, and validating the system generated annotations before integration in the reference KG.


    Fig. 1. Simplified activity diagram describing our SSG framework

    Knowing that every resulting semantic image representation is appended to the KG, this latter will become increasingly rich with semantic annotations.In other words, the KG 's semantic descriptive powers will incrementally increase as the system is being used

    snapshots
    Fig. 2. Snapshot of the SSG prototype

    We have conducted a large battery of experiments to test and evaluate our approach. We have created a labelled SVG dataset consisting of 22,553 objects from 750 images based on panoramic dental x-ray images. To our knowledge, it is the first significant dataset of labelled SVG objects and images, which we make available online as a benchmark for future research in this area. Results underline our approach's effectiveness, and its applicability in a practical application domain.

    GitHub link to prototype and empirical dataset.