jovanovik - Tumblr blog

jovanovik · 26 days

Text

OA Version: Review of Natural Language Processing in Pharmacology

Our previously published paper "Review of Natural Language Processing in Pharmacology" is now available as open access (OA), after the 12 months of embargo are over. You can now access the paper directly on the Pharmacological Reviews journal website, for free.

You can access it as HTML at this URL, or directly as PDF at this URL.

#Pharmacology #NLP #MachineLearning #KnowledgeGraphs #Research #Paper

0 notes

jovanovik · 2 months

Text

New Preprint: RDFGraphGen: A Synthetic RDF Graph Generator based on SHACL Constraints

In the past year or so, our research team designed, developed and published RDFGraphGen, a general-purpose, domain-independent generator of synthetic RDF knowledge graphs, based on SHACL constraints. Today, we published a preprint detailing its design and implementation: "RDFGraphGen: A Synthetic RDF Graph Generator based on SHACL Constraints".

So, how does RDFGraphGen work, and why was it needed?

The Shapes Constraint Language (SHACL) is a W3C standard which specifies ways to validate data in RDF graphs, by defining constraining shapes. However, even though the main purpose of SHACL is validation of existing RDF data, in order to solve the problem with the lack of available RDF datasets in multiple RDF-based application development processes, we envisioned and implemented a reverse role for SHACL: we use SHACL shape definitions as a starting point to generate synthetic data for an RDF graph. The generation process involves extracting the constraints from the SHACL shapes, converting the specified constraints into rules, and then generating artificial data for a predefined number of RDF entities, based on these rules. The purpose of RDFGraphGen is the generation of small, medium or large RDF knowledge graphs for the purpose of benchmarking, testing, quality control, training and other similar purposes for applications from the RDF, Linked Data and Semantic Web domain.

RDFGraphGen is open-source and is available as a ready-to-use Python package.

Preprint: https://arxiv.org/abs/2407.17941 Authors: Marija Vecovska and Milos Jovanovik RDFGraphGen on GitHub: https://github.com/mveco/RDFGraphGen RDFGraphGen on PyPi: https://pypi.org/project/rdf-graph-gen/

#DataGenerator #SyntheticData #KnowledgeGraphs #RDF #SHACL #SemanticWeb

0 notes

jovanovik · 8 months

Text

New Paper: Sentiment Analysis in Finance: From Transformers Back to eXplainable Lexicons (XLex)

I'm happy to share that our paper "Sentiment Analysis in Finance: From Transformers Back to eXplainable Lexicons (XLex)" has just been published in the IEEE Access journal.

It showcases an amazing work done by Maryan Rizinski (Boston University, USA) and Hristijan Peshov (Faculty of Computer Science and Engineering - Skopje), under the guidance of prof. Dimitar Trajanov, with the help of the rest of our team: prof. Kostadin Mishev and myself (prof. Milos Jovanovik).

In this paper, we present a novel XLex methodology that leverages NLP transformer models and SHAP explainability to automatically enhance the vocabulary coverage of the Loughran-McDonald (LM) lexicon in sentiment analysis scenarios for financial applications. Our results demonstrate that standard domain-specific lexicons, such as the LM lexicon, can be expanded in an explainable way with new words without the need for laborious annotation involvement of human experts, a process that is both expensive and time-consuming. We have conducted 22 separate experiments, and in all of them, the proposed XLex methodology leads to increased performance compared to LM. Additionally, the use of generated (XLex) or combined lexicons (XLex+LM) leads to significant improvements in sentiment analysis results compared to using the manually annotated lexicon alone.

Overall, the proposed XLex methodology holds great promise in advancing the field of sentiment analysis, particularly in applications where interpretability is of utmost importance.

The paper is published as #OpenAccess and is freely available at the link below.

Paper: https://ieeexplore.ieee.org/document/10380556

#SentimentAnalysis #MachineLearning #NaturalLanguageProcessing #TextClassification #Finance #Lexicons

0 notes

jovanovik · 1 year

Text

New Paper: Knowledge Graph Based Recommender for Automatic Playlist Continuation

Our latest research has just been published in the MDPI Information Journal.

The paper, titled "Knowledge Graph Based Recommender for Automatic Playlist Continuation" explores and demonstrates the potential of using Knowledge Graphs in the domain of music recommendation and playlist continuation. More specifically, by integrating representational learning with Graph Neural Networks (GNNs) and fusing multiple data streams, our approach effectively models user behavior, leading to accurate and personalized music recommendations.

✍️ Authors: Aleksandar Ivanovski, Milos Jovanovik, Riste Stojanov and Dimitar Trajanov.

📄 Paper (HTML): https://www.mdpi.com/2078-2489/14/9/510 📄 Paper (PDF): https://www.mdpi.com/2078-2489/14/9/510/pdf

#KnowledgeGraphs #GraphNeuralNetworks #RepresentationLearning #Music #ArtificialIntelligence

0 notes

jovanovik · 1 year

Text

New Paper: Review of Natural Language Processing in Pharmacology

I'm happy to share that after an extensive review, our paper "Review of Natural Language Processing in Pharmacology" has been published in the Pharmacological Reviews journal.

The paper is a result of the fruitful collaboration of our research team at the Faculty of Computer Science and Engineering - Skopje (Ss. Cyril and Methodius University in Skopje) with the University of Ljubljana, Faculty of Computer and Information Science, lead by prof. Dimitar Trajanov and prof. Marko Robnik-Sikonja.

The main objective of this paper is to survey the recent use of NLP in the field of pharmacology, in order to provide a comprehensive overview of the current state in the area after the rapid developments which occurred in the last few years. We believe the resulting survey to be useful to practitioners and interested observers in the domain.

The team: Dimitar Trajanov, Vangel Trajkovski, Makedonka Dimitrieva, Jovana Dobreva, Milos Jovanovik, Matej Klemen, Aleš Žagar, Marko Robnik-Sikonja.

The paper is available on the Pharmacological Reviews website: https://pharmrev.aspetjournals.org/content/75/4/714

#NLP #MachineLearning #Research #Pharmacology #Drugs

0 notes

jovanovik · 2 years

Text

New Paper: Learning Robust Food Ontology Alignment

Our paper from the IEEE Conference on Big Data 2022 (Osaka, Japan) has now been published online. "Learning Robust Food Ontology Alignment" is a work by our team at the Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University in Skopje, Macedonia.

In the paper, we show how ontology alignment can be performed using neural networks. Each semantic resource is represented as a combination of graph based representations (#RDF2vec) and text representations (#BERT), in order to capture its semantic and structural features.

With this, we get a methodology for ontology alignment that is both robust and ontology-agnostic. It can be applied to any ontology, regardless of the domain.

Congrats to the team: Viktorija Mijalcheva, Ana Davcheva, Sasho Gramatikov, Milos Jovanovik, Dimitar Trajanov and Riste Stojanov.

Paper: https://ieeexplore.ieee.org/document/10020417

#OntologyAlignment #KnowledgeGraphs #NeuralNetworks #RDF

0 notes

jovanovik · 2 years

Text

New Paper: PharmKE: Knowledge Extraction Platform for Pharmaceutical Texts Using Transfer Learning

The new year is off to a great start! Our paper "PharmKE: Knowledge Extraction Platform for Pharmaceutical Texts Using Transfer Learning" has just been published in the MDPI Computers journal. It highlights the work by our team at the Faculty of Computer Science and Engineering - Skopje in the field of transfer learning in pharmacology.

More specifically, in it we introduce PharmKE, a text analysis platform tailored to the pharmaceutical industry that uses deep learning at several stages to perform an in-depth semantic analysis of relevant publications. With the platform, pharmaceutical domain specialists can easily identify and visualize the knowledge extracted from the input texts.

Shout out to our team: Nasi Jofche, Kostadin Mishev, Riste Stojanov, Milos Jovanovik, Eftim Zdravevski and Dimitar Trajanov.

Paper: https://www.mdpi.com/2073-431X/12/1/17

#Pharmacology #DeepLearning #TransferLearning #KnowledgeGraphs

0 notes

jovanovik · 8 years

Text

New Paper: Consolidating Drug Data on a Global Scale Using Linked Data

The main contribution from my PhD Thesis has just been published in the Journal of Biomedical Semantics: "Consolidating Drug Data on a Global Scale Using Linked Data".

Results: We developed a methodology and a set of tools which support the process of generating Linked Data in the drug domain. Using them, we generated the LinkedDrugs dataset by seamlessly transforming, consolidating and publishing high-quality, 5-star Linked Drug Data from twenty-three countries, containing over 248,000 drug products, over 99,000,000 RDF triples and over 278,000 links to generic drugs from the LOD Cloud. Using the linked nature of the dataset, we demonstrate its ability to support advanced usage scenarios in the drug domain.

Conclusions: The process of generating the LinkedDrugs dataset demonstrates the applicability of the methodological guidelines and the supporting tools in transforming drug product data from various, independent and distributed sources, into a comprehensive Linked Drug Data dataset. The presented user-centric and analytical usage scenarios over the dataset show the advantages of having a de-siloed, consolidated and comprehensive dataspace of drug data available via the existing infrastructure of the Web.

Paper: https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-016-0111-z

#LinkedData #LinkedDrugs #Drugs #OpenData #SemanticWeb #RDF #Bioinformatics

0 notes

jovanovik · 8 years

Text

PhD Thesis: Linked Data Application Development Methodology

I'm happy to share that I've successfully defended my Doctorate (PhD), at the Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University in Skopje.

With this, I'm now a Doctor of Technical Sciences, specialty: Computer Science and Engineering.

You can access and download the full thesis text in Macedonian, or the shortened executive summary in English, at the links below.

PhD Thesis (ENG): https://repository.ukim.mk/handle/20.500.12188/253

PhD Thesis (MKD): https://repository.ukim.mk/handle/20.500.12188/17581

#PhD #Thesis #LinkedData #Methodology #DoctoralDegree #KnowledgeGraphs #RDF #SemanticWeb

0 notes

jovanovik · 9 years

Text

New Ontology Tools

This morning I came across a Google+ post from Pierre-Yves Vandenbussche, one of the people behind the Linked Open Vocabularies (LOV) project. In it, Pierre-Yves details the newly added ontology tools on the LOV vocabulary pages: WebVOWL, Oops (Ontology Pitfall Scanner) and Parrot.

WebVOWL

WebVOWL is an online version of the VOWL ontology visualizer. You can add the URI of the ontology you want visualized as a query parameter, in the following way:

http://vowl.visualdataweb.org/webvowl/#iri=http://purl.org/net/po#

Oops (Ontology Pitfall Scanner)

As the name suggests, Oops is a service which checks ontologies for errors, which can be anything from minor suggestions (missing comments) to more serious design issues. The URI of the ontology you want tested can be added as a query string parameter as well:

http://oops.linkeddata.es/response.jsp?uri=http://purl.org/net/po#

Parrot

Parrot is an older, but very useful tool for generating documentation / technical report for an ontology. You can use it in various ways, and one of them is denoting your ontology as a query string parameter, just as the other tools:

http://ontorule-project.eu/parrot/parrot?&documentUri=http://purl.org/net/po#

If you have a suggestion for other new and interesting tools for ontologies, ping to me on Twitter!

#ontology #tools #webvowl #oopsontology #parrot #linked data #semantic web

1 note · View note

jovanovik · 12 years

Text

The EU Open Data Hub

A couple of weeks ago, the European Commission launched the EU Open Data Hub, currently in early beta. It currently contains 5815 datasets, most of which are from Eurostat, the statistical office of the European Union.

The EU Open Data Hub provides a SPARQL endpoint for access to linked data. It is a standard Virtuoso SPARQL endpoint, which means it can be used both from the browser, or from code, by providing the SPARQL query as query string in the request. An example query, tweeted by Kingsley Uyi Idehen, shows how you can get information about partner institutions in the EU FP7 Programme projects, and group them by country. The result in HTML can be seen here. Notice that the URL contains the SPARQL query and the answer format.

The Portal also provides a place for publishing Apps built on top of the Open Data from the Portal. At the moment there are only two Apps, which is expected, but the number should quickly rise, once the Portal is out of beta. After this initial public beta launch, the European Commission is seeking feedback from the Open Data community. If you have any suggestions or comments, head over to their website.

#open data #EU #SPARQL #SPARQL endpoint #linked data

2 notes · View notes

jovanovik · 13 years

Video

vimeo

An interesting video by David Siegel, the author of the book "Pull: the Power of the Semantic Web to Transform Your Business", about how information has evolved since the beginning of man.

The description of the video, originally posted on Vimeo, says: "David Siegel, author of 4 books about the web, presents his new 8-minute video on the history and future of information. Using a fast pace and fun graphics, Siegel introduces us to the next wave of innovation with two key concepts he claims will affect $10 trillion of commerce worldwide: Pull and the Semantic Web."

#history #information #semantic web #video #David Siegel #web 3.0

8 notes · View notes

jovanovik · 14 years

Text

New LOD Cloud Diagram

A new version of the Linking Open Data (LOD) cloud diagram is published. The Linking Open Data community project has a goal to extend the Web with data commons by publishing various open datasets in RDF format on the Web and by setting RDF links between data items from different data sources, i.e interconnecting semantic data from different datasets via dereferenceable URIs. The LOD cloud diagram shows datasets that have been published in Linked Data format by contributors to the Linking Open Data community project, as well as other individuals and organisations. The latest version of the diagram shows 203 datasets, which comprise around 395 million links between over 25 billion RDF statements. The datasets hold semantic data about media, geographical info, publications, government info, life sciences, etc.

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

For more info on the LOD cloud diagram, click here. For more info on the Linked Data initiative, click here.

#linked data #linking open data #rdf #cloud #diagram #Richard Cyganiak #Anja Jentzsch #datasets

0 notes

jovanovik · 14 years

Photo

An illustrated description of what the Semantic Web is all about. Source: Focus.

#semantic web #illustration #picture #focus #sw #semantics #rdf #sparql #owl

0 notes

jovanovik · 14 years

Text

The year Open Data went Worldwide

After sir Tim Berners-Lee called upon everyone to 'put their data on the Web' in his previous appearance on TED, he's back on TED to show us what has been done with the (semantic) linked data on the Web, so far.

In this TED talk, we can see some interesting applications built upon the linked data provided by the UK and US Governments, which help users to access various information, spanning from the way their money is spent, to the safe bike routes through the city they live in. There are also some other interesting community-driven projects, such as the Open Street Map project, which seems to have helped the rescue teams during the Haitian post-earthquake rescue operations.

This is a truly motivational video, both for semantic / linked data publishers, and it's consumers. Both data.gov.uk and data.gov contain lots of data, and it seems it is now up to the developers to find innovative and useful ways to exploit them.

#TED #Tim Berners-Lee #semantic web #linked data #semantic data #community #open data #video #open street map #Haiti #US government #UK government

0 notes

jovanovik · 14 years

Text

Web 3.0: What can we expect from the WWW in the following years

The problem the World Wide Web is currently facing is the lack of mechanisms for easy and efficient consumption of relevant news, information and data from the huge amount of data present on the Web. The problem is worstening with the daily growth of these numbers. The way that Google provides us with relevant search results cannot keep pace with the growing number of pages and information on the Internet.

According to Tim Berners-Lee, the creator of the WWW, "when users search the Web, they are practically praying that they'll get the required information."

Web 3.0, the 14-minute documentary by Kate Ray, an NYU student, makes a good and interesting analysis of the problems the Internet is facing today and the vision of the Semantic Web as a mechanism for overcoming these constraints and for shifting the WWW from a web of documents (pages) to a web of data.

The Web 3.0 features a number of researchers, innovators and representatives of companies, who share their ideas about which direction the Web should move to in the future. Among them are Tim Berners-Lee - W3C Director and the father of the web, David Karger - professor at MIT and author of the semantic projects Haystack and SIMILE, and Alon Halevy - a researcher at Google.

Given that our professional, private and social lives are inevitably and deeply connected to the Internet and to the information that we "feed" upon through the Web, the future of the WWW should be our common concern and interest. If you want to know what we can expect in the next few years, take a look at Kate Ray's documentary.

#Web 3.0 #Semantic Web #Video #Documentary #WWW #Internet

2 notes · View notes

jovanovik · 14 years

Text

What is the Semantic Web?

The current Web is a vast network of documents. These documents, and thus the data they contained, are interconnected by mechanical means, using hyperlinks to reference each other. What's been quite popular in the web research in the past nine years, is the idea of World Wide Web Consortium (W3C) and the large number of it's research and industrial partners: extending the principles of the current Web, from a Web of Documents, to a Web of Data. This endeavor is called the Semantic Web, and it aims to connect the data found on the web according to it's meaning, i.e it's semantics. This idea first appeared in the article "The Semantic Web", published in May 2001 in Scientific American, by Tim Berners-Lee, James Hendler and Ora Lasila [1]. According to them, the Semantic Web is a Web in which the data has meaning, so it will create an environment in which software agents, which crawl the Web, will be able to perform sophisticated tasks assigned by users [1].

The Semantic Web is not a new, separate web, but an expansion of the existing Web, in which the data will have well-defined meaning, enabling computers and people to easily collaborate. The first steps towards the realization of this idea have already been taken. In the near future, with the constant development of Semantic Web technologies, our computers (and machines, in general) will be able to process and understand the data they only show today. This will produce new functionalities for the end users [1].

The Semantic Web mainly concentrated on two things. First, how to create common formats for integration and combining of data extracted from various sources, which is an approach different from the existing Web approach, which concentrates on the exchange of documents (HTML pages, for example). Second, creating a language which should correnspond with the relations of the data objects from the real world. That would allow the computer (or a machine, in general) to start working with some data from a particular collection, and then continue to work with data contained in the set of huge collections of data, that are not physically connected to the starting set, but refer to the same thing the user is interested in. This sets of data are located somewhere online, i.e. on the Semantic Web [1]. The user doesn't explicitly need to know what data set is he/she working with. Thus, the Semantic Web gives users a sense that all the data sets which can be found on the Web, are available as one large collection of data [2][3].

The main idea is for the Semantic Web to be created progressively, with incremental changes, by collecting machine-readable descriptions of data with documents that already exist on the Web. As we know, the Web is a huge set of "static" web pages that are linked together via hyperlinks. Currently, the Web is in the process of evolution towards the Semantic Web, as illustrated by the example shown in the picture. There are many different approaches of how to add semantic description to the existing Web resources. On the left-hand side of the picture, is a graph depicting the current Web, composed of mechanically linked resources. The resources are interlinked and form a network. For a software agent, who crawls the Web, there is no visible difference between these resources, i.e. the links that connect them. To give meaning to the resources and the relationships between them, we need new standards and new languages.

These new standards, languages and technologies are all part of the Semantic Web. They enable us to give meaning to these resources, meaning which a machine can also understand. If we enable the machines to understand the data they work with, we produce new functionalities for the end users. We can write programs which buy movie tickets for us, based on our schedules, based on our preferences towards a certain movies genre, based on our preferences of certain movie theaters, etc. We can also buy and sell stock, arrange traveling plans, vacation plans, etc., by giving a software agent our preferences and allowing it to crawl the Semantic Web, find the right data, purchase the tickets and update our calendar.

The Semantic Web endeavour is vast, so sometimes people get the impression that it may be too ambitious. By now, the fundaments of the Semantic Web have been already laid out. The standardization of XML, RDF, RDFS, OWL and SPARQL, and the efforts to standardize RIF, the language for expressing rules, encourages the researchers and implementors of the Semantic Web technologies to focus on the upper layers of the architecture. Today, more and more companies and governments incorporate the recommendations and standards of the W3C [4]. Thus, the ideas and the technologies of the Semantic Web are alive and are being used not only in science, but for commercial purposes also [5][6].

In order to achieve the vision of the Semantic Web, the current research focus is on the upper layers of the architecture. Once they are standardized, what remains is the attention of the researchers and their industry partners to turn to untouched areas - the evidence and the confidence layers, which will play a key role in the widespread acceptance of the Semantic Web by users of the existing Web.

References:

[1] Berners-Lee, T., J. Hendler, O. Lassila, “The Semantic Web”, Scientific American, May 2001.

[2] W3C Semantic Web Activity, http://www.w3.org/2001/sw/

[3] Interview with Tim Berners-Lee, Business Week, April 2007.

[4] Janev, V., “Primena semantičkog jezika SPARQL za pretraživanje u okviru Semantičkog Web-a”, Institut Mihajlo Pupin, Volgina 15, 11060, Belgrade.

[5] Cardoso, J., “The Semantic Web Vision: Where are We?”, IEEE Intelligent Systems, September/October 2007, pp.22-26, 2007.

[6] Nikov, A., Jovanovik, M., Stojanov, R., Petkovski, M., Trajanov, D., “Use of Semantic Web Technologies for Meeting Management Software Development”, ICT Innovations 2009, September 2009.

#Semantic Web #Web 3.0 #Introduction to Semantic Web #Research #Research Topic

0 notes