Requirements fro BioPAX URIs
Choice of Base URIS
Post from Andrew Gibson:
I am currently looking at integrating and / or link my data source to Reactome. As a user of Semantic Web standards, I find that it is very progressive of Reactome to provide a BioPax representation of reactions and pathways. However, I have noticed a few problems with the underlying representation of downloaded '.owl' files of Reactome content for integration purposes.
First, I notice that all of the RDF documents downloaded from your website are given the same base URI <http://www.reactome.org/biopax>. This causes some problems for integration in certain environments. A base URI for a BioPax document that is derived from an identifier for the pathway or reaction would be more appropriate in my opinion.
Second, I have noticed that when I download and integrate two or more 'homologous' pathways from different species, that the URIs for certain entities are identical in both documents, meaning that certain nodes are joined, leading to ambiguity ad errors in the resulting RDF graph.
For example, take the pathway Beta Oxidation of Pristanoyl-CoA for Human and Rat. The URI for the Pathway in both of these documents is the same:
Merging the documents joins them at this node. Now it is not possible to distinguish which document the data and metadata originated from. For example, after joining, this node will contain 'organism' relationships to both human and rat, which would be inconsistent in the BioPax3 model, as organism is a functional property).
Andrea: as one need to identify a pathway for a specific organism, the URI for the pathway should be organism-pathway specific. There may be also another URI which is more general and identifies the pathway independently from the organism. The latter may not fit in BioPAX at this point.