BioPAXValidator

From BiopaxWiki
Jump to: navigation, search


Overview

The BioPAX Validator helps improve BioPAX data quality. It implements and checks many syntax and semantic rules and BioPAX best practices, which in most cases cannot be formally defined neither using OWL constraints nor - a rule definition language. It was started by Pathway Commons team in 11/2008; and here are initial requirements and design. The latest stable (official) version is available online at http://www.biopax.org/validator.

BioPAX Rules

Availability

Software that might be useful for or inspire biological pathway data validation:

Contribute!

We can imagine several levels of contribution, depending on how familiar you are with the project and what time and resources you are going to grant. It worth noticing that all the listed below categories and tasks are very important, and those that may seem to be for beginners only or somewhat boring to perform, in fact, are currently of highest priority. So we encourage you to actively participate!

Basic

  • either try it online or download the latest version and report problems (using the issue tracker: http://sourceforge.net/p/biopax/_list/tickets)
  • grab the sources from the CVS and create more BioPAX rule tests that generate example OWL files (a very important and excellent exercise!) - in the org.biopax.validator.Level3RulesUnitTest or TestContextTest class
  • feel free to ask to biopax-discuss@googlegroups.com or at http://groups.google.com/group/biopax-discuss

Intermediate

(one may submit patches or become a BioPAX developer to check-in)

  • implement a rule labeled with "*" at the BioPAXRules page
  • help debug/test/tune
  • make web service accept/send zipped BioPAX data

Advanced

  • improve auto-fix and normalization
  • make it asynchronous, multi-threaded (why?..)
  • getting rid of AspectJ LTW?.. [- LTW is not required anymore when the validator is used as java library, but it will ignore all issues that occur in the BioPAX reader while converting from RDF]
  • integration with visualization and modelling tools
  • add advanced rules that may require using external tools (e.g., check organism, check that sequence match the xref, etc...)


Milestones

biopax-validator v1.0 alpha 20-Apr-09 [done]
biopax-validator beta 07-May-09 [done]
Trying at IOB May-09 Toronto-Bangalore
XML configuration for the DB synonyms in Xrefs; limit DB usage (allow/deny) May-09 [done]
New features: FIXIT behavior, errors threshold, and warnings Jun-09 [done]
Web service and Web site (draft) Jun-09 [done]
biopax-validator v1.0 beta 2 Aug-09 [done]
Re-design: simpleIO instead JenaIO; errors - only via AOP; simpler interfaces; a basis for the future multi-thread validation Jul-Aug [done]
Implementing rules Aug-Oct-09
Present the validator at the BioPAX meeting Nov-09 [done] NYC USA
Assembly or convert BioPAX models online Collaborate with ChiBE
Begin v2.0 development Dec-09 [done] biopax hg (Mercurial) repository; fix/normalization behavior, maven2
Basic CV rule change: do not use the Generic Schema Validator module, nor Ontology Manager; use OLS directly Dec-09 [done]
BioPAX Validator 1.0 Release 31-Dec-09 [done] v1.0.6 is the last 1.x one
BioPAX Validator 2.0 2010 [done] normalization, error threshold, new web look, etc... http://www.biopax.org/validator
BioPAX Validator 2.0.0 Release by April-23 (HARMONY) improved error messages, normalization and auto-fix, ontology manager configuration, docs/examples
BioPAX Validator 2.1.0 Release by end of May 2012 (HARMONY meeting) add categories, review rules and messages, improve normalizer, upgrade ontology manager, paxtools, and other dependencies, etc.
BioPAX Validator 3.0.0 Release January 2013 [done] "notstrict" validation profile added; improved auto-fix and normalization; improved console app; improved web style, etc.
BioPAX Validator 3.0.1 Release April 2013 TODO: fix bugs; improve validating of multiple large files (from a directory)
BioPAX Validator 3.0.2 Release Summer 2013 [done]
BioPAX Validator 3.0.3 Release December 2013 [done]
BioPAX Validator 4 2014 TODO: JSON support and more web services; optionally, use of a (Pathway Commons) data warehouse and other external resources to, e.g., suggest/fix CV terms, infer some BioPAX properties, check using isoform sequences and identifier vertions, etc...