The DTD of Core SBML

Joachim Niehren, Inria Center of the University of Lille, BioComputing Team, France

Last update on 21 Feb 2024 @Copyright

Version and Sources

Version 1.21 from 21 Feb 2024.

The documentation csbml-1.21 of the schema of Core SBML shown here is created from the same source as the DTD csbml-1.21.dtd of Core SBML itself.

Core SBML Networks

A Core SBML network represents a reaction network with a set of algebraic and differential equations possibly with delays.

Introduction

Core SBML has an XML syntax, so it is composed from XML elements. Any network of Core SBML contains a set of variable and reaction elements. Species are special types of variables. Each reaction has linear combination of reactants and a linear combination of products. Reactants and products must always be variables of type species. A reaction also has a set of modifiers, which are variables of any type.

Each reactions has a kinetic expression, which is a arithmetic expressions that is build from the variables and the usual arithmetic functions. Furthermore, kinetic expressions may use macros for subexpresssions that are defined by expression elements. They may also contain application of arithmetic functions that are defined by the network's function elements.

Any Core SBML network can be drawn as a graph that is bipartite. The variable elements yield the variable nodes, while the reaction elements constitute its reaction nodes of the graph of a network, Variable nodes are drawn as circles and reaction nodes as boxes, following the usual notions coming from Petri nets. The reactants, products, and modifiers define the edges of the network's graph. The edge go from the nodes of reactants and modifiers to the nodes of reactions and from the nodes of reactions to the nodes of products.

Any Core SBML network contains a set of event elements that describe possible changes of the values of variables controlled by differential equations. Finally, it may also contain comment and edgecluster elements that serve to represent the edges of the network's graph in a clustered manner.

Network Elements

In summary, any network element has the following content:


	<!ELEMENT network (( variable |
               reaction | expression | function | event |
               comment | edgecluster )*) >

The attributes of a network are not required since mostly of administrative or graphicla character and unrelated to its semantics.


	<!ATTLIST network
               id      CDATA        #IMPLIED
	       name    CDATA      #IMPLIED
	       biomodels CDATA  #IMPLIED
	       version CDATA      #IMPLIED
	       csbml-version CDATA #IMPLIED 
	       kind    CDATA      #IMPLIED
	       scale   CDATA      #IMPLIED>

Any network may have an identifier, a kind, and a scale for its graph.

@id: the identifier is relevant when referencing to the networks entities within LaTeX

@scale: the scale is a positive real number that permits to scale the x-axis and y-axis simultaneously

@kind: is an identifier which serves for debugging purposes

Variable Types

Each variable must have an attribute @type which may have one of four possible values:

species,
algebraic,
differential, or
control

Only variables of type species may be used as reactants or products of reactions, while any type of variables may be used in the kinetic expressions or as modifier of the reaction. Furthermore, variablesof type control do not lead to any variable node in the graph, in contrast to the three other types of variables.

Semantically, each variable defines a signal or equivalently a trajectory, i.e., a real-valued function from the positive reals. The type of a variable declares how its signal is defined by the network.

The signal of a species variable is specified by the reactions of the networks: The reactions define a system of differential equations, one for each species variable an equality for the derivation. The signal of a species variable is a solution of the differential equations.

For any algebraic variable the signal is defined by some algebraic equation that is explictely given by the network. The signal of a differential variable is given a differential equation that is explictely given by the network. A control variable is constant except that its value may be changed by events. Since directed by differential equations, the values of all nonalgebraic variables can also be updated by events.

Identified Objects

All variable, reaction, and expression elements are objects with unique identifiers and may have pretty names.


	  <!ENTITY % object "
                 id               CDATA     #REQUIRED
	         latex-look   CDATA    #IMPLIED
	         comment     CDATA    #IMPLIED
          ">

id: the unique identifier of the object. The symbols "-" and "+" are the only non alpha numeric symbols allowed in identifiers.

latex-look: the pretty name of the object. It is allowed to contain arbitarary symbols. In many tools, these pretty names will be interpreted by latex, so they better contain valid latex sequences.

@comment: a latex comment

Graphical Entities

The variable elements of Core SBML networks are identified objects that contain information about where to place the corresponding graph nodes.

Graph Nodes

The graph of a reaction network has nodes for variables of type different than control, reactions, and edge clusters.


	  <!ENTITY % node " %object;
	         x                CDATA     #IMPLIED
	         y                CDATA     #IMPLIED
	         aux               CDATA    #IMPLIED 
	         epsilon          CDATA    #IMPLIED 
          ">

Graph nodes may be given x-y coordinates for their layout in the graph.

@x: position coordinate at x-axis

@y: position coordinate at y-axis

Constraints

the value of @x must be castable to xs:float
the value of @y must be castable to xs:float

The nodes of variables are called variable nodes. Recall that all variables but those of type control are mapped to variable nodes. So also the variables of types algebraic and differential.


	  <!ENTITY % variable-node " %node;
	         initial              CDATA      #IMPLIED
	         initial-expression   CDATA      #IMPLIED
	         essential         CDATA         #IMPLIED
	  ">

Note that either of @inital or @inital-expression must be present. The otherwise optional attributes of variable nodes have the following meaning:

@initial: a real number for the initial value of the variable.

@initial-expression: the identifier of an initial expression that specifies the initial value.

@essential: if present, the circle around the species' node will be drawn in red. It indicates that the presence of the species is essential for a network to work properly.

Copies of variable-nodes

Each variable node may have a set of copies, which are drawn as nodes with dashed circles. All nodes for the same variable are linked by a splitpoint which also is drawn as a node (but does not need any identifier). Logically, it doesn't matter whether a variable node is linked to a reaction node or one of its copies.


	  <!ENTITY % copies "((copy+,splitpoint,comment?)?,comment*)">
	  <!ELEMENT copy        (#PCDATA) >
	  <!ATTLIST copy        %node;   >
	  <!ELEMENT splitpoint  (#PCDATA) >
	  <!ATTLIST splitpoint
	         x           CDATA     #IMPLIED
                 y           CDATA     #IMPLIED
                 comment    CDATA     #IMPLIED >

So copies can be used in any positon where species nodes can be used.

Arithmetic and Boolean Expressions

We will us arithmetic expressions as kinetic expressions of reactions. They will contain conditions for defining piecewise linear functions. The conditional contain Boolean expressions as conditions. The same Boolean expressions will serve as conditions of events.

Arithmetic Expressions

Real numbers or signals are refered to by the following elements. There are constant elements, references to variables var, references to expression macros expr, and references to a reaction's speed, that is to its kinetic-expression.


	<!ENTITY % number-reference "constant | var | expr | speed">

	<!ELEMENT constant  (#PCDATA)>
	<!ATTLIST constant 
              id	   CDATA     #IMPLIED 
	      value     CDATA     #IMPLIED >

	<!ELEMENT speed  (#PCDATA)>
	<!ATTLIST speed
	      reaction    CDATA     #REQUIRED >

	<!ELEMENT expr  (#PCDATA) >
	<!ATTLIST expr
	      id          CDATA    #REQUIRED  >

	<!ELEMENT var  (#PCDATA)>
	<!ATTLIST var
	      type         CDATA     #IMPLIED
              id	      CDATA     #IMPLIED >

An arithmetic expression may define a real number, but in most cases, it defines a signal, i.e. a real-valued function. The atomic expressions are the references above. There are applications of built-in arithmetic functions mult, power, etc. The built-in functions are named as usual (as standarized by MathML). However, compared to MathML; they are applied with a s simplified syntax without using any apply elements everywhere. Apply elements a reserved for the application of user defined functions.


	<!ENTITY % expression "
               %number-reference; |
               plus|minus|times|divide|
	       power|log|
	       sin|cos|tan|cot|
	       sinh|cosh|tanh|coth|
	       ceiling|floor|
	       delay|time|
	       ifthenelse|
  	       apply             ">

	<!ELEMENT times  (%expression;)*>
	<!ELEMENT plus  (%expression;)*>
	<!ELEMENT divide  ((%expression;),(%expression;))>
	<!ELEMENT minus ((%expression;),(%expression;)?)>

	<!ELEMENT power ((%expression;),(%expression;))>
	<!ELEMENT log ((%expression;),(%expression;))>

	<!ELEMENT sin (%expression;)>
	<!ELEMENT cos (%expression;)>
	<!ELEMENT tan (%expression;)>
	<!ELEMENT cot (%expression;)>

	<!ELEMENT sinh (%expression;)>
	<!ELEMENT cosh (%expression;)>
	<!ELEMENT tanh (%expression;)>
	<!ELEMENT coth (%expression;)>
	
	<!ELEMENT ceiling (%expression;)>
	<!ELEMENT floor (%expression;)>

Beside of the usual building operators from MathML, there are the following more specific operators:

time for the identity function
delay for delays in differential equations
ifthenelse for conditionals
apply for applying function defined in the network itself (rather than being built-in in MathML).


	<!ELEMENT time  (#PCDATA)>
	<!ELEMENT delay ((%expression;),(%expression;))>
	<!ELEMENT apply  ((%expression;)*)>
	<!ATTLIST apply
	       fun             CDATA     #REQUIRED
	       latex-look  CDATA     #IMPLIED >

Arithmetic expressions subsume conditionals ifthenelse that permit to define piecewise functions. They contain boolean expressions as conditions.

Boolean Expressions

Boolean expressions can be used to compare real numbers for equality or ordering. Furthermore, Boolean expressions are closed under the boolean operators.


	<!ENTITY % boolexpression "eq | lt  |  leq | and | or | not |  true "> 
	
	<!ELEMENT eq  ((%expression;),(%expression;))>
	<!ELEMENT lt   ((%expression;),(%expression;))>
	<!ELEMENT leq ((%expression;),(%expression;))>
	<!ELEMENT and ((%boolexpression;)*)>
	<!ELEMENT or ((%boolexpression;)*)>
	<!ELEMENT not (%boolexpression;)>
	<!ELEMENT true (#PCDATA) >

	<!ELEMENT ifthenelse ((%boolexpression;),(%expression;),(%expression;))>

Top-Level Network Elements

A Core SMBL network may contain the following children element on top-level.

Variables

Variable elements must contain one kinetic-expression and a set of copies.


	<!ELEMENT variable (kinetic-expression?, %copies;)   >
	<!ATTLIST variable %variable-node;
	       type         CDATA #IMPLIED
	       compartment  CDATA #IMPLIED
	       concentration CDATA #IMPLIED>

A kinetic expression may and must be present only if @type="differential"

@type: the type of a variable must be either of species, algebraic, differential or control.

@compartment: variables of type species may have a compartment.

@concentration: the concentration of a variable of type species with a compartment is given by its concentration variable.

Expression Macros

An expression element defines a macro for some arithmetic expression:


	<!ELEMENT expression (%expression;)>
	<!ATTLIST expression %node; >

Function Definitions

Function definitions have a sequence of lambda bound variables for the function's formal arguments.


	<!ELEMENT function (lambda)>
	<!ATTLIST function
            id	      CDATA     #REQUIRED
            latex-look  CDATA     #IMPLIED >
	
	<!ELEMENT lambda ((bvar | %expression;)*)>
	
	<!ELEMENT bvar  (#PCDATA)>
	<!ATTLIST bvar 
               id	      CDATA     #IMPLIED >

Reactions

The kinetic law of a reaction is given by an arithmetic expression.


	<!ELEMENT kinetic-expression (%expression;)>
	<!ATTLIST kinetic-expression 
	     angle      CDATA     #IMPLIED >

A reaction may have four kinds of modifiers that are represented by the following elements;


	<!ENTITY % modifier " modifier|inhibitor|activator|accelerator ">

modifier: a general modifier of the reaction

accelerator: an accelerator speeds up the reaction

activator: an activator is an accelerator necessary to apply the reaction

inhibitor: an inhibitor slows down the reaction

A reaction has a kinetics and a set of complements, which are products, reactant, or modifiers.


	<!ENTITY % complement "%modifier; | reactant | product  ">

The content of a reaction element has the following form:


	<!ELEMENT reaction  ((comment)*,(((kinetic-expression))?,(expression|%complement;)*))>
	<!ATTLIST reaction %node;
	    type         CDATA     #IMPLIED
	    candidate    CDATA    #IMPLIED  >

The expresssion in a reaction a macros that are local to the reaction's definition. They can be used to represent local parameters of SBML reactions.

A reaction is drawn as a boxed node. The fill-color of the box indicates whether the reaction is a candidate for knockout.

candidate: the presence of this attribute means that the reaction is a candidate for knockout.

Events

An event has a trigger condition and a set of updates:


	<!ELEMENT event (condition, (update)*)>
	<!ATTLIST event 
            id	      CDATA     #IMPLIED  >

A trigger condition is a conjunction of equations and inequations


	<!ELEMENT condition (%boolexpression;)>
	<!ATTLIST condition >

we can update the values of variable of type species, differential, and control, so all types but algebraic variables.


	<!ELEMENT update (var, (%expression;))>
	<!ATTLIST update >

Edge Clusters

Edge clusters introduce extra nodes that permit to cluster edges between variables and reactions for nicer graph presentation.


	<!ELEMENT edgecluster (source*)>

a source of an edge cluster is a species, a copy of species, or an edge-cluster that has an outgoing edge pointing to it 'edgecluster': a node that clusters edges from several sources.


	<!ATTLIST edgecluster %node;
	    type       CDATA     #REQUIRED>

the type of an edge cluster is the type of edges that point to it. Only edges of the same type can be clustered.

type: thie value is one of the 6 types of reaction complements (see entity @complement below)

Reaction Complements

We next specify the contents of reaction complements which are reactants, products and modifiers.

References to Species

Each reaction complement give refererence to some species variable. Also the source of an edgecluster must contain a reference to a species variable.


	<!ENTITY % species-reference "	  
	       spec        CDATA     #IMPLIED
	       copy        CDATA     #IMPLIED
	       edgecluster CDATA     #IMPLIED" >

The values of these attributes give references to one or many species:

@spec: identifier of a species

@copy: Identifier of a copy of that species

@edgecluster: reference to one or many species

Constraints:

Either @spec or @edgecluster must be present
the attribute @copy is optional in the case where @spec is present.
in this case, there must be some species $species with $species/@id=@spec and $species/copy/@id=@copy


	<!ELEMENT source (#PCDATA) >
	<!ATTLIST source %species-reference; >

Reactants and Products

There are two main reaction complements are reactant and product elements.


	<!ENTITY % reaction-complement "%species-reference;
              stoichiometry CDATA   #IMPLIED" >
	
	<!ELEMENT reactant      (#PCDATA) >
	<!ATTLIST reactant       %reaction-complement; >
	<!ELEMENT product       (#PCDATA) >
	<!ATTLIST product       %reaction-complement; >

Modifiers

Reactions may have four kinds of modifiers: (generic) modifiers, inhibitors, activator, and accelerators. None of these intervenes in the formal semantics, but they are all relevant for the graphical representation of a reaction network.


	<!ELEMENT modifier   (#PCDATA) >
	<!ATTLIST modifier    %species-reference; >
	<!ELEMENT inhibitor   (#PCDATA) >
	<!ATTLIST inhibitor    %species-reference; >
	<!ELEMENT activator   (#PCDATA) >
	<!ATTLIST activator    %species-reference; >
	<!ELEMENT accelerator   (#PCDATA) >
	<!ATTLIST accelerator %species-reference; >

Comments

Comments are given in latex format


	<!ELEMENT comment      (#PCDATA)>

	<!ATTLIST comment 
            latex	      CDATA     #IMPLIED 
            experiment   CDATA     #IMPLIED
  	    prediction     CDATA     #IMPLIED >