This document intends to clarify the main aspects behind the development of the Cyni Toolbox App. It provides an extensive description of all elements in the Cyni API so that future app developers understand better the requirements and the possibilities of the framework that this new app provides.
1. Cyni Components
Cyni is composed of two main components: Algorithms and Metrics. They are completely independent and they are also managed independently. An algorithm could do its job without using a metric and the same thing for a metric. However, if an algorithm wants to use a metric, it needs to pass data to the metric in a way that the metric understands that data and can calculate the result. Because of this need, a third component called CyniTable has been created. This CyniTable is a kind of intermediate table with less functionality that a Cytoscape table but with other advantages such as fast access to the data. Algorithms willing to use a metric will have to ask the selected metric to provide the CyniTable to be used with that metric. Next subsections try to define more deeply these three components.
At the beginning, Cyni was thought as a tool to only infer networks. But its development has revealed that this process implies many other techniques that are not related to network inference directly and that could also be useful. All these techniques, inference techniques and other ones, have in common that they implement an algorithm which manipulates data.
Therefore, the first component of Cyni is a component called Algorithm which requires Cytoscape data as input and then performs some tasks to produce an output such as a network or a table. These algorithms may require other input parameters and this is up to the algorithm developer to request these parameters through Cytoscape Tunables. Defining a new Algorithm will be explained more in detail in Cyni Algorithm Definition Section.
Another component of high importance when trying to infer networks, but also for any kind of technique that compares data is a Metric or Measurement. There are many kinds of metrics that compare data and give a value that reflects the level of similarity of that data. Metrics component wants to make available all these measurements and avoid that users need to develop measurements that have already been implemented by other users. The implementation of any new metric will have to follow some requirements to make it compatible with the existing ones, section Cyni Metric Definition explains how new metrics have to be developed.
1.3. Cyni Table
After loading any biological data into Cytoscape, the data is basically stored in tables called CyTable's. Cyni Algorithms need this data to produce their results, but accessing to it through CyTable's can be very time consuming if it is done very often. Besides, many times not all data in CyTable's is required and so using CyTable's directly is not the most efficient way to facilitate the work of Cyni Algorithms.
CyniTable intends to provide a simple structure to work with this data. Actually, it only contains the required data along with several functions to access to this data quickly. CyniTable consists of an internal table and a bunch of methods to access to the different elements of this data such as its values according to the desired type, column names or if there are missing values in a row. A CyniTable needs a CyTable as an input parameter, the names of the columns that are required to be loaded in the CyniTable and some other configuration parameters. Once a CyniTable is constructed, it becomes a complete separate element and any modification in its data does not imply a modification in the CyTable that comes from.
The goal of CyniTable is to facilitate the work of any component that needs to work with that data, but also to provide a structure so that Metrics and Algorithms can work together.
CyniTable is always open to improvements about the functionality that provides, but always maintaining current methods to keep backward compatibility with previous versions.
2. Cyni Elements Classification
The two main components in Cyni (Algorithms and Metrics) need to classify their elements in a way that users can access to them according to their needs. As explained in previous section, there is a difference of functionality and specifications between these two components that makes necessary the use of different classifications strategies for the elements of each component. Next subsections will explain the different strategy used for each component.
2.1. Algorithm Categories
Cyni Algorithms are divided in three categories according to the functionality that they propose. These categories are not extensible by the user and although it could be extended if another functionality is proposed in the future, this option is reserved to Cyni Developers. Each algorithm has a variable that store the category of that algorithm and its category is unique. Therefore, an algorithm can only be classified in one category of the proposed ones. The three categories available are:
- INDUCTION: All algorithms in this category implements an algorithm that intends to infer a network so the output should produce a new network if everything works well.
- IMPUTATION: Algorithms in this category are focused on estimating data when there are missing values in the input data. Their output should modify the input table with the new calculated values.
- DISCRETIZATION: In this category, the algorithm's objective is to discretize continuous values into nominal values. These continuous values are usually contained in a table column so the output of these algorithms should produce another column with the new nominal values.
2.2. Metric Tags
The second classification found in Cyni is more flexible and expendable than previous one. The variety of available metrics is so large that it makes pretty difficult to define a fixed classification, which would include all possibilities. Therefore, Cyni Metric Tags provides an extensible way that can be used to classify existing and new developed metrics, but each metric is not restricted to use the default tags to define its properties.
In this classification, each tag is represented by a string. The advantage of using strings to define tags is that users developing new metrics will also be able to add their new tags. The way to define a new metric along with its tags is explained in Cyni Metric Definition section.
Another difference with Algorithms classification is that the classification of a metric is not restricted to only one classifier but a list of tags can used to classify a metric.
Cyni already proposes some Metrics tags to classify new metrics and they are also used to classify Metrics already included in Cyni Toolbox. The list of tags available by default so far is:
- INPUT_NUMBERS: Metrics that work with numbers
- INPUT_STRINGS: Metrics that work with strings
- LOCAL_METRIC_SCORE: Metrics that have in common that the final value can be decomposed as the sum or product of the score of each individual node.
- INFORMATION_THEORY: Metrics implementing a information theory related metric
- CONTINUOUS_VALUES: Metric works with continuous values.
- DISCRETE_VALUES: Metric works with discrete values.
- BAYESIAN_METRIC: Metric to be used on bayesian algorithms.
- DIRECTIONAL_METRIC: Metric that produces different values depending on the order, so metric(X,Y) might be different of metric(Y,X).
- CORRELATION_METRIC: Metrics based on searching for any statistical relationship between two random variables or two sets of data.
- K2_METRIC: Metrics with this tag will be available through K2 Bayesian Inference algorithm.
- HILL_CLIMBING_METRIC: Metrics with this tag will be available through Hill Climbing Bayesian Inference algorithm.
- LOW_METRIC: Metric that produces a value that is significant if it is low. By default metrics output values are significant when the value is high. However, metrics containing this tag produce output values that are more significant when the output value is the lowest possible.
3. Cyni Model Design
Cyni development model is based on Cytoscape 3 Model principles. The prefixes Abstract and Cy are also used and follow the same concepts. The goal of Cyni API is to provide a simple and complete programming interface to allow app developers to focus on the development of their algorithms. Cyni API will rest unchanged unless a functionality issue appears. Next subsections give a definition of the main elements of the Cyni API.
3.1. AbstractCyni Classes
AbstractCyni classes are classes that provides an abstract implementation of an interface or implementations of other useful methods. Any app developer willing to contribute to Cyni with new algorithms or metrics, is forced to extend these classes and implement the corresponding related interfaces.
The objective of these classes is to provide users a predefined structure that only needs to be completed to add their algorithms requirements. The list of AbstractCyni classes and their main functionality is:
AbstractCyniAlgorithm: It implements the CyCyniAlgorithm interface, which means that any external component will have to go through this class to use the algorithm. It also stores the main configurations parameters such as category and name.
AbstractCyniTask: It contains the implementation of the algorithm and the produce the final result.
AbstractCyniMetrics: It contains the implementation of the metric and the list of its types.
3.2. CyCyni Interfaces
Cyni interfaces intend to follow the same prefix terminology than Cytoscape 3 and classes with prefix Cy correspond to interface definitions. Methods listed in Cyni interfaces define the way that new proposed Cyni elements will interact with any other Cyni or Cytoscape element. Therefore, each one of these methods will have to be implemented when a new element for one of the two Cyni components(Algorithms and Metrics) is developed. The methods to implement will be different according to the type of the new developed element because there is one interface for each main component in Cyni. The name of these interfaces are:
This class provides several methods that will be used in Cyni to display correctly the User interface required to allow the setting of input parameters. These methods might be overridden to give a different functionality but a Cyni Context class will always need to implement them. At the same time, this class also provides some useful methods that implement useful tasks to get information about the chosen table data such as name of columns, etc.
3.4. Cyni Utils Classes
Cyni API contains some Utils classes where several of methods are grouped according to their functionality. Any developer can get an instance of classes and use any of its methods. So far, there are the following two Utils classes.
CyniNetworkUtils: The goal of methods found in this class is to help with anything related to the generation and the display of a network as well as the generation of the tables related to that network.
CyniBayesianUtils: This class provides methods that intend to help with the implementation of algorithms to infer Bayesian Networks.
The goal of this class is to define the available categories for Cyni Algorithms. All these categories are defined using a java Enum type, so they cannot be modified dynamically.
CyniMetricTags is a java enum class because its purpose is to define the default tags provided by Cyni and that will not be modifiable by users. This java enum class could be updated in future API versions to make available to all users the tags that other metrics developers are already proposing in their own apps.
3.7. Cyni Events
Cyni also provides own events to allow notifying external components that something has happened in Cyni. Currently, these events allow reporting a new algorithm added in the Cyni Toolbox and an algorithm deleted from the list of Cyni algorithms.
4. Cyni Managers
As explained in previous sections, Cyni stores and manages elements for the two main components (Algorithms and Metrics). Any external component should be able to insert new elements to these two groups. At the same time, other internal or external components might want to have access to one or several elements stored in Cyni.
Cyni framework provides two managers that fulfills these expectations. The two type of components have independent managers because of the different way to classify their internal elements. Therefore, even though, both managers provide functions to retrieve one or several elements stored, the requests to the managers require to pass different type of information to get the correct elements. The different interfaces used to communicate with each type of manager are explained below.
4.1. Cyni Algorithm Manager
An algorithm is a component that can be classified in three fixed categories. Therefore, when a new algorithm is loaded to the manager, the category of the algorithm is verified and the algorithm is stored in one of the three possible groups. If the category of the algorithm does not belong to one of these three categories the manager does not store the algorithm. The way to load a new algorithm to Cyni will be explained more extensibly in next section.
CyCyniAlgorithmManager provides an interface to allow external components to retrieve possible algorithms stored in the algorithm's manager. This interface has several methods to request one or several algorithms to the manager. The name of the algorithm and its category are the information used to request the desired algorithms. The methods available in this interface are:
GetCyniAlgorithm: It provides an specific algorithm given its name and category.
GetAllCyniAlgorithms: It provides all algorithms stored that belong to specific category.
GetAllCyniAlgorithmNames: It provides the name of all algorithms stored that belong to specific category
4.2. Cyni Metrics Manager
Metrics unlike algorithms can be classified in several types(tags) at the same time. Actually, when a new metric is loaded to the manager, the new metric is not stored in a specific group, but only one general list of metrics is maintained. However, each metric contains a list of tags that define that metric. This list of tags is the information that will allow metric's manager to differentiate one metric from another. In this case, there are not restricted tags to be used and as it will be explained in next sections any user can propose their own tags.
CyCyniMetricsManager provides an interface to allow external components to retrieve possible metrics stored in the metric's manager. This interface has several methods to request one or several metrics to the manager. The name of the metric and the list of the tags that at least the requested metric should support are the information used to request metrics. The methods available in this interface are:
GetCyniMetric: It provides an specific metric given its name.
GetAllCyniMetrics: It provides the list of all metrics stored in the manager.
GetAllCyniMetricsWithTags: It provides the metrics that at least have the same tags that the ones stored in the list provided by the requester.
This interface also provides two other methods that allow users to get or set a default metric. Thereby, users do not need to specify any extra information to get a metric. These two methods are:
5. Cyni Algorithm Definition
In order to define a new Cyni Algorithm, the three abstract classes related to Cyni Algorithms need to be extended. Each one of these new generated classes will perform a specific function.
Classes extending AbstractCyniAlgorithm will need to provide the name and the category of new algorithm along with an implementation of the two methods which generate an instance of the other extended classes. The chosen category will be one of the available categories in CyniCategories class.
On the other hand, CyniAlgorithmContext is a class that will be extended with the information of any input parameter that the algorithm needs. These parameters will be defined along with their tunables to let cytoscape manage its display correctly when a dialog will be created to modify them by the users. Or the app developer will have to provide a Java Swing panel along with the parameter definition that will allow users to set and modify the input parameters.
Finally, the class extending AbstractCyniTask will contain the actual implementation of the algorithm and it will use the content of the input parameters set through the CyniAlgorithmContext to perform the algorithm. The relationship between these three classes is represented in the flow graph of the figure shown below.
As the figure shows, AbstractCyniAlgorithm is the class that controls the other two. The main reason is that implements the CyCyniAlgorithm interface so any external element that wants to use a Cyni Algorithm will need to use the functions implemented by the class extending this abstract class. Classes extending AbstractCyniAlgorithm class do not produce any final task but they generate the instances of the other two classes, that are the ones managing the input parameters and implementing the real algorithm.
6. Cyni Metric Definition
In order to define a new Cyni Metric, the AbstractCyniMetrics class needs to be extended. This new class will provide two important elements. The first one is a list of tags that define the metric. That list can contain Cyni default metric tags provided by CyniMetricTags class or any other string that define the new metric. These tags are strings that serve to define the new metric and also will be used to group the metric with other metrics that share the same properties(strings). The second element is the implementation of the metric itself that needs to be developed in the method getMetric. The input parameters of this method are a CyniTable and the indexes to indicate the rows to compare. Since there are many types of metrics, there could be the possibility that a metric wants to compare a row against several rows so the second index to pass as input parameter is not just an index but a list of indexes.