Are we far from correctly inferring gene interaction networks with LASSO?

Detecting the interactions of genetic compounds like genes, SNPs, proteins, metabolites, etc. can potentially unravel the mechanisms behind complex traits and common genetic disorders. Several methods have been taken into consideration for the analysis of different types of genetic data, regression being one of the most widely adopted. Without any doubt, a common data type is represented by gene expression profiles, from which gene regulatory networks have been inferred with different approaches. In this work we review nine penalised regression methods applied to microarray data to infer the topology of the network of interactions. We evaluate each method with respect to the complexity of biological data. We analyse the limitations of each of them in order to suggest a number of precautions that should be considered to make their predictions more significant and reliable.

Read the full paper free of charge.

Do It Yourself Principal Component Analysis

One nice aspect of Principal Component Analysis is that it is relatively easy to implement.
The main idea is:

  1. Project some data onto the direction of maximal variation
  2. Select a number of principal components which is definitely smaller than the original number of features
  3. Classify the projected data (or reconstruct the original data from there, in a form similar to compression)All of this is usually implemented by someone else in R, Python, C, Java, etc. and “regular” people just call their favorite function to do the dirty job. In R this is done by prcomp and several other packages.
    In this post I am showing how you can make it in 3 lines of code or, for the brave ones, by hand.

Continue reading

The philosophy behind Statistics

Sometimes I stop doing math and start doing philosophy. Some other times it’s really hard to distinguish between the two.
True fact is that today I am relaxing on two fundamental concepts that gave rise to a very aggressive debate in the world of statistics. Those of you who are into stats probably already know what I am talking about: the dilemma between frequentists and bayesians.
Since many researchers don’t say hi to each other, just like those who support a football team or a political party, I think that it would be nice to write about my position in the matter. If that really matters indeed.

Continue reading

How to export all R packages from home to work (or anywhere else)

When I change computer, department or workstation, I would like to take my arsenal with me, that usually consists in my skills. Some of  the working tools that I need on a regular basis are installed in the form of R packages.
Here is a painless way to export the currently installed packages on one machine and import onto another one, easily and fast.

Continue reading