Search here:

Recent Posts
 Adaptive Boosting explained in one slide
 Big Data. Everybody knows what it is. Except that nobody actually does.
 The philosophy behind Statistics
 How to export all R packages from home to work (or anywhere else)
 Switch from JAGS to PyMC. Now.
 ExpectationMaximization in action (and some Python code)
 Uninstall Python packages
 Where was the pen? EM algorithm in action
Blogging about
analysis appeal bayesian bayesian statistics big data BUGS chaos combinatorial computer differential equation distribution economy em expectationmaximization fitting genetics gibbs google graph hypothesis testing inference lasso logistic love math modeling pvalue PCA Principal Component Analysis probability R science software statistics theoryTwitter
 Moved up 887 spots on #kaggle. I'm not addicted. I can quit when I want. kaggle.com/c/ottogroupp… 2 hours ago
 How Soy Fights Breast Cancer buff.ly/1G2NTUC 6 hours ago
 Deprivation And Poverty Leave Visible Marks On The Brain buff.ly/1zFEmAB 8 hours ago
Archives
Your opinion about
Follow me
Sponsored by
Tag Archives: big data
Big Data. Everybody knows what it is. Except that nobody actually does.
Big Data is a big thing indeed.The way companies and research institutes are approaching the problem of data analysis with respect to the trends of a few years ago is changing the entire scene of the industry, marketing and clinical … Continue reading
Posted in General
Tagged analytics, big data, business, companies, healthcare, statistics
2 Comments
Discovering Main Genetic Interactions with LABNet LAssoBased Network Inference
Genomewide association studies can potentially unravel the mechanisms behind complex traits and common genetic diseases. Despite the valuable results produced thus far, many questions remain unanswered.
Posted in General
Tagged big data, genetics, graph, lasso, mathematics, networks, statistics
Leave a comment
Increase production in a cooperative research environment: installing and setup of Git server
Disclaimer: In this post you will not find any math, any formula, any deep thought about science. This is an howto post that somehow allows us (and other scientists) to do science. When it is time to cooperate with colleagues, coding … Continue reading
Posted in General
Tagged big data, collaboration, computer, concurrent, git, setup, ubuntu, versioning
Leave a comment
Why is Principal Component Analysis a bust for genetics
I am not a fan of Principal Component Analysis. At least not for big data analysis. I still find it awkward to apply such concepts to, e.g. genetic data. As a brief explanation, for given variables , a principal component is … Continue reading
Posted in Statistics and R
Tagged big data, genetics, PCA, Principal Component Analysis, R, statistics
2 Comments
Bonferroni is too stringent, fine. Go FDR!
Multiple testing in statistics is quite a fully developed topic to write about in 2014. When multiple testing is applied to the analysis of big data it may still lead to inconsistencies and issues that are usually not expected from … Continue reading
Posted in General
Tagged big data, bonferroni, false discovery rate, family wise error rate, multiple testing
3 Comments