Piggy's Blog

Ramblings from the tech to human world and back


Protecting privacy and confidentiality in data and communications

Talking about security of communication and privacy is never enough, especially when political instabilities are driving leaders towards decisions that will affect people on a global scale. Those who expect to read about data science and machine learning in this post shall postpone their wish to the next post, being the content of this one equally important, if not more. Motivation As a matter of fact, people are cultivating very bad habits in the way they communicate. Many are putting...
[Read More]

Ahem detector with deep learning

Do you know why you can’t hear the ugly ahem sounds on the podcast Data Science at Home? Because we remove them. Actually not us. A neural network does. Let me introduce the ahem detector, a deep convolutional neural network that is trained on transformed audio signals to recognize “ahem” sounds. The network has been trained to detect such signals on the episodes of Data Science at Home, the podcast about data science at worldofpiggy.com/podcast Project description Slides and technical...
[Read More]

Word Embedding explained in one slide

Word embeddings is one of the most powerful concepts of deep learning applied to Natural Language Processing. Any word of a dictionary (the set of words recognized for the specific task) is basically transformed into a numeric vector of a certain number of dimensions. All the rest, classification, semantic analysis, etc. is done from the aforementioned vectors on. Here is a slide that explains this with a bit of algebra and some user friendly text. Feel free to download and...
[Read More]

What I discovered by analyzing my favourite Twitter accounts

As some might already know, I am a heavy Twitter user. You can find many thoughts of mine, findings and ideas, at least those ones I share with my followers. I also get inspired by the many researchers who, like me, use Twitter on a daily basis. As a data scientist I constantly analyze my own data, from my Fitbit to my Garmin, emails, documents and of course my Twitter timeline. Which is exactly what I am writing about in...
[Read More]

Way to git

If you buy software just skip this post. If you make software and need to maintain it, here is a very brief overview of what you might be using (or should be using) to maintain your code in the proper way, track and fix bugs and work with production and testing environments, without messing around. In one word, Git. In this pdf I summarize what you would like to do and what really happens behind the scenes. Enjoy!
[Read More]

A not so short introduction to deep learning NLP

I collected some thoughts and findings about the magics of Deep Learning applied to text analytic, with main focus on generative models and word embeddings. I also played with some code and paved the way to my next toy, I will present soon ;) Feel free to share! Enjoy the presentation and get in touch if needed. Deeplearning NLP from Francesco Gadaleta Check the slides Deeplearning nlp
[Read More]