Guide To Statistics

Please find below some of the most popular articles I've written on Guide To Statistics.

Table of Contents

A brief introduction to Weka

Advertisers used by banned sellers in Flippa auctions

Book Review: R Cookbook

Detecting Pitches in music with R

Finding the beat in R

Implementing k-means in Scala



A brief introduction to Weka

Weka is a GPL data mining tool written in Java, published by the University of Waikato. It includes an extensive series of pre-implemented machine learning algorithms, including well known classification and clustering algorithms. If you’ve ever been curious how Bayes Theorem works, this is a great tool to get up and running. Weka uses a [...] Read More...

Advertisers used by banned sellers in Flippa auctions

In a previous post, I listed the top Flippa advertisers, gained through the node.js web scraper. Which advertisers are mentioned most often in auctions by banned sellers? As you can see, there is a big drop in the “unknown” category, and a big increase in banned accounts associated with Infolinks and CJ. After visual inspection, [...] Read More...

Book Review: R Cookbook

The R Cookbook is written by Paul Teetor, a developer with degrees in statistics and computer science, specializing in finance. The programming language R is a specialized language designed for deep statistical research, although it has some support for other mathematical fields, such as matrix algebra and signal processing. True to the O’Reilly cookbook format, [...] Read More...

Detecting Pitches in music with R

In a previous post, I described a method to detect a chord using a Fourier transform in Java/Scala. I’ve re-implemented the same in R, detailed below. This will generate an audio file containing the C-Major chord: library(sound) c<-261.63 e<-164.81 g<-196 len<-1 cData<-Sine(c,len) eData<-Sine(e,len) gData<-Sine(g,len) audio<-normalize(cData+eData+gData) saveSample(audio, "out\\ceg.wav", overwrite=TRUE) And a series of helper functions: magnitude<-function(x) [...] Read More...

Finding the beat in R

In a previous article, I described a method for detecting chords in an audio file (also available for Scala). Continuing on this theme, the following will find the onset of a drumbeat in a file, using R. I’m using a single drumstick click, which you can hear on freesound.org. This method detects sudden volume increases- [...] Read More...

Implementing k-means in Scala

To generate sample data, I selected two points, (10, 20) and (25, 5), then generated a list of normally distributed points around those two – the exact points used are in the code below. This implements Lloyd’s algorithm, which tries to cluster points in iterations in a simple manner: 1. Assume a certain number of [...] Read More...