Information Retrieval and other interesting topics

Linear Regression In PHP (part 2)

In: classification, statistics

10 Oct 2011

In the last post we had a simple stepping algorithm, and a gradient descent implementation, for fitting a line to a set of points with one variable and one 'outcome'. As I mentioned though, it's fairly straightforward to extend that to multiple variables, and even to curves, rather than just straight lines.

Linear Regression In PHP

In: classification, statistics

10 Oct 2011

I've had a couple of emails recently about the excellent Stanford Machine Learning and AI online classes, so I thought I'd put up the odd post or two on some of the techniques they cover, and what they might look like in PHP.

Bayesian Opinion Mining

In: classification, probability

01 Jan 2010

The web is a great place for people to express their opinions, on just about any subject. Even the professionally opinionated, like movie reviewers, have blogs where the public can comment and respond with what they think, and there are a number of sites that deal in nothing more than this. The ability to automatically extract people's opinions from all this raw text can be a very powerful one, and it's a well studied area - no doubt because of the commercial possibilities.

Support Vector Machines In PHP

In: classification, svm, vector space

12 Dec 2009

When it comes to classification, and machine learning in general, at the head of the pack there's often a Support Vector Machine based method. In this post we'll look at what SVMs do and how they work, and as usual there's a some example code. However, even a simple PHP only SVM implementation is a little bit long, so this time the complete source is available separately in a zip file.

Text Classification (And Twitter)

In: classification

09 Sep 2009

Classification techniques are used for spam filters, author identification, intrusion detection and a host of other applications. They can be used to help organise data into a structure, or to add tags to allow users to find documents. While the latest classification algorithms are at the cutting edge of machine learning, there are still thousands of systems using simpler algorithms to great effect.