Skip to content
Richard Townsend edited this page Jul 26, 2014 · 3 revisions

Golearn's model for machine learning problems will be familiar if you've used SciPy, WEKA or R. Data is represented as a flat table, analogous to a spreadsheet, and used for training and prediction. The structure which implements this table is called Instances.

Sorting Instances

Sort implements an in-place radix sort. It accepts a sort direction (Ascending or Descending) and a slice of integer Attribute positions.

Code excerpt: sorting instances

inst, _ := base.ParseCSVToInstances
attrs := make([]int, 4)
attrs[0] = 3
attrs[1] = 2
attrs[2] = 1
attrs[3] = 0
inst.Sort(Descending, attrs)

Because radix sort isn't stable (maintaining the original order of sorted elements is not guaranteed), sort by all of the Attributes available to get a consistent result.

Sample application

examples/instances/instances.go is a sample which reads iris_headers.csv. The ParseCSVToInstances reads the CSV file into a new Instances structure and creates appropriately named and typed attributes. This sample also demonstrates manually constructing Instances.

Attributes

Instances currently stores everything as 64-bit (8 byte) floating point values. Attributes determine how this value is interpreted:

  • CategoricalAttributes represent discrete strings which can only take a fixed number of values.
  • FloatAttributes report the underlying value without modification.

Both FloatAttributes and CategoricalAttributes have a number of functions to control presentation, covered in more detail within the linked documentation.

Manipulation of CategoricalAttributes

You can add values to a CategoricalAttribute using Instances.SetAttrStr with the appropriate column index. It's important to use this method after adding the CategoricalAttribute, otherwise printing the Instances may panic. You can also call GetSysValFromString on the Attribute itself, which appends a value if unrecognised.

Support status

Operating Systems Mac OS X 10.8
Ubuntu 14.04
Go version 1.2
GoLearn version 0.1
Support status Current
Next revision On version upgrade