-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Instances
Golearn's model for machine learning problems will be familiar if you've used SciPy, WEKA or R. Data is represented as a flat table, analogous to a spreadsheet, and used for training and prediction. The structure which implements this table is called Instances
.
Sort
implements an in-place radix sort. It accepts a sort direction (Ascending or Descending) and a slice of integer Attribute positions.
Code excerpt: sorting instances
inst, _ := base.ParseCSVToInstances
attrs := make([]int, 4)
attrs[0] = 3
attrs[1] = 2
attrs[2] = 1
attrs[3] = 0
inst.Sort(Descending, attrs)
Because radix sort isn't stable (maintaining the original order of sorted elements is not guaranteed), sort by all of the Attributes available to get a consistent result.
examples/instances/instances.go
is a sample which reads iris_headers.csv
. The ParseCSVToInstances
reads the CSV file into a new Instances
structure and creates appropriately named and typed attributes. This sample also demonstrates manually constructing Instances
.
Instances
currently stores everything as 64-bit (8 byte) floating point values. Attributes
determine how this value is interpreted:
-
CategoricalAttributes
represent discrete strings which can only take a fixed number of values. -
FloatAttributes
report the underlying value without modification.
Both FloatAttributes
and CategoricalAttributes
have a number of functions to control presentation, covered in more detail within the linked documentation.
You can add values to a CategoricalAttribute using Instances.SetAttrStr
with the appropriate column index. It's important to use this method after adding the CategoricalAttribute, otherwise printing the Instances may panic. You can also call GetSysValFromString
on the Attribute itself, which appends a value if unrecognised.
Operating Systems | Mac OS X 10.8 Ubuntu 14.04 |
Go version | 1.2 |
GoLearn version | 0.1 |
Support status | Current |
Next revision | On version upgrade |