GopherCon 2019 - Machine Learning & AI with Go Workshop

: 24 July 2019
: conference, golang, gophercon2019, notes
: https://github.com/dwhitena/gc-ml

These are some notes from my experiences at the GopherCon 2019. I don’t expect these will be laid out in any particularly useful way; I am mostly taking them so I can remember some of the bits I found most useful in the future.

Presenter: Daniel Whitenack

Introduction to ML/AI

Benefits of Go for ML/AI

Type safety
Performance pretty good
Easy concurrency

Uses of AI in the World

Classification: input of images/text, output label or bounding boxes etc
Control systems (self-driving): input of images, output of control deltas
Translation: input of text, output of more text

All basically just input -> ML model -> output (data transformation)

Input data == features
Output data == labels, responses
ML model is just a function

ML Models

Definitions – equations, expressions, conditions (if an image is mostly color C, then a cat)
Parameters – weights, biases (the color C)
Hyperparameters – parameters that we choose but don’t subject to training (kind of a part of model selection)
Make ML/AI basically by trial and error to set parameter values

Two Major pieces to ML/AI

(at least supervised ML/AI)

Inference / prediction – using the model
Training – generating the model

Training

Known Results (labels/responses for set inputs)
Automated trial and error to find the “best” paramater value(s)

Model Selection

How do we pick which definition is the best? Trial and error / domain knowledge

Machine Learning vs. AI

Not a great answer to this – not well differentiated

Common Blockers

Getting the data
- Need annotations / known outputs for the training and evaluation data
Overfitting
- Only works well on the data it knows about, not novel data
- Set aside a validation set
  - Can still overfit on validation set, but could like randomly select validation set every time or something?
  - Can have a really separate holdout set that is not used in training too
- Can always increase model complexity to decrease error too

Kinds of ML/AI Problems

Object recognition / Classification
Prediction (customers etc -> sales)
Forecasting (last month sales -> this month sales)
Recommendation (netflix problem)
Clustering (group users by “similarity”)

Model Artifacts

After training, we save a model artifact file
- some include both definition and parameters, other just parameters
- various formats
- newer format emerging: onnx

Linear and Logistic Regression

Linear Regression

y = w * x + b (w == weights, b == biases)
- example: number of users (x) -> actual sales (y)
pick initial values somehow
- maybe random, maybe pick 2 points and draw that line
loss function
- determines how good a line is
- example: absolute vertical distance
data normalization
- squish values to always be 0-1 (or some other known range)
profiling data / looking for intuition
- for many-x worlds when wanting to pick a single x, try graphing all the pairs (gc-ml/linear_regression/example1)
before doing learning, reminder to pick out test data (gc-ml/linear_regression/example2)
- might want to try and ensure that test data is representative and expand as necessary
Stochastic Gradient Descent training method (gc-ml/linear_regression/example3, gc-ml/linear_regression/example4, gc-ml/linear_regression/example5 (adds multi-linear regression))
- epochs: number of training iterations (# of times through the training data)
- gradient: more or less the derivative of goodness – move parameters in the direction of less error
  - derivatives of error loss wrt each parameter, adjusted w/ learning rate
- learning rate: hyperparamter that helps prevent huge jumping
Evaluate data (gc-ml/linear_regression/example6)
- test data set evaluated by RMSE (root mean squared error) when the loss function is MSE (mean squared error)
  - gets back into the units of the prediction
- multi-regression might not get you more than linear sometimes, but it might
- might want to un-normalize errors to better understand error numbers

Logistic Regression

Pretty similar to linear (gc-ml/logistic_regression)
Often used for classification where we need a step-function-like thing
Logistic function: 1 / (1 + e^(wx+b)) = 1 / (1 + e^b * e^wx)
Inflection at - b/w ? (worked out for myself, but might have a wrong sign or something)
Data cleaning is often necessary in real world (gc-ml/logistic_regression/example2)
Intuition generation again (gc-ml/logistic_regression/example3)
Don’t forget to create test/training splits (gc-ml/logistic_regression/example4)
Training (gc-ml/logistic_regression/example5)
Validation (gc-ml/logistic_regression/example6)
- Accuracy – how many things did I get right?
- Alternatives: precision, recall, sensitivity, AUC, false pos/neg, etc.
goml package to do a lot of this for you (gc-ml/logistic_regression/example7)

Neural Networks and Deep Learning

Gorgonia for tensorflow / theanos

Neural Networks

Semi-black-box neurons acting as mini-models
“With enough parameters we can model just about any relationship”
- Pile up logistic regressions (and other such things) to give enough freedom for more things
Terminology
- Input layer
- Hidden layers
- Output layer
- Feed forward – generate predictions
- Backpropogation – calculate error, then adjust parameters
Architecture choice is usually finding one that someone found has worked well
Iris flower classification example (gc-ml/neural_networks/example{1,2})
- uses “one-hot” encoding of correct species

Deep Learning

As used here: pre-trained models that we might tweak or just use to solve problems
TensorFlow trained model from python used in go (gc-ml/deep_learning/example1) for object identification
- can be pretty verbose
Using gocv / opencv to interface w/ tensorflow model (gc-ml/deep_learning/example2)
Using MachineBox to do classification via a rest service

ML Pipelines with Pachyderm

Pachyderm seems to make ML pipeline work pretty darn efficient and painless, but that is definitely just first impression