In love with CMake

I am absolutely in love with CMake. For years I have been looking at automating the cross platform build-process. There are may tools made for this process. I think CMake is one of the best in the lot. Here are a few stunning capabilities of CMake that I find extremely useful.

Features of CMake

  1. can handle in-place and out-of-place builds
  2. ability to build a directory tree outside the source tree
  3. ability to generate a cache to be used with a graphical editor
  4. can locate executables, files and libraries
  5. able to accommodate a project that has multiple toolkits, or libraries that each have multiple directories
  6. can work with projects that require executables to be created before generating code to be compiled
  7. open-source
  8. CMake can generate makefiles for many platforms and IDEs

From a scientific computing perspective, the following features are excellent

  1. support for MPI, OpenMP, CUDA and OpenCL
  2. support for multiple compilers, e.g., GNU, Intel, clang
  3. able to work with python, numpy, scipy etc
  4. able to work with LaTeX
  5. support for Doxygen

In this post I collect the thoughts and ideas about the linux workshop I gave in UCL. This is for new PhD students and Post-docs who just started in UCL Astrophysics Group. Basically the students did a lot of examples after a few explanatory slides. I also gave them a cheat sheet of linux codes.

Slides and exercises

The slides can be found the UCL Astrophysics GitHub. It is all written in LaTeX with beamer. Here is the source.

Most of the students have some basic understanding of Linux/Unix. The cheatsheet was to give them a jump start. The exercises were from beginners level to intermediate. The solutions to exercises are here.

I also gave a brief idea about the HPC facilities in the department and briefly mentioned best practices. Next year I plan to do a separate session on HPC.

The best way to learn GPU based parallel programming is to actually code examples. Here I collect GPU based examples.

Books I found useful

  1. Gaster, Howes and Kaeli, 2012, Heterogeneous Computing with OpenCL
  2. Munshi, Gaster, Mattson, Fung and Ginsburg, 2011, OpenCL Programming Guide
  3. Sanders and Kandrot, 2010, CUDA by Example: An Introduction to General-Purpose GPU Programming
  4. Kirk and Hwu, 2012, Programming Massively Parallel Processors: A Hands-On Approach

Examples

I started adding the examples to a repository in GitHub. For each example I clearly indicated where the original source came from and what license it is distributed under.

Sun T-1000 server

Recently I got hold of 2 Sun T-1000 machines off ebay! It was my first “contact” with a server. I mean, the only way to connect with this server is through a serial terminal. Facinating! The specs of this machine are really impressive. It has an 8 core Sun Ultra Spark III processor. These are actually not floating point units, but rather integer units. Hence the floating point performance is will not really scale as we would think. Installing Debian on this is a very challenging task and it took a full of work (baisally a weekend) to get this installed. Here are the steps I took in the marathon process.

Overview

The server looks like below

T1000

I found some documents about the machine.

  1. Datasheet
  2. Server Overview
  3. Service Manual
  4. Server System Admin Guide

Firing up

I connected the machine to my router through Network Management Port (NMP). It I can see that it has been assigned a ip-address. Now I can ssh into the machine’s firmware. Default admin username is admin and password is “password” I think. At least it worked for me. When I type help in the sc prompt, I get the following result.

sc> help
Available commands
------------------
Power and Reset control commands:
  powercycle [-y] [-f]
  poweroff [-y] [-f]
  poweron [-c] [FRU]
  reset [-y] [-c]
Console commands:
  break [-y] [-c]
  console [-f]
  consolehistory [-b lines|-e lines|-v] [-g lines] [boot|run]
Boot control commands:
  bootmode [normal|reset_nvram|bootscript="string"]
  setkeyswitch [-y] <normal|stby|diag|locked>
  showkeyswitch
Locator LED commands:
  setlocator [on|off]
  showlocator
Status and Fault commands:
  clearasrdb
  clearfault <UUID>
  disablecomponent [asr-key]
  enablecomponent [asr-key]
  removefru [-y] <FRU>
  setfru -c [data]
  showcomponent [asr-key]
  showenvironment
  showfaults [-v]
  showfru [-g lines] [-s|-d] [FRU]
  showlogs [-b lines|-e lines|-v] [-g lines] [-p logtype[r|p]]
  shownetwork [-v]
  showplatform [-v]
ALOM Configuration commands:
  setdate <[mmdd]HHMM | mmddHHMM[cc]yy][.SS]>
  setsc [param] [value]
  setupsc
  showdate
  showhost [version]
  showsc [-v] [param]
ALOM Administrative commands:
  flashupdate <-s IPaddr -f pathname> [-v]
  help [command]
  logout
  password
  resetsc [-y]
  restartssh [-y |-n]
  setdefaults [-y] [-a]
  ssh-keygen [-t rsa|dsa] [-r] [-l]
  showusers [-g lines]
  useradd <username>
  userdel [-y] <username>
  userpassword <username>
  userperm <username> [c][u][a][r]
  usershow [username]

In order to boot I typed

sc> poweron -c

It gave me a warning message

Warning: User < > currently has write permission to this console and forcibly removing them will terminate any current write actions and all work will be lost. Would you like to continue? [y/n]y and I typed y.

Enter #. to return to ALOM.
Done
0:0>Test Memory....Done
0:0>Test Slave Threads Basic....Done
0:0>Extended CPU Tests....Done
0:0>Scrub Memory....Done
0:0>Functional CPU Tests....Done
0:0>Extended Memory Tests....Done
0:0>IO-Bridge Tests....Done
0:0>INFO:
0:0> POST Passed all devices.
0:0>POST: Return to VBSC.
0:0>Master set ACK for vbsc runpost command and spin...

SC Alert: Host system has shut down.
JAN 19 19:16:15 ERROR: Available system memory is less than physically installed memory
JAN 19 19:16:15 ERROR: Using unsupported memory configuration
JAN 19 19:16:15 ERROR: System DRAM  Available: 004096 MB  Physical: 008192 MB

SC Alert: Host System has Reset
/

Sun Fire(TM) T1000, No Keyboard
Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
OpenBoot 4.23.4, 4088 MB memory available, Serial #71818864.
Ethernet address 0:14:4f:47:de:70, Host ID: 8447de70.



ERROR: The following devices are disabled:
    MB/CMP0/CH0/R1/D1


{0} ok

I typed help

{0} ok help
Enter 'help command-name' or 'help category-name' for more help
(Use ONLY the first word of a category description)
Examples:  help select   -or-   help line
    Main categories are:
Breakpoints (debugging)
Repeated loops
Defining new commands
Numeric output
Radix (number base conversions)
Arithmetic
Memory access
Line editor
System and boot configuration parameters
Select I/O devices
eject devices
Power on reset
Diag (diagnostic routines)
Resume execution
File download and boot
nvramrc (making new commands permanent)

printenv command will give all the environment variables. What we need to setup the netboot is netowork-boot-arguments.

ok# setenv network-boot-arguments host-ip=192.168.0.20,subnet-mask=255.255.255.0,file=tftp://192.168.0.11/debiansparcboot.img

where host-ip is the ip-address of the sun-machine (the current one) the tftp address is the address of the machine where the tftp server is hosted. It is followed by the name of the boot image. More documentation can be found at

Kaggle DecMeg Competition

I participated in the Kaggle DecMeg competition and it was a great experience. I came 114 out of 273 :) I approached it from a image processing point of view than the machine learning perspective. From comparing the winning codes, it can be seen that deep-learning approach wins by a huge margin. My question here is that do we really learn anything from applying the machine learning method. I understand that machine learns the characteristics of the data perfectly well. However, does that mean that we can get a good understanding of the underlying physical phenomeanon?

Here is a pictorial representation of the competition.

DecMeg

Comparing my code to the winners I find that my way of thinking is not really what the machine learning people think. It was a great machine learning experience! My code is in GitHub now.