Introduction
To work in LOCEN the student/collaborator needs to learn basic programming skills. This web-page indicates the fundamental concepts of programming that the student/collaborator should learn independently of the programming language, research area, and research problem selected, to work with LOCEN.
The specific ways in which to implement the concepts presented here are indicated in various web-pages, for example a very good one is presented this:
http://www.tutorialspoint.com/tutorialslibrary.htm
The purpose of this page is twofold:
- Since the features of each specific programming language are many, here we indicate the fundamental elements on which the student/collaborator should focus his/her learning time.
- In presenting such features we furnish some core information on programming that is often not reported in online tutorials but that is nevertheless very important to learn to program in a specific language with a better understanding and faster. Indeed, the concepts indicated here represent the core concepts of programming that one needs to have to be able to create neural-network models: indeed, they also represent the basic programming skills in general and if one has acquired them he/she can say to "have learned to program" (at a basic level, but with the capacity to easily switch from one langugae to the other... being this just a matter of learning the new key-words of the new language to learn).
Have fun!
Get and learn to use an IDE (Integrated Development Environment) for the programming language you choose
A computer program needs to be "compiled" to be executed by a comptuer: compiled means that it has to be translated into binary instructions that the computer processor can perform. This translation is done by a... computer program called "compiler". The compiler gets your code, written in a certain programming code, and returns a code, called "executable code" or "binary code", that the comptuer is able to compute/perform. Many languages (e.g., C and C++) work in this way and are called "compiled" languages. Other languages called "interpreted languages" (e.g., Matlab and R) work in a different ways: the program code is taken line by line by the interpreter/IDE program, is translated in executeble code, and executed. This distinction is not immediately visible for you but it has some implications, e.g. compiled langues are usually faster during execution; translated languages have usually better "debuggers" (see below) and are more user-friendly and high-level.
An IDE (Integrated Development Environment) is yet another program that allows you to perform these important operations (and many other ones) in an integrated fashion:
- Write your program code in an assisted way (highlight of the program keywords, smart completion of variabels, integrated help, etc.)
- Compilation/translation and execution of the program
- Debugging through the debugger, inspect the contents of the data-structures of the program (see below)
- See the text/graphical output of the program
- Manage the files that form your program, create (often) "projects" to host them
- Help on the language
An IDE is so a very useful tool to support programming and we suggest to use one that is good for your language. Thus, spend some time to learn how to use your IDE basic functionalities (sometimes there are tutorials inside the IDE itself).
In LOCEN we use these IDE for the different programming languages (all of them are free/open source, except when told otherwise):
- C, C++: no IDE, we direcly program from Linux using a text editor
- Python: Anaconda/Spyder
- Java: TODO
- Matlab: Matlab program (all incorporated; this is a commercial program costing money)
- Octave (free clone of Matlab): Octave program (all incorporated)
- R: R-Studio
A fundamental challenge posed by programming
One subtle thing that a new programming learner must understand is that computers are damn rigid machines. Indeed, in normal life we are used to interact with people and animals that are very flexible. Instead, computers and computer programas are the opposite: to make a program to work, it is fundamental that your code does not contain any mistake, not even (litteraly) a "wrong tiny comma". As consequence, the sooner you learn what follow, the better:
- Computers are super-rigid, so your program has to be 100% corrects: there are zero possibilities that the computer understands ''what you intended'' if you specify it in a wrong way in your program!
- Any tiny element of your program counts and has to be correct
- Learn to write programs by knowing exactly what each single instruction of your programs does. In other words, never program ''by-trial-and-error'', writing code and staying with it just because "it works": you must understand damn in depth all our code does.
- Learn to be super-clean and super-tidy in your code:
- If you make your habit to write clean code, in the long run you will save tons of time
- On the other side, if you write messy code you will waste tons of time in the long run and you will never become able to write programs above a certain level of complexity (repeted as important ; )
An important tip to ensure to succeed to write good code is to follow this phylosophy:
- Understand very very well the key primitive building blocks (of the programming language you selected) that you need to use to build your programs .
- See how to write your programs using those buliding blocks.
Debugging and debugger
A fundamental element of learning to program is to learn to find errors in programs. Indeed, you will realize that (at least at the beginning) most time you spend programming will be actually spent to find bugs! "Bugs" means errors; "debugging" means searching and fixing errors. A good part of being a good programmer is being a good debugger. This are important tips to learn to debug:
- Programs are rigid: for each line of code you write, before moving forwared you have to think a double-check to make sure that it does what you have in mind it should do. Indeed, at least at the beginning of learning each line has a high probability of containing a mistake. For example, to make the test check the existance and nature of the variable you created, do the computations of your operations by hand, envisage a specific tests of your funcion, etc.).
- If you like, you can write long pieces of code; but then, comment the whole piece of code, decomment one line, test/check that line to make sure it is correct, decomment the following line, check it, etc.
- The best strategy to debug is "divide et impera" (the Latin for: "divide and conquer"). Notwithstanding you will use maximum care, making sure that your code is correct line by line, there will be nevertheless mistakes in your code. You will realise this when you run your program as you will see that it does not work as expected. To find the bug, there is this golden strategy to use. Start to check the first half of the program, and then the second half of it; once you identify wich of the two parts contains the bug, divide it in two and check each parts... etc., that is iterate this process. In general, you do not need to ''cut in two" but just to "repeatedly divide the program in big blocks" (based on how the program is structured): the key trick is that this iterated division rapidly brings you to identify the single line of code that is wrong (and with many less steps than you imaging... due to the power of exponential numbers). Remember: an important element to make the divide-et-impera strategy to work is to make sound and strong tests, rather than superficial ones.
An extremely useful tool to debug a program (and later to debug) is the ''debugger". Most good IDE have this tool incorporated. Once written, the IDE generally allows you to run the program in either one of two modes: the "release mode" and "debug mode". Broadly speaking, the release mode is the standard way of running finished programs, while the debug mode is used during the program development. Running in debug mode allows to perform these simple operations (and others) that are fundamental to understand what the program is doing and if it is correct:
- Put (and remove) a ''breakpoint'' on one ore more code lines (usually by double clicking at their left in the IDE code-editor window)
- Run the program in debug mode until the first, or next, breakpoint: the program is executed until that point
- See the nature, type, and content of the data structures of the program
- Continue the execution of the program line-by-line, while observing how the content of the data structures change: these contents change while the program is executed, so at a certain moment of the debugging you can see the content of the data structures just before the execution of the line where the breakpoint/execution step is. This is crucial to understand if the program is doing what you expect it is doing.
Good names to use for the datastructures and other elements
Do follow the following rules to establish the names of the data structurs you use in your program. These rules will: (a) help you to write a more tidy code; (b) help you to be fully aware of the data-structures and types you use; (c) help other people -- and also yousefl after sometime -- to more quickly understand your code. These rules go beyond the common practice but we found the to be extremely useful, so we indicate them with some detail.
Here are the rules:
- The names of the data structures that you use should look something like these: siCounInpuUnit, vfInpuUnit, mfConnWeig, m3fHistConnWeig, TEMP_PARA. Let's see why.
- Start the name of each data structure with two (or more, when needed) small letters. The first letter indicates the datatastructure you use (e.g., "s" for scalar; "v" for vector). The second letter indicateas the type of the data structure you use (e.g., "i" for integer number, "f" for float number). Note that sometimes it is good to use letters for categories that are more detailed than those used by the program language itself. For example, in C C++ you use data-structures called "arrays" for both matrices with two or more dimentions: in this case, it is useful to use m, m2, m3, etc., to distinguish the different number of dimentions of the matrix; in matlab, even a scalar is a matrix, but in this case it is good to use "s" to indicate that a matrix is actually a simple scalar variable. Similarly for types, for example Python considers "numbers" for both integers and floating point numbers: it is good to indicate them separately, with "i" and "f" respectively, as this is very useful. In general, for programming neural networks it is good that you use at least these distinctions:
- Data structures: s = scalar; v = vector; m = matrix 2D; m3 = matrix 3D (and m4, m5, etc.) ... but create other ones also depending on the language, e.g. in R use: fa = factor; df = data frame.
- Types: i = integer number; f = floating point number; c = complex number; fi = file; ch = character; st = string of letters.
- Usef 4 letters for each word of the data structure name (or less, if the word is shorter than 4 letters, e.g. ''Bin''), for example if you create a variable used as a counter in a for-loop computing the input of a neural network you can use "siCounInpuUnit"; for the activations of the input units of the network you might use "vfInpuUnit"; for the connection weights mfConnWeig.
- Use capital letters for the beginning of each word, and small letters for the other 3 letters (but not for the intiial letters: this also respects the programming habit of using small letters for the user's data structurs with changing content), e.g. the words "connection weights" will become "ConnWeig".
- Use capital letters, separating words with _, for constants (parameters), i.e. for data structurs whose content does not change during the program execution, e.g.: NUMB_INPU_UNIT, NUMB_EXAM, NUMB_SIMU_STEP, DELT_T. Note: a good code should not contain any specific number, e.g. 10 or 3.14 (or other types of contents): all numbers (parameters) should be defined at the beginning of the program as constants.
In addition to this:
- Start your functions with a capital letter, e.g. "NetwFunc(...)", "NetwLear(...)"
- Start classes with "C" and objects with "O", e.g.: "CNeurNetwBackProp(...)", "ONeurNetwBackProp()"
What is a computer program
A key thing to understand is that a computer program is a list of instructions (written in a "programming language", such as C++, Python, Matlab, R, and forming a piece of ''code'') that a computer can execute and that when executed produce these kind of effects:
- Exchage input/output information with the external environment (e.g., get an input from the keyboard, or print a graph in the screen)
- Change the content of the program data-structures corresponding to memory slots in the computer.
That's all! This might sound simple but is very profound as it implies that with a given restricted set of elements a program can do whatever a computer can do (a bit like with some LEGO blocks you can bulid whatever construction).
Let's further clarify this. A program is a list of instructions using these key ingredients:
- Variables hosting the data: these are labled portions of memory of the computer where data are stored. Data can be for example integer numbers, real numbers, and characters (e.g., 'a', 'b', 'c'), strings (e.g., 'abcd'). Information on these data is stored in the computer memory on the basis of binary codes.
- Data structures: often programming languages allow the creation of sets of variables called "datas tructures", for example arrays (an ordered set of variables addressable with an index and being in a fixed number, e.g. 10), or lists (an ordered set of variables addressable by passing from one variable to the next, or with an index, and with a variable length). Here we use the term "data structure'' also to refer to variables, the simple possible data structures.
- The operators: these are the processes that change the contents of the data structures, for example ''copy'', ''sum'', ''multiplication'', ''sort'', ''repeat the following operations'', ''compare two variables'', etc.
- Loops, conditionals, comparisons, logical operations: 4 fundamental things programs can do are these:
- Loop, i.e. execute a certain list of istructions several times
- Check an "if" condition to decide what to do
- Compare the content of two variables (e.g., tell if it is the same, or one is bigger than the other)
- Perform logical operations (e.g., compute the truth value of statements using NOT, AND, OR logical opeartors)
- Functions: a funtion is equivalent to a mathematical function, i.e. it is a chunk of a program code with given input variables/data-structures, and given output variables/data-structures. For example a function might be something like this: ''Sum(x, y)'', being this a piece of code that when executed takes x and y (two numbers) and returns their mathematical sum (i.e., x+y). The utility of functions is the possibility of chunking a piece of code (even very complex), call it with a name, and invoke its whole execution within the program through such name. A function is a key concept of programming and can be thought of as a machine that is feed by the program with some input (e.g. two numbers), it processes it (e.g., makes the sum), and then returns the outcome of the process to the program (e.g., the number resulting from the sum operation). Some important elements of the functions to understand:
- Output values: how they are produced by the fuction
- Input parameters: how they are taken by the function:
- Parameters default values
- Override of default values by position
- Override of default values by type
- Override of default values by keyword
- In "object-oriented programming" (e.g., C++, Python), one can use classes and objects: classes represent a "template" (a model) to create objects of a certain type; an object is a chunk of a program formed by data structures and functions that can be invoked to operate on such data structures. The utility of object-oriented programming is to create chunks of variables/data-structures/functions that can be treated like a black-box having (a) internal information and (b) ways to process such information (through the object functions). An object can be thought of as a machine that; (a) can be fed by the program with various inputs; (b) it stores information internally; (c) it allows the program to request from it certain processes of its internal information or new inputs (this is done by invoking the functions of the object).
- Interfaces: these are the means through which the program gets information in input from the external environment (e.g., a keyboard, or a mouse, or a slider, or a file), or gives an output to it (e.g., printing text or drawing a graph on the screen, producing a sound, dowloading data into a file, etc.).
How to best organise your program
When you implement your program encoding neural networks, structure it well using these sections:
- (When required by language) inclusion of libraries
- Definition of constants (parameters; in advanced programs, these are loaded from an external file at the beginning of the program).
- Definition of the data structures
- Core nested loops of the simulation.
- Loops in programs encompassing agent-world interactions (embodied models tested in robots/simulated animals that interact with a real/simulated environment):
- Loop of different experiment repetitions (simulation runs with different random seeds: needed if you want to repeat the experiments more times)
- Loop of different trials
- Loop of the steps of the trials
- Loops in programs encompassing non-embodied neural networks:
- Loop for the different experiment repetitions
- Loop for the training epochs
- Loop for the training examples
- Inside the previous loops (plan well where to distribute the code inside the different levels of the nested loops), insert the core code on:
- The functioning of the system
- The learning of the system
- Collection of data for monitoring the system: (a) on-line: i.e. during the program execution; (b) off-line: i.e. at its end (or, based on data downloaded into files, after its excution).
- Final statistical analysis
- Final graphical output
Good programming habits:
- A good program should not have numbers (parameters) inside: all numbers should be defined in constants at the beginning.
- A good program should be "wholly parameterised'', i.e. the size of all data structures (e.g., the number of input units and output units of a neural network) and the duration of time intervals (e.g., the duration of training of a neural network or an embodied agent) should be defined with a constant variable located at the beginning of the program with all other constants, and thus easily changed.
- The control of the code flow of loops should be as much as possible located at their beginning or end (e.g., do not use instructions, such as "break", to exit a loop in the middle of its execution).
- Write commented titles to the different sections and sub-sections of your program
Key concepts of programming
The consequence of what said in the previous sections is that basically "learning to program" means learning how the elements above can be expressed in the particular programming language one is using (e.g., C++, Python, Matlab, R). Note that each of such elements can differ, or have different names, in the different languages. For example a matematical matrix is stored in datastructures called ''array'', or ''list'', or ''matrix'' in respectively C++, Python, Matlab. Let's now give a list of the specific concepts to learn in the various languages. The purpose of this is that the student/collaborator who is learning to program in a given language will study (e.g., on material taken from internet) how to implement all the following elements in the given languge (e.g. in C++ or Python).
Data types and data strutures
- Variable: a portion of the memory of the computer where the program stores an item of information. A variable is formed by 3 key elemenst:
- Variable name: the name through which the program refer to the variable (i.e., portion of memory), for example ''sVariName'.
- Variable content: the contente of the variable, e.g. a number such as 127.44, or a letter 'a'.
- Variable address: this is the 'address number' with which the computer refers to the particular portion of memory allocated to the variable when the variable is created by the program; the variable address, which allow special low operations, are accessible only in some languages (e.g., in C and C++ there exist special variables, called 'pointers', whose contents are the address number of other ''normal'' variables).
- Data structures: these are variables and sets of variables (such as "arrays" in C++, "lists" in Python, "variables'' in both, etc.), i.e. portions of memory of the computer that the program uses to store data (e.g. the numbers [1, 2, 3] or the string 'abc'). Fundamental data structures we need here are those to stores mathematical vectors (i.e., a sequence of ordered numbers) and matrices (a set of numbers ordered in 2 dimensions): these are very important data structures needed to perform some linear algebra operations that greatly simplify the code implementing neural networks. Given their importnace, below we list them more explicity as you should learn how to create them in the language you learn.
- Types: the variables, and the elements of the data structures, created by the program, have a certain "nature" which requires the computer to store them differently in the memory slots corresponding to them. For example, the computer stores integer numbers, real numbers, and alphanumeric characters with different binary encodings, so when you create a variable to store one of them this needs to be specified in the program (e.g., in C++ if you create a variable to store an integer number with this instruction: "int sVariName;").
- Casting: an important operation done by programs, that can generate subtle bugs if not done properly, is casting. Casting means that within the program a certain content of a datastruture/type is passed into a different datastructure/type. For example, in C++ you might have a content of a variable of floating-point type (say: sfVari=412.423215), and pass it into a variable of integer type (siVari). Sometimes programming languages allow you to make the casting implicitly, e.g. with a simple operation of as this: siVari = sfVari. Even in these cases, usually the languages allow you to make the casting with explicit functions, e.g.: siVari = int(sfVari): when possible, we recommned to use this explict functions to increase reability and avoid subtle mistakes, e.g. a typical error is to realise that with a float-to-integer casting the number is rounded, e.g. from 412.423215 to 412. Some other times, the use of the casting functions is compulsory and if not done the IDE signals an error or warning.
Basic operations on data structures
- Creation of variables/data-structures
- Assignement of content to a variables/data structure (e.g., "put" a number into a variable, or a sequence of 10 numbers into an array in C++).
- Reading of the content from the elemenst of data-structures (e.g,, knowing the content of the third element of a C++ array).
- Putting particular values into data-structures, in particular:
- Zeros: e.g, 0, 0, 0, 0, ... into an array in Matlab
- Ones: e.g., 1, 1, 1, 1, ...into an array in Matlab
- Interleaved numbers: e.g., 0.1, 0.2, 0.3, ..., 1.0
- Random numbers: in particular, random numbers drawn from a uniform distribution ranging in [0, 1], or a Gaussian distribution
Operators
- Basic mathematical operators: +, -, /, *, power, logarithm, exponential, sine, cosine, tangent
- Note that often languages have more complex operators that use the sintax of a function, e.g. "sum(vector)"
Programming controls
- Loops:
- loop lasting a precise number of cycles (e.g., the "for" in C++ or Python)
- loop whose number of cycles depend on a condition (e.g., "while" in C++ or Python)
- Conditionals (e.g., "if", "else if" in C++)
- Logical operators: e.g, NOT, OR, AND
Functions
- Function creation
- Function use
- External variables, local variables
Classes/objects
- Class: private and public data structures, functions of classes
- Constructor: a special function of classes
- Distructor: a special function of classes (only in some languages, e.g. C++)
- Variables of the class: private/public
- Initialisation of the variables of the class
- Creation of an object based on a class
- Use of the functions of an objects
- Operations on the variables of an object
Interfaces
- Print text in output into the ''console'' (a simple window in the screen opened by the program when it runs)
- Files:
- creation and naming of files
- download of data from the program data-structures into files
- upload of data from files into the program data-structures
- Graphics (note that some languages, e.g. Matlab, have graphic elements incorporated in the language, some others, e.g. C++ and Python, need additional libraries such as Qt):
- basic plots: histograms, time series graph, scatter plot, etc.
- management of the different elements of the graphs, e.g. type of lines, lables, title, type/colour of line, etc.
- plot of multiple data series in the same graph
- Presentation of multiple plots in the same screen window
- Input from the keyboard
- (Optional) Creation and use, in the program, of basic "widgets" for the on-line control of simulations: window, button, field, slider, etc...
Key operators to implement the fundamental linera algebra operations
Some linear-algebra data-structures and operators are fundamental to impelment (and understand!) neural networks, so you'd better learn what they are and what they do. Note that some languages, such as Matlab, have these data-structures and operators as part of the language; others, such as C++ and Python, need additional special libraries to implement them (e.g., Armadillo in C++; NumPy in Python). Here we list the fundamental data structures and linera-algebra operators you need to learn:
- Data structure for the creation of a vector (all vectors should be column vectors, whenever possible, to avoid confusion)
- Data structure for the creation of a 2D matrix
- Initialisation of vectors and matrixes with zeros, ones, equidistant numbers, random numbers (uniform, Gaussian)
- Dot (or ''internal'') product between two vectors
- Vector-vector and matrix-matrix element-by-element (or ''entrywise'') math operations +,-,*,/,power,exp,log, etc. (e.g., [1, 1] + [1, 2] = [2, 3]) and logical operations
- Matrix-vectors multiplication
- Matrix-matrix multiplication
- Identity matrix
- Matrix transpose
- Matrix inverse (for this, it is enough to find the function that implement it in the chosen language, not to understand the details of how the computation is performed)