Video: Fall 2017: About Programming for CIS 121 at Portland Community College
>> Good evening or good morning or good day, whatever. Today, I want to talk a little bit about programming in a very general nature. We're going to talk about what is a programming– what is programming about, what languages do programmers use, how do programmers approach their work in general. And, you know, I hope you get a little bit of feel for what it's like to be a programmer maybe. I enjoy programming. I've done many, many– I've programmed for many, many years.
And I rather like it. It's something I don't want to do all the time, but I like to do it quite a bit. When you're really deep in your code, oh it's stressful, it's high tension at times, and it's solitary. And I'm not a solitary person. And– But when you complete a project, you get a real high. It's really great. So programming is something that I've always liked to combine with other activities, do a little bit of– do some programming for a while, and then go back and do some systems administration or project management, and then do some more programming. And, you know– and maybe some teaching, I don't know. Anyway, mix it all together. So let's talk a little bit about programming. I'm not going to show any code, anything of that type. It's going to be very general in nature, a lot like the– It's my take on the chapters in your book on– — for CIS121, the book "Computer Concepts", Chapters– Well, in this edition, it's Chapters 5 and 6.
And my take is quite a little bit different because my experience is different than the authors. You know, everybody approaches things in a different manner. I approach things in a bit different manner. That's nothing to be said wrong about, what's in the book or about the author. And it's nothing to be said about– wrong about me, we just come at things– you know, sometimes people come at things from a different background because they have a different background. I'm more of a mathematician. I've worked in scientific computing as opposed to business computing, so I come at things from that approach. OK. Let's go down here and here's an outline of the talk. First thing we're going to talk about is what is programming, and what is a programming language. Programming languages are– Well, let's take a look at what we have in Wikipedia maybe. Wikipedia says that a programming language is a form of language that specifies a set of instructions that can be used to produce various kinds of output, whatever that means.
Anyway, programming languages are different things to different people, it turns out. Not everybody's definition of a programming language is the same. My definition is rather specific and I think you'll find a lot of people do use my definition. I think nowadays, this is actually called a Von Neumann programming language. A Von Neumann program– after the famous mathematician who lived in the first half of the 20th century, died in about the same time as Albert Einstein, actually his office was just down the hall from Albert Einstein, died in, you know, 1954, '55, '56, something like that. Only, he was much younger than Albert Einstein at the time. He was only in his 50s. Anyway, John Von Neumann was one of the finest mathematicians of the 20th century. And among other things, he defined what we– what many of us consider to be a programming language. It requires that you have variables.
You have to have X's and Y's or something like that. There has to be some variable that you can set the value of in order– and the value can change. These are basically memory address spaces in your computer. You have to have– The language has to have statements of some sort so you can, you know, so you can write it. But it also has to have control statements. In other words, there has to be like an if statement so that you can branch from one direction to another. If X is less than 10, do the following, else do something completely different. That's what I mean by a branching statement. You have to be able to branch and make different decisions. You also should be able to make a loop. So, you know, if X is less than 10, let X equal X plus 1 so the first time through the loop and then it goes down to the bottom of the loop, it goes back up to the top of the loop. So the first time through, if X is 1, well, X gets set to 2, the next time through, X gets set to 3, gets set to 4.
At some point, if there's an if statement that says, you know, if X is less– less than 10, jump where the heck can go on, you end up jumping away the heck can go on. OK. According to Wikipedia, it has to have assignment statements. Well, we just used an assignment statement, let X equal X plus 1. And you have to have expressions. Well, I think we've already been using a lot of expressions, in terms of arithmetic expressions. My concept of a programming language, the perfect programming language looks like algebra. Now, business people won't agree with that. They don't think the perfect language looks like algebra. Musicians– my son is a musician, he does a lot of computer music, he really isn't into seeing a program as algebraic. But he does do a lot of algebraic stuff because that's the way programming languages tend to work.
You know, COBOL, which is a business language does do a lot of arithmetic type things. Most– You know, OK. So, with that in mind– Well, I don't know. That– You know, just the fact that you have to have variables means some of the languages we have discussed in our class and we will discuss are not what I would call a programming language. Maybe somebody else would, but I wouldn't because my definition is more like a Von Neumann language. And for a Von Neumann language, really, you have to have random access memory. And we want a language that takes advantage of the random access memory, a language where you can jump places and you can do loops. Now think about doing that with hardware, with chains and gears, and it's really tough. I guess we see a jump every now and then if you tear a watch, a pocket watch apart, or– well, they don't make pocket watches anymore, do they? A clock, a mechanical clock, if you tear that apart and you look at the gears and stuff, sometimes there's a gear that gets to a certain point and then it drops.
So you can kind of get jumps and decisions by mechanical things but they're really hard, almost impossible. And you've got to engineer the program, you can't keep changing the program. You've kind of got to make– you got to engineer it into the hardware. It's kind of a mess. But if you look– You know, I was surprised here when I read this that actually I read this before but I was till surprised because I haven't read it in a long time. That there were some brothers in Baghdad in the 9th century that came up with a mechanical flute that sort of did the right thing. The Jacquard looms– I think it's Jacquard, which France is famous for, especially– I believe he invented those in Lyon. I couldn't find that anyplace.
But when I was in Lyon, I think they were invented in Lyon, in Southern France, where they had a huge silk industry at one point. Well, they still have little silk industry. And prayer– player pianos, I mean, you have a little music– mechanical music boxes but they don't really have jumps and stuff. They're– And it's pretty limited. And of course, there was, you know, the work in the mid-1800s by Charles Babbage who made something of a mechanical computer that was programmed by his niece, a woman by the name of Ada– I can't remember. She's probably the world's first programmer. And so, there's a programming language called Ada. And that was used in the US since I don't know 1830, I'm not sure. But it was actually developed for the United States' system in some year. I think it was actually closer to the end of the 19th century, more like 1890. I can't look it up, doesn't matter. So back here to our outline what languages do programmers use? Well, I don't know.
There are a lot of languages. There are so many computer languages, everybody has made their own computer language. Absolutely everybody. It was a fad for a while. You know, you couldn't be a computer science professor if you hadn't made your own language, I think. So let's look a list of a few languages. This is on– Well, this is on a website computerhope.com. Let's hope it is OK. And it just gives a list of a few languages. Now, this is a very short list, it's only maybe a hundred language is here, I don't know. And they count things– you know, in my mind things like SQL, HTML, XML, HTML are not what I would call computer languages because they don't have variables. You can't do arithmetic in them.
They're not what I call a programming language. Cascading Style Sheets is not a programming language. They're more like– Well, they're markup languages or– well, here it says, languages marked with an asterisk above are not programming languages. Now, that's a matter of opinion. It depends on how you define programming languages. This person agrees with me, so I used him in my video. But I have found other sources that don't agree with me. In any case– Yeah, in any case, the point is the languages that I am talking about, the so-called Von Neumann languages or whatnot are an important category of languages because we can do things with those we can't do with markup languages or SQL or other languages. Not to say those are unimportant languages because believe me they're not. They're just not what we're talking about today. Here's another list of programming languages.
This comes from Wikipedia and it's a list of programming languages. And it goes on and on and on. I'm just showing you the languages that begin with an S here. SNOBOL, S– Small Basic, Smalltalk. Smalltalk, a well-known one. S-Language, S-PLUS, S, S2. R. Well, R would be under the R's. R, R++, I haven't heard that one. Ratfor, Ruby, Rlab. OK. Lots of them. Millions of them, millions of them. Billions– Well, maybe not billions but, you know, lots of languages. Wait, GRASS. I use GRASS, I would not call GRASS a programming language. G-code. GRASS is not a programming language. GRASS is a geographic information system, an open source geographic information system. I maintained a couple of the– Oh, I've done a lot of maintenance on GRASS. I've used GRASS a lot. I– Well, I would not call it a programming language. That's just not right.
Yes, you can do programming in GRASS, but I wouldn't call it programming language. Under the F, there's Fortran. Well, anyway, lots of languages. Here's another list of languages. This is languages by type. And they've got, you know, if you like well, whatever, curly-bracket languages are, you can go to curly-bracket languages. And it will give you a list of languages that like to use curly-brackets for various things. If you like compiled languages, you go there and it gives you a list of so-called compiled languages. We'll talk about that a little more a little bit later because I don't know if I'd call Python a compiled language. It depends on what you mean by compiled. OK. Enough said. Let's go back to our outline. How do programmers approach their work? We'll get to that later. Language performance.
OK. When we talk about languages, you know, at the basic level, computers don't understand languages. They understand 1's and 0's. And all the commands at the basic level the CPU executes are 1's and 0's. There are things like, you know, 100-whatever that says load registry whatever from memory address whatever. Add register 4 to register 5. And they're just strings of numbers. And many of them are instructions that do have meaning. Some of them are actually numeric values to be used in the program, whatever. That is the way the first computers had to be programmed, is just typing in these strings of numbers that had meaning to nobody except the women doing the programming. And yes, the first women by and large– or the first programmers, by and large, were women. They were female mathematicians who couldn't get jobs doing a lot of things, and a lot of them in World War II– They didn't have a lot of choice in jobs.
They got jobs kind of– Well, they ended up getting jobs calculating by hand as all things were done there. Trajectories and stuff for artillery. And then when some men came up with computers or the first inklings of computers, they basically went to them and asked them to do the programming. So they– It was largely women that were the first programmers. And to this day, there are many fine women programmers. I married one. Let's see. OK. So the first languages were, you know, these machine languages where you typed in the numbers. That was awful. So suddenly somebody said, well, how about if we let– we use the name load, L-O-A-D, for loading into the register. And we can write a little program that does a translation, that does kind of– that will substitute the number that we can't remember so we're always looking it up in the book.
So that sort of very, very simple almost one for one type translation between machine code and something that people could kind of make sense, that was called assembly language. And every CPU will have an assembly language. And those are very, very basic steps, but at least they're mnemonic and they're– they make some sense to people. Well, only to very arcane people. But they– You know, because they really do have to do with the hardware. So they are kind of arcane-type things. But still, it's a world above machine language. However, since it's a one for one translation to machine language, both machine language and assembly which gets translated then to machine language run very, very fast in general. They're incredibly fast, very efficient code running, very time-consuming to write.
The next step was fully compiled languages. These are languages where you can write them and they look like beautiful algebra or something. But, you know, Fortran looks like algebra and maybe with a little bit of trigonometry and a little bit of calculus thrown in. Not much calculus. You got to do– You got to write the code to do that. But it's a beautiful language. You know, I'm a mathematician, I can say it's a beautiful language. If you ever talk to a computer scientist and they all tell you it's a dead language, it's an ugly language, don't listen to them, listen to me. No. But there is a difference of opinion there. And Fortran is not a dead language. It's used in high-end supercomputing. It's not a general purpose language like it once was. But it does have a niche where it's used a great deal.
Anyway, C is another language very much like that. And what is done is you've got a code that looks like algebra or looks like, you know, if you take COBOL, COBOL looks like business type stuff. It reads kind of business like. And then there's a translator called a compiler that translates that actually into machine language. It's actually a couple of steps. There's also something called a linker– But basically, it takes this higher level language and translates it into machine language. The resulting code tends– It tends to run very fast. Fortran or C is much, much faster to write than assembly language, much easier to write. It's really nice. And the resulting code runs very fast. So to this day, these languages tend to be used for a lot of applications. If you write a device driver, the things that, you know, like– oh, like the software that can read from a hard drive and put that stuff into the operating system. That will be written in C or something like that.
If you write a atmospheric modeling program like an upwelling– say you're trying to study the upwelling of ocean water off the coast of Peru. You end up– You may end up with a set of partial differential equations or something like that called the Navier-Stokes equations that will describe this upwelling area. It's pretty much unsolvable. The difference– the partial differential equations are because they're not linear and they're, you know, they're bad. But you can simplify that by simplifying the number of layers of water you have, assuming some terms are zero, if the ocean is calm. There's way of simplifying things. And under those circumstances, you can probably do some sort of numerical solution to that. It will be very intensive. Takes many calculations. It's slow. The code runs very slow. You may use Cray supercomputer to do this sort of thing. And you'll probably write the code in– well, when I wrote the code to do that, I wrote it in Fortran. And I think they would to this day. Although, we do have some really amazing tools in other languages today that we did not have– that was 20 years ago or 30 years.
Well, that was a long time ago. That was in the 1980s. OK. That's even longer than 20 years still. Anyway, that code runs– Code in those languages runs fast. It takes a fair bit of time to develop the code, it's not a trivia. Then there's another set of languages like Perl, Ruby, Python, PHUH, ELisp, most Lisp that are– Oh, I'm sorry. I rode ahead of myself here. Let's go back to this stage. This is more historic kind of. There are– The other way of doing things is let's forget all this compilation stuff. We don't need to worry about any of that. Let's just do the code line by line by line. As we step through, we will write a program that reads a line of code and does whatever it says. Reads another line of code, does whatever it says. If the code goes in the loop, it goes back up to the loop, it reads that line again, does whatever it says. And so, it is not translating things into machine language, it is executing things as it goes without any fancy translation.
Those are called interpreters. And we have a lot of interpreted languages, many of them. A few that I've– Unix uses a Bourne shell language or they also use the shell, they use other shells. But most of their shell languages are fully interpreted to this day. They are very slow. Fast enough for what they do, but you wouldn't want to solve the world in them because they're too slow for that. For writing quick little dirty shells, they're great. The BAT files in MS-DOS or in Windows are fully– are interpreted languages. I threw in an old language that I had a lot of fun with way back in the 1970s called SNOBOL. I guess it's still around a little bit today, but mostly it's just a quirky language. It was a lot of fun. It was the slowest running language I had ever seen. It was slow. But anyway, it was fun. Well, because those languages were slow, as scripting languages developed and got more complex, we got Lisp– Well, Lisp was a very early language.
We still use it today. PHP, Python, Perl, Ruby in order– they did several things because interpreted languages are really slow, so they did several things to speed up these languages. One of them was they would go through all the code and they would assign like numbers to the keywords. So like if you had an if statement, they'd say well, that's instruction 5, so that they have to go through each time and figure out whether capital IF is if or lowercase IF is if or capital I lowercase f– So by what I'd call tokenizing everything in the program, they could speed it up. The other thing that they've done in practice is they have found certain things that people want to do all the time. And they put those into libraries which sometimes they write in C or some other language.
So– And by extracting the proper things, which has taken them many years to tell what's things people use the most, and really optimizing those, often in another language, they can speed up the whole process and make the whole process, you know. A byte compiled language might be 10 times as fast as the interpreted language. And the fully compiled language will be 10 times faster, yeah. But it makes languages like Perl, Ruby, Python, PHP, Lisp, ELisp, languages you can really use to solve big huge major problems. They're good languages for a lot of things. Not for everything but for many, many things. And they are really growing in– you know, there was a day and age when I wouldn't call– I wouldn't say a person was a programmer if they were using a interpreted type language, but with the byte compiled languages and stuff, that's not the case at all.
It hasn't been the case for the last 20 years. These are real languages that are good– you can do really good big projects in. They're quite good. OK. So that tells us about languages. And if you go to like one of my– you go to– like this guy up here, list of programming languages by type, you can find programming languages that are compiled, programming languages that are interpreted, programming languages that are meant for parallel– use in parallel computers, whatever. It's a cool list. OK. Going back here, we're going to talk about writing code. Oh, well, one thing. Now, I sometimes think about this. There is this question. We talk about compiled languages. Fortran and C are compiled languages. Python is in a byte compiled language, an interpreted language, a byte compiled language.
I guess does that really– is that really true? Well, it is. That's the way we talk about them. But I could actually write an interpreter that would interpret a Fortran program. I wouldn't have to compile it. And now I don't know about, you know, there are languages I wouldn't– I can't imagine writing a compiler for. But many byte compiled languages, we probably could, if we wanted to actually write compilers for, but we don't. So we do talk about compiled languages and interpreted languages, but I'm not quite sure that that is really– I mean it's– I'm saying it's more tradition than it is real hardcore this is the way it is. You know, someday somebody may write a compiler for Python and they will have to put that in the list of compiled languages. In the case of, say, Fortran or C, there are– we– interpreters for Fortran and C, we don't call them interpreters, we call them debuggers. Many of the– Well, many. Every interactive debugger I know of actually interprets the program that they're debugging and in order for you to find the bugs. And then after you get the bugs, you'll compile the code.
But the truth is, they are interpreting the program. That's why they run so slow and I have often worked on such compute intensive programs that– in– when I was doing a mathematical modeling like oceanographic mathematical modeling. I would find that often I could– the debugging mode would run so slow because it was interpreted that I couldn't afford to run it. That I– It would have taken forever and ever and ever and ever. So I'd have to debug my program without using interactive debuggers which, you know, it's a lot more work with that using interactive debuggers. But, you know, there are methods and it can be done, and I have a lot of experiencing doing it. OK, let's talk about writing code. There's a couple of ways to write programs– write code for programs. One, the new way, the in way, the new way is called an IDE, an integrated development environment.
Now, maybe it's not new. I think Microsoft wanted us to use this for a lot time. But an integrated development environment combines a text editor and a compiler or an interpreter and a debugger, and it's all one big screen. And you type in your program in this one big screen. And it may give you aid so that if you misspell if in your if statement, it will prompt you and say you've misspelled if or maybe it's a bigger longer than that. But it will– Or if you make a mistake typing in the name of your variable, it will say don't you really want. Or maybe if you start and use a new variable, it will automatically put in the declaration statement, if you're working in a declarative language. So that the integrated development environments give you lots and lots of aids and supposedly make things much easier. Some of them like PyCharm is– that we are using, is a integrated development environment for Python.
I find that a little frustrating. That means if I use Python, I can use PyCharm. But if I am programming in Perl tomorrow, I've got to use a different environment, with a different editor or with a different– I don't know. I don't care for that. But there are development environments like Eclipse. Eclipse is an open source development environment that supports quite a few different languages. I don't use Eclipse so I can't tell you much about Eclipse. But they developed– supports a lot of different environments. I'm not big on the integrated development environments. They may take over the world. I don't know. I've worked in some that really are bad. I've worked in one that was put out by ESRI, Environmental Sciences Research Institute. Fortunately, they killed the environment. It was so bad.
But it gave you the editor, it gave you the– some sort of an object-oriented language called avenue. It gave you everything you supposedly needed. But the editor was so bad– I mean it was so awful compared with something like Emacs that, you know, I always edited the program in Emacs and then imported the program into the integrated environment to run it. So I don't know. Some people love these environments. I've got good friends that use these all the time and work in them. Microsoft has got a number– You know, I think the Microsoft Visual Studio is basically an integrated development environment. It's very, very popular. So I don't mean to say– You know, there are really plusses to these. And– But there's also negatives.
And, you know, you take your choice. Now the traditional development environments, I am going– the one I'm going to describe is a Unix environment. Windows has environment, you know, everybody has their more traditional environments. The Unix one is probably the most popular. Unix/Linux or– You use your text editor of choice. If you like JOE, use JOE. If you like Kate, use Kate. If you like– what's– notepad, use notepad. If you like Emacs or VI, use those. Most programmers in the traditional Unix environment will either choose VI or Emacs. But it doesn't matter. Use the compiler or interpreter of your choice. Sometimes like for a language like C or Fortran, there might be 14 different compilers available. And sometimes one compiler is better for your application than another compiler. One might make code run faster than another compiler. Or one will make parallel instructions easier to use.
If you're doing a massive programming, you need to run it on a parallel cluster of computers. So you may want to use a different compiler than the standard one. And it's nice to have a choice. And then there will be a set of tools like m4, make, autoconfig that help you manage your code. That you may do– Make is kind of a cool tool. It's a little arcane, but it's a cool tool that will– If you're working with a huge program with thousands of files, you– and you change a file which results in changes that need to be made in a hundred other files, make will make all those changes for you without making changes in the other 900 and some files that don't have to have changes made. So what make is, is it's called a dependency language. It helps you keep track of what changes need to be made and need to be made. And you can speed up your development by only changing things that have to be changed. Autoconfig is– helps you if you're working on multiple versions, multiple machines, multiple operating systems. If you want one program that runs Solaris and Linux and something, autoconfig can really help you out. Some of these things like cvs or, I don't know, rcs or– there's a lot of these, subversion, these are called version control systems.
Because if you get into a big coding thing where you've got lots of things going on, you have to start keeping track of the versions you have because you may have 14 versions of your program. As you make changes to your program, you make a change. You'll call that version 1.11. You make another change, that's version 1.13. You make another one, it's 1.15. Oh, my I made a mistake version 1– I better throw out that version. We're going to drop back to version 2.13 or something. And the version control systems will help you keep track of all that versioning. It can get really complicated, especially if you've got a whole set of programmers working on one program. In which case, you may even want a check-in/check-out system. And these version control systems, some of them have– well, some of them, most of them. Most of them have the ability to manage check-ins and check-outs of source code so that only one guy can modify a module at one time.
That you can't have two different people trying to make changes to the same thing and then you've got to merge your changes or– well, there are systems for doing that too. Github is an online system that helps with version control and check-out, check-in, things of that type. It's very popular with developers. OK, likewise, traditional environment. You do your debugging however you want to do it. There are some– In the Unix environment, there are GDB. There's a number of good debuggers out there that work with certain languages. DGB is for– or GDB is for C and Fortran. But– And C++, I think but not Python. Anyway– But there's debuggers around– There's bug tracking systems where you can have people report the bugs that come back and you can then assign the bugs to a programmer, and he's got to fix the bug.
And then he basically tells bugzilla or whatever system you're using that the bug has bee repaired and things of that type. There's package managers which has to do with how the software gets installed on particular operating systems like, you know, when you get something down from google dot– or play.google.com, it comes as a package and there is an android way of doing this package and an android way of doing the install. Likewise if you use Ubuntu, there is a Ubuntu way– well, it's the Debian way, but there is a Ubuntu way of getting this thing down. It will be probably a .deb file and then gets spread across. All the files go into the right locations. All the dependencies are taken care of. If you use Fedora, there's a different system, Red Hat Package Manager.
So– And then there's documentation systems. Part of the traditional development environment is documentation systems. In the Unix environment, the man file is still really God, it's really important. But there's also– The GNU– the free software foundation have tried to push people to man files. And there's all these people that do online documentation using HTML. But again, I don't consider those a part of the traditional system. My answer is use the man page but– OK. So we have done development environments. Now, the thing the book talks about is structured programming. Structured programming is really a very, very– Well, try to program without structure. In about a year's time, you'll find out why we need structured programming. Some languages force structure on people. Other languages, you have to force the structure onto the language. I rather liked languages that might encourage a little bit of structure but don't force it on you. I think my own opinion is to be able to write good programs in a language, it– the language has to be full enough that you can also write bad software. And there are, you know, in the early days of programming maybe, well, maybe today, I don't know, people will find– they'll write a program, they find some code way up here, oh, and it'll be down here in the program and they'll say, oh, why don't I– you know, I wouldn't have to rewrite that code if I just used that up there.
So I'm putting the if statement there that if found some condition, you know, go to this. And then down here, I have a go to there so that I can get back. And before long, what you have is a plate of spaghetti. You cannot follow your way through the code. It is so hard to maintain. I speak as a voice of experience. I started programming, you know, way back in the '70s. And I didn't know much about what I was doing. Maybe nobody knew much about what they were doing. But I didn't know much about what I was doing. It was a bit hard. And, you know, I was writing very complex mathematical software. And– Oh man, some of my code was– it was pretty bad. And, you know, during the first year, it was tough. And some of my code was pretty bad and I was having trouble reading the code that I wrote. It's not an unusual problem. And I was working in Fortran at that time for language that you can write very bad code in. Fortran has improved over the years, but it's still a language you can write very, very convoluted bad code in very easily. It's a beautiful language but, you know, that doesn't mean you can't do bad things if you want to do one. And, you know, in my first year of programming, I ran into a little book called– I'd never heard– well, nobody had ever it, it was brand new, called "Elements of Programming Style" by Brian Kernighan and Plauger.
And– Oh, that talks about integrated development environments. There. And you can find that book as a PDF on file. This is the best book I've ever read on structured programming. And of course, these people are very famous people. Brian Kernighan was– at the time, they were– he was writing this book, was working with Dennis Ritchie on developing a language called A or B or something, which later became C that they used to write an operating system which they were calling Programmer's Workbench which was later renamed UNIX. And then– And the second author has also become very famous for his work in the same era. So it's– the book really talks about programming style in FORTRAN but it doesn't matter what language you're using. It just talks about good programming style. It's a short book, maybe 60, 70 pages. It's a great book.
And it's cheaper than your textbook since there's a PDF online of this [laughs]. It's a better book too. It's a good book for what it does. It's just simply talks about programming style, nothing else. I might mention this is the same Brian Kernighan who wrote "The C Programming Language" by Brian Kernighan and Dennis Ritchie. "The C Programming Language" just by chance is the bible of C programming. It was written many years ago by the two guys that wrote C. And it's a good book. Any C or C+ programmer should have a copy of this book. And guess what, it seems to be– there seems to be an online PDF of that too. So I must say I paid good money for my copy but– Yeah. But it was worth every penny of it. OK. One of the things in writing structured code is– I told you about the spaghetti where you got this little bit of code up, up, up, up here that you want to reuse when you're down here someplace. I'm getting lost here with– OK. And you want to reuse this code that's way up here.
Nine times in 10, what that means is that code does that belong in your program. That code should be in a module or a function that gets called from your program. So what you really should be doing, don't reuse that code. Don't make a copy of that code, but pull that code out of your program and put it into a subroutine or a function or whatever your language– what– depending on your language, whatever they call a module. Put it into a function, a separate function and use it there. Because I bet that's what the code is, is it belongs– it should be called into things every time you need it. And that will protect you from a lot of spaghetti code. Modularity is very important in writing programs. Big program has millions of lines of code. The Linux kernel has, you know, a million, two million lines of code. Linus will tell you that he's– I think he's written 10% of those but he probably knows the kernel better than anybody. And there's a lot in the kernel that he says he doesn't know.
It involves a lot of different programmers. And the way you isolate their work and keeps them from stepping on one another's feet is by making different modules and working in modules. By and large, I don't– there's exceptions. Don't, you know, don't be dogmatic about it, but a program should never be more than one or two pages long. Now, you know, I have written programs that are probably two pages long, but they are straight down code, extremely simple straight down code. I would do that in that situation. But generally speaking, a program shouldn't be more than one or two pages. And some people will tell you even shorten than that. Mine is one or two pages because you need at least a page of comments, so. And that means you need a lot of modules. And the way most of us do modules, modules can either– or– I should be consistent what I call these guys. It's all the same, modules, functions, methods, subroutines. Fortran people, you'd call them subroutines usually. C people call them functions. But the– In most languages, you could put a main program followed by a module followed by a module followed by a module all in one file and they work fine.
I can name a couple of languages, they don't work fine but most languages work fine. However, I found over the years, in most languages, it's best to put every module into its own file. A separate file. One module per file. One file per module. A one to one relationship. Then, you know, a program like GRASS, I did a lot of maintenance on GRASS at one time. GRASS has hundreds of modules, literally hundreds. And they are, you know, it's one– GRASS is written in C. Well, 90% of GRASS is written in C. I think there's a little Fortran in it. Actually, sometimes, you can mix languages in a program. It's not always the case that a program is written in one language. There are many programs that are written in Fortran and C. There are some languages or some programs that are written in C with a little bit Python thrown in.
Be careful when you mix languages. Don't do it willy-nilly, but there are times that's a good thing to do. And a large program usually is stored in a tree structure. And a tree structure might look like, if I can find a tree here. I thought I had one over here. OK. Here's an example of what– this is just– I made a directory called GRASS. Now, there's no code here. I'm just using this as an example. But as an example of what you might have for a tree structure is you might have something of this type. And in a Windows environment, you'd have kind of the same type of thing. Only your names wouldn't– might be different because Windows likes longer names than Unix. Unix people use like little short cryptic names. So like, you know, we use– where are you is spelled pdf in Linux or I'm sorry pwd. You know, because I don't know.
Actually it stands for present working directory, but there's a lot of cryptic things. Anyway, I might make– let's clear the screen. I might make subdirectories under, you know, I've got GRASS as my major, major thing. Or maybe I would even call it not GRASS but GRASS 3.1 or GRASS 7.8 or whatever revision it is. Then under that, I would put it in a directory called source, s– or call it source or I would– Most of us Unix people would spell it s-r-c because we don't know how to spell and we haven't vowels. So– And we put all of our source files under that. And there are– if we go down into that directory, we have no source files there. But if this was actually filled, there'd be hundreds of files down here. All of which would end in, say, a .c or a .
f77 for Fortran or something like that. And they would be– there'd be hundreds of files and they'd be the source code for– that you would use to compile. Those would be the files that you'd edit. Now, those might– Of course, you might, you know, if you're working with a group of people, that's your own copy. And then you might have to check out a file from Github or whatnot, you know, or with the write rights in order for you to modify it but worry about that later. Anyway, that's where all your source files would go. And then you might have an area over here called include. And that's little snippets of code that get moved and included into the source files when they're needed. C, C++, sometimes Fortran. But C or C++ has– really likes to use these include files where things get slammed into the code, just prior to compilations.
Next one is– Oh, next one might be your bin file. This would be where all the binaries are kept after they are compiled. They might be kept over there and then the total final program as well, one that says GRASS. But you might also have the modules over there, just the compiled modules before they all get linked together. Yeah, I know a list. If you haven't done this done, it is a complex process. But just hearing this will give you kind of a clue of what's going on. And if you ever get into it, there's a lot to learn. The other way you might keep your binaries is a guy like this called binaries. And let's go down to binaries. This gets even more complex. Instead of keeping your binaries in one bin file, sometimes you may be developing code.
I've developed code that are– is supposed to run on seven or eight different platforms at a time. It's one set of source code but it's got to run on every version of Unix that a certain client owns. And the client in this case owned like about eight different versions of Unix. And then to make things worst, they threw Windows at me. And Windows is so different. That was not fun. Windows, that's not fun. I don't think GRASS runs on Windows to this day, so. Maybe, I don't know because I don't care. Anyway, and you'd keep your binaries for each– you know, it's one set of source code that's kept under src but it's the binaries would then be kept under depending on whether it's the AIX architecture or the Android architecture or the Linux architecture, they'd be kept in different areas. Your documentation could be kept under the doc file or the man file.
A man– Unix uses a man page system and man stands for manual. And they have their own man page system that comes with their own markup language called troff or groff or– and it works a lot like HTML. It's not HTML in any sense. OK. Well, that gives you an idea of just how a large, large program might be stored. And we're talking large programs here that have, you know, hundreds of modules. Well, actually, 30 to 40 modules is enough. Each module in its own file might need some sort of version control system like– or check-in, check-out systems like Github. OK. The last thing we want to talk about and I'm probably way overtime here. But the last thing I want to talk about is software design. There's a lot of different theories on software design. And there's a lot of different methodologies on how you design software.
Everybody has their own ideas on how you design software. And it's a whole that is played. Let's go over here to software design. This website has a list of various design methods used. Introduction. Well, introduction isn't a method. The Agile Software Design Methodology. The Crystal Methods Methodology. The Dynamic Systems Development Model Methodology. The Extreme Programming Methodology. The Feature Driven Methodology. The– Well, yeah, Lean Development Methodology. Mean– No, Lean Development, not Mean. Extreme Programming is Mean Development. The– oh– There's a prototyping method someplace. I don't see it but, you know, that's probably called Rapid– Oh that's Rapid Development. Rapid Development depends a lot on prototyping, I think. The Systems Development Life Cycle Methodology.
Oh, yeah, I've done– I learned that. The Waterfall or Traditional Methodology. There's a lot of these, I– and– Actually, I got a side story here. Extreme programming was developed by a fellow by the name of Ward Cunningham, who I may have mentioned in a previous video because he also developed Wiki, the whole concept of Wiki. And Ward developed extreme programming and I can remember having a beer with Ward after the end of the day once. And yeah, and he said– well, he had second thoughts about his methodology that he thought it was a great methodology if you had to get something done. It would– If you really had a deadline and you just had to get it done, but that it tended to be painful to use and wore programmers out. And hoped nobody would use it on a regular basis, that they'd only use it for emergencies or for– at least that was my understanding of what he was saying.
Anyway, neither here nor there. There's a lot of these methodologies, and then there's people who develop software using what I would call– Well, and these many technologies I don't know about them. I mean I've used a lot of them, kind of, sort of. Sometimes, the methodology has to do with selling your work. They're good to present to the people funding you. They do– They can help you, but they can also kind of lead astray, you have to use them very carefully. There's a danger of them just becoming bureaucratic gobbledygook. Bureaucratic gobbledygook. OK, enough said. There's also people who just develop code. And not using any methodology particularly. I would say they use the un-methodology method. I'm saying they don't design their code because they really do. And I'm using the word un-methodology because open source people like to use, you know, instead of copyrights, we talk about copyless.
Instead of conferences, we talk about un-conferences. So this is an un-methodology. But it isn't, like, going into things with your eyes, you know, full speed ahead with your eyes closed. You really are using a lot of– You're designing your software. If you don't design software, you don't get good software. It's got to be designed. But you have to have a methodology that fits the problem. And sometimes, the methods, it will change depending on the problem. The best method will change depending on the problem. Don't get glued to waterfall methodology or agile methodology. Use what fits the situation. One of the things I like to do when I'm developing software and it depends on the type of software, but I like to get trained by the people doing the task today, the people who will be using my software and the people who are doing the task today. And I actually like– you know, if people let me do this, I like to do their work. I mean, like move into their office, and if they're doing some process by entering things on paper and then taking it, the paper over and typing it into a big mainframe someplace, I want to– I want to spend some time doing that.
I don't want to just replace that whole thing until I feel like I understand their problem. It makes them accept me better. It makes– So I actually understand what they're doing and I understand their objections if they have objections or I understand what they want better. You know, it is important to do research to talk to experts and to talk– and some of those experts are the people doing the work, not the managers managing the people doing the work because they probably never done the work. Talk to the people and work with the people who actually do the work. And sometimes people don't want to do that. They don't like to pay you the salary. You want to be paid for doing the work that they hired somebody for $10 an hour to do. Well– That's– It needs to be done. OK. The next thing you do usually is you beg, borrow, and steal resources. And steal, well, it may come to that. A lot of– You know, an open source development or a lot of development, you don't have many resources.
People aren't giving you tons of money upfront saying here's a million dollars, go write a little bit of code. Instead, you know, you're stealing a little bit of time that is basically your salary is covered and you're trying to use 20% of your day to do this side project that will actually work out. And you're using computers from the backroom. If they've got some 386's around from the 1980s, you'll say they'll run Linux, we can do it. One reason for using open source software is it doesn't cost money. It's good stuff. It doesn't cost money. You can install it at will. Steal resources. Well, you know, you scram, you know, do whatever you have to. I can remember, you know, working in somebody's office because I needed some resources and they told me well, we'll let you use our office after hours. So I was, you know, I was the graveyard shift because I needed the computers that ran the software they had. So– And I didn't have it.
I didn't have the computers. So, you know, I'd sneaked down into their office at night– I mean they gave me the keys, but I'd sneak down there at night, use their stuff off hours. That's the way it's done. And then the next step is get out some sort of a product. Get a prototype out. Get it out as soon as you can. It may be in the wrong language. You may be piecing little bits together here and there. But get something out that sort of kind of works. And don't tell people it works, tell them I think this might kind of sort of work because you don't want to overpromise. At that point, what you do, do as little– do a little bit of documentation. If you– Do documentation if you need to. If you don't, don't. But do what you need to and get people to start using it.
That builds up a little bit of community. And that's probably about the time, it's– you're ready to start giving presentation at conferences, user groups, whatever. You know, talk about your product. You're out there, you're trying to recruit. You're trying to recruit people to use it. More importantly, people to help develop it, people to pay you to develop it more. So you're back to begging borrowing and stealing resources. And hopefully, hopefully, you have proven enough of the prototype that somebody will actually give you some money at that point and– which is nice. And you're recruiting developers, users, documentation people, everybody you need. And don't limit the search to your own company. If you're developing open source software, don't ever let your own company get too much of a tie to it. You want to get out where, you know, one company is never more than 20% of the user base or whatnot.
You really want to get it out where it– it won't be seen as, oh, that's IBM's product or that's John Auto Company's product, you know? You want it to be seen as having a broad enough user and– well, developer base you don't always get, but user base, that it's seen as being more general, then you can get resources from other groups. You will get– might get programmers to help you, volunteer programmers, maybe you got to get some money for them, documentation people, and just users to help pound the thing and give you ideas on how it should develop and guidance. From there, you– at some point, you go back and you fill in your design documentation. Now, hopefully, you already have something for algorithms that work, I hope.
And– Or at least they work except for the big warning thing, this step won't work. And you refine your prototype and you start it all over again. And that's the way you, you know– yeah, that's the way a lot of open source stuff gets developed. And you go over and over and over. And you sort of do it all at once. I guess agile development kind of does it all at once. But– And you do it for as long as possible. One of the problems with like the waterfall approach to software methodology is they seem to think you developed a product and it's over. But a good product is never over because you go right into maintenance phase and then you go into the next version. I told you that Fortran is still a viable language. It's still around. Version 1 came out in, what, 1954, 1955, something like that. And so we're talking– Well, you know, 60, 70 years later, it's still being developed. Maybe not as rapidly, maybe not as fast but, you know, they're– it's still being developed.
It now has object-oriented stuff. It has parallel processing stuff. It has pointers. It has stuff that they never dreamed off in 1950s. And, you know, successful software goes on and on and on. And so, you know, it's not necessarily over. OK. The only thing that is necessarily over is my talk which finally is over. I hope this has made some sense. We'll see. OK, thank you. Bye-bye..