Thursday, August 23, 2007

OOPs?

I entered in the National Computer Science School python programming competition run by the School of Information Technologies, at the University of Sydney. It's been great for my learning. Some of the challenges have been hard, for me. eg. simulate a spreadsheet, make a MUD game (although there was a fair bit of guidance for the latter).

But one thing I noticed was no requirement for object oriented programming.

I do have a beginners book on python programming (Python Programming for the absolute beginner by Michael Dawson) which does teach OOPs. Don't be put off by the dumbed down title, it has very clear explanations which is unusual in my experience for programming books.

Also I've become aware of the origins of OOPs through reading Alan Kay's Early History of Smalltalk and also Dan Ingalls Design Principles Behind Smalltalk and this knowledge makes a difference (without going into those detail at this stage).

So, I've become aware of two things.

At best, Education is stuck at the level of procedural or structured programming. Or quite often for most students, just applications, no programming at all.

OOPs is important (vital) for programming more complex systems but is harder and therefore makes limited inroads into formal Education.

Despite finding it hard to get my head around OOPs myself I don't really want to believe the second statement. I'm hoping that etoys / squeak (visual programming) might provide the answer of making OOPs more accessible. I'm not sure.

Reading Mark Guzdial's blog (he has published books in both Smalltalk and Python) makes me think that part of OOPs (creating classes) might be too difficult

This:
"Objects, for example, are harder for people to understand than procedural programs. Distributing responsibility and process across multiple objects increases cognitive overhead. That's empirically demonstrated. While it may be provably better (e.g., improvements in cohesion, coupling, and encapsulation), it demands more from its practitioners -- and in so doing, makes it harder to take on that role. As more complex ideas flow into the task of programming (structured programming, strong type systems, iterators, abstract classes, interfaces, and so on), the cognitive demands increase."
- Plea to Language Designers: Bring Back GoTo!
And this:
Two of the graduate students working with me, Brian Dorn and Allison Tew, did a fascinating study over the winter break and submitted a paper on it (still awaiting word on acceptance). They reviewed all the programs (scripts) that graphics designers had written for Adobe Photoshop and then had shared with other graphics designers/programmers at a specific website. These were written in Adobe's form of JavaScript. Brian knew (from an earlier study reported in his ACM ICER 2006 paper) that these designers/programmers had virtually no formal computer science background. The questions that Brian and Allison were asking included "Without a curriculum directing coverage of everything, what programming constructs do these end-user programmers use -- either because it's so useful, or because it's easy enough to understand without a course, or both?" In a real sense, this is studying "natural" programming -- what programmers "in the wild," who program because it's useful to them, do without the influence of a teacher.

As one might imagine, every program used variables and assignments, and virtually every program used conditionals and relational operators. But then there are some subtle and fascinating differences. Over 60% of the projects they found used FOR loops, but only 37.5% used WHILE loops. Are FOR loops nearly twice as useful or twice as easy to understand as WHILE loops? They also found that the use of TRY-CATCH for dealing with exceptions appeared in over 60% of the projects they reviewed. Perhaps exceptions are much more useful or much easier to understand than WHILE loops, though virtually everyone teaches a WHILE loop but not everyone teaches exceptions. Programmer-created functions appear in over 70% of the programs they reviewed. Programmer-created objects appear in less than 20% of the programs. Perhaps "objects-first" isn't as natural as we might think.
Studying Programming in the Wild
And these extracts from a discussion he was having on another list:
- "Object-use" means instantiating objects, applying methods to objects, writing new methods.
- "Class-create" means creating new classes and solving problems by writing different methods in different classes (distributing responsibility).
- "Early" means in the first week of class, or in the students' first programming assignment

Using the term "impossible" seems to be setting up a strawman. Nothing in education is "impossible." How about if we use a new term: "works," where works means "students can be successful in completing tasks using what they perceive as a reasonable amount of effort." ...

Reconsidering the phrase "objects-early is impossible," let's first consider "object-use early doesn't work." I'd say that the evidence AGAINST that is pretty strong. Not only do we have evidence of broad student success in tools like Alice (which doesn't have classes and methods in the same sense as Java or Smalltalk, so it tends to be "object-use" rather than "class-create") but even in Logo -- whose turtle is clearly a "first object." Roy Pea and Midian Kurland found that few students that they studied really learned Logo well, but there's a good bit of evidence from studies like Sharon Carver's that students could learn Logo -- that Logo "works."

Now let's consider "class-create early doesn't work." The way I read the research, there's a lot of support for that statement: Creating classes and writing distributed methods is *hard*. Consider some of the research evidence:
- Anne Fleury's work showed that students far prefer in-line code, even going so far as to prefer constants to named values. The Psychology literature makes that obvious: having to look elsewhere to find the value for something increases cognitive load.
- T.R.G. Green's work showed that increasing cognitive load (by making people look elsewhere for code and values) makes it harder to program, e.g., people read more slowly and make more errors.
- John Carroll's and Mary Beth Rosson's work at IBM in the 1980's on Smalltalk found that programmers had a hard time understanding the distinction between classes and instances.
- Even Adele Goldberg's technical reports from Xerox PARC in the 1970's showed that most of their students weren't able to complete program modification tasks in the given time. It took too much time to find where a particular feature was buried in a particular class. It didn't "work."

Overall, it's hard to test this hypothesis convincingly. It's hard to control for all factors and get students to program the same things with and without making classes. But there is a macro-level way of testing this evidence. For the schools whom I've talked to, the failure rate in introductory classes jumped when they went to class-create early, either in C++ or Java. That evidence suggests that it's true that class-create early doesn't work....

For myself, I'm going to stick with an approach of object-use early and class-create later (week 10 or later). Here are my reasons:

- PROGRAMMING IS HARD. Even procedural programming. I see students struggling with where to put the RETURN even in week five. In our last midterm, I was dismayed with how many students were still struggling with how to manipulate two different indices when working with two different arrays (sounds). I'm happy with where our students get in 15 weeks.

- OBJECT ORIENTED PROGRAMMING IS HARDER. I have seen no evidence that class-create programming is EASIER than procedural programming when dealing with introductory-level concepts and CS1 level of programming. I've seen lots of evidence (referenced in my last message) that it's harder. This doesn't have to do with the teaching method -- this is entirely a matter of cognitive load. The raw task of O-O programming requires a greater cognitive load than procedural programming.

- I CAN'T AFFORD TO MAKE IT HARDER. With high failure rates, students' perception of CS as being too hard, and declining enrollments, I can't afford to include class-create early. The costs aren't worth the benefits of learning object-oriented programming in the first course. I believe we need more of a gradual slope in our intro course, not such a steep curve that looks to students like a wall.
- NSF CPATH, Jobs and Objects
Mark Guzdial makes a lot of sense. Reality check. However, I'm still very interested in pushing ahead and learning OOPs myself as well as further exploring the potential of etoys in that context.

3 comments:

Anonymous said...

If classes are an obstacle to grasping OOP, using prototypes seems to be the way to go (cf. Wikipedia on prototype-based programming). Is this what Guzdial is on about when he talks about deferring the object concept?

Anonymous said...

I grew up learning procedural programming as a fun hobby (BASIC on a TRS-80). The last few years I've been trying to wrap my head around OOP (in Python mostly). I've heard that learning OOP may actually be easier if one has never studied programming before versus having an ingrained procedural programming background...as if procedural "dogma" taints cognition and is hard to break away from. I don't know if that is true...it may just be that I'm too daft to grasp the subtlety of OOP. :) Regardless, I've never taken a programming class and am all self-taught. That's probably the real reason why I'm a weak programmer. But, good enough as a teacher to get beginners' feet wet. I just keep trying to learn more and more every year.

To Charles: Thanks for that wikipedia link!

Bill Kerr said...

Yes, Charles' link to prototype based programming is very good. It seems to be something like cloning and modifying existing objects rather than designing new classes from scratch.

Etoys has morphs and cloning, so I've been doing it already without knowing the name of it! - and I plan to look at morphic more (which was incorporated into etoys / squeak from Self by John Maloney)

Some morphic tutorials

A paper about morphic in the squeak UI by John Maloney

I think the object model in GameMaker is similar too, but I'm not sure. In GameMaker you can clone instances, do inheritance and over-riding but you can't design your own classes

So I guess both etoys and GameMaker have adapted themselves in their evolution along the lines being suggested by Mark Guzdial - object use but not class create.