Friday, January 02, 2009

the drama and humour of numbers

The Story of 1 (60 minutes)

I just saw this excellent TV show about the history of numbers (ABC review) and, for joy, it's available on the internet too :-)

Some Australian aboriginal tribes did not have a number system, just one and many. Arithmetic evolved in cities which had more complexity which required calculations. The first writing was with numbers.

3000 BC: The Egyptians conceived of 1 million. Also they invented the cubit, a unit of measurement, required for the buildings they constructed

Pythagoras invented odd and even numbers, things such as magic triangles (1, 2, 3, 4) and explored the relationship between music and the size of containers (the music of the spheres). But his dogmatic idealism about number led to tragedy. One of his disciples discovered irrational numbers and was drowned.

The Romans murdered Archimedes and then imposed their crummy numerals onto the world. They were so useless for doing calculations that the abacus was used instead.

Our decimal system and most notably the number zero wasn't thought of until 500 AD by someone in India. From there it was passed onto the Arabic Muslim world. Then the decimal system was brought to Europe by Fibonacci.

There ensued a struggle between the Roman numerals and the decimals system which lasted for hundreds of years. Eventually the decimal system won out because of the need for capitalism to calculate compound interest accurately.

Finally, Liebnitz invented the binary system but we had to wait another 200 years for the computer

This video is very enlightening and funny being narrated by Terry Jones of Monty Python fame. The simulated battles between our modern sprightly numbers and clunky Roman numerals are fabulous.


Mark Miller said...

Thanks for this, Bill! I've watched a bit of it and I'm already enjoying it immensely.

It looks like Terry Jones has done many educational series. The one I watched most recently was called "Medieval Lives". Very interesting. He corrected some misconceptions I had about how Medieval Europeans thought and lived. For example, it's a myth that they thought the Earth was flat. They always knew it was round.

I've watched a couple of his other shows. I remember he did one on Roman gladiators, and one on ancient inventions that we still use today. I think he's doing a great service with what he's been doing since his Monty Python days. He tries hard to make learning interesting.

Peter William Lount said...

Bill that is a very funny show! Very educational.

I found this interesting article on Roman Numerals that's very illuminating. I've learned more about Roman Numerals between the Story of 1 and this article than my brain really needs to know but it's fun.

Roman Numerals and Arithmetic.

It's also interesting to me that "zero" aka "nothing" plays a role in Smalltalk (and other languages) in the form of the concept of "nil". We have objects (one) and we have "nil", the undefined object. Nil is such an important concept in the programming that I do. The question is why? One reason is that it allows for the case of nothing.

The lack of consistency in using nil and object together is also the source of a great many bugs. For example at one large insurance company they had a great many bugs because they used a pattern of two methods to access state. For example, the first of these two methods checks that we have something, "hasExpiryDate", while the second one accesses the "specialDate" object. They then used code to check the existence of the date before sending the method to access the special date. The problem is that this wasn't always done leading to many cases where the "hasX" method wasn't implemented or simply wasn't sent. Thus "specialDate" would often generate an exception and, ehem, crash the application with a walkback error as a "nil" object returned from "specialDate" doesn't understand date math methods. "specialDate" returning a nil object wasn't anticipated since you're "supposed" to call "hasSpecialDate" first. Oops.

An alternative is to have the "specialDate" method return two types rather than just one: a date object OR a nil object. This way you defensively know to check for nil and not send date math methods arbitrarily to the returned object.

There are other alternatives of course, ensure that a date is always returned however, that may not make sense in many cases in practice. The experience at the insurance company proves this out.

They liked their first "hasX" test method since it could be used in combined "and:" statements to eliminate the ifNil:ifNotNil: constructions. This tended to shrink method sizes but really just added additional complexity with an explosion of methods that really didn't solve the problems of having "nil" dates.

This also highlights issues with "type checking" in type constrictive languages which enforce single types on variables and parameters. In these type constrained languages it makes sense to have "hasX" methods to test for the existence of the data element object and then use the "accessX" method to get it when it exists and avoid getting it when it doesn't exist. In fact this pattern is required in type constrictive languages.

Unfortunately the two methods, existence test and accessor when in existence is a pattern that doesn't work too well in Smalltalk, a type freedom language where objects carry their own type/class information with them rather than the variable/parameter carrying that information in it's definition.

One of the reasons that I insisted on adding "ifNil:ifNotNil:" to Smalltalk on a very large bank project in the early 1990's was to support the "nil+object" dual type logic without having to use "self specialDate == nil ifTrue: [...] ifFalse: [...]" type of construction.

A compelling advantage of Smalltalk is that it allows for the literate use of extensions to the language. Blocks make this possible in many cases. "ifNil:" and it's variants take blocks for the cases, nil or an object, or as a well known bard said, "to be or not to be", er in the case of Smalltalk, "nothing or something".

Adding control structures to languages that can be debugged and tested with unit tests and validated as needed provide a secure foundation to build upon. I'm glad that control structures such as "ifNil:ifNotNil:" have caught on as it's way easier to comprehend "self specialDate ifNotNil: [...] ifNil: [...]" than some convoluted "logic statement" that uses ifTrue: or trys to avoid using if's by the use of even more cryptic "and:" style logic.

As the "Story of 1" talks about, 0 and 1, zero and one, are quite the pair. Well this pair has entered a new era in Smalltalk, as "nil and object".

Peter William Lount said...

Oh, the people who want to use "hasSpecialDate" and "specialDate" are in many cases attempting to get rid of "zero" (nil) by ignoring it. It's sorta like the Catholic Church fighting the adoption of Zero.

The fact is that in data processing and in worlds of objects (like insurance applications or the real world) nothing exerts a powerful influence in the lives of things. It's best to embrace it rather than ignore it with "simplistic solutions".

Mark Miller said...


Did you just contradict yourself?

It sounds like you and the institutions you worked for were taking the approach of code acting on data. Might I suggest an approach that allows objects to emulate whatever they want, and control how they react to messages? For example, if you have an object that would like a null string if there's a null date, give it an object that when expiryDate is sent, and the date is null, returns the null string (''). This is a simplistic example. I'm sure there are advanced methods that allow specialized functionality to be isolated (probably in derived classes), but allows one object to be returned, and it can transform its behavior depending on who references it (there is a way to detect who the sender of a message is). This is the kind of thing that Alan Kay intended with OOP.

Peter William Lount said...

Mark: Did you just contradict yourself?

Nope. If you think so please point it out.

Mark: It sounds like you and the institutions you worked for were taking the approach of code acting on data.

That is typical of very large Smalltalk applications that interact with RDBMS systems that. These systems are all over the place. By the way, I didn't create any of them, so I absolve all responsibility for their systems. I was there to assist in improving their systems. Again that's typical of situations that consultants face.

Mark: Might I suggest an approach that allows objects to emulate whatever they want, and control how they react to messages? For example, if you have an object that would like a null string if there's a null date, give it an object that when expiryDate is sent, and the date is null, returns the null string (''). This is a simplistic example. I'm sure there are advanced methods that allow specialized functionality to be isolated (probably in derived classes), but allows one object to be returned, and it can transform its behavior depending on who references it (there is a way to detect who the sender of a message is). This is the kind of thing that Alan Kay intended with OOP.

Well, having objects "emulate whatever they want" tends to spread methods ALL over the place on classes that don't need them and shouldn't have them. Having date and string methods on the undefined object, nil, makes little sense. In fact it tends to lead to bugs as evidenced by the tragic mess of JavaScript where errors are simply ignored and the slightly less mess of Objective-C++ where messages sent to nil are, ahem, simply ignored.

So sure you can do that... but is it really design or a hack?

It can be design in some cases. I created a "SortCriteria" and "SortCriteriaColumn" set of objects that pretends it's a "sort block" by implementing "value:value:" and a couple of other methods on SortCriteria. It then becomes pluggable any place you'd put a sort block. It is useful when you have to sort by more than one "column". For example, sort by "last name" then by "first name" then by "city" then by ... . Also the direction of sorting for each column can be controlled independently. Once written and debugged and a test case created this is very powerful. It's also open source and you can get it here.

Another solution is to have "date wrapper" objects that hold either "nil" or a "date" and you put the methods on there, however, you still have to deal with the facts of life in these situations which are that you may or may not have a date (or some other object)!

Typically people want to simplify so much that they ignore the facts of life that there might not be an object they are interested in, a date in the above examples. So even the wrapper solution isn't fully helpful. It does however give you a place to hang methods that don't belong on date or nil (or some other object). So in that sense it's better.

A point I'm making is that the use of "if..." (ifTrue:, ifFalse:, ifNil:, ifNotNil:, ...) needn't be avoided in some frantic attempt of bias against "if...". The result of such attempts at simplifying is too make their code more complex and more prone to errors. This is especailly the case when it's the binary state of an object existing or not existing (nil).

So while the wrapper object makes sense in some situations the two state situation comes up all the time all over the place and wrappers don't make sense. It's also not what people have used (others, personally I like the wrapper notion when it's appropriate as the Sort Critter above shows).

Yes, Alan Kay did intend that polymorphism be used and it does make sense to use it much of the time. It's a question of design. Adding tons of methods to "nil" (the undefined object) doesn't make much sense however. Adding "ifNil:ifNotNil:" (and it's variants) to Object (or ProtoObject) and the UndefinedObject (nil) did make a lot of sense since it's a test for existence of an object. This is different than the test for truth of some expression that "ifTrue:ifFalse:" performs. Existence and Truth are two different abstractions.

It is notable that "ifNil:ifNotNil:" and variants came in existence in Smalltalk after Smalltalk-80 and VisualWorks 2.5. There was resistence by former members of Alan's group to the adoption of "ifNil:ifNotNil:" for the reason that it added four methods to Object and the UndefinedObject - they didn't want any methods added to base classes. Fortunately "ifNil:ifNotNil:" has spread wide in the Smalltalk universe now as it's use is validated as useful time and time again.

Using "hasX" or "doesDateExist" truth test methods is fine if you don't forget to use them, which unfortunately what actually happens in practice. When you see systems with thousands of "doesXExist" or "hasX" methods used inconsistently your brain begins to be boggled and your mind really gets fried when those that wrote it don't see the issues with it.

The issue is that many variables use "nil" as a way of marking that the object doesn't exist. Sure in some other variables "nil" isn't permitted and simply indicates a programming error but any time that you use "nil" to indicate that an object isn't present you can't ignore the facts of your meta data which now says that you either have an object or nil.

I've fixed many bugs written by others that are of this nature in live production systems. Far too many. Code quality is important in that it saves large corporations lots of real and actual money.

Another issue is that the meta model can change or be uncertain during the design, development and maintenance life cycle of an application. For example, there might always be an "expiry date" when the application was in the design and development phases but then it's discovered that it's needed during maintenance that the "expiry date" on a certain object is optional. Some projects code that as an date far in the future: such as 1999-12-31, oh, oops, ah, how about 9999-12-31 instead? But it's not the same thing since that's saying that the expiry date is far in the future as opposed to there not being one at all - two different interpretations. Of course it's up to your project how you interpret YOUR data however, there are ways of doing it and then there are ways of doing it.

Nothing (nil) isn't going away just because someone "wishes" it to with simplistic solutions. While we benefit from the concept of nothing (and zero) we also pay for it with a little bit more complexity at times.

Mark Miller said...


Ah, I see what you were saying now. You weren't contradicting yourself. I wasn't trying to put you on the spot. Somehow your second comment struck me as whimsical, because it appeared at first to contradict your first comment. I can't remember how, because you've clarified it in my mind now. :)

I agree that ifNil:IfNotNil: is simpler to deal with than hasX. I was suggesting a virtualization option because I imagine it got very repetitive to use this construct with every nullable field.

I used to deal with this a lot in database applications in C, C++, C#. The problem that bugged me was that anytime a database schema change was made, such as changing a field from non-nullable to nullable, I had to find all the places in the code that referenced that field and make sure it still worked. Sometimes I'm sure it just felt safer for me to test all but key fields for null or not null, because then the code scheme would be consistent regardless of if the field rule was changed or not.

I wasn't suggesting adding methods to nil. Rather I was suggesting having your record object be made up of field objects that had dual logic in them. So rather than your record being made up of elements that are "value" or "nil", they're made up of containers which contain "value" or "nil".

Say you had NullableDate with a date field, which could be a Date or nil. If you have display logic that only requires a string, for example, you could have a "display" method in NullableDate, which would return either '' for nil, or the date as a string. It could carry out operations with other NullableDate's, and reveal its contents (necessary for carrying out those operations). One I was thinking about as I wrote this was:

startDate to: expiryDate applyTo: [:date | block] ifDateNilApplyFor: timeSpan

where startDate and expiryDate are NullableDate's.

I'm hedging here a bit because I don't have access to Squeak at the moment. I know there's a way to set up a TimeSpan or something like that. Anyway, what this would do is apply the dates to whatever logic was in the block (creating Date's in one month increments, or up to the expiryDate), with the ifDateNilApplyFor: clause being there if startDate or expiryDate was nil, to provide some boundary.

This may not make sense in the context you were working in, but you could have it go the other way: if startDate contains a nil date, but you had an expiration date, it could set up the loop to start at expiryDate - timeSpan.

If both startDate and expiryDate contained nil dates, then it would definitely make sense to throw an exception.

The idea here is to isolate the dual nature of dates in the context you're dealing with. I know it's a design preference, but if I had the scenario you were working with I'd rather try to do this than sprinkling ifNil:IfNotNil: all over the app. to do the same thing. I would add methods as needed to my container objects, so they wouldn't pollute nil or the other standard classes.