The amortized cost of vector insert is 3

2019-03-262019-08-17 / mortoray / 6 Comments

C++ requires many algorithms to be implemented in amortized constant time. I was curious about what this meant for vector insertion. To get started I first wrote an article about amortized time in general. Now I’d like to apply that to appending vector items.

23.3.6.1-1 A vector is a sequence container that supports random access iterators. In addition, it supports (amortized) constant time insert and erase operations at the end;

In this analysis I’m looking at end insertion, which is done with the C++ vector::push_back function.

Cost of copying

Complexity analysis is usually based on counting how many times a dominant operation occurs. Often we count the number of comparisons, such as with trees, but that doesn’t make sense for a vector insert.

What should we count? Maybe we should include memory allocation, or is that done in constant time? Do we need to consider a specific cost model, such as a “random-access machine”? The C++ standard only gives us this description:

23.2.1-1 All of the complexity requirements in this Clause are stated solely in terms of the number of operations on the contained objects. [ Example: the copy constructor of type vector <vector > has linear complexity, even though the complexity of copying each contained vector is itself linear. — end example ]

Let’s focus on the cost of copying data, this should be okay by that definition. For common use, this is the most interesting cost since it is usually the highest cost.

Given a type T we define the cost of an operation as the number of times an object of type T is copied. Each time an item is copied, it costs 1.

What is a vector?

It’s helpful to know what a vector actually is. It has a current number of items, its size, and capacity to store more. The requirements also state:

The elements of a vector are stored contiguously, meaning that if v is a vector … then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size().

We can use this pseudo-structure to represent the vector:

vector {
    T* store
    int size
    int capacity
}

capacity is the total space available in store. size is the current number of items in store that are actually defined. The remaining are undefined memory.

Two insertion scenarios

When we insert a new item into a vector, one of two things can happen:

size < capacity: The item is copied to store[size] and size is incremented. That’s one copy: a cost of 1.
size == capacity: New space must be allocated first. The old items must be copied to the new store. Then the new item added. We copy once for each existing item, plus once for the new item: a cost of size + 1.

That second cost is clearly linear, not the constant time the standard requires.

This where the amortized cost comes in. We aren’t interested in the worst case, but the average case. As the vector reserves capacity beyond its current size, we don’t always have to resize the vector to add a new item. The amortized cost is the average cost of many insertions.

There must be some way to make this cost be constant time.

How many resizes

Let’s first try the approach of increasing the capacity by $k$ each time, some constant amount. This means that every $k$ inserts will require a resize. Call $r$ the resize operation, so at $r_1$ we have $k$ items in the list, at $r_2$ we have $k*2$ items, at $r_m$ we have $k*m$ items. For completeness at $r_0$ we have $0$ items.

Operation $r_m$ requires that $r_{m-1}$ has already happened. We can only insert one item at a time into the list, thus no resize operations can be skipped. The total cost of this sequence of resizes can be expressed as this recurisve relation:

$\begin{array}{lcl} SumResizeCost( 0 ) & = & 0 \\ SumResizeCost( m ) & = & m k + SumResizeCost( m-1 ) \end{array}$

Reducing to a single closed form equation that gives us:

$SumResizeCost( m ) = k m (m + 1 ) / 2$

We need a form related to $n$ , the number of insert opertions we perform in total. This is easy, since $m = n / k$ ; assume integer math to avoid needing a floor operation. In addition to the resizing cost we need the individual insert cost, 1 for each item. So the total cost of the sequence of $n$ insertions is:

$\begin{array}{rcl} SumCost( n ) & = & SumResizeCost( n/k ) + n \\ & = & k (n / k) (n / k + 1) / 2 + n \\ & = & (n^2 + 3 k n) / (2 k) \end{array}$

To get the amortized cost we divide by the total number of operations:

$\begin{array}{rcl} AmortizedCost( n ) & = & SumCost( n ) / n \\ & = & (n + 3 k) / (2 k) \end{array}$

Strip out the constants, and we’re left with an amortized cost of $O(n)$ . Thus any constant size of $k$ results in linear time, not the desired constant time.

Geometric size

Let’s move on to another option for resizing: start with 1 capacity and double each resize. Let $r_m$ refer to the resizing of the vector. To simplify the math, we’ll use indexing starting at 0. The $r_0$ operation needs to move 1 item; this is the first resize. The $r_1$ moves 2 items; this is the second resize. The $r_m$ operation moves $2^m$ items.

$\begin{array}{lcl} SumResizeCost( 0 ) & = & 1 \\ SumResizeCost( m ) & = & 2^m + SumResizeCost( m-1 ) \end{array}$

Basic geometric series reduce very nicely:

$SumResizeCost( m ) = 2^{m+1} - 1$

At this point we could swallow the cost of 1 copy and drop the $- 1$ part. Given the exponential term before it, the constant 1 isn’t relevant. For thoroughness, I’ll leave it in.

Relate back to $n$ where $m = log_2(n)$ , and add in the initial copying of each item on insert:

$\begin{array}{rcl} SumCost( n ) & = & SumResizeCost( log_2(n) ) + n \\ & = & 2^{log_2(n)+1} - 1 + n \\ & = & 3 n - 1 \end{array}$

Divide by $n$ , the number of inserts, to get the amortized cost:

$\begin{array}{rcl} AmortizedCost( n ) & = & SumCost( n ) / n \\ & = & (3 n - 1) / n \\ & = & 3 - 1/n \end{array}$

The constant 3 is the most significant term, so our amortized cost of insert is $O(1)$ . Success! That’s constant linear time as the standard has requested.

The intuitive solution

That 3 represents a concrete number of operations: how many times each item will be copied. It could have been 2 or 4 and still been constant time. Indeed, if we were off by 1 in our definition of $m$ , it would have been 2 or 4, not 3.

What are those 3 copies? The first copy is clear: when you first insert an element it has to be copied into the store memory. That leaves us with two copies.

Think of the structure of store at the moment a resize happens. It starts with size == capacity, it’s full, that’s why it’s being resized. After resize it ends up at size == capacity/2 as the capacity has doubled. The second half of the store is empty.

This second half will eventually be filled, and once filled, all items have to be copied somewhere new. This means that for each item inserted, 2 copies happen. The item itself will need to be copied somewhere new, as well as one item from the first half. This relation holds regardless of how many resizes have been done. Each resize always results in one full and one empty half.

If we think in terms of the accounting method that means insertion costs 3 credits. The first credit pays for its immediate insertion. The second credit pays for it’s eventual copy to a new store. The third credit pays for copying an existing item to the new store. The doubling in capacity each resize ensures these new items always have enough credits to cover the copying cost.

Lessons learned from recording my first class

2019-03-222019-03-22 / mortoray / Leave a comment

I published my first video class, How to Write a Great User Story, something all programmers should know how to do. Even though the class only went live recently, there’s already a host of things I know I can improve upon. I’m not, by any means, unhappy with the class; I think it’s a decent introduction to the topic. But, as with all new experiences, I learned a lot. Before I forget those things, I thought I’d write them down.

First I’ll look the content, the numerous ways I could improve it, and how it contrasts to writing articles. Then comes the film production, an area still new to me. And of course, with filming, comes the dreaded task of editing. I don’t know if you dread it, but I do!

I’m sharing this as a reminder that we all start somewhere. You don’t need to be great the first time. You do what you can and then improve over time. Take my (mis-)adventures as motivation to try something new.

Here’s what I learned.

Content

I’m used to writing articles, not video scripts, so I’m not confident I got the right content mix. My pieces tend to be succinct and don’t labour a point. However, during the editing of my video, I realized my information is perhaps too compact. I should have explained some concepts more, and given more examples.

My class ended up on the short side, not because it doesn’t cover a lot, but because it’s dense. I give some information, an example, and then move on. I don’t take the time to explore those concepts. I should have known to do this from mentoring and presentations. Tackling the same point from multiple angles is good. Unlike an article, which a person can read, and reread at any speed, the video just keeps playing.

I also left a lot of angles less explored. By listening to myself, repeatedly, I could pull out a lot of ideas of what to expand on. Suddenly I had a nervous feeling, as all I was hearing was all the stuff I left out, rather than what I had in! I have to find a way to make this discovery before filming next time.

In the end, I’m satisfied I did a bite-sized introduction to user stories. It could have been more, but that doesn’t mean it doesn’t have value as it is.

Calls to action

One key thing I missed throughout the video is getting the viewers to participate. I know how important it is to engage people, and motivate them to follow along.

There is a project with the class: the viewer creates their own user portrait and story. I mention this at the start, and a few times later, but I don’t guide the user enough. I should be a lot more direct in my requests, and at every step encourage the user to write down what they’re thinking.

I’m more used to a presentation style, rather than a class. When I did Twitch streaming it was me talking about what I’m doing. It wasn’t intending as direct learning, but more of a play-by-play of my activities. In a class, however, the focus should be on interactive learning. I don’t want to be like those horrible professors that monologue their whole class. Good classes include a motivational edge. They encourage the user and bring them into the experience.

Getting this right is tricky. Go overboard, and it’ll feel like a cheesy sales pitch. That said, I think I’m on the light side and could add a bit more.

Lighting

I knew the lighting was terrible when I started filming. Alas, I live in a city where I could go weeks without much sunshine. As I was taking part in a teaching challenge, to get the video out by the end of the month, I couldn’t afford to wait weeks. So I made do with the lighting equipment I had, then increased the brightness during post-production.

My videos end up having an undesired gray tone. The colours are a bit washed out. In isolation, I think the video is okay, but when compared to others there’s clearly a lot to be desired.

Part of my issue might be the webcam I’m using for recording. I don’t think it’s good with poor lighting. I recorded other videos with it before, and they came out fine — provided the sun was blanketing my room with beams of light.

Getting the lighting right is hard. Doing it on a budget is even harder. I don’t like buying equipment until I’m positive I need it. Now I know for sure. I’ll need to get some sizeable diffuse lighting sources for my next videos.

Audio

I’ve been podcasting for a while now, and have done some music recording before. But filming is different. I can’t have a microphone positioned in front of my face.

I initially tried using my podcast mike as an overhead mike. The distance from my face required a high gain, which then picked up too much noise, and not enough of my voice. There are mic’s designed to work this way, but they weren’t within my budget. The mic would also work if I had an absolutely silent studio, but as an urban dweller, that isn’t going to happen.

So I went for a lavalier mike, the Comica CVM-V02O as a reference. This is a cardioid microphone that clips on to my shirt. It connects to a pre-amp that I already had. I didn’t try to hide the mic, but it blended with my shirt.

It worked well for the standing parts of my videos, so I assumed all was well. Then I forgot to check it while sitting. It picked up a lousy echo from the corner of my room. I cleaned it up a bit in post-production, but it’s definitely not ideal. I’m unsure as to why I didn’t use my podcast setup for that part.

Getting rid of echos is a challenge, but must be done. I have some more work to do before my next video.

Scripting

While rewatching the videos, I noticed my eyes dart about as though I’m watching an insect fly around my room. Perhaps it isn’t that bad, and I’m just focused on it now. It happens as I try to remember what I say, and glance towards the paper I have it written on. It doesn’t matter that it’s right beside the camera, it’s still enough of an offset to see my eyes move.

In any case, it is distracting, and it shouldn’t be done. There are a few options:

Don’t use a script: This is what I’m used to from podcasting and live streaming. It creates a natural, sincere feeling but has the problem of, well, going off script and not covering what I want.
Memorize: I actually did this. I took the effort to read some paragraphs, memorize the key bits, and then somehow proceeded to look at the damn script while recording anyway!
Teleprompter: This places the text directly in front of the camera. It’s standard in professional production of videos. It’s a bit outside my initial budget. It’s great for looking at the camera, but it ends up forcing you more on script, thus potentially less natural sounding.

I don’t fully script my material. I have only bullet points that I refer to. For me, this tends to work, as I can speak from notes, but sometimes I miss something important. For an introduction, I think it makes sense to either memorize or use a teleprompter. For the bulk of the course, bullet points make sense — for me, at least.

Trying to stay on script made it feel a bit more forced than it should have sounded. I’m uncertain what level of trade-off I should make here. Do I go for the natural conversational flow and hope I cover everything, or do I sound a bit stiff and ensure I cover my key points?

Editing

Editing video is a lot of work. I think the tools I chose ended up making it more work than it should have been, or more annoying that it should have been.

I’m working primarily on Linux, so used tools on that platform. I’ve been doing bits of video editing for a while and have yet to find a good solution on this platform. I ended up using Shotcut, but it has some significant deficiencies that prevent me from recommending it. Moving forward I’ll likely look towards the standard pro tools, like the Adobe suite.

I created my overlays in Inkscape. I like the program for vector illustration. I didn’t want to get too fancy with animations on my first go, so I limited it to static images coming and going from the screen. Of course, to do fancy effects, I’d need a better tool anyway.

I’ll have to bug one of my friends for animation help. They also recently finished their first class on animation in After Effects.

Off to a good start

Overall I’m happy with the class. I have to start somewhere, and I think I’ve created something of value. This shorter class gave me an opportunity to test out the process. I’ve learned a lot of lessons that’ll help me in the future.

I’m awaiting student feedback to see the other ways in which I can improve the content. This aspect is perhaps the most significant area where viewers can help me. I can’t judge from my own viewpoint whether it’s helpful. I have some areas for improvement, but I’ve most certainly missed something.

Film production and editing, on the other hand, is an area where I see the potential improvements on my own. It’s also my weak technical area — even if I notice the problem, I’m not clear on how to fix it. This requires more practice and a bit more equipment.

Let me know what you think.

The Internet Lottery™ for content creators

2019-03-13 / mortoray / Leave a comment

Your cursor hovers over the post button. This will be the one. You researched it, you checked it, you got the graphics, and you have the audience. Today you’ll have that successful post: thousands of views and followers. Finally, you’ll be on your way to internet fame. You click the button, and it goes live. Within minutes you get two downvotes, and your precious content is relegated to internet obscurity.

It’s a reality that faces content creators every day. They didn’t just publish content; they bought a lottery ticket. And they need to win to get an audience. Usually, they’ll lose, or get the pacifying minimal prize. The success of content depends on many factors, but a significant aspect is luck.

Call it timing

A lack of views, likes, and reposts is an unfortunate story that plays out every day, thousands of times, for many individuals. At times it feels like content, which was laboured over for hours, or even days or weeks isn’t given a fair chance. The whims of the internet don’t even taste it as it’s swallowed whole.

There’s a good chance the author did nothing wrong. Dumb luck plays a significant role in what becomes popular online. Those first few viewers have enormous sway over who sees what. It may depend on the network, but generally getting early downvotes is the kiss of death. For sites without downvotes, missing out on those first few likes is about the same.

The volume of content posted amplifies the problem. Social sites get thousands of posts per day. This gives an article only a slight chance of breathing before it’s relegated to page two, or worse. Many people won’t even bother scrolling through new content lists, waiting for popularity to push it high enough in their own feed. Any piece of work could have a significant audience, but if it fails to get the initial bump, it’s out of luck.

I’ve been hit by this many times, and it’s something that’s easy to get bitter over. An initial post may go ignored, and I think the article isn’t any good. Sometime later, somebody finds it and posts it elsewhere, and then it gains a significant number of views.

Other than having a lot of eyeballs in the first place, there’s really nothing one can do to affect this luck.

Niches and global competition

Often overlooked is the value of niches. Content competes on a much larger scale than it ever did before. Online, we don’t have a lot of regional, or local niches. A post to Facebook competes with content from all over the world. Even the feeds of friends and followers are dominated by other popular things.

Local niches do exist, in the form of private boards, smaller sites, and circles of friends. These are valuable tools in getting initial feedback. They don’t hold the same allure as the vast internet though. It’s easy for creators to get caught up in the big prize instead of competing for a more reasonable one.

I’m not going to blame Facebook here, or Twitter, or Reddit. They all have different algorithms, and all suffer from the lottery effect. All content needs to hit some threshold of popularity before it gets noticed. If that sounds like a Catch-22, it’s because it is.

Because of this, gaming the system is common. If a business depends on the success of a campaign, it can’t let this luck get in the way. Vote pods or outright buying votes is a real option. It is a way to get those precious first few likes, helping the content get over the initial hump of popularity. Whether it’s ethical or not doesn’t change how it works. The more complex the scheme, the harder it is for social networks to catch.

As sites attempt to block gaming, it increases the difficulty of the game for everybody else. Creators, trying to post or vote on their own content, risk getting banned. Friends, or followers, who legitimately like the content, may or may not help. Many sites filter out votes from people who continually up-vote the same sources. It’ll definitely help on Twitter, but there’s only a small chance the author’s followers actually see the post.

Having thousands of followers does nothing to ensure visibility — I’ve got over 7000 on Twitter and my posts often get only a few hundred views.

Numerous software vendors love this game, offering tools to manage and optimize posts. These track timing, response rates, and take care of reposting over time. They aren’t free, and they still take time to use, creating a barrier to entry for individuals. Corporations with more resources are better equipped to promote their content. This isn’t the free internet we were hoping for.

Content and quality

But we do see original user content become successful, so surely there must be some way to make it work? Again, the lottery effect. Some people will be in the right place, at the right time, with the right content. They’ll have drawn a winning ticket.

Content is the ticket. You can’t play the Internet Lottery™ unless you buy these tickets. The more tickets one buys, the better chance they have. The more followers they have, the better the potential each ticket has. But, until they reach stardom, there is always luck involved.

Before I leave you thinking that luck is everything, don’t assume that quantity alone is enough. If the content is bad, then it won’t likely succeed. Unless it meets some minimum quality threshold, it’ll buy a junk ticket. Those precious few initial votes will never be affirmative for lousy content. And if it does get over the initial hump, the second round of faces will look more closely and liberally downvote anything that isn’t good. Of course, there are ways to game the content itself: think of click-bait and cat pictures.

Quality content is a requirement, but not a guarantee of success. There is a lot of luck involved. It can seem unfair, but nobody knows of a solution for this yet. There is an abundance of content, and viewers have limited hours. Not everything can be seen.

Entrenchment

There are two significant dangers of the Internet Lottery™. It significantly demotivates lesser known content creators, and it entrenches the position of corporations. These both limit the variety and originality of content that everybody else ends up seeing.

Dealing with demotivation is a significant personal issue for creators. If they wish to post content online, they’ll have to deal with disappointment, rejection, and outright hostility. Buying a winning lottery ticket doesn’t necessarily help — the increased feedback can increase stress.

Corporations are more of a problem. They’ve bought into the game big time, and love hiding their corporate status by masquerading as ordinary users. Compared to individual users, and indie creators, they’ve got enough money to buy an infinite source of tickets and optimize their strategy. They also lobby politicians, introducing laws that make it harder for original content to be seen — consider the debate around upload filters.

If you’ve been wondering why I keep adding the trademark symbol to The Internet Lottery™, it’s because of the corporations. It often feels like the system is owned and operated by somebody — and I should note that corporations like Facebook and Google are accomplices to this mess. We aren’t given the rules, but they are there. My choice of the name and trademark attribution is an indication that you are fighting a system — in addition to your own motivation. Plus, I’ll claim the name as my own now because it never hurts to buy another ticket.

Though the monopolization of content by corporations risks ruining the internet, it is one of many things that will demotivate creators. Though I don’t think we can fix the Internet Lottery™ as a whole, keeping corporations in check is crucial. As long as individual tickets still have some value, we should be okay. Finding a way to create local niches would be a way to reduce the luck factor, but it’ll never go away.

Think about this the next time you see original content you like. Send the creator a thank you. It may not seem like much, but it’s a big reward for their ticket. It helps keep them motivated and keeps them posting content to break up the corporate cruft.

Invented here syndrome

2019-03-062019-03-06 / mortoray / 21 Comments

Are you afraid to write code? Does the thought linger in your brain that somewhere out there somebody has already done this? Do you find yourself trapped in an analysis cycle where nothing is getting done? Is your product mutating to accommodate third-party components? If yes, then perhaps you are suffering from invented-here syndrome.

Most of us are aware of not-invented-here syndrome, but the opposite problem is perhaps equally troublesome. We can get stuck in the mindset that there must be a product, library, or code sample, that already does what we want. We spend a lot of time testing out modules and trying to jam them into our system. At some point, we need to say, “stop!”, and write the code ourselves.

Upon rereading this article, I’m wondering how much of this problem is attributed to imposter syndrome. I talk about this in my my book, and also look at all the skills of a good programmer. Perhaps, it’s easy to look at other people’s code and assume they have it all figured out, while we ourselves don’t? This would push us away from developing our own solution.

Varying levels of quality

As a general rule, we shouldn’t write code that already exists. Often the choice is easy. We pick up a standard product, lump it in with our system, and it works as promised. These situations are delightful and do happen often enough.

Past this come the less than ideal choices. Perhaps a library covers only 95% of the requirements, the configuration is a bit more involved, or the purpose has to be twisted a little bit. It’ll work, but not without some effort, or rethinking a few requirements.

At some level, we enter the realm of crappy products. These are things advertised to do what we need but fail utterly at that task. Or perhaps they do something in a completely different way than expected. Maybe the integration is extremely troublesome. Perhaps the product is too bug ridden to be trusted, or the documentation so bad that proper usage is a mystery.

From what I’ve seen, the vast majority of coding products, snippets, and libraries fall into this category of crappy software. Being available in a package manager, being downloaded by thousands of people, or having a fancy web page, are no indications of a good product. It’s trivial to publish programming products. Chances are, for any obscure requirement there is already some matching product. Just because it’s there doesn’t mean it should be used.

I should point out that the vast majority of any project is comprised of standard products. Consider the operating system, the compiler, file system, shell and the build system. Add to this the built-in libraries like SSL, HTTP, or even the browser to render HTML. It’s generally an unfounded fear that a project is not using enough standard components.

It’s about the time

When we search for suitable products, it shouldn’t take long to discover when none exist. We will find modules, but either they are less than ideal or just crappy. As we add new search terms, we are either coming up empty or getting the same results. Lists of popular, seemingly suitable products, don’t include any that really fit. That’s it. The search is exhausted.

I’m not saying we should immediately jump from here to writing our own module. No, now is the point where evaluation becomes essential. Time is usually the critical factor here. How long will it take to write a custom component? How long will it take to adapt one of the existing products?

I feel that the time savings from using a less-than-suitable third-party product must be an order of magnitude higher than writing it myself to consider it worthwhile. Third party code has a lot of open questions, regardless of how well it has been evaluated. This uncertainty must be factored into our consideration.

Getting stuck in analysis paralysis is very bad. I have no problem writing code before completing the analysis. I consider this a valid form of evaluation. Often I’m not confident about what I need until I’ve attempted an implementation. Stalling a project can be disastrous. Perhaps my quick code is enough for now and let’s defer the decision to later.

It’s about the features

It’s easy to fall into the trap of features. The full feature set of a product can seem impressive and alluring. But who cares about everything a product can do. Our project has a specific list of requirements, and those are the only requirements we should care about.

This is where a lot of popular products falter. They offer a complete package, but we aren’t looking for a complete package. They have the feature we want, but there’s no way to extract it efficiently. We need to view the individual features on their own. This is relevant to the time consideration. Clearly, product Xyz would take millions of man-hours to recreate, but perhaps module Q will only take a few days.

As a second aspect, how vital is a requirement is to our own product. Compromising our key selling points is going to doom the project to failure. There’s no value in saving time if it doesn’t result in the intended product. We have to face reality sometimes: getting the desired feature set may involve writing a lot of code. We’re programmers though so that shouldn’t scare us.

Warning! Being delusional here is not helpful, and can often lead to genuine not-invented-here syndrome. While features can be realized in many different ways, do we need it exactly as we want? The time invested to write our version has to be related to how critical that feature really is. It’s best to involve marketing, or product management, in such decisions. It can be easy at times to lose sight of what is truly important.

Write some code

Fretting over a selection of inadequate products is not productive. While it’s entirely reasonable to avoid not-invented-here syndrome, becoming overly frightened of writing code can land a project in a production quagmire. It shouldn’t be surprising that software development involves coding.

A great mass of libraries, modules, code, and other software products, available on the net are either terrible in their own right or simply not suitable to our project. Forcing things to work together can be costlier than writing the required code on our own.

And what if coding turns out to be the wrong choice? Well, that’s part of development. It’s entirely possible that attempting our own component leads us to find the correct third-party product. Since we’re following good design practices, it won’t be a huge problem to swap parts in and out as desired.

This is an updated version of an article I originally wrote on 2015-02-25.

Essential code for lists and vectors in an interview

2019-03-012019-05-14 / mortoray / 1 Comment

“Lists are perhaps the fundamental data structure in code. There’s virtually no program that doesn’t use lists. Interviewers love asking questions that involve lists, either intentionally, or just because almost everything uses a list. To help you prepare for an interview, an keeping with our theme, I’ve created a list. It describes all the operations you should know how to perform on lists. This will make you comfortable when asked a question, and save you valuable time while coding.”

This article has been moved to Interview.Codes.

Sorry, but I have no way to do an auto-redirect on WordPress.com. They also don’t allow canonical-url, so I can’t keep it here as well. 😞

Stop waving the wand of magic numbers

2019-02-252019-02-25 / mortoray / 9 Comments

37. You have no idea what that number is, do you? A number without context nor a label is a random value. It doesn’t tell us anything. Imagine walking by a billboard, with a picture of a person on a boat, and the text is a giant number 89. I’d be intrigued, but utterly confused. We rightfully reject meaningless numbers…

…so why then do they appear so often in source code? Any time a constant appears in an expression, like 12, we call it a magic number. Through some dark craft, they arise out of the ether and populate our code. We don’t know where they came from, what they mean, or how they got there.

Though I don’t go down to the level of this article, my book has a chapter about interviewing as an essential programmer skill. Something simple like fixing magic numbers can leave a positive impression on the interviewer.

An example of magic

When I conduct interviews, I ask the candidate to create a deck of cards. I let them simplify it, by using sequentially numbered cards, instead of suits and ranks. In the majority of code I’m presented a loop of this form:

for i in 0..52:
    cards.add( i )

The number 52 doesn’t convey any information. Only because we’re talking about a card game, and the candidate has assumed a standard poker deck, does the number 52 appear. There are of course many card games that don’t have 52 cards.

I request they deal the cards out to multiple players. I accept splitting the deck as well. Frequently, I get code like this:

for i in 0..26:
    player1.add( cards[i] )
for i in 26..52:
    player2.add( cards[i] )

It’s not immediately obvious that this is splitting the deck in half. I have to reconstruct that in my head. Making a mistake while writing is easy. What if instead, you saw this code:

for i in 0..27:
    player1.add( cards[i] )
for i in 28..51:
    player2.add( cards[i] )

Is the code accounting for some inclusive/exclusive end condition? Is there a bunch of off-by-one errors in there? What is 51? It doesn’t line up with any number I know at all, not even about cards.

Magic number

In this code:

for i in 0...52:

The 52 is known as a magic number. A future coder has no information about what it is. These types of numbers convey no information as to their purpose.

It’s easy to get rid of them. Give the numbers a name.

num_cards_in_deck = 52
for i in 0..num_cards_in_deck:

We’ve immediately improved the quality of this code. A reader knows what the number 52 means. The loop now logically does something for every card in a deck.

Giving a name to a number removes its magical quality. Here we’ve done it beside the code, but typically symbols like num_cards_in_deck end up as global constants. They are context-free facts. If multiple pieces of code need the same value, then create one constant and share it.

Another response to the dilemma is to add a comment, such as # create a standard size deck to the loop. Just say no to this! Comments are a tool of last resort. They have no structure, can’t be enforced by the compiler, and inevitably become out-dated. Using proper symbol names is superior — code trumps prose.

With our new constant, we can change the second loops:

for i in 0..(num_cards_in_deck/2):
    player1.add( cards[i] )
for i in (num_cards_in_deck/2)..num_cards_in_deck:
    player2.add( cards[i] )

Again, the quality of the code has improved considerably. These loops now have meaning; it’s easy to see we’re treating the deck as two halves. It’s also much less likely to have an error as we’re letting the compiler do the division. Moreover, if the number of cards changes, this loop is still correct.

I begrudgingly use this example only because it comes up in my interviews. These two loops are still bad. From a defensive programming standpoint, you shouldn’t use a constant value at all. You should use len(cards) and len(cards)/2. Many magic numbers can be removed in this fashion. Rather than tying your code to an arbitrary value, base it on some other knowledge you already have. In this case, we have a cards collection which has all the information we need.

Put the wand down

Not all constants are magic numbers though. There are some special cases, in particular, 1. Adding 1 to a number creates its natural successor. There’d be no confusion about what is happening. There are more, but without seeing the code in particular, I don’t wish to give general exceptions.

Even a number like 2 may not seem magical, but can nonetheless be improved upon. In the example I’ve given, it’s clear that it splits something in half, but it doesn’t convey what half means. It’d be better to say num_players in that example. Many times, even if the number is obvious, it helps to give it a name. They add clarity to the code.

As fun as magic may be, it’s time to put the want down and stop conjuring these numbers into our code.

Addendum: A better dealing loop

The dealing loops shown above are still wrong in my opinion. Though not related to magic numbers anymore, here is code that does the dealing between players in a cleaner fashion. It avoids duplicate loops.

for i in len(cards):
    player_cards[i % num_players].add( cards[i] )

Another possible option, which closer mimics dealing, and avoids ranges entirely. It also uses an OOP player that contains cards. This type of code is suitable when you need to model all the states visually, or when there is partial dealing involved. It retains intermediate states.

cur_player = 0
while len(cards) != 0:
    player[cur_player].cards.add( cards.pop() )
    cur_player = (cur_player + 1) % num_players

How does a mutex work? What does it cost?

2019-02-202019-10-17 / mortoray / 15 Comments

Concurrent programming requires synchronization. We can’t have more than one thread accessing data at the same time; otherwise, we end up with a data race. The most common solution is to wrap the critical data access in a mutex. Mutexes are, of course, not free. A mutex can have a significant impact on the cost of the code we are writing. When used correctly we’ll barely notice the overhead. When misused, it can cause a program to run worse in threaded mode than it would have single threaded!

Go further with the 🎞️ video class.

🛠️ Write your own Mutex

📔 Learn about user and kernel space operations.

Watch Now

What is a mutex?

A mutex, in its most fundamental form, is just an integer in memory. This integer can have a few different values depending on the state of the mutex. When we speak of mutexes, we also need to speak about the locking operations. The integer in memory is not intriguing, but the operations around it are.

There are two fundamental operations which a mutex must provide:

lock
unlock

A thread wishing to use the mutex, must first call lock, then eventually call unlock to release it. There can be only one lock on a mutex at any given time. The thread holding the lock is the current owner of the mutex. If another thread wishes to gain control, it must wait for the first thread to unlock it. This mutual exclusion is the primary goal of the mutex, and indeed the origin of the name.

The lock operation has several variants. In most cases, we’d like to wait until we can lock the mutex, so the most common lock operation does precisely this. Other time we may wish to only wait for a given period, and sometimes we may not want to wait at all.

In contrast, unlock is usually just one function. It makes a mutex available for another process to lock.

Contention

Attempting to lock an already locked mutex is called contention. In a well-planned program, contention should be quite low. You should design your code so that most attempts to lock the mutex will not block.

There are two reasons why you want to avoid contention. The first is that any thread waiting on a mutex is not doing anything else — possibly resulting in unused CPU cycles. The second reason is more interesting, in particular for high-performance code. Locking a currently unlocked mutex is cheap compared to the contention case. We have to look at how the mutex works to understand why.

How does it work?

As mentioned before, the data of a mutex is an integer in memory. Its value starts at 0, meaning that it is unlocked. If you wish to lock the mutex, you check if it is zero and then assign one. The mutex is now locked, and you are the owner of it.

The trick is that the test-and-set operation has to be atomic. If two threads happen to read 0 at the same time, then both would write 1 and think they own the mutex. Without CPU support there is no way to implement a mutex in user space: this operation must be atomic with respect to the other threads. Fortunately, CPUs have a function called “compare-and-set” or “test-and-set” which does exactly this. This function takes the address of the integer and two values: a compare value and a set value. If the comparison value matches the current value of the integer, then it is replaced with the new value. In C style code this might like look this:

int compare_set( int * to_compare, int compare, int set );

int mutex_value;
int result = compare_set( &mutex_value, 0, 1 );
if( !result ) { /* we got the lock */ }

The caller determines what happens by inspecting the return value. It is the dereferenced to_compare pointer value before the swap. If this value is equal to the compare value the caller knows the set was successful. If the value is different, then the call was unsuccessful. When the section of code no longer requires the lock, it can set the value back to 0. This makes up the fundamental part of our mutex.

Atomic increment/decrement functions could also be used and are the recommended way if using the Linux futex.

What about waiting?

Now comes the tricky part. Well, only in a way is it tricky, in another way it is simple. The above test-and-set mechanism provides no support for a thread to wait on the value (aside from a CPU intensive spin-lock). The CPU doesn’t fully understand high-level threads and processes, so it isn’t in a position to implement waiting. The operating system (OS) must provide the waiting functionality.

For the CPU to wait correctly, a caller needs to go through a system call. It is the only thing that can synchronize the various threads and provide the waiting functionality. If we have to wait on a mutex, or release a waiting mutex, we have no choice but to call the OS. Most OSs have built-in mutex primitives. In some cases, they provide full-fledged mutexes. If a system call does provide a full mutex, why would we bother with any test-and-set in userspace? The answer is that system calls have quite a bit of overhead and should be avoided when possible.

Various operating systems diverge at this point, and will likely change as time goes on. Under Linux, there is a system call futex which provides mutex like semantics. If there is no contention, the call is resolved in userspace. If there is contention, the call is delegated to the OS to handle in a safe, albeit far costlier manner. The OS handles the waiting as part of the process scheduler.

futex is quite flexible in allowing the creation of various locking mechanisms in addition to a mutex, such as a semaphore, a barrier, a read-write mutex, and event signalling.

The Costs

There are a few points of interest when it comes to the cost of a mutex. The most vital point is the waiting time. Your threads should spend only a fraction of their time waiting on mutexes. If they are waiting too often, then you are losing concurrency. In a worst-case scenario many threads always trying to lock the same mutex may result in performance worse than a single thread serving all requests. This isn’t a cost of the mutex itself, but a serious concern with concurrent programming.

The overhead costs of a mutex relate to the test-and-set operation and the system call that implements a mutex. The test-and-set is likely a minuscule cost; being essential to concurrent processing, CPU vendors have a strong incentive to make it efficient. We’ve ignored another important instruction, however: the fence. This is used in all high-level mutexes and may have a higher cost than the test-and-set operation. I talk a bit more about fences in CPU Memory – Why do I need a mutex?. Most costly however is the system call. Not only do you suffer the context switch overhead, the kernel now spends some time in its scheduling code.

You may also be interested in reading CPU Memory – Why do I need a mutex?.

How to fail a programming interview

2019-02-192019-05-14 / mortoray / 2 Comments

“I’ve conducted many interviews. There are many reasons why I reject people, but there are recurring themes. I want to give a list of the most common problems. A few of these things I once thought were obvious, while others are perhaps less so.”

This article has been moved to Interview.Codes.

Sorry, but I have no way to do an auto-redirect on WordPress.com. They also don’t allow canonical-url, so I can’t keep it here as well. 😞

Of course HTML is a programming language

2019-02-112019-02-11 / mortoray / 8 Comments

Is HTML a programming language? I could express a bit of shock, even dismay, at the question, but instead, I’ll try a more polished approach. In some sense, it may be a legitimate question, in which case it deserves a fair answer. In another sense, it may be an attempt at gatekeeping. I suppose that also must be addressed.

My first concern is, what value would it be to say it isn’t a programming language? Does it bring us some clarity as to its purpose? Is there a doubt that HTML is used by programmers to create UIs? Is there doubt that HTML has source code? Do we doubt that HTML source code is subject to the same issue systems, requirements management, version control, peer review, and testing like any other source code? I’m not certain what clarity it would bring to say it isn’t a programming language.

However, before I conjecture on what else we could be asking, let’s take a more technical look.

HTML is a declarative programming language. Unlike an imperative programming language, which tells a computer how to do things, a declarative language tells the computer what the result should be. The engine that processes the code figures out a way to get the system in that state.

There are a lot of domain-specific declarative programming languages. Configuration files generally fall into this same category. The lists of options declare how you want a system configured. In DevOps, configuration is consistently referred to as code.

Occasionally we’ll see Turing completeness brought up in the discussion. Turing machines have been a wonderful tool in analyzing computation, and algorithms in general. They are used when talking about computability, whether we can find the answer to a problem. Though they can simulate a lot of languages and machines, they’re terrible models of real-world practical computing. They simulate, but can not model, many features we’d expect — including something basic like random access lists.

If we’re asking what a programming language is, I think we need to stay at the practical level. Let’s leave Turing completeness to the theory. A proper discussion in that realm requires going into a lot more detail. I’m not dismissing it. I find it fascinating, but not relevant to our question.

What might be a practical definition? How about a programming language is a set of codes that unambiguously instructs a computer as to what it should do. HTML instructs the computer how to create a visual document, or app interface, for the user. Python, Java, or C++ give sequences of commands that tell the computer how to manipulate memory, modify files and talk on the network. Haskel, or O’Caml create structures of equations that calculate results.

I might need to be careful with that “unambiguous” bit, as many languages have dark corners to them: unintentional defects in the language, intentionally unspecified behaviour, and the outright nasty undefined behaviours. So, we’ll have to live with for-the-most-part unambiguous.

Different languages instruct the computer differently. We recognize they aren’t all appropriate for every situation. There is no all-domain all-purpose language anymore.

Back to the question then, what value does it bring to say HTML isn’t a programming language? I fear the answer is related to gatekeeping. Some individuals feel a need to distinguish the quality of programmers based on some artificial goal posts. It helps keep the role of programmer artificially elite — for some definition of elite.

It’s not like the label “programmer” magically makes one qualified to do all types of programming. It’s a large field. We have lots of tools and lots of approaches. Somebody who knows only HTML won’t likely be hired to program a game engine or embedded sensor. Nor would a data analyst specializing in NumPy be a good fit for a mobile interface.

But even still, who only knows HTML? Perhaps we’re talking about UI/UX designers that use it to create prototypes. These are people that will work with a programmer to help them get what they want. I don’t think they’re clamouring to be recognized as programmers themselves? I do consider them developers though; I have a chapter in my book about why I think this is important.

Otherwise, most of the web programmers I’ve met doing HTML seem also to be doing CSS and JavaScript. It’s difficult to program an interface using only HTML; it’s inevitably combined with other technology. It needn’t be, but that’s the most common situation. If somebody wishes to deny JavaScript as a programming language, well, let’s just ignore that foolishness.

In summary, HTML is a declarative programming language. It can’t do everything, but no language can. There’s no practical value in saying it isn’t a programming language. Though a programmer who knows only one language is limited in what they can do, this applies to all languages, including JavaScript, C#, Python, or whatever. We could argue that this theoretical person who knows only HTML is not a well-rounded programmer, but not that they aren’t a programmer at all. Instead of trying to exclude them, why not work with them, and encourage them to learn all the other beautiful things that can be done?

The User

2019-02-05 / mortoray / 1 Comment

Thinking that programming is about programmers is the worst mistake you can make. The single most important aspect of programming is the user. Keep this in mind through all the discussions we have, even down to the lowest technical details of an operating system. The only reason we need programming is for users.

This is an excerpt from my book What is Programming?, a complete outline to all the skills a programmer needs.

Who is the user? As they are the people we ultimately serve, we need to understand who they are, and who they aren’t. In the most abstract sense, a user is anybody using technology to accomplish a goal.

It’s the photographer touching up their photos. It’s a gamer relaxing behind a controller. It’s friends talking to each other on the phone. It’s an accountant entering numbers into a spreadsheet. It’s the thousands of people browsing social media, or the few organizing a weekend getaway. It’s a hungry man ordering a burger late Saturday night. It’s a musician putting the final touches on their new masterpiece. It’s the medical researcher trying to create a new vaccine.

The list focuses on people doing real things. I’ve intentionally omitted a lot of what I consider to be secondary users. These are people producing tools for other people to serve end users. These are the administrators and programmers who use technology and libraries. We’re users, but a special category. Our requirements need to be met, but ultimately we need to cater to the primary user.

Note that I haven’t mentioned software here. A user doesn’t care about software, hardware, or anything in between. The user has a high-level, real-world task. Many components need to work together to help them. Thinking about the hardware, software, and process separately does a disservice to the user.

Ideally, the user’s thoughts never leave their application domain. They shed no worry about the bits they are using, blending them seamlessly into their life. One approach here is to create a persona. Define a fictional person, their name, their job, where they live, and everything they do. Provide a flow for this person by trying to be part of their life. We need empathy to understand their problems. Any time we make a decision, we must consider the impact it has on those users. Frustration arises from mismatches in expectations, not necessarily defects.

Furthermore, we need to consider what abilities our user has. They aren’t as computer literate as us — again, their thoughts ideally don’t need to leave their domain. Getting into the user’s head lets us use terms, symbols, and processes they will understand. The user shouldn’t have to read this book to use the tools we make.

Way beyond knowledge, not all humans are created the same. Not just disabilities, but minor variations, like finger size, can make applications hard to use. We don’t all have the same reaction time, nor the same appreciation of, or even ability to distinguish colours. In a world where everything is connected, we have to consider all people. Leaving even a few behind is unfair.

Creating good software requires understanding our users. Who are they? What are they capable of? What do they want to do? This goes well outside the realm of programming, to all development teams. There is only one reason why we develop software, and it needs to be at the forefront of our thoughts, always.

The user.

Would you program a human?

2019-01-29 / mortoray / Leave a comment

With tales of Crispr babies in the news, I’ve been pondering the implications of programming human beings. This may or may not have been spurred by a crime novel I recently read, which hinges on this topic. Were I offered a job to edit the genome of humans, would I take it?

I understand the question was easier in China where the morality issue is less problematic. The overt goal doesn’t sound terrible either: guaranteeing the dependents of over a billion people will not be born with genetically inherited diseases. Who wouldn’t want that? Though, our ethics boards are not likely inclined to approve experimentation on humans — at least not until we see China racing ahead.

For the sake of the thought experiment, let’s assume the ethics boards have granted permission for this work to go forward, and I’ve been hired as a programmer. How does it start?

As a good programmer, I start with a user story, something that describes the people involved and what they’d like to accomplish with the software. When I get back from a consulting meeting in Beijing, I draw up this user story:

Xi, the wholly elected leader of a prosperous, populous nation, is facing challenging stability decisions. He has great concern for the citizens and does not sleep easy knowing they are suffering. He’s worried about people succumbing to addiction — drugs, alcohol, ~~individualism~~. He’s successfully launched monitoring programs to identify these people and alert neighbours to their plight. Now he wishes to go further. He’s looking for a genetic solution that would eradicate addiction.

Great, that sounds like a noble cause. With this insight, I can now recommend a variety of possible solutions. Naturally, I start with the latest version of the CRISPR gene editing technology.

I install the toolkit.

Immediately, I’m not impressed. The documentation is a mess. It’s outdated, and most of the API is just blank. StackOverflow is filled with questions about programs randomly crashing, and smug answers belittling the poster. I’ll have to poke around blindly looking for something that works.

You laugh, but this is the true state of affairs for the software that runs your phones, cars, medical devices, and military hardware. Do we really expect that we’d approach human programming more rigorously? We can’t stop the development of technology just because we haven’t figured out programming yet. We also can’t use the argument, “but these are people!” That holds little weight to the decisions made by giants like Facebook and Google — who essentially already control our lives through the software they write.

Alright, I have some code.

I can’t just deploy this. I need to test it. I wonder how this will even work. Do we have some emulator? Maybe, but it looks buggy. I see there’s an offer from India to outsource my testing. They’ve got a web form and ability to upload the code. I’m best off not thinking about what happens on the backend. As long as I get my results, it’s their concern, not mine.

I pay the contractors for the Indian testing. I’ve given them direct access to my issue system so they can file any defects they find.

Issue #18: Results in random clucking like a chicken
> CLOSED: not reproducible
> COMMENT: Second case of clucking
> REOPENED: Confirmed
> SEVERITY: Cosmetic, PRIORITY: Low

Given the randomness of the API, I’d expect mostly wacky results, but some might be promising.

Issue #37: Side-effect: blocks cerebral palsy progression
> ASSIGNED: George
> COMMENT: Identified the issue, working on a patch
> FIXED: Removed unintended side-effects

Perhaps such valuable side-effects don’t come up too often in programming. But all those medical researchers out there might wish to share their opinions on the amount of research that is buried, destroyed, and/or locked behind paywalls. If I’m on a deadline, or trying to save face, I’m likely to keep my head down and focus on getting the job-at-hand done.

This demands the question: what level of completeness is okay? It’s currently impossible to set a deadline and a planned feature set in software. This isn’t a lack of planning ability, it’s a fundamental uncertainty in how the profession works. There’s no reason to assume human programming will be any different.

What defects are we willing to accept when it comes to gene editing? If the program is to cure cerebral palsy, I imagine random clucking would be acceptable. But would moral judgements even allow the discussion of side-effects? There are some correlations between high IQ and other neurological disorders. Is it acceptable to edit genes that make a trade for higher IQ, at the risk of other genetic disorders?

If I think about the self-driving car discussions, we have a fundamental lack of knowledge to deal with morality and software. And the repercussions will be played out at a high political level, it might never even involve the programmer — despite them being essential to the answer.

Nonetheless, while the debate plays out, I continue coding. I won’t hold off on releasing the code. Until I’m explicitly told not to, I continue with my job. Plus, I feel safe. It’s not like I’ll ever be held personally responsible for the defects. Nobody is held liable for any kind of defects now. Consider mass data breaches; how severely are those companies punished? Even in the medical world, there’s a litany of drugs with questionable side effects, yet those pharma concerns are still around.

I may be painting a bleak picture of gene editing. There are all sorts of positive uses for it, including the ability to eradicate genetically inherited diseases. But, realistically, how do we answer the questions about testing and defects? Should we argue whether this is even programming? It’d be hard not to call it that since we legitimately have a paradigm called “generative programming” which has been used for over half a century.

Nothing but questions.

How about you? Would you accept a position in human programming?

We don’t need a ternary operator

2018-09-05 / mortoray / 5 Comments

A staple of compact code, the ternary operator ?: feels like it’s earned a place any programming language. I use it often; you should use it often. However, I don’t like the operator itself. It feels complex and incomplete. Though I’ve stopped development on Leaf, I did find a better option.

Let’s take a deeper look at what this ternary operator is, then show how a language could avoid it.

What’s this ternary operator?

The name of this operator raises questions. A ternary operator, in a typical math sense, or from the view of a parser, is an operator that takes three arguments. An operator is a function that has a special syntax, not the generic call syntax foo(a,b,c). Most common are binary operators, we see them everywhere.

//binary operators
x + y
x / y
x * y

There are also a handful of unary operators we see commonly. Unary meaning they have only one argument.

//unary operators
-a
~bits
!cond

There’s no shortage of unary and binary operators. But what examples of ternary operators do we have?

//ternary operators
cond ? true_value : false_value

Are you scratching your head trying to think of another one? As far as I know, there aren’t any. This is why we end up calling this the “ternary operator” instead of using a proper name, like “conditional operator”. There aren’t any other ternary operators in programming languages.

Chained Operators

Let’s step aside for a second. There are operator sequences and combinations that may look like ternary operators but aren’t.

For example, this Python comparison appears to involve three arguments.

if a < b < c:
    go()

That looks similar to a ternary operator. It must consider all three arguments to evaluate correctly.

However, digging deeper this is more of a syntactic sugar than a true ternary operator. It’s equivalent to the following.

once_b = b
if a < once_b and once_b < c:
    go()

You may find several examples that show only a < b and b < c, but those are incorrect. The original form guarantees that b is evaluated only once. I use the once_b = b to show this guarantee; I assume a different internal feature achieves this in practice.

This python construct can be extended to include even more values.

if a < b < c < d <e < e:
    go()

It’s a chained operator. There’s no limit to what can be added.

The << stream operator in C++ can also be chained.

cout << "Hello " << first_name << " " << last_name << endl;

Unlike Python’s chained comparisons, this C++ sequence doesn’t require any parser support. << is a normal binary operator that happens to be chainable. Basic math has this same property, for example a + b + c + d.

Messy syntax and parsing

I said there aren’t any other ternary operators in programming languages. There’s a good reason for this.

Look at cond ? true_value : false_value. Is there anything that makes it look like three arguments are involved? What if I change the operators slightly, to unknown ones: expr ⋇ value_a ⊸ value_b. That looks like two binary operators. Even if I keep the same symbols, but add nested expressions, it looks off: cond ? value_a + value_b : value_c * value_d. There’s no visual indication that ? ... : is one unified operator. This syntax limitation prevents any new ternary operators from being introduced. It’d create a backlash. ?: is already in some coder’s bad books as it’s easy to abuse.

Despite providing a valuable syntax, the ternary operator is syntactically messy. It’s just weird to have two split symbols define three arguments. A function with three arguments is not a problem since we have a robust syntax for it foo( a, b, c ).

Adding to the complexity is precedence. All operators require precedence. For example, a + b * c is evaluated as a + (b*c). Multiplication has higher precedence than addition. Its result is evaluated before addition.

Operators of the same precedence also have associativity. For example 5 - 2 + 3 is evaluated as (5 - 2) + 3 = 6, which is left associative. Right associativity would be evaluated as 5 - (2 + 3) = 0, which is wrong.

Right associativity is generally reserved for unary operators, which only have a right argument, and assignment operators. For example, if you’re crazy enough to write a = b += c, right associativity evaluates this as a = (b += c).

The ternary operator is right associative. It doesn’t feel right though. How can an operator with three arguments and two symbols have associativity defined merely as left or right?

cond_a ? val_one : cond_b ? val_two : val_three

//right? associative
cond_a ? val_one : (cond_b ? val_two : val_three)

//left? associative (wrong)
(cond_a ? val_one : cond_b) ? val_two : val_three

//strict-left? associative (wacky)
(((cond_a ? val_one) : cond_b) ? val_two) : val_three

//strict-right? associative (nonsensical)
cond_a ? (val_one : (cond_b ? (val_two : val_three)))

The two strict forms, which apply associativity on each operator symbol, are wacky. Think for a moment from the point of the view of the parser. It’s one of those two that make the most sense syntactically. The first two forms require a bit of trickery to bypass regular associative parsing.

Try to imagine what happens in nested ternary operators: a ? b ? c : d : e. You won’t find nested ternaries often in code. They are too hard to parse mentally.

You find lots of code depending on the quasi-right associativity. That’s what allows chaining.

int a = cond_a ? val_one :
    cond_b ? val_two :
    cond_c ? val_three : val_four;

A binary solution

When I was working on Leaf, I wanted to avoid the ternary operator. I liked the feature it provided, but I wanted to find a more fundamental approach to providing it.

The solution came in the form of language optionals. These were fundamental in Leaf, thus had operator support.

The first operator was a defaulting one. If the optional value were set, it’d evaluate to that. Otherwise, it’d evaluate to the provided default.

var a : optional
print( a | 1 )  //prints 1, since `a` has no value

var b : optional = 2
print( b | 3 )  //prints 2, since that's the value in `b`

The || operator in JavaScript achieves the same effect in many cases. In C# there is a null coalescing operator ?? that uses null in the same way as Leaf uses unset optionals.

This sounds like half of what the ternary operator does. We could look at like this: cond ? some_val | default_val. That is, consider the entire left part, the cond ? some_val to produce an optional value. Given an optional, we already have operators to default that value when unset.

Leaf incorporated the ? operator to create an optional.

var a = true ? 5  //a is an optional set to value 5
var b = false ? 6  //b is an optional without a value

On its own, this operator is often useful for optionals. Combined with | default operator it mimics the traditional ternary operator.

var a = cond ? true_value | false_value

int a = cond ? true_value : false_value;

? and | are both binary operators. There is no need to have an actual ternary operator to produce the same conditional evaluation. We get the same feature without the syntactic messiness. It’s a lot clearer what happens, and the parser doesn’t require any trickery.

It also improves expressiveness without introducing more symbols, as both operators are useful on their own.

Alas, I’ve not seen any plans to add this to any language. If you know of one, please leave a comment. I’d be curious to see it.

ELI5: The filter, map, zip, and reduce operations

2018-08-28 / mortoray / Leave a comment

You’ve baked a fresh batch of gingerbread cookies. A bunch of little men, snowmen, stars, and bells are cooling on some trays. You clean up the table and lay out the colours and sprinkles in anticipation.

Good, the cookies are cool enough to start decorating. The little men need clothing. The snowmen covered with sugary snow. The stars outlined and filled with yellow. The bells get a simple jagged line pattern.

It’s a bit messy to have the colours for all shapes open at once. It’d be better to do one form at a time. Let’s start with the snowmen. You look through the cookies, grab the first snowman, put icing and sprinkles on it, then set it to the side. Now look through the cookies and do the same for all the snowmen you find.

Once done, you can pack up the snowman decorations and get the star colours ready. You go through all the cookies again, this time decorating the stars. Then change colours and repeat for the men and bells.

In programmer terms, you’re doing two things to the cookies: filter and map. Scanning the big pile of cookies and taking only those of a given form, like snowmen, is called a “filter”. You filter out all those you don’t want, ending up with only the snowmen.

You decorate each cookie in a specific way. This is called a “map”. It doesn’t change the number of cookies. You start with a bunch of plain cookies, then end up with the same number of decorated cookies. They all get the same decoration as well. Since we have only one type at a time, like snowmen, we can keep using the same colours for the entire “map” process.

We do the “filter” and “map” once for each of the cookie cutters we used. This is four times. Once for men, snowmen, stars, and bells.

You’re left with four different sets of cookies, one of each shape. It’d be a bit boring to have them all grouped together like that in the cookie boxes. First, you grab one of each form and stick them in a pile. Then you take each pile and dump it into the box. The piles are gone now, only a box full of mixed cookies remains.

Creating the piles of four cookie forms is sometimes called a “zip” operation. It works like the zipper on your pants, as it connects the left and right side one bit at a time. In this case, we have four things we’re combining instead of two. We zip up the four piles of cookies into a bunch of piles, each with four different cookies.

Putting these all together in a box is called “reducing” the sets. We “reduce” the individual piles into a single box of cookies. An interesting thing about reducing is that it doesn’t matter what order we grab the piles. We can even group some together in bigger piles first — to save walking up and down the table too often. They all end up together in the same box.

“filter”, “map”, “zip” and “reduce”, some funny terms for things you do every day. But we’ve missed something? What else can we do with the cookies?

That’s right, we need to eat them. Programmers like to call that, getting the munchies.

Sadly, I must say goodbye to Leaf (my programming language)

2018-08-072018-08-07 / mortoray / 28 Comments

This is hard to write, or rather, it’s a hard decision to make. After investing many years on Leaf — a programming language, a major life project — I need to take a break from it. It’s no longer fulfilling my goals, and it’s not as fun as it once was. It’s interfering with my other career priorities. Leaf has become part of my identity, and this feels like ripping out one of my organs. But it’s necessary. Let me explain why.

I’ve lost track of the specific event that led to the creation of Leaf, though I do know some of the key reasons. First and foremost, I wanted a technical challenge. Some people may find that odd because programming is a vibrant and diverse world. But I’ve done a lot of programming. I’ve tried many projects and continually looked for new ways to do things. I’ve worked in numerous startups and had many side ventures. Finding new challenges has become quite unlikely. This lack of novelty was dragging me down as a whole.

Even though I’d done some toy languages before, they were nothing compared to an entirely modern language. I was hoping for new obstacles, and on that front Leaf delivered. It stopped being a hobby and became my motivation.

I solved exciting problems and learned a great deal. Building the type system got me deep into thinking about type theory. It exposed me to unknown corners in the structure of languages. It’s the part I wish I could continue working on. Working through the complexities of memory management at the language level was invigorating. And of course error handling. Wow! That was quite the experience. Getting the block flow correct for LLVM is one of those critical compiler challenges.

It’s those achievements that make my decision extremely hard. How can I take this wonderful thing I’ve built and just set it on the shelf? But I was already feeling that urge over a year ago. Something was missing. I needed to do more with it.

I drove Leaf in the direction to create real programs. Above all theoretical tasks, it should be useful. Earlier this year, I streamed the development of a snake-like game called “Wormies”. It gave meaning to the language. It gave me a way to share the work with others.

Sharing is vitally important. I’ve written several technical articles, but the audience is very limited. Let’s face it, nobody knows who creates the languages we use. Sure, a select few names pop up here and there, but for the most part, the language designers and coders are not known. It’s also upsetting to see somebody like Guido step down from the Python project. The success of a language guarantees you lose control over it.

But I had more immediate problems, I hit a stumbling block in the game development. A missing feature, OOP interfaces, was preventing me from using the language the way I wanted. It jumped to the top of my todo list, but I’ve not been motivated to write it. There’s nothing new for me there. There’s nothing new in the realm of compilers here either. Sure, it isn’t trivial, but it is a lot of straightforward coding with no intrinsic reward.

Even with that step done, there’s an enormous amount of work on the libraries and environment. All required before anybody else could hope to use the language. Unfortunately, it’s everyday work — no challenge and no learning for me.

Maybe I should think about my career. Compiler and language developers are rare breeds, but so are the projects that need them. And would I want to work on yet another language, knowing I’ve already solved the interesting problems?

My personal goals lie more on the creative side. I want to do graphics, write music, make some games and record some videos. I had a plan where I’d create all this in Leaf. I’d build out the system enough to do these creative projects. But people that want to see the creative stuff don’t care about the language. Limiting myself to Leaf holds back these other projects. Instead of gaining from my language, I’m actually losing out.

That’s a grim conflict. I knew I could get it working, but in what timeframe? I could never afford to work full-time on it. If it were corporate driven, like I have a salary, the situation would be different. I could still pursue my other projects on the side. But as it is, I have too many ideas competing for the little time I have.

It’s damn hard to let go. I’ve invested a lot of time into Leaf, but I begrudgingly admit those are sunk costs. I must evaluate where Leaf stands now relative to my current goals. And, quite unfortunately, pursuing Leaf at this time is not in my best interest.

Maybe I can pretend it’s a hiatus to lessen the emotional blow.