Getting De-railed

Anna's blog

How to Rotate a Matrix in Ruby

Here is a problem that I recently stumbled upon while researching something completely different, as these things usually go, – and down the rabbit hole I went.

How does one rotate a matrix?

First of all, let’s define parameters of the exercise. We will take a matrix – an array of arrays – and turn it 90° clockwise.

So this is what it would look like, using Flatiron’s whiteboard table tops (we are really a paper-free school):

As you can see, numbers 1 (position[0][0] in zero-indexed terms) and 9 (position [2][2]) traded places. One thing is immediately obvious: this would only work if the matrix in question is ‘square’: that is, if there are x many rows with x many elements each.

Now that we have defined that condition, let’s start with a simple 3x3 matrix.

If the original array looks like this, Ruby-style:

my_array = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

The modified array would look like this:

rotated_array = [[7, 4, 1], [8, 5, 2], [9, 6, 3]]

But how to get there?

Ruby actually comes with default methods that allow one to manipulate arrays. Most people know pop, shift, unshift, and drop, but there are way more useful methods.

I was only marginally familiar with rotate, so here is an example:

According to the docs:

Returns a new array by rotating self so that the element at count is the first element of the new array.

When using it without an argument (or rather with the default count of 1), it works like this:

It effectively takes the first part of the array and places it at the end of the array. If we use that 5 times on the array with 5 elements, we will arrive at the original array (well, it will return what looks like the original array as the original array does not get modified).

If used with a count that is not 1, it will move the elements down by the count number of places:

(array.rotate(2) is the same as array.rotate.rotate)

Count can also be negative, in which case it will rotate the array in the opposite direction, taking items from the end of the array and placing them in the front:

As many other methods, it can be used with the bang to modify the original array: return!

I played around with rotate, and it became clear that I was going to need something else to complete the task:

It handily re-arranged the nested components in the array, moving each ‘inner’ array over by 1 at a time.

There is also transpose. It is actually a bit trickier to comprehend:

For a nested array that contains 2 array of 2 elements each, it seems like all it does is switch elements that are in the [1][0] and [0][1] position with each other – in case below, only 2 and 3 traded places. But what is actually does, as becomes more obvious in case of a larger array (nexted with 3 arrays of 3 elements each) is make columns rows and rows columns.

Here is an example:

Unlike rotate, transpose does not take any arguments.

By the way, although I won’t be using it here, one of my favorite array methods is also flatten, which is a handy way to get nesting out of arrays:

As an aside, flatten is is also a handy way of converting a hash into a simple array:

But back to the task at hand:

Upon experimenting with rotate, flatten, and transpose, I think I came up with a pretty efficient technique using transpose and then reverse:

The array is first transposed and then iterated over, and each of the nexted arrays inside it is reversed.

Works like a charm. But if you want to do something slightly more exciting:

Let’s look at the origin array again:

my_array = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

And at the rotated_array:

rotated_array = [[7, 4, 1], [8, 5, 2], [9, 6, 3]]

For a zero-indexed language, the nested array will look like this, positions-wise:

[6,3,0]
[7,4,1]
[8,5,2]

OR:

(position of first element (0 in this case) + 2*3) + (position of first element + 3*1 ) + (position of first element + 3*0) (position of second element + 2*3) + (position of first element + 3*1 ) + (position of second element + 3*0) (position of third element + 2*3) + (position of third element + 3*1 ) + (position of third element + 3*0)

Or for a nested array:

[[2][0],[1][0],[0][0]],
[[2][1],[1][1],[0][1]],
[[2][2],[1][2],[0][2]]

Which is:

[[my_array.length-1][my_array.length-3],[my_array.length-2][my_array.length-3],[my_array.length-3][my_array.length-3]],
[[my_array.length-1][my_array.length-2],[my_array.length-2][my_array.length-2],[my_array.length-3][my_array.length-2]],
[[my_array.length-1][my_array.length-1],[my_array.length-2][my_array.length-1],[my_array.length-3][my_array.length-1]]

This would be a cool structure to use if pushing things into the array (I am using absolute positions instead of array.length-related as it is easier in this example):

(as you can see I made a typo and then fixed it)

All you need to do thereafter is convert it into a nested array.

You can also do a chain of push statements:

All of these are pretty fun solutions to the problem.

RegEx Is Just Like Mandarin Chinese

When I was in primary school in Russia, we had an intro to computer science class that taught up the basics of programming to produce good post-Communist STEM-oriented comrades. It used Basic to teach the fundamentals. As I spoke no English at the time, I spent more time memorizing words like GOTO and GOSUB than actually figuring out coding logic.

English speakers have a very distinct advantage when it comes to programming in most languages: we know all the words already. One doesn’t even have to be an actual native speaker: Ruby was famously written by a Japanese programmer.

Whether a native speaker or not, it is a lot easier to memorize certain methods in Ruby (see: any?; none?; all?) when one actually understands how to use those words in a sentence.

Learning to code is comparable to learning a foreign language: a large part of it is comprehending the principles and then complementing it with memorizing quirks of grammar.

I compared SQL to German in class before. German is a very structured language where sentences have to be built and organized a very certain way, just like SQL.

Compare that to an Eastern European language, where we take major liberties with positioning words in a sentence. I used to tutor Russian as an undergrad at Yale; I remember students coming to me panicking because they could not figure out why subjects and objects were creatively arranged in a sentence and what to do about it. All I could say at the time is they were lucky because they did not learn Ukrainian, which has same creativity, by more complex grammar.

Ruby feels like an Eastern European language to me – there are many ways to write code, but there are still very specific ‘grammar’ rules one needs to follow to make code ‘grammatically correct.’ The stakes of course are differnet: while one can make oneself understood and convery the message well in broken Russian, Ruby won’t run broken code at all or run it incorrectly.

Here is the problem with RegEx: it is just like Mandarin Chinese. Judging by the number of desperate, hate-filled posts on StackOverflow, it is as hard to learn.

Mandarin Chinese, and most its other dialects of Chinese actually have no well-developed grammar. There are ways to structure sentences, but there are no 7 cases of Ukrainian or 12 tenses of English. Past tense is barely denoted, yet alone formed in a miriad ways that hapless students of Germanic languages spend years memorizing.

When I first started Ruby, I routinely had to spend a fair bit of time ensuring I had an appropriate number of ‘end’ key words: one to end an if statement, one to end a do block, one to end a method, one to end a class… That’s a lot of ‘grammar’ to learn – but at least we all know what the word ‘end’ means, so that makes it easier. With RegEx, you have to poke around various seemingly randomly assigned symbols and hope that they work. Just like with Mandarin, half the time you are hoping you are refering to the correct character that has no real linguistic content and that you are using it correctly.

For a project in class, I had to write the following expression to validate a potential twitter handle: /^[A-Za-z0-9]{1,15}|^@[A-Za-z0-9]{1,15}/. It’s not technically that difficult, but who decided that ^ stands for start of line? Or that $, for that matter, stands for end of line? Or that ‘a?’‘ means zero or more of soemthing?

There might be some internal logic to it, just like with Chinese (vestiges of old-school computer science perhaps?), but for a modern-day person without much linguistic…I mean computer science background, it is not intuitive.

On top of that, to use RegEx is like writing an application without a good test suite: you think that you have thought of everything, but there is always that one edge case that can pop up and screw it over. I felt like that when learning Mandarin: something always pops up where the word has a diferent meaning or is really meant to be used with a different character to really ‘work.’

The good news is that both RegEx and Mandarin are possible to master. Perseverance and a healthy amount of grunt work is key. But just like Mandarin is handy for freaking out Chinese restaurant employees and nail salon workers, RegEx is handy too – in fact, it might have a greater utility of being used oftenf or the rest of one’s Web Development career.

Why I Don't Like Semicolons

So I started learning Javascript.

They say that once you understand basic coding principles, learning a new programming language is mostly a matter of figuring out syntax and getting used to whatever special tricks that language has to offer. It certainly seems to be the case with this Ruby to JS transition.

But I certainly miss the syntactic sugar that Ruby gives us – and what is with all the curly brackets and the semicolons?!

Consider Project Euler Problem I (summing up multiples of 3 and 5 from 1 to 999).

There is just so much we can do with it in Ruby:

And here comes JavaScript:

But just like mastering a foreign language that allows one to express oneself in a new and expanded way (as an example, google ‘toska’ and ‘Nabokov’ – Russians take melancholy to a whole new level that does not exist in any other language), JavaScript offers something pretty cool: for (var x = 0; x < 1000; x++) – now that is one handy way to guide one’s iteration. I sort of wish Ruby had that…

But for now, I need a T-shirt that says ‘I don’t really like semicolons.’

Project Euler Problem 1, or Benchmarking Ruby Code

Everyone loves a good coding challenge. But where do you find good brainteasers outside of StackOverflow’s endless Ruby 101 questions?

Enter Project Euler. At 513 problems and counting, it is a great way to practice both your math and coding skills.

Many of the initial problems are on the easier side, and when I was assigned Problem 1, it did not take long to figure out how to approach solving it:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

We were supposed to code both a ‘regular’ procedural and an object-oriented solution.

Here is a very quickly written procedural one in a lab context (the two methods used and their names came from rspec suite):

It’s pretty straightforward. There is a simple loop to find all the multiples of 3 or 5 and push them into an array; inject method is then used to sum up all of them (yes, one can iterate over the array instead, but inject is perfect for summing or multiplying array contents, so I was not going to bother).

It works perfectly fine – the answer is 233168, by the way.

But then I got thinking: there are just too many lines of code, and writing them is not exactly efficient. I create an empty array; I create a variable with initial value of 3 (or 1, but our common sense tells us we don’t need 1 and 2 anyway), I increment by 1; I push things into the array. It just feels like too much effort.

So for the ‘object-oriented solution’, I made things a little more elegant:

That is easier to read and it just looks a lot better, and it produces the same result. But ultimately, it is not enough for code to be handsome, it also has to be fast and efficient. It was a good chance to use Ruby’s Benchmark Module to see if one solution was preferable to another due to it being faster.

Quoting straight from the source:

The Benchmark module provides methods to measure and report the time used to execute Ruby code.

It is pretty straightforward. One has to ‘require 'Benchmark’ and then use Benchmark.measure { whatever expression one is evaluating } – use ‘puts’ to see the actual outcome.

My initial plan was to benchmark my ‘longer’ loop solution vs. select vs. a similarly-coded reject solution, vs. several conditions ( using || for modulo or using min) and see if one was preferable to another.

So I ran benchmark on the select statement. And then I ran it again on the same statement. And again. And the processing times were different. And again – still different. Here is what it ended up looking like:

The time on the right is ‘elapsed real time’, which is basically time it took the program to run from beginning to and end.

Since the unit of time in which the output is shown is seconds, the amount of time it takes to execute is very small. In the 14 examples above, we are talking the slowest example taking just above a millisecond.

But I expected running the same line of code to take the same time each and every time. After all, there are no random number generators in that code, and Ruby should evaluate the line of code in the same order each time – the result doesn’t change, why would processing time?

What’s interesting about Benchmark is that it is affected by CPU, and your CPU will be running differently every time you run Benchmark. Although it took me several seconds before benchmarking attempts in the example above, and I did not start of exit any new software (or Chrome Tabs) in between attempts, my CPU clearly was doing different things at the time, hence the time variations.

So how is one supposed to benchmark one process against another if it is also affected by ‘outside’ factors such as whatever else your computer is doing in the background? The trick is to run your line of code against the other ones multiple times simultaneously, and then average processing times.

There is a very handy solution suggested in the documentation, which I have used to run 5 different statements 100 times and then evaluate results:

Here is the outcome, run within 3 seconds of each other:

While the absolute numbers are different each time benchmarking is run, the relative numbers remain the same (I ran it a few more dozen times and charted it, but I won’t bore you with the details).

The fastests solutions were using logical operators vs. applying min<1 – the latter took at least 2.5x longer than the 3 solutions that used Boolean operators – can you guess why?

However – all of the evaluted solutions took well under a second, so it is not really significant for this particular problem. But speed is a major concern for many industries (see a very interesting NYTimes article that discusses how it applies in finance (article here – paywall)).

Certain companies employ entire teams of people whose job description is to shave off milliseconds of algorithm running times. While Ruby’s Benchmarking module is not powerful enough for those commercial purposes, it is a fun way to experiment with the code one writes to see how it compares against other options.

Writing Elegant Code Part II: The Rules

I remember curiously peeking at code on GitHub before starting Flatiron. Even then, despite not understanding all too well what the code was meant to do, I could tell that some of it was easier to read than others. My favorite samples had methods that were short and clearly named, variables has clear names, code was organized logically and everything was properly indented.

Now that I can (mostly) write some functioning code, it is time to make sure it is easy to understand to those who may read it at a later point.

And so I went in search of re-factoring rules – and I found them.

The Rules are Sandi Metz' Rules for Developers and are discussed here.

I am paraphrasing them slightly below:

  1. Classes can be no longer than 100 lines of code ( # sloc – we are not counting blank lines) to ensure they stay within the single responsibility principle for classes.

  2. Methods can be no longer than five lines of code in Ruby proper (Rails is hard, and I am still not sure how it is meant to work). And yes, if statements with else and elfish count. Don’t do them. Break your code into reasonable methods with easy-to-follow names [use neonates as an example].

  3. Pass no more than four parameters into a method. Hash options are parameters, too.

  4. Controllers can instantiate only one object. Views can only know about one instance variable; views should only send messages to that particular object.

I will touch on all of these later, but in this post, I wanted to discuss applying the first 2 in real life (insofar as solving labs can count as ‘real life’, of course).

For a lab this week, we had to write a simple game of Rock Paper Scissors and then turn it into a simple web app using Sinatra. To create such a game is far from challenging, and my code was already reasonably succinct at the point when I got all the rspec tests to pass. (NB: that is not how I would actually write the game, but it worked in a lab context).

But the code still felt it could use some re-factoring:

Enter The Rules.

My class was much shorter than the mandated 100 lines of code – check.

Most of my methods were one-liners with 3 winning scenario helper methods designed to not clutter my won? method. So far, so good.

But what is going on lines 33 to 41? (that method is called on in app.rb to integrate into Sinatra.)

It felt too simplistic – just an if-then statement? really? And I was just itching to improve it somehow. It certthe explicitly frowned upon elsif was there!).

I was not entirely sure how to improve it other than via a nested ternary operator. While I like ternary operators as much as the next noob Ruby developer who thinks they are really cool-looking, nesting them is just the opposite of my goal of writing code that is easily understood by others.

I had a very vague notion of experimenting with something else instead: calling on all 3 methods, checking which one returns true (and in this case, one and only one will return true) and then returning interpolated name of that method with an exclamation point instead of a question mark. So if method won? is the one that returns true, then the game response would be ‘you won!’

I went to one of the instructors, Sophie, seeking her counsel on how to execute just that. After some pry-ing around, we have decided that such a solution would not actually save me any lines of code, because I woulf have to write new methods that will take up the very lines I have saved.

But Sophie did point out something the test did not cover: the computer_play method was sampling the USER_CHOICES (previosuly known as VALID_MOVES) array every time, so I needed to assign it to an instance variable (and that is why pair programming with a more senior professional is so important).

Upon some consideration, I also got rid of the winning scenarios methods (3 basically identicaly methods? sounds like something in need of abtraction to me) by creating a VICTORY_HASH constant.

At that point, I could have gotten rid of the VALID_MOVES array, reading it as keys or values of VICTORY_HASH instead, but to keep it allows for better code comprehension, so it was spared from culling.

To make everything orderly, I re-named ‘user_move’ as ‘user_play’, because I wanted its name to be symmetrical to the computer_play method (whose name was in turn mandated by Rspec).

The final code ended up being shorter than before by around 10 lines:

And the ‘outcome’ method? It is still there, staring at me in its simple flow control glory. Rules exist to be broken, I suppose.

Lines of Code: Is More Always Better?

A book on Ruby I have been reading brought up an interesting fact: it is common for programmers to be evaluated based on how many lines of code (a.k.a. source lines of code, SLOC) they write per unit of time.

I see how that is commonly applied in a corporate environment. Everyone needs to be evaluated on something. Traders are evaluated on their profit-and-loss statements. Teachers are evaluated on their students' standardized test performance. My cat is evaluated on his cuteness.

So programmers also should be evaluated based on something they do day in and out: writing code. And the more they write, the better their performance is. Right?

It seems to be a no-brainer: if you write a lot of code, surely you are very productive and, by extension, also a very capable programmer.

Furthermore, if you wrote 1,000 lines a week on average for the past few weeks, and your colleague wrote only 500 lines while working in the same language – you work much harder and your colleague is not pulling her weight. Right?

It’s not all as clear-cut.

There is no doubt that complex programs require a lot of lines of code. A cursory internet search suggests Windows 7 has 2,085,772 lines of code, although I am sure that can be much improved upon if re-written from scratch; but not even the most capable programmer can convert it into 2,000 lines of code.

It is obvious that a program that has several million lines of code is much more complex than that that has several thousands. But for the same complexity level, is it better to have more code?

So far, most of my own Ruby code refactoring – once I got past the ‘Hello, World’ stage - involved abbreviating number of lines of code once I got the program to work (outside of maybe using modules, but even then, I probably end up with fewer lines of codes on a net-net basis).

My current steep learning code means that sometimes I see a new method and realize I should have used that on a lab from a week ago, so I go in and apply it, which usually results in fewer lines of code:

Why go and create a custom factorial method if one can just use ‘inject’?

So to me, a metric of learning performance has been being able to remove code and replace it with something more efficient and elegant that does not take away from the program’s performance.

I stumbled upon a relevant E.W. Dijkstra’s quote when researching the topic; It was featured in his 1988 paper ‘On the Cruelty of Really Teaching Computer Science’:

“… [there] is only a small step to measuring "programmer productivity” in terms of “number of lines of code produced per month”. This is a very costly measuring unit because it encourages the writing of insipid code […]

My point today is that, if we wish to count lines of code, we should not regard them as “lines produced” but as “lines spent”: the current conventional wisdom is so foolish as to book that count on the wrong side of the ledger."

I will leave you with another quote by Bill Gates:

Measuring programming progress by lines of code is like measuring aircraft building progress by weight.

Don’t waste your code – write it succinctly.

Elegance of Code Part I

When one first starts coding, the main concern is writing something that works and produces desired result.

You get irb to output “Hello World!” – excellent. Then you get a nested hash-array monster to return a very specific key hidden 7 levels deep – great. Then you write your CLI interface that runs in your command line and shows scraped data from your fellow students' bio – good job. And you even use cowsay to format interactions with the user and format command line output using escape sequences, so that user interface looks less boring. And when you hit rspec, most tests pass.

And then the next task comes – how do you make your actual code look good?

“Code so beautiful that tears are shed.”

Why’s Poignant Guide to Ruby

So how do you write code that indeed is so beautiful and elegant that tears will be shed by those reading your pull request?

There are multiple parts to the puzzle, but let’s start with aesthetics of formatting first.

Do you remember when Avi said that he had been looking at our code, and we all needed to abide by basic indentation and not leave not-working code bits floating around commented out?

Ruby is perfectly fine with reading your code without any and all indentations and any real formatting. It can easily ignore all the hashed-out random comments. But once the code we read and write gets more and more complex, us humans really benefit from proper formatting to figure out what goes where – especially when the code is revisited at a later point and no one remembers anymore what it was really supposed to do.

To use proper formatting is also kind to anyone who might work on the code after you, so that they won’t have to spend hours doing forensic coding and figuring out what end statement closes what block.

And it is simply and elegant thing to do. And it’s Step 1 to Writing Elegant Code.

You know how when you went to college you had to format your essay footnotes a very specific way using a style manual or a helpful website or two? Well, there is a Ruby style manual, too!

The Airbnb Ruby Style Guide is what I suggest as a reference for all the formatting questions that are keeping you up at night.(Airbnb website is in Ruby and Ruby on Rails by the way – see, real companies use it!)

It is is a very comprehensive but not excessively lengthy GitHub repo that will answer most of the questions you have ever had about, say, indenting when and case together — turns out they are supposed to be indented at the same level! (That means if I re-visit some earlier labs, cosmetic edits will need to be made.) The best part if, anyone is welcome to contribute and submit a pull request – and maybe your take on comment indentation will be shared with the next generation of newbie developers.

Now, if there was only a website that did my Ruby code formatting for me…

Beautifying Non-Working Blog Posts

Avi said on Day 1 of Flatiron School: don’t fix non-functioning code. But how about editing a non-working blog post?

I had a post all written up and ready to go. It was about ways to iterate over arrays and had several A4 pages' worth of text (come to think of it, it’s a good thing that post did not happen).

I did some edits on it over the weekend, polishing my style, planning my fancy markup, and inserting clever programmer jokes (“Make me a sandwich” “No” “Sudo make me a sandwich” “Sure”).

All I had to do was set up Octopress and deploy it, which I conveniently postponed until the night before it was due. Because what could go wrong?

Many things, as it turned out.

I think I was the only person in my cohort (are there 27 of us? seems that we never did a headcount) who had issues setting up GitHub pages.

While everyone seemed to have their nice and shiny parts of cyberspace in which to share their musings with the outside world, mine never allowed me deploy the mandatory “hello world” post.

Several hours of feverish stack overflow-ing and tortured troubleshooting later (I would rather solve Hashketball both times all over again), it still wasn’t working.

After several more hours of precious instructor time (thank you, Rose!), we all threw our hands in the air, rm -rf'ed existing directories, deleted GitHub repos, and started from scratch. And it worked, ad oculos.

We figured out what happened, but not how. It looks like GitHub Pages gave me setup meant for companies instead of private individuals. Octopress was not enthused about me being Anna Ershova LLC and refused to produce a live page.

But we re-did it from scratch and that fixed it. And by then I had realized that Avi was right. I spent too much time on writing and editing my post, without trying to see if it would even go through. Murphy’s Law dictates that it is the very situation when it wouldn’t. So instead of using my original post, I am writing this one as a very public note to self to not get caught up in aesthetics of things if I don’t even know if they are going to work.

And those dozens of ways to iterate over arrays? Brace yourselves, I’ll come back to that.