Obrigado RubyConf Portugal

There are many dayIMG_6026s of the year where nothing memorable happens, and then there are some days which leave a lasting impact on your life. I can vouch for the fact that if you ask people who attended this conference, at least 90% (including me) would say that the 2 days at the Ruby Conf Portugal fall into the latter. Whether it was the talks, the food, the vinho(wine) or the venue itself, everything was wonderful.

IMG_6029

You hell know that the conference is going to be fun when the master of ceremonies appear as caesar. Hail Jeremy !! You did a wonderful job of managing the whole energy in the auditorium and your wonderful introduction to Caleb’s talk still lingers in memory. And the sssshhhsh sshhh too :)wrong-conf

First a warm hug to all of the speakers. You were awesome. And I know I cannot contain this awesomeness in a line or two but I will still try ..

Katrina Owen – Practise simple problems to get better at the craft which will then help you in the bigger problems at hand.
Alex Coles – Rails way has not changed in the last 10 years when it comes to Front end development. And considering the growth of some of the Javascript frameworks, he suggested a mix of 2 applications where API development and JS client framework would be the way to go about designing a web app in today’s day and age.
Piotr Szotkowski – Do read about enumerable.rb and you will be a better programmer.

Carlos Souza – Some pointers to keep in mind when you are developing Web API’s.
Chris Kelly – Some of the content just went tangential to my head but I remember that a string of 23 character is better than a string of length 24. Don’t ask me why :)
Gautam Rege – “Ruby is for developers while go is for programmers”. I <3 Ruby and I know he loves it too :)IMG_6031
Erik Ober – Keeping a track of object space is important and just making those little changes in ruby code and benchmarking them can lead to sizeable improvement in performance. Do have a look at his slides for some tips and tricks. And also if you want to see some beautiful illustration.DSC_0403

That was the end of first day. But then it was not the end. There was the traditional Portugese performance in the lobby and then the Ruby Karaoke session took the fun to another level.IMG_6046

And yes even after all the wine and the fun last night, I did make it to the next day on time and I had breakfast too :)

Steve Klabnik – He introduced us to a new language rust and then gave a demo of a c extension which used a rust service. Pretty cool eh. Also I came to know that he is these days involved with the active_model_serializer gem which plans to tackle some of the pain points of the JBuilder experience.

PJ Hagerty – Mozart effect works in programming too .. ie. listening to a particular type of music does result in cognitive boost.
Piotr Solnica – You may be a clean coder(very particular about code smells) or a cowboy coder(you sometimes cut corners just to deliver more features) and both the styles are pretty acceptable as long as you have the confidence to change something or refactor if the need be. I sure need to work on my testing skills to increase my programming confidence.
Luca Guidi- It is important to reinvent the wheel and the Lotus web framework is trying to do that. Do use it if possible and provide feedback. The community can only get better with such new stuff coming up
Danish Khan – This talk did remove some of my misconceptions about sales. Awesome slides.
Caleb Thompson – He gave an awesome presentation of how scenic gem can be used to do full text searching across more than 1 table. Yes elastic search and other 3rd party search tools are there but then I would love it if there was something in rails out of the box for these advanced searches. The problem with the gem right now is that it is only for Postgres database.
Terence Lee – A great closing speech where he spoke about the ways in which you can contribute to Ruby and that is not limited to writing C code and sending in patch requests. You can open tickets for features/bugs at their ticketing website and not on twitter :) Share data about application performance which can be used to improve ruby. More or less participate in the discussion and take Ruby to the next level(3.0)

DSC_0421

Obrigado to the organizers, the volunteers and the sponsors for making the RubyConf Portugal a big success. And a thank you to Josh software for making this trip happen. That’s it from Braga. Hopefully I will there again next year.

Image | Posted on by | Leave a comment

Raspar – Build a html parser in 5 minutes

Raspar is a HTML parsing library that parses HTML pages and converts HTML to ruby object by defining a map of ‘css’ or ‘xpath’ selectors. This gem can also manage parsers for multiple websites.

The sample output looks something like this

{ product: [ 
    <Raspar::Result:0x007ffc91e4d640 @attrs => { :name=>"Test1", :price=>"10"}, 
    @domain => "example.com", @name => :product> 
    # ... 
    # ... 
    ] 
}

Why Raspar?

For almost every website that we parse, we need to customise the code to parse and convert data into our defined format. While doing this, we potentially face some of the following problems in parsing html.

  • HTML page may contain multiple items or single item with the same CSS selectors.
  • We need to collect different types of data from single page. For example, products, offers, comments etc.
  • For the single website, the HTML structure could be difference on various page. For example, on one page the product name could have a CSS selector as.product-name while on another page in the same website, it may have the CSS selector as .pname.
  • Sometimes we want to collect particular attributes as an array. For example,  in the product section, we may want the various product features that are defined in the li tag as an array.
  • Some attributes are common for all pages, for example, the product comparison page has the same name and description but other attributes would differ from page to page.

Raspar helps to solve all these problems!

Usage

You can define a parser as shown below. In this example, we are parsing a currency exchange rate website and fetching the country, currency and it’s code.

class CurrencyCodeParser
  include Raspar
  domain 'www.exchange-rate.com'

  collection :currency_code, 'table.currency-codes tr' do
    attr :country, 'td.country'
    attr :currency, 'td.currency'
    attr :code, 'td.code'
  end
end

We first include the Raspar module and register the domain that is going to be parsed. Then we have to plan the parsing strategy. In this page, there are currency codes for each country. So using the collection method, we can collect all the values based on the currency code. We can also define multiple collections – for example, in a page containing products and brands, we can define two collections, one for products and another for brands.

NOTE: You can set multiple css selectors too in order of priority. In the example below, if .country element is not set, then the .nation will be checked and returned.

attr :country, '.country, .nation'

collection: This takes two arguments and a block of code: the collection name, the html selector and the block in which the attributes are defined. In the example above, ‘table.currency-codes tr’ is a selector that contains all there attributes country, currency and the code. So, the parser collects all ‘table.currency-codes tr’ elements and makes a result object using the selectors defined for attribute.

attr: This takes two mandatory arguments: the name and html selector. It can take an optional third options argument that help in formatting or getting a particular property of the html element. Potential options are :prop, :eval. If the options are not defined, then the attr returns the text value of that html element.

:prop: This will return the value of the mentioned property of the selected element. In the case below, we want the src property of img tag.

attr :image, '.lg_photo img', prop: 'src'

:eval: This evaluates the HTML element and processes it. This can be a Proc or a method name (i.e. a symbol). Remember, the method or Proc defined must take 2 agreements: the method name and the element. For example,

attr :address, '.address', eval: Proc.new{|text, ele| text.split(':').last}

or

attr :address, '.address', eval: :parse_address

def parse_address(text, ele)
  text.split(':').last
end

If we need the attribute as an array, we can simply do the following:

attr :specifications, '.specs li', as: :array

NOTE: If attr is defined outside any collection block, it is considered a common attribute and will be included in all collections!

The parsing logic

Here is an example of the parsing a particular page in the domain we have specified. In the example below, Raspar will automatically load the parser depending on the domain, in our case the CurrencyCodeParse. We don’t need to specify this in our code. The advantage of this is that we can customise or add new parser at will as long as we specify the right domain!

url = 'http://www.exchange-rate.com/currency-list.html'

// Using RestClient get html page
html = RestClient.get(url).to_str

Raspar.parse(url, html).each {|c| p c; }

This will get us the following result:

{
  currency_code: [
    #<Raspar::Result:0x007ffc91e4d640
     @attrs={:country=>"USA", :currency=>"USD", :code =>"$"}>,
    #<Raspar::Result:0x007ffc91e57be0
     @attrs={:country=>"Japan", :currency=>"¥; ", :code =>"JPY"}>,
   ...
   ...
 ]
}

Alternate ways to create parsers

There are other ways to define a parser.

By passing a block

Here we don’t need to define a separate class; just an anonymous parser!

Raspar.add('www.exchange-rate.com') do
  collection :currency_code, 'table.currency-codes tr' do
    attr :country, 'td.country'
    attr :currency, 'td.currency'
    attr :code, 'td.code'
  end
end

By Passing a hash

This can be helpful if we have pre-defined selector map configured in the code or saved in our database or even if we want to add map dynamically i.e JSON file of web service etc.

domain = 'http://www.leguide.com'
selector_map = {
  collections: {
    product: {
      select: '.offers_list li',
      attrs: {
        image: { select: 'img', prop: 'src'},
        price: { select: '.price .euro.gopt', eval: :parse_price}
      }
    }
  }
}

In the selector map above, we have defined a :parse_price method. Here is how we can add it to Raspar. We can also define more data processing helpers in the ParserHelpermodule as shown below.

module ParserHelper
  def parse_price(val, ele)
    val.gsub(/,/, '.').to_f
  end
end

Raspar.add(domain, selector_map, ParserHelper)

This gem is available on ruby gems and on github: Raspar
You can check out various examples too.

Go forth and parse!

Posted in Ruby | Tagged , | Leave a comment

Precision number parsing in spreadsheet using ruby.

This blog post is not about how to parse spreadsheets using ruby. If you are looking for that, you are not gonna find that here. This blog post is about a problem I faced while parsing decimal numbers from a spreadsheet, long story short precision related problems. Here are the details (the longer version):

Recently I had to parse an spreadsheet. The cells in the spreadsheet could have either of strings or numbers (float or otherwise). And since this application had to do with lot of calculations, even a specification being off by a single digit could lead to wrong set of results. So my point being that it was vitally important that I parse the data as it has been entered. Seems like a regular parsing situation… I had the same thought, but it turns out excel does a lot of things in the background.

FYI: I am using ‘roo’ gem for parsing the file. And these files are written in MS Excel.

Try doing this in irb:

$ irb

irb(main):001:0> .09 + 0.0016
=> .0916000000000001
irb(main):002:0> .09 + 0.0016 == .0916
=> false

First of all this is the correct behaviour. And second of all the reason for this behaviour is because ruby uses IEEE754 doubles. And so does Excel. So if the cell has data whose value is .0916 and its format is set not set as ‘TEXT’, but something else for example: Number, General, etc, then what excel returns is .09160000000001. So even if user has entered a value like this, it could come up with a different precision altogether.

So, the question is how did I solve this problem ? Well, I did not. Not because I could not solve it, but because I was not fully satisfied with the solutions present right now and I am still trying to figure it out. But anyways I thought those solution could solve somebody else’s problem. So here are some details on those solutions:

. You could ask all your users to use the format of the data they are entering as ‘TEXT’. So excel or any other software that I am not aware of right now, will not change the values, and you would get what user has entered as it is.

. roo gem which I am using to parse the spread sheet, it has methods to find out the format of the data it is parsing. So you can just check in your code, if the format is ‘:float’, then just round the number. Here is how you could do it:

s = Roo::Excel.new('myspreadsheet.com')
s.cell_type(4,2)

But again, as I said earlier, these solutions are not right or wrong, but these are trade-offs that you have to think of in terms of whats best for your application.
If you want to follow up the issue, here are some links to look at: roo gem github, same issue in a different gem and a related issue.

Posted in Ruby | Tagged , , | 1 Comment

Ruby through rails part 5: Bundler Dsl

Gautam Rege:

In this post, Sanjiv explains what happens under the hood when we do “bundle install”. While learning in detail about how are gems loaded, we learn a few useful Ruby tips and tricks and see some meta-programming at it’s pristine best!

Originally posted on narutosanjiv:

Note: All path are relative to bundler gem path. For these blog, i am currently using ruby 2.1.2 and bundler 1.6.3.

While going through Bundler source code earlier, we have seen how bundler evaluates the Gemfile and creates function for each of keyword like gem, source and etc. Now we are going to see the implementation details of a few of these functions. Here is the `gem` function.

1

The `gem` function accepts one mandatory parameter – the name of the gem and other parameters is an array (called splat parameters). These can be any of :version, :git, :github, :platforms, :source or :group. The * operator (pronounced “star,” “unarray,” or, among the…

View original 1,882 more words

Posted in General | Leave a comment

My experience at RubyConf Brasil, 2014

This was my first time attending Rubyconf Brasil and it is one of the best experience I have ever had at a conference. The conference for the speakers started a day early. There was a barbecue party arranged at the Codeminer HQ. It was a good idea since a lot of speakers were not from Brasil, so it made it easy to get to know a few people. Moving onto the venue for the conference. It was at the Frei Caneca Theatre. It is a 16 floor building which has a theatre for the conferences, a shopping mail, many restaurants, corporate office and a lot more which I could not explore. The conference was at the 5th and the 6th floor. Here is a picture of the theatre.

Theater

People started coming in around 8:00 am and by 8:15, it was jam-packed. A 1000 people attended the conference. It started with George Guimarães’s talk on “Dogmatism and software development” and followed by a coffee break, it was Ruslan Synytsky’s talk on Jetlastic’s new ruby implementation. Both the talks were interesting and were single track conference.

On a side note, most of the talks were in Portuguese, so there were translators available for all the English-speaking people.

After Ruslan’s talk, conference was divided into a 2-track conference. Next up was talks on Elastic Search and Puppet using Ruby in the other track. I choose to attend ElasticSearch talk since it is an area of interest to me. Pedro Franceschi  spoke about elastic search and how using it in its core way is the best approach rather using a ruby wrapper for it because it limits the way you could use it. Couldn’t agree more with him!!

It was time for lunch. There were so many restaurants in the building, everyone had a varied choice of what they want to eat. After the lunch was talk on “Quality software engineering for your data center” by Ben Langfeld. I met Ben at the barbecue and was already looking forward to his talk. He is an Englishmen but speaks Portuguese just like a local. Even though it is always very difficult to keep pace with the talks which are just after lunch. But it turned out to be one of the most interesting talk at the conference. He talked about Devops and how it is not something of a middle ground between development and Operations and is much more. In the other track, it was about “Migrating an application from MongoDB to PostgreSQL” by Marcio Trindade. I wanted to attend both the talks but I have just not found a way to teleport yet :(

Next up was me. I think it went well as many people came up to me and appreciated it. Anyways I really enjoyed the talk and learned some very important lessons about giving talks. In the other track was a talk on “iOS on Rails” by Cezar Carvalho Pereira. Another talk which I wanted to attend, but definitely not at the cost of skipping my own talk :). Just before the coffee break, there was one more talk scheduled. And it was time for Vipul and Prathamesh to take the stage. They talked about “Building a ORM with AReL: Walking up the (AS)Tree”. I could not follow the whole talk as I was still making notes of how I could have improved my talk and was asking for reviews from people sitting around me. One of the most important review which I received was that I should have spoken a little slow as not many people understood English and not all of them had the translators. Point noted for the future..!!
In the other track it was about “Convention on creating and deploying Rails applications” by Eduardo Fiorezi.

Followed by another coffee breaks, it was time to choose between “SOLID principles through tests” by Sebastian Sogamoso and “My (Re)Architecture process, from Zero to Hero!” by Lucas Martins. Well I have been reading a lot about Solid Principles and how TDD fails to drive us towards the correct design, I choose to attend this one. Because topics like these are open to debate and everyone has their own opinion, and I love to hear discussions about this. It was also one of the most interesting talks at the conference.

Last talk for the day was a very unique talk. It was titled “Leadership for the Future” by Murilo Gun. He is a standup comedian, entrepreneur and have spent 10 years in technology. It was one of the talks I badly wished I knew Portuguese. He talked about the bets he places for the future. Although the translator was doing a great job at translating, some jokes were just meant to be understood in Portuguese.

IMG_20140829_192949 party

And it was finally the end of Day 1. Wait.. it was the end for the talks not the conference. Now it was time for happy hours at a pub very close to the conference. Made some friends here too.

 

 

Day 2 of conferences usually have less attendance for the morning sessions, but not here. I guess it was because Koichi Sasada was supposed to speak about “Growing the Ruby Interpreter”. Nobody wanted to miss this one. After this it was time for a coffee break and then another choice to make between “Integration Testing SOA Rails Apps” by Travis Douce and “Testing the security of your applications with Metasploit Framework” by Daniel Romero. This time around I did not have to make a choice as I had an extended coffee break talking to few people.

Then it was another of these design-tdd related talk. It was titled “Improve your Rails application design with better TDD” by Marko Anastasov. By now you would have guessed that I would have attended this one. Marko started off by showing a really complex piece of code and how he used TDD and design patterns to arrive at the correct design and maintainable codebase. In the other track Thiago Scalone spoke about “MRuby – Minimalistic ruby on browser or mobile”.

Followed by lunch, it was time to hear about “Tricks that Rails didn’t tell you about” by Carlos Antonio da Silva. It was another just-after-the-lunch talk, so you know it was really hard for me to follow all the time. But I really liked his talk and would definitely wait for the videos to be published for this one. The next talk I attended was about javascripts titled “You still don’t know how to write proper Javascript for Rails applications” by Jean Carlo Emer. In the other track, Ricardo Valeriano and Arthur Zapparoli spoke about “Lotus: A true OO Ruby Web Framework”.

Next up was Matt Campbell. He is a product of a Rails Bootcamps and he talked about how he left his well-to-do finance job one day and joined a Rails bootcamp, his experiences of the bootcamp and how the education system is screwed in the US. Personally I do not think bootcamps are a replacement of a college degree. And they would be of help to someone who has a programming background and want to learn something new. But that’s just my opinion. In the other track Hanneli Tavante spoke on “Java headache? Torquebox!”. Next up was a talk on “Rock-Solid Web APIs with Rails” by Carlos Souza and “Building your own Credit Card Company” by Eduardo Mourão. I attended the latter one. But unfortunately took the wrong translator with me and it took me a little while to realize this. And by the time I realized it was pretty late. But it looked like people really enjoyed his talk.

Now we were very close the wrap of the conference. Fabio Akita spoke on “The last 10 years through Rails and Ruby” and how we should all follow our passions and do not worry about the consequences. One of the things I really liked was how he accepted the fact that even after giving over 100 talks, he is still nervous every time he gets on to the stage. It was a good closing key-note and at the end all the speakers did some pictures on the stage and also took a selfie..:)
image

Overall it was a great conference, great experience for me as it was my first conference abroad and lots of memories. And as we say it here, OBRIGADO..!!

Image | Posted on by | Tagged | 1 Comment

Tussle of the State Machines

There’s no shortage of Ruby state machine libraries (assm, state_machine, etc.). However when we needed to implement dynamic state machine we didn’t find one.

The Problem

We needed a polymorphic class that could have different state machines triggered in it depending on some condition. Basically, here is what we wanted to achieve:

class Call
 include Mongoid::Document

 field :scheduled_at, type: DateTime 
 field :is_existing_customer, type: Boolean
 field :note

 belongs_to :callable, polymorphic: true

 case callable_type
  when 'Car'
   state_machine :state, :initial => :fresh, namespace: 'car' do
    event :schedule do
     transition [:fresh, :schedule] => :scheduled
    end
    # ...
    # ...
  end

  when 'Personal'
   state_machine :state, :initial => :fresh, namespace: 'Personal' do
   event :schedule do
    transition :fresh => :scheduled
   end
    #...
    #...
  end

  when 'any other'
   #...
  end
 end

Now the problem was that AASM State Machine does not support multiple state machine in a single class. So we tried to achive it through state_machine gem with namespaces. However, we could not have same state field even under the namespaced state machine in a single class.

The solution!

We wanted a state machine that could be easily integrated with other Ruby objects. So we decided to define a state machine as a separate class and selectively apply it to our Rails models. We were using MongoDB, so we embedded these objects.

class CarStateMachine
 include Mongoid::Document
 include AASM

 field :state
 embedded_in :call

 # no need for name space and we can use AASM directly
 state_machine :state, :initial => :fresh do
  #states: fresh, scheduled, lead, succeed
  event :schedule do
   transition [:fresh, :schedule] => :scheduled
  end
  #...
  #...
  end
 end
class PersonalStateMachine
 include Mongoid::Document
 include AASM

 field :state
 embedded_in :call

 #states: hello, meet, bye
 state_machine :state, :initial => :hello do
  event :wow do
   transition :hello => :meet
  end
  #...
  #...
 end
end

Now, the model can access these embedded objects using a call_state method, that returns the embedded object based on callable_type of model!

class Call
 include Mongoid::Document

 field :scheduled_at, type: DateTime
 field :is_existing_customer, type: Boolean
 field :note
 field :callable_type
 
 embeds_one :car_state_machine
 embeds_one :personal_state_machine
 
 # Method to access state machine
 def call_state
  case self.callable_type
   when 'Car'
    self.car_state_machine || self.build_car_state_machine
   when 'Personal'
    self.personal_state_machine || self.build_personal_state_machine
   end
  end
 end

Here is a sample output of a Call model using different state machines dynamically!

call = Call.first.callable_type # => "Car"
call.call_state.state # => 'fresh'
call.call_state.schedule! 
call.call_state.state # => 'scheduled'

#####

call = Call.last.callable_type # => "Personal"
call.call_state.state # => 'hello'
call.call_state.wow!
call.call_state.state # => 'meet'
Posted in Ruby | Tagged , , | Leave a comment