EXPLORING REDIS AND IT’S DATATYPES

Content posted here with the permission of the author Bandana Pandey, who is currently employed at Josh Software. Original post available here.

Today Performance is what that comes first, when we developers try to develop web services. One of the issue is that, when a web service tries to interact with database, in order to get the result it may take time depending on the number of records.

Prerequisites

For this blog, I am assuming that you have knowledge about Rails and basic idea about Redis.

Getting Started

Lets’s imagine we are building a back-end for the online movie app. Customers will use this app to view all the movies, their details, resulting in huge load on Database. So what if we could reduce the load on the database by caching the movies data. But for caching what should we use ?

There comes REDIS to our rescue.

Redis

Redis is the key-value store, which we can use for CACHING to speed things up and improve our performance.

But Redis is not just a plain key-value store, it is data structures server, means it not just limited to support strings as value, but also more complex data structures, such as Hashes, Lists, Sets, Sorted Sets. For detailed information refer this.

Strings

Strings are the most basis data type that we use for caching in Redis. They are binary safe and easy to use. So we mostly go for them.

But in our scenario Strings DataType was not enough as I have to store the whole list of movies and their respective details in Redis.  Strings work well, but it stores the whole list in the string format as value. So, before sending the data , I have to parse them in JSON Format,  such that they can be used by the views in order to present it to the User. But what if the data is huge, parsing strings to JSON or any other required format will be time consuming. So, string is not which can be used in our case.

By reading these memory optimization blog and documentation, I found that there is other Datatype that Redis supports, which can be helpful i.e, Hashes.

Hashes

Hashes are the perfect data structure to represent the objects. They are the map between string fields and string values. Also, they are stored in attribute: value format, just like how the tables data is mapped to object using ActiveRecord in Rails. Small hashes are encoded in a very small space, so we should always try to represent our data using hashes.

And in this way using hashes our data parsing issue is solved. Now, we fetch data, as it is from Redis using Hashes, and there is no conversion of data format is involved.

Also memory consumption, reading and writing performance can be improved using optimized storage of hashes over strings data type.

Now lets check the above theory using Benchmark in rails. Here, we are going to use  redis-namespace and redis service which is explained later in this section.

Setting Data in Redis:

Benchmark.bm do |x|
  #here data is in the json format

  #Setting data using hash(value will be stored as hash)
  x.report { RedisService.new(klass: Event).set_list(key: CMS_MOVIE_LIST, data: data) }

  #Setting data using string(value will be stored as string)
  x.report { RedisService.new(klass: Event).set(key: MOVIE_LIST, data: data) }
end

#user     system   total    real
#0.030000 0.010000 0.040000 ( 0.011480) #Hashes
#0.150000 0.000000 1.150000 ( 0.447619) #Strings

Fetching Data from Redis:

Benchmark.bm do |x|

  #Fetching data using hash(value will be stored as hash)
  x.report { RedisService.new(klass: Event).get_list(key: CMS_MOVIE_LIST) }

  #Fetching data using string(value will be stored as string)
  x.report { RedisService.new(klass: Event).get(key: MOVIE_LIST) }
end

#user     system   total    real
#0.010000 0.000000 0.010000 ( 0.008200) #Hashes
#0.090000 0.000000 0.090000 ( 0.032398) #Strings

This demonstrates, how our performance can be improved by using Hashes over Strings in Redis.

So in order to use the same things in our rails application, we are going to use redis-namespace. For detailed information about this, refer Redis::Namespace

Initializing Redis in Rails

We instruct our rails app to use redis as a cache store and set the redis_host in ENV variable like this:


REDIS_HOST: 'redis://localhost:6379'

Now, initialize a wrapper around redis using redis-namespace. Or have a service redis_service.rb using redis-namespace so that we can interact with our redis.


class RedisService
  def initialize(klass:)
    redis = Redis.new(url: ENV['REDIS_HOST'], timeout: 1)
    @namespaced_redis = Redis::Namespace.new(klass, redis: redis)
  end

  def set(key:, data:, expire: nil)
    #Command to Set value of a key
    @namespaced_redis.set(key, data.to_json)

    #Expire your redis key In 1 week
    @namespaced_redis.expire(key, 1.weeks)
  end

  def set_list(key:, data:, expire: nil)
    #Command to Set List of Data on a Redis Key
    @namespaced_redis.set(key, Marshal.dump(data))
  end

  def get(key:)
    #Command to Get only Value of Key in Json Format
    JSON.parse(@namespaced_redis.get(key))
  end

  def get_list(key:)
    #Command to Get List of Data from Redis
    Marshal.load(@namespaced_redis.get(key))
  end

  def del(key:)
    #Command to delete Key from Redis
    @namespaced_redis.del(key)
  end

  def keys(pattern: nil)
    @namespaced_redis.keys(pattern)
  end
end

Marshal

In the above code, we are using marshal. It is a library in ruby which converts the collection of Ruby objects into byte stream. It is the fastest option available in ruby for data serialization. For detailed information refer this

Now we have generic Redis Service which we can use to perform different operations like add, delete, fetch data from Redis in our rails application.

Advantages of writing this service class:

  •  Code is DRY
  • All the redis commands are there in it, and we can use them whenever and wherever we want in our rails app.

Now, we are going to use this, to fetch movies on the basis of city.

Managing Redis Cache in Rails

Here, the whole idea is that, when a customer wants list of movies in a particular city, firstly we are going to fetch the movies, by directly quering on database. Secondly, we will cache the response using redis-namespace wrapper, such that on subsequent quering, the data will be fetched from redis, and not from the Database, thus improving our application performance.


class MoviesController < ApplicationController
  #Here we are going to use the RedisService to perform operations on redis
  def index
    #Check if the list of movie is there in redis
    movies = RedisService.new(klass: Movie).get_list(key: "movies:#{params[:city]}")

    #If there is no movies in redis
    if movies.blank?
      #Load Movies from Database
      load_movies

      #serialize the data
      movies = serialize_resource(movies, V1::MoviesSerializer)

      #Cache the serialized response in Redis, so that it can be used again
      RedisService.new(klass: Movie).set_list(key: "movies:#{params[:city]}", data: movies, expire: 1.day)
    end

    #Returns the response
    mobile_success_response(data: movies)
  end
end

The above code is perfect, but there is one loophole in that, if any movie is added or it is updated in the database, it will not be shown to the customer if the data is fetched from Redis.

So in order to solve the above issue, what we have to do ?

We’ll write a callback in such a way that, whenever any movie is added or updated,  we will delete the keys corresponding to movie list. So, during updation of any movie, if User wants the data, it will be fetched directly from database and then will be stored in redis cache. On the subsequent calls, it will be fetched from redis. Below is the callback, to achieve this:


class Movie < ApplicationRecord
  after_commit :update_in_redis, on: [:create, :update]
  after_commit :delete_from_redis, on: [:destroy]

  def update_in_redis
    redis = RedisService.new(klass: self.class)

    #Delete all the keys matching the movies: pattern
    redis.del(key: redis.keys(pattern: "movies:*"))
  end

  def delete_from_redis
    redis = RedisService.new(klass: self.class)

    #Delete a movie from redis if it is deleted from database
    redis.del(key: self.id)
  end
end

Hope this blog will be useful. For more information like this, Stay tuned 🙂

Advertisements
Posted in General | Leave a comment

Postgres – A new NoSQL

Content posted here with the permission of the author Tejaswini Gambhire, who is currently employed at Josh Software. Original post available here.

In today’s world of database technologies, there are two major database types: SQL and NoSQL.  Basically both SQL and NoSQL do the same thing but in different way. Depending upon our project needs we need to find the better fit for our project. So, if you need to handle large amount of data with a little or even no structure then NoSQL is the best fit. But if you need transactional support and handle structured data then you should go for the SQL. We will not go into the details of it. But for a quick difference between these two, you can visit here.

For any project we either go for SQL or NoSQL. For our project also we needed acid compliance and transactional support, so we had opted for SQL database(postgres). Now, keeping this in mind and looking at the title of this blog you may wonder Why would I even want to store unstructured data in my database? and that too postgres? Isn’t it better to go for the NoSQL database itself?

Nothing complicated in that. So this is the usecase that I had in my project:

In our project we were storing dish. Now as dish has many ingredients, we wanted to keep a track of some ingredients like whether it contains nuts, milk, eggs, gluten, soy sauce in order to handle the allergies and preferences of customers. Now,  by the conventional approach assuming that its a relational database, we would have created separate column for each. But thinking a little bit, is it a scalable solution? Off course not! Here postgres came into the picture with its hstore support. PostgreSQL has provided a very great platform by incorporating hstore, json and jsonb, which has lead us to use unstructured data in a structured database. So, we created just one column ‘contains’ which stored this data as a key value pair with keys as the ingredients and boolean value.

There are some other usecases also where it makes a lot of sense to incorporate JSON document into your model. For example, it’s perfect when you need to maintain data that comes from an external service in the same structure and format (as JSON) that it arrived to you. Instead of trying to normalize this data across multiple tables, you can store it as it is (and still query against it).

In this blog we will have a quick overview about the NoSQL capabilities of postgres and learn how to use hstore in detail.

HSTORE:

  • Hstore is a schema less key-value store.
  • The best part is it’s acid compliant.
  • It is useful to store parse attributes like product description.
  • The advantage of using hstore is we can store very different types of records with different attributes in the same table and still we can query with SQL.
  • The downside of hstore is that all values are stored as strings.

Postgres Document Store:

  • JSON is the most popular data interchange format on web.
  • Postgres has a native JSON data type and a variety of JSON functions.
  • It is a hierarchical document model.
  • Postgres also supports JSONB column type which is the binary version of JSON.
  • JSONB is faster and robust than JSON.
  • The key difference between JSON and JSONB is that JSON stores exact copy of the text input, which must be reparsed again and again. However, JSONB stores a binary representation that avoids reparsing the data structure.

If you want to learn how to use jsonb with ruby on rails you can visit here

So, we can say that we can use nosql capabilities with the same syntax and in the same ACID transactional environment and rely on the same query planner, optimizer and indexing technologies as conventional SQL-only queries.

Using HSTORE:

To use hstore you must enable the extension by using the command

 CREATE EXTENSION IF NOT EXISTS hstore; 

We can simply create a table as like any other table with hstore as column type

CREATE TABLE dishes(
  name TEXT,
  recipe TEXT,
  contains HSTORE
);

Insertion of the data has nothing magical to do with hstore. We can simply use the same conventional syntax

INSERT INTO dishes (name, recipie, contains) VALUES (
'Green Beans, Tomato and Potato Salad',
'Organic Potatoes red 2 cups, Hot house Tomatoes 1/2 cup, Organic Green Beans 1/2 cup, 1 tbsp Parsley, 1 tsp  Lemon, 1/2 clove Garlic, 1/4 cup Extra Virgin Olive Oil, Salt &amp; Pepper red onoin 1/4 cup kalamata olives 2 tbsp 1 tsp capers',
'"nuts"=&gt;"yes", "dairy"=&gt;"no", "gluten"=&gt;"no", "sesame"=&gt;"no", "egg"=&gt;"no"');

It’s typical for every row to have the same key names, or at least some minimum number of overlapping key names, but you can, of course, use any keys and values you like. It may be the case that there are totally different keys in many of the rows.

Now, let’s see a simple query to retrieve all the dishes containing nuts

SELECT name FROM dishes where contains-&gt;'nuts'='yes';

Notice several things here. First, the name of the column remains without any quotes, just as you do when you’re retrieving the full contents of the column. Second, you put the name of the key after the -> arrow. Finally, the returned value always will be of type TEXT. There are numerous operators and functions provided by postgres which you can always refer from the official documentation.

Hstore with Rails:

In rails you can use enable_extension in your migration. Let’s see how to add the column contains to our dishes table by writing a migration

class AddContainsToDish &lt; ActiveRecord::Migration[5.1]
  def change
    enable_extension 'hstore'
    add_column :dishes, :contains, :hstore, default: {}
  end
end

Now you have to identify this column on your model with store_accessor as below:

class Dish &lt; ApplicationRecord
  store_accessor :contains
end

We can now store any kind of attributes in the contains column.

Dish.create(name: 'Green Beans, Tomato and Potato Salad', contains: {'nuts'=&gt;'yes', 'dairy'=&gt;'no', 'gluten'=&gt;'no', 'sesame'=&gt;'no','egg'=&gt;'no'})

Not only hstore allows us to store arbitrary key value pairs but also it allows us to quickly query them.

# Find all dishes that have a key 'nuts' in contains

Dish.where("contains ? :key", :key =&gt; 'nuts')

# Find all dishes having sesame

Dish.where("contains @&gt; (:key =&gt; :value)", :key =&gt; 'sesame', :value =&gt; 'yes')

If you’re going to query this column frequently, you must add an index. There are two types you can use: GiST and GIN.

  • GIN indexes are three times faster to search, but they take more time to index. They also take more disk space. Use it when you have more than 100K unique terms.
  • GiST indexes are slower than GIN indexes, but they’re faster to update. Use it when you have up to 100K unique terms.

You can define the index on your migration file with the :using option.

class AddContainsToDish &lt; ActiveRecord::Migration[5.1]
  def change
    enable_extension 'hstore'
    add_column :dishes, :contains, :hstore, default: {}
    add_index :dishes, :contains, using: :gin
  end
end

This is how you can use hstore. If you want to dig into the details, more information is available in the postgres hstore docs.

So, you can say that postgres is a bridge between SQL and NoSQL. You can convert hstore to json as well. Also you can make a sql table look like json document and vice versa in postgres. You can easily combine the sql and json queries in the acid compliant environment of postgres. So, now you can start with storing structured data in your database and then integrate unstructured data as well or start with unstructured dataset and adjust the balance between structured and unstructured data very very easily with postgres. To know more about the NoSQL capabilities visit the official site of enterprisedb.

Posted in General | Leave a comment

Testing React-Redux App with Jest

Content posted here with the permission of the author, Kiran Deshmukh, who is currently employed at Josh Software. Original post available here.

We often get confused about selecting testing framework for our application. Currently, I am working on a ReactRedux based project. While selecting the testing framework, we compared some of the popular JavaScript testing frameworks. We found that Jest is the best fit for testing our application.

Jest is not limited to ReactJs testing. We can test any JavaScript code using Jest. It can be used to test asynchronous code.

In React – Redux project, we will have a single store containing state of the application. We will have actionCreators which return action type and payload(may be the response from API). Reducer will contain actual logic to update store for a particular action. Components listen to the reducer. So, when state of the reducer changes, component will be re-rendered.

Here, we will discuss how Jest helped us for testing actionCreators, reducers, components in our project. ActionCreators return the actions. So, we are not testing actions .

Suppose we have a file friendListActions.js which contains string literals for the actions:

const friendListActions = {
  fetchFriendList: 'FETCH_FRIEND_LIST',
  fetchingFriendListSucceeded: 'FETCHING_FRIEND_LIST_SUCCEEDED',
  fetchingFriendListFailed: 'FETCHING_FRIEND_LIST_FAILED'

};

export default friendListActions;

Suppose, we have following file friendListReducer.js.We are changing the state in the reducer based on the actions.

import friendListActions from 'friendListActions.js';

//Set the initial state for this reducer.
const initialState = {
  isLoading: false,
  errorMsg: null,
  friendList: []
};

//Here is our business logic to change state in the reducer.
const friendListReducer = (state = initialState, action) =&gt; {
  switch (action.type) {
    case friendListActions.fetchFriendList:
    case friendListActions.fetchingFriendListSucceeded:
    case friendListActions.fetchingFriendListFailed:
      return { ...state, ...action.payload }
    default:
      return state;
  }
}
export default friendListReducer;

Suppose we have the following file friendListActionCreators.js file containing action creators for fetching friend list. We are handling success as well as error response while fetching the friend list.

import friendListActions from 'friendListActions.js';

//This actionCreator is to initialise the fetching of friend list
export const fetchingFriendListInitiated = () =&gt; {
  //actionCreator is returning an action object.
  return {
    type: friendListActions.fetchFriendList,
    payload: {
      isLoading: true
    }
  }
}

//This actionCreator is used when friend list is fetched successfully.
export const fetchingFriendListSucceeded = ( friendList ) =&gt; {
  //actionCreator is returning an action object.
  return {
    type: friendListActions.fetchingFriendListSucceeded,
    payload: {
      isLoading: false,
      errorMsg: null,
      friendList
    }
  }
}

//This actionCreator is used when failed to fetch friend list.
export const fetchingFriendListFailed = ( errorMsg ) =&gt; {
  //actionCreator is returning an action object.
  return {
    type: friendListActions.fetchingFriendListFailed,
    payload: {
      isLoading: false,
      errorMsg
    }
  }
}

export const fetchFriendList = () =&gt; {
  return( dispatch =&gt; {
    dispatch( fetchingFriendListInitiated() )

    //Here we are fetching the friend list for user having 23 as id.
    return fetch("http://social-media.com/23/friends")
    .then(successResponse =&gt; {
      dispatch(
        fetchingFriendListSucceeded(
          successResponse.response
        )
      )
    })
    .catch(errorResponse =&gt; {
      dispatch( fetchingFriendListFailed(errorResponse.message))
    })
  })
}

We are making ‘fetch’ call to the respective API.

Let us observe the test cases for the reducer. We have the following file friendListReducer.test.js:

import reducer from 'friendListReducer.js';
import friendListActions from 'friendListActions.js';

const expectedInitialState = {
  isLoading: false,
  errorMsg: null,
  friendList: []
}

//'describe' is used to create 'test suite' containing multiple test cases.
describe('Friend List Reducer', () =&gt; {
  it('returns a state of reducer when succeeded to fetch the friend list', () =&gt; {
    let expectedPayload = {
      isLoading: false,
      errorMsg: null,
      friendList: [
          'John', 'Emraan', 'Sukanya'
      ]
    }

    expect(
      // "reducer" takes 2 arguments:
      // first argument: state of reducer before applying the action
      // second argument: Plain JavaScript Object containing "action" and "payload"
      reducer(expectedInitialState, {
        type: friendListActions.fetchingFriendListSucceeded,
        payload: expectedPayload
      })
    ).toEqual({ ...expectedInitialState, ...expectedPayload })
  })
})

State in the reducer should be changed when some action is performed. Here, we are testing whether this state is changing as per expected payload or not.

Here, we are using expect() and toEqual() methods provided by Jest. Also, it provides describe to create a test suite and it to create an individual test case.

Let us test which actions will be performed when friend list is fetched successfully from the API. Let we have file friendListActionCreators.test.js:

import configureMockStore from 'redux-mock-store';
import thunk from 'redux-thunk';
import * as fetch from 'jest-fetch-mock';

import friendListActions from 'friendListActions.js';
import { fetchFriendList } from 'friendListActionCreators.js';

//We are creating a mock store here.
const middlewares = [thunk];
const mockStore = configureMockStore(middlewares);

describe('fetchFriendListSucceeded()', () =&gt; {

  it('returns friend list in response', () =&gt; {
    const getFriendList = [
      'John','Emraan', 'Sukanya'
    ]

    const jsonResponse = {
      "method": "getFriendList",
      "response": getFriendList
    }

    //We are mocking only one http fetch response
    fetch.mockResponseOnce(JSON.stringify(jsonResponse));

    let store = mockStore({
      friendList: {}
    })

    let expectedActions = [
      {
        type: friendListActions.fetchFriendList,
        payload: {
          isLoading: true
        }
      },
      {
        type: friendListActions.fetchingFriendListSucceeded,
        payload: {
          isLoading: false,
          errorMsg: null,
          friendList: getFriendList
        }
      }
    ]
    //We are returning 'promise' due to asynchronous actions.
    return (
      store.dispatch(fetchFriendList())
      .then(() =&gt; {
        expect(store.getActions())
          .toEqual(expectedActions);
      })
    )
  })
})

Here, while testing, we should mock the Redux store. Since we are using fetch call to the API,our store is getting responses from asynchronous actions. As Redux only supports synchronous code, we require middle-ware f so that it supports asynchronous code. While writing test cases, we should mock middle-ware also.

Jest can’t mock http  fetch calls, Redux store and middle-ware which will be used in store. So, we need to add some other packages for mocking these things.

Here, we have used ‘redux-mock-store’ to create a mock store and ‘redux-thunk’ to mock the middle-ware. We are using ‘jest-fetch-mock’ package to mock http fetch calls. We have used ‘mockResponseOnce()’ method since we want to mock only one API call.

After mocking API call, it will return a static value. But, our actual code expects a promise object to be returned from API. So, for simulating the same behaviour, we are returning the promise object in the test case.

Snapshot testing:

When we don’t want to change UI components unexpectedly, Snapshot testing will be useful.Jest provides this  amazing feature.

Snapshot for the component is created when test case for the component is run for the first time. So, when I am running Snapshot test cases for the first time, they will pass successfully. This shows that Snapshot testing is not Test Driven Development(TDD). For making it TDD, we can use enzyme package along with it.

Jest creates a new folder __snapshots__ under the current working folder of the test cases and the snapshots will be stored here. These snapshots will be in human readable format. When I run test cases afterwards, the component will be compared with it’s existing snapshot. If some modifications are done in the component, test case will fail. If these changes are desired, we can change the existing snapshot.

Let us have a list component in list.js file:


import React, { Component } from 'react';

class List extends Component {

  render() {
    const flowers = ['Lily', 'Lotus', 'Rose']
    return (
<div>
<h2> List of flowers</h2>
<ul>
          {
            flowers.map(( flower, index ) =&gt;
	<li key="{" index="" }=""> { flower }</li>
)
          }</ul>
</div>
)
  }
}
export default List;

We will write test case for this component to create it’s snapshot:

import React from 'react';
import renderer from 'react-test-renderer'
import List from 'list.js';

it('renders list component correctly', () =&gt; {
  const tree = renderer.create(	<list>).toJSON()
  expect( tree ).toMatchSnapshot()
})

 

For creating snapshot of a component, first we have to create Json object of that component. So, we will create JavaScript object for the component and will convert it to Json. We are unable to create JavaScript object of the component in Jest. So, we have used ‘react-test-renderer’ .

Jest provides ‘toMatchSnapshot()’ method to create snapshot(if it is absent) for that component. Next time, when I want to test List component, it is compared with existing snapshot.

If there is any change in List component, the test case will fail. We can update the snapshot to reflect these changes in snapshot. We should commit these snapshot files along with other test files. You can find more information here.

If we want to add assertions in component testing, want to check manipulations in the components, we can use enzyme package along with Jest framework. enzyme will add TDD in out UI component testing.

In short, Jest will create component tree structure and we can traverse this component tree with the help of enzyme. This package doesn’t have it’s own assertion library. So, we can use assertion library provided by Jest.

We can test following things with the help of Jest and Enzyme:

  • We can test state changes in the components.
  • We can test conditional parameters passed in the component. E.g. Suppose the className of div tag is calculated at run-time based on the received props, we can test it.
  • We can test event handling in the component.
  • We can test component life cycle callbacks. Here we can test whether desired function is called from that life cycle hook or not.

Here are my observations about testing React – Redux application with Jest:

  1. Jest provides a very good assertion and mocking library. We can test asynchronous code with the help of it. If you are new to testing ReactJs application, Jest will be the best choice. Due to parallel testing, it is a great choice for large projects.
  2. We cannot mock http fetch calls with Jest. We can use package like ‘jest-fetch-mock’ for it.
  3. We cannot mock Redux store with Jest. We can use packages like ‘redux-mock-store‘ to create mock store and ‘redux-thunk’ to provide middleware for the store.
  4. Snapshot testing is one of the best features provided by Jest. It is useful to check whether UI is changed unexpectedly or not.
  5. Snapshot testing creates a component tree. We are unable to traverse through this component tree using Jest only. We can use enzyme package along with Jest for it.

To sum it up, I think, Jest is really good framework for testing ReactJs part of the application. Using some packages like ‘redux-mock-store’, ‘redux-thunk’, ‘jest-fetch-mock’, ‘enzyme’ along with Jest, we can test entire React-Redux application.

Posted in General | 1 Comment

Raspberry Pi with GOBOT (Golang) Part I

This all started with developing a simple software to support the robotic project. We had initially planned with Arduino and Gobot (A golang framework for robotics). After we tested some initial sample programs we realised that we can’t run the … Continue reading

Gallery | Tagged , , | Leave a comment

Rails views to ReactJS

Content posted here with the permission of the author, Yogesh Khater, who is currently employed at Josh Software. Original post available here.

Have you ever been introduced to a Rails application having JS/jQuery scattered around app/assets/javascript or app/views or even in app/helpers directory for that matter ? Wondering if there is any consistency present while adding any JS snippet, you think would it be better to have some conventions to follow in here too. Whilst there are lot of other options available to help us out, this blog is about using react-rails to bootstrap with ReactJS and then decoupling it from Rails to a complete UI only application.

♦ But why ?

Few reasons that I could think of,

  • Reactiveness, Of course.
  • Performance is certainly better with client-side rendering.
  • View layer testing gets easier because of the component based architecture.
  • Having Rails as an API only application has its own advantages like better performance (by removal of some middlewares), well documented APIs, faster deployments etc.

♦ Inception

To start with the decoupling, start introducing ReactJS into the application using rails-react gem. But before going further, prerequisite is that you’ve some idea about ReactJS, Redux and JSX.

Usage of react-rails gem is pretty simple. To start off, you could start conversion with one simple Rails view or even a div from the Rails view into a React component.

Lets say you’ve a Rails partial which renders the navigation bar containing some links and username, so creating a NavigationBar component would be like,

$ rails g react:component NavigationBar username:string --es6

which creates app/assets/javascripts/components directory having navigation_bar.jsx in it. After adding its rendering logic, you can replace

render partial: 'navigation_bar', locals: { username: current_user.username }

with

react_component('NavigationBar', username: current_user.username)

So by using above approach, you could start replacing partials and views into React components.

Just to make sure that our goal is to decouple the view layer from Rails, don’t pass too many props which are bound with Rails methods/variables. It would be difficult or rather time consuming to remove such references and repopulate the props values while separation.

♦ Architecture of components

As our final goal is to have well architected and maintainable ReactJS application, we should follow the basic conventions for it.

Assuming you would be using react-redux as a state maintainer, the directory structure that any React-Redux application follows basically has,

  • src/
    • components
    • containers
    • actions
    • actionCreators
    • reducers
    • store.js

So in this scenario also, you could add above directories along with components directory in app/assets/javascripts. You could refer to an example of such architecture.

♦ Using NPM packages over bundled gems

You might have installed some gems for packages like select2, moment.js etc in your Gemfile. But those can’t be used after the decoupling. So a better way is to start using NPM packages over such gems.

To do so, you can start using https://rails-assets.org which converts the NPM packages into respective gems and then adds them to the asset pipeline.

# Gemfile

source "https://rails-assets.org" do
  gem 'rails-assets-moment'
  gem 'rails-assets-select2'
  ...
end

♦ Using rails-api

In time, you would also need to start replacing your controller actions with API based controller actions. If your application is using Rails 5, then it has the builtin support for ActionController::API (a class extracted from ActionController::Base with minimal requirements supporting API actions), but for applications with Rails < 5, you would need to install rails-api gem.  Once the ActionController::API is available, you could add app/controllers/api/v1 directory to start with API based controller actions.

Note that while initialisation of your application, Rails requires all the middlewares by default using require 'rails/all' in config/application.rb. But after the decoupling, we won’t be needing all the middlewares. So do remember to remove require 'rails/all' and keep only the required middlewares.

BTW, inheriting controllers from ActionController::API won’t process requests through additional middlewares (like for rendering the views), so you don’t have to worry if you’ve decided to keep all the middlewares.

♦ The Decoupling

Say you’ve reached a level in your application where

  • Views has only single line for rendering the specific React component
  • No more partials
  • No more view helpers
  • No more HAML, ERB or any other Rails template code in view layouts
  • No more controller actions redirecting or rendering the Rails views.

then your application is in the right place to start the decoupling.

You can use create-react-app or any other boilerplate template to create a React application and start copying the required directories from app/assets/javascripts to it.

Some points to be considered after the migration,

  • Add the dependencies that you’ve mentioned in Gemfile under https://rails-assets.org group into the package.json.
  • You would need to add authentication module in your main component as that was handled by Rails previously.
  • Finally, add top-level components inside BrowserRouter (or any other navigational component) which are rendered in Rails views and then remove app/views.

Your ReactJS application will be up in no time and the best thing is that there won’t be any downtime required while switching from Rails views to ReactJS on your servers !!

Thanks for reading, Happy decoupling !!

Posted in General | Leave a comment

ProTips PostGres 1: Looking through the Postgres Window

Content posted here with the permission of the author, who is currently employed at Josh Software. Original post available here.

Problem – 

We encountered a situation where we had to select only one record for each values of a column (lets say ‘user_id’) ordered by a different column (lets say ‘price’) in a certain table (lets say ‘purchases’). Doing this active-record way was very inefficient. By active-record way, we had to:

  1. Fetch all the user_ids from purchases table
  2. Loop through all the user_ids, for each user_id
    1. Select once record ordered by price

This approach lead us to query the database number of user times + one query for selecting all the user_ids (N + 1 query problem).

Solution – 

We struggled to find a better solution in active-record for this problem and window functions of postgresql came to the rescue.

To understand postgres window functions, we must understand GROUP BY clause. We use GROUP BY clause to apply aggregate functions on a subset of rows of a table.

For example you have a table called “purchases” with the schema as shown below.

  CREATE TABLE purchases (
    purchase_id serial PRIMARY KEY,
    product_id INT NOT NULL,
    price DECIMAL (11, 2),
    user_id INT NOT NULL
  );

Lets insert data into this table.

 INSERT INTO purchases (product_id, price, user_id)
 VALUES
 (1, 10, 1),
 (2, 20, 1),
 (3, 30, 1),
 (4, 40, 2),
 (1, 10, 2),
 (5, 50, 2),
 (6, 30, 2)

 

To calculate average of all the purchase prices, we will use AVG aggregate function.

  SELECT AVG (price) FROM purchases;

This returned the average of prices for both users with id 1 and 2.

  avg
 --------
  27.1428571428571429

 

Now, if you want to find AVG (A postgres aggregate function) of the price of items that a specific user has purchased. You would use GROUP BY clause on user_id. Thus the following statement will return average for each user_id.

  SELECT AVG (price) FROM purchases GROUP BY user_id;

Above query return the following

 user_id          avg
 --------      --------
    1          20.0000000000000000
    2          32.5000000000000000

As you can see,

The AVG aggregate function has reduced the number of rows returned by the sql query.

The GROUP BY clause groups the rows of the table into subset of rows based on a column(s).

Similarly, a window function also operates on a subset of rows of a table but it does not reduce the number of rows. It returns the actual records of the table in the output with an extra column “avg” if you have applied AVG function to the prices.

In the query below the AVG function works as a window function that operates on a set of rows specified by the OVER (PARTITION BY) clause.

For example, below query will return the product_id, user_id, price and average from “purchases” table for each user group.

  SELECT
  product_id,
  user_id,
  price,
  AVG (price) OVER ( PARTITION BY user_id )
  FROM
  purchases

The output

  product_id    user_id   price   avg
 ------------ ---------- ------- -----
      1           1       10.00   20.0000000000000000
      2           1       20.00   20.0000000000000000
      3           1       30.00   20.0000000000000000
      4           2       40.00   32.5000000000000000
      1           2       10.00   32.5000000000000000
      5           2       50.00   32.5000000000000000
      6           2       30.00   32.5000000000000000

You can also apply other operations with PARTITION BY clause. Like if you want the results to be returned in an order, you can use ORDER BY clause. You can also use built in window functions like row_number(), rank(), etc.

These built-in window functions adds a number to each row based on their order. The row_number() function assigns a serial number to each of the rows returned. So if you want a limited number of records for each user_id, you can use it.

The query below will return only one record for each user_id, decreasingly ordered by their purchase price.

  SELECT * FROM
  (
    SELECT
    product_id,
    user_id,
    price,
    row_number() OVER ( PARTITION BY user_id ORDER BY price DESC )
    FROM
    purchases
  ) tmp
  WHERE tmp.row_number &amp;amp;lt;= 1

In the above query, I have ordered each partition by decreasing price and then added limit on the number of rows for each partition by using the row_number() function. Below is the output.

  product_id   user_id   price   row_number
------------- --------- ------- ------------
     3           1       30.00       1
     5           2       50.00       1

Postgres provides many built-in window functions. Some of them are explained below.

RANK() function

The RANK() function assigns ranking within an ordered partition.  If the values of the two rows are the same, the  RANK() function assigns the same rank, with the next ranking(s) skipped.

  SELECT
  product_id,
  user_id,
  price,
  RANK () OVER ( PARTITION BY user_id ORDER BY price )
  FROM purchases

The output

 product_id    user_id    price   rank
 ------------ ---------- ------- -----
    1            1        10.00    1
    1            1        10.00    1
    2            1        20.00    3
    3            1        30.00    4
    1            2        10.00    1
    6            2        30.00    2
    4            2        40.00    3
    5            2        50.00    4

 

DENSE RANK()

The DENSE_RANK() function assigns the ranking within an ordered partition, but the ranks are consecutive. In other words, the same ranks are assigned to multiple rows and no ranks are skipped.

  SELECT
  product_id,
  user_id,
  price,
  DENSE_RANK () OVER ( PARTITION BY user_id ORDER BY price )
  FROM purchases

The output

  product_id   user_id    price   rank
 ------------ ---------- -------  -----
      1         1        10.00     1
      1         1        10.00     1
      2         1        20.00     2
      3         1        30.00     3
      1         2        10.00     1
      6         2        30.00     2
      4         2        40.00     3
      5         2        50.00     4

There are many other window functions that are listed below.

  1. FIRST_VALUE()
  2. LAST_VALUE()
  3. NTH_VALUE()
  4. PERCENT_RANK()
  5. LAG()
  6. LEAD()

 

In the next blog I will share performance analysis of using postgres window functions vs active-record way for same task.

 

Posted in General | Leave a comment

Securing your web applications

Content posted here with the permission of the author, who is currently employed at Josh Software. Original post available here.

While developing web apps, how frequently do you think about securing it?

Just wait for a moment and think!

You do? Really? That’s Awesome!

This blog will act like a checklist for you. Go through it once, to figure out just in case you have missed anything OR maybe I might have missed something that you have taken care of. Either way, please be kind enough to drop a comment.

You don’t? No worries, let’s do it together. Let’s understand:

  • What are the various types of attacks?
  • How do they work?
  • How to secure your app from such attacks?

Alright, without any further ado, let’s get started.

Cross-Site Request Forgery (CSRF)

It’s a kind of attack that forces an end user to execute unwanted actions on a web application in which they’re currently authenticated. These attacks target state-changing request, not theft of data.

csrf explanation

Consider a scenario wherein a genuine admin user logs in to your web app (http://www.genuine-domain.com). When login form is submitted, server validates the credentials and sends a cookie in response. Now, for each subsequent request to your domain, browser will automatically send along this cookie. Server will verify the existence and contents of the cookie before considering the request authenticated.

All sounds good? Yeah? Think again!

Your web app is vulnerable to CSRF attacks right now. How?

Let’s say the admin user is authorized to delete some other user’s account. Also assume that URL for such endpoint is accessible over GET request. Now, open a new tab, type this URL and press enter. Voila! you just deleted a user.

Now think, how difficult is it for a malicious user to trick your customer into clicking such a URL? Not very difficult? Right?

Just by following certain basic rules, you can protect your app from such attacks:

  1. Show a confirmation prompt before every destructive action
  2. Use GET verb for read-only requests. Use POST/PUT/PATCH/DELETE for write operations.

How do non-get verbs help? Did you notice that only URLs exposed over GET verb can be opened from browser’s URL bar (and hence the links). By using non-get verbs, you are making the hackers job a little more difficult.

The hacker may still trick your customer into submitting a form, which would trigger a POST/PATCH/PUT/DELETE request.

We can step-up our security by including a one-time disposable secure token in every form rendered by the server. Now, if a form is submitted without a token OR with an invalid token, the server would simply reject it. Only our server and genuine client (browser tab) knows about this token. Additionally, verify the Origin and Referrer header whenever possible. Please note that Origin header may not always be present (specially in GET requests). Similarly, Referrer header is not guaranteed to be present in every request.

Cookie Stealing/Hijacking

The attacker can also modify the contents of the cookie OR just copy it. To mitigate this risk, your web app needs to ensure the following while setting the cookie:

  1. User “session” cookies – they are deleted when browser/client is shut down.
  2. Turn on the “secure” flag on the cookie – only accessible over HTTPS.
  3. Turn on the “HttpOnly” flag on the cookie – inaccessible using JavaScript API’s.

Network Sniffing

An attacker can always intercept your network requests and get access to sensitive information. To secure your data during transit, we need to encrypt it. This can be done by using HTTPS certificates.

However, using HTTPS alone is not sufficient. HTTPS encrypts data using some ciphers. You encryption is only as strong as your ciphers. Older version of HTTPS are known to have vulnerabilities. Therefore, I urge you to use HTTP Strict Transport Security (HSTS).

CORS (Cross-Origin Resource Sharing)

Similar to  HTML requests, we also need to protect our web app from AJAX based security threats. AJAX calls can be triggered either from your domain OR from a different domain. Luckily, most modern browsers block cross-domain AJAX calls by default. But, what if a genuine user wants to share resources across the domains?

Secure inter-domain AJAX based communication is handled by using CORS (Cross Origin Request Sharing). Here is an excellent blog on implementing CORS.

NOTE: CORS restrictions are only applicable on AJAX requests (i.e. not applicable on HTML requests).

Clickjacking

Clickjacking (User Interface redress attack OR UI redress attack OR UI redressing) is a malicious technique of tricking a Web user into clicking on something different from what the user perceives they are clicking on, thus potentially revealing confidential information or taking control of their computer while clicking on seemingly harmless web pages.

clickjacking explanation

In order to secure your web app from clickjacking attacks, the server needs to set X-Frame-Options: DENY. This means, your web app cannot be opened inside an iframe.

If you want to allow rendering of your app inside an iframe from your domain only, you can set:

X-Frame-Options: SAMEORIGIN
OR
X-Frame-Options: ALLOW-FROM <whitelisted-domain>

SQL Injection

A SQL injection attack consists of insertion or “injection” of a SQL query via the input data from the client to the application. A successful SQL injection exploit can read sensitive data from the database, modify database data (Insert/Update/Delete), execute administration operations on the database (such as shutdown the DBMS), recover the content of a given file present on the DBMS file system and in some cases issue commands to the operating system.

Bind variables are the best way to prevent SQL injection. When using bind parameters you do not write the actual values but instead insert placeholders into the SQL statement. That way the statements do not change when executing them with different values. The question mark (?) is the only placeholder character that the SQL standard defines.

Cross-Site Scripting (XSS)

Cross-Site Scripting, commonly known as XSS, is a vulnerability that is often found in web apps. XSS allows attackers to inject client-side scripts into public facing web pages and, in many cases, can be used by attackers to work their way past access controls.

This is done by tricking a browser so that it accepts data from an untrusted source, and this typically happens when attackers use familiar code (such as JavaScript, for example) as developers don’t scrub out these characters.

xss explanation

To protect your web app from XSS attacks, you need to:

  1. Never trust user input. i.e. Always escape/sanitize input before inserting into:
    1. HTML element’s content. e.g. innerHTML, $(‘#some-id’).html(), etc.
    2. Values of common attributes. e.g. class, data-*, etc.
    3. HTML or JS comments
    4. Style property values
    5. URL parameters
  2. Whenever possible use an Auto-Escaping Template System
  3. Avoid eval() or such dynamic evaluation methods
  4. Implement CSP

CSP makes it possible for web app developers to specify the domains that the browser should consider to be valid sources of executable scripts. A CSP compatible browser will then only execute scripts loaded in source files received from those whitelisted domains, ignoring all other script (including inline scripts and event-handling HTML attributes).

e.g. The following header instructs the CSP compatible browser to only allow scripts from site’s own origin. This excludes the sub-domains.

Content-Security-Policy: default-src ‘self’

Event after implementing all the above mentioned security measures, your web app is still not 100% secure. That’s when I say: Security is a myth.

Despite that, I recommend following ALL the above mentioned precautions to protect your web app. That’s because something is better than nothing and this is certainly more than something ;). If nothing else, at least this will discourage attackers from choosing you as a target.

Finally, a sincere appeal to all:

Implement all these security measures together. Don’t pick n choose.

Posted in General | Leave a comment