Microservices with API Gateway

Let’s imagine that you are developing a native mobile client for a shopping application. you would have to implement a product details page, which displays following information,

  • Items in the shopping cart
  • Order history
  • Customer reviews
  • Low inventory warning
  • Shipping options
  • Various recommendations,  (other products bought by customers who bought this product)
  • Alternative purchasing options

example..

In a monolithic application architecture, Those data would retrieve by making a single REST call (GET api.company.com/productdetails/<productId>) to the application. A load balancer routes the request to one of N identical application instances. The application would then query various database tables and return the response to the client.

But if you use microservice architecture the data which need to displayed must retrieve by multiple microservices. Here are some of the example microservices we would need.

  • Shopping Cart Service – items in the shopping cart
  • Order Service – order history
  • Catalog Service – basic product information, such as it’s name, image, and price
  • Review Service – customer reviews
  • Inventory Service – low inventory warning
  • Shipping Service – shipping options, deadlines, and costs drawn separately from the shipping provider’s API
  • Recommendation Service(s) – suggested items

All above  microservice would have a public endpoint (https://<serviceName&gt;.api.company.name) and client would have to make many requests to retrieve the all necessary data. If app need to make hundreds of request to render a one page, the app would be inefficient. Also if already existing microservices response with different data type, the app have to handle it too.

Due to these reasons, its wise to use an API Gateway for encapsulates the internal microservices and provides an API that respond to each client. The API Gateway is responsible for request meditions and compose a singe respond.
A great example of an API Gateway is the Netflix API Gateway. The Netflix streaming service is available on hundreds of different kinds of devices including televisions, set-top boxes, smartphones, gaming systems, tablets, etc. Initially, Netflix attempted to provide a one-size-fits-all API for their streaming service. However, they discovered that it didn’t work well because of the diverse range of devices and their unique needs.

Collaboration of Devs and Ops , Hi DevOPs!!

What is DevOps?

There is no definitive answer, only lots of opinions about what is covered under DevOps and what’s not.  It Born of the need to improve IT service delivery agility, the DevOps movement emphasizes communication, collaboration and integration between software developers and IT operations. Rather than seeing these two groups as silos who pass things along but don’t really work together, DevOps recognizes the interdependence of software development and IT operations and helps an organization produce software and IT services more rapidly, with frequent iterations.

First, whole point of devops is that change the culture of dev and ops. Yeah but the reality is that many companies want to designate someone as the DevOps engineer. Usually a more accurate title for that person would be Automation Engineer or something along those lines but we work within the constraints we are given.

Common DevOps mistakes when scaling

When it come to scaling,  most companies are pretty decent at scaling up infrastructure, and pretty awful at scaling up code (at least on the infrastructure side of things i.e Ops.) Lot of people taking a 2000 line mentality into a 20000 line project. Some companies have engineers that can code a large architecture and keep scalability in mind but they get hamstrung by their ops people.

The main issue when scaling up is cultural change. When the organization is small, everything relies on one go-to guy under a lot of pressure to work quickly. Choices are made which aren’t always the best ones or the most scalable and every fix is a quick and dirty job. Naturally, documentation is an afterthought.

When the same organization grows, the same person is likely very insecure about the choices he or she has made. Collaboration with the new guys is often an issue.

Management also often makes the common mistake of pushing to retrofit automation tools before standardization and process which creates as many problems as it solves and introduces new risks. Design your infrastructure in modules as soon as possible. Make it so each module can easily be replaced added or removed as needed. Also document stuff and don’t feed the IT hero culture.

Other common pitfalls would be,

  • Premature optimization.
  • Reinventing the wheel… in a completely non-scalable way.
  • Not using indexes/foreign keys.
  • Fashion driven development (mongo, hey!)
  • Not creating a stateless infrastructure where possible
  • Not throwing hardware at a problem first (is your database server a bit slow? have you tried just installing an SSD/upgrading the RAM?)
  • Not doing performance testing before optimizing
  • Not admitting that used schema sucks
  • Not admitting that the method use for the database sucks, not the database itself. Yes, the problem is us, it is not the database.
  • Thinking that things are easy.
  • Wanting to solve problems with new technologies with fancy names.
  • Jumping into those technologies without researching them.
  • Not hiring a devops person, but instead getting me a developer to act like a sysadmi
  • Taking prototypes turning them into production, then complaining when they crash. Prototypes which were explicitly stated as things that would-not-scale.

Fashion driven development really hits home. Its understandable how it happens. I mean, who doesn’t want to be an early adopter of the latest framework, DB etc…. But in the real world, you really need to have an actual data driven argument for why the new system is actually measurably better than the tried and true. Most engineers learn this eventually.

Exposing too much information about your environment is another problem.  People start to hardcode around that information and it makes scaling hard. For example work  environment where have three data centers: west, central and east. Servers in those data centers have a number associated with its location in the hostname. I have seen far too much code wired to this information. If we added/removed a location there would be a ton of refactoring. Staying ambiguous makes changing architecture and technology easier.

Every year DevOps deal with at least a few clients where the founder knows that their infrastructure is buggy and insecure, but there’s often a main “First Hire” developer that is very reluctant to let you look under the hood, ostensibly because they are embarrassed by what’s there. Dont judge, because those decisions get made, often under extreme pressure from people who may or may not understand the technical risks and tradeoffs. Hopefully DevOPs will be able to close the gap somewhat for founders so they can make better informed decisions regarding this stuff and maybe even work more harmoniously with their technical people.

Apache Lucene™ 5.2.0 available

Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements,
some of which are highlighted below.

* Span queries now share document conjunction/intersection code with
boolean queries, and use two-phased iterators for faster intersection by
avoiding loading positions in certain cases.
SpanQuerys allow for nested, positional restrictions when matching documents in Lucene. SpanQuery’s are much like PhraseQuerys or MultiPhraseQuerys in that they all restrict term matches by position, but SpanQuerys can be much more expressive.

* Added two-phase support to SpanNotQuery, and SpanPositionCheckQuery and
its subclasses: SpanPositionRangeQuery, SpanPayloadCheckQuery,
SpanNearPayloadCheckQuery, SpanFirstQuery.
The basic SpanQuery units are the SpanTermQuery and the SpanNearQuery.

* Added a new query time join to the join module that uses global
ordinals, which is faster for subsequent joins between reopens.

* New CompositeSpatialStrategy combines speed of RPT with accuracy of SDV.
Includes optimized Intersect predicate to avoid many geometry checks. Uses
TwoPhaseIterator.

* New LimitTokenOffsetFilter that limits tokens to those before a
configured maximum start offset.

* New spatial PackedQuadPrefixTree, a generally more efficient choice
than QuadPrefixTree, especially for high precision shapes. When used, you
should typically disable RPT’s pruneLeafyBranches option.
PackedQuadPrefixTree subclass of QuadPrefixTree, this SpatialPrefixTree uses the compact QuadCell encoding.

* Expressions now support bindings keys that look like zero arg functions

* Add SpanWithinQuery and SpanContainingQuery that return spans inside of
/ containing another spans.

* New Spatial “Geo3d” API with partial Spatial4j integration. It is a set
of shapes implemented using 3D planar geometry for calculating spatial
relations on the surface of a sphere. Shapes include Point, BBox, Circle,
Path (buffered line string), and Polygon.
The release is available for immediate download here