Monitoring as an integral part of your SDLC

Monitoring as an integral part of your SDLC

In a modern cloud-native, microservices, container-driven, highly available (insert your favorite catchy term here) application design, monitoring is not an option, and while much time is being wasted in the technology debate (do not get me wrong, it is a good debate in which I have a bias, and it is the Elastic Stack if you care).

Instead, this blog post will focus on what I think is fairly omitted, monitoring( like every other system), requires a good understanding of its users and goals, it is a system with a normal life cycle (Dev, QA, Prod)  and it will improve along not just your production but also your developments (DevOps, you know!)

Monitoring Tenants

Start by understanding the eco-system in which your monitoring platform will function, you have mostly 3 tenants that your monitoring system needs to ‘Enable’ (assuming large enterprise, lacking the pink unicorn DevOps talents)

Picture1

Enabling your tenants

Now that your different engineering teams are identified and their requirements are clear’er’, you can start enabling those tenants, some example tasks maybe:

  • Enable the Development team to emit events (Logging or otherwise) providing enough information about the SLIs. (Service Level Indicators)

For example, you may Enable Spring Boot/Node.js Development team to emit the proper logs needed for your You may build a few libraries and initializers to ensure the effective use of logging

  • You may need to enable the operations team to stand up and tear down the monitoring environments at well.
  • Enable the support team to achieve their target SLO’s through three main ways:
    • Automatic notifications on certain events. (proactive actions)
    • Automated response to certain events (when possible to avoid pager fatigue)
    • Dashboards to help investigate issues and decrease your MTTR (Mean Time To Resolution/Response).

Integrating monitoring into the project life cycle

In a classical story of ‘Login’ Screen, a typical flow will typically look like the following

normalpath

Once we include the Support Engineering Team and monitoring in the discussion, we start asking a question

“What could go wrong here that I may need to alert the administrators about? or simply plot on a dashboard for problem resolution ? “

The simple action of asking that question may bring new ideas to the team and instead you start to plot a story like this

improvedpath

By integrating monitoring, operations and support tasks into our development environment we simply gain the following

  • Inclusion of exception paths that were not handled at all
  • Further enforcement of the “Design for Failure”
  • Faster response to problems that would have gone unnoticed or at worse took a longer time to fix.

The previous steps only cements the status of our support efforts as an “Engineering” practice!

Monitoring life cycle

Monitoring solutions involves all parties, it is important for Developers, Testers, Operations and Support, building a monitoring environment could be a great way to break the silos between those teams

As monitoring systems are enabled in each project and custom dashboards/alerts are being built for each project it does make sense to deploy those systems in Development environments and that is the perfect way to get support teams involved as early as possible.

Monitoring systems could identify bugs that regular system would not identify,  Engage the support team and QA team in discussions about what is acceptable operational situations and what is not and it will evolve along your system, the learnings from one project could also flow into others as you improve your ‘base offering’ for each project over time.

Agile applies here!

Do not let the desire to engineer a great system bait you into the Analysis Paralysis trap, If you do not currently have a monitoring system, almost anything you do will help, start somewhere, use the “Build to change” versus the “Build to last” principal and remember that “Perfect is the enemy of Good”.legosaybatman.png

Anytime you have a choice of an open-source vs pay tool, I lean towards the open source, with the exception of hosted tools that may offer fast and easy starting point

Try to avoid tool lock-in (tools that require modifying your code against tools that monitor your logs for example).

Deploy your first system as soon as possible, minimum dashboards and features, collect feedback from your tenants, make small changes and redeploy, Sounds Familiar?

Monitoring the Monitors

By that, I do not mean literally to monitor monitoring systems (and yes, that is a thing!), but I mean to constantly evolve your monitoring solutions, as time goes by you will learn that some of the features your tenants asked for are not used anymore or even worse, an event trigger, for example, may trigger so many times that it is ignored.

Any dashboard that is rarely used or Event that triggers so often it is ignored should always be put up for consideration for change or removal altogether,  The experience shared by organisations large such as Google and some of the startups that shared their experience, tells us that the smaller and targeted your monitoring system is the more useful it becomes.

Monitor the things you need and nothing else

 

 

 

Bittersweet Validation

Bittersweet Validation

In December of 2015, I thought I had a great idea, a voice-enabled banking device, that shows you how far are you to your savings goal, to that vacation/car purchase or even alert you if you are about to over-draft.

I did try to sell the idea and even patent it to my previous employer “Innovation Department” which gave me the standard “we are not an IT business” and somehow that director thought IOT is health related only!, luckily one of the great minds there saw the conversation online ( Greg O’Toole ) and was super enthusiastic, he contacted me and we sprung into action, I did a blog on using the pi and a cloud backend and he did a full moving robot with the help of his daughter!. it was fun and then life took us into our different directions.

More than a year later, while listening to Cloud-Cast podcast on my way to work, the topic was voice-enabled apps and “oh no dear lord!” it is my idea, it seems Capital One has already started doing this, At first I had a mixed emotions, I was excited with the validation,  yet frustrated that I never got the glory of being first there, but as the day went on, I was just excited (wait till you see the cool stuff we are working on these days!) .

As my 4 years old who can not stop watching ‘Cars’ would say..

Ka-chow! 🙂

 

Beware The Software Developer Purge!

 

How many times have you heard that the world is running out of developers ?, and that usually is followed a few years later by the statement that developers will be extinct!.

I recall reading an article that Ronald Regan Star-Wars program created such a shortage in this new field called software development, I thought then that my Commodore 64 is not just my best teenage companion but may also be my ticket out of my small town in the south of Egypt and into the big shiny world!, only few geeks like me really cared about computers then.

But more than 10 years when I arrived in Canada, Microsoft certification was all the rage (second only to day trading), and everyone I know was in “IT”.

When the tech bubble busted in 2001, The market purged itself pretty well,  some of my motorcycle buddies who worked for Microsoft went back to being accountant, club bouncers, and managers, then the pendulum swung too much to the other side, and by 2007 if you were a software developer you were doomed, your job is going overseas, somewhere, somehow, someone decided writing software is something that should be done for 30$/hr. and nobody wanted to study software again. the dark ages of coding loomed!.

Those who were talented were pushed to the side, and yet there were still a lot of “Computer Blue Collar” jobs, you know?, the Unix Admins, the DB admins,the Operations guys, and , those guys that were still necessary to stay here (not overseas) and were hired by every government/bank/enterprise in the country.

Then in 2008 in the middle of that horrific economic gloom, Mr. Steve Jobs came up with a magical glass rectangle he called  an iPhone, and BOOM, software was back, big time, this time riding a waves of social media, Analytics, BigData, Cloud, IOT, FinTechs, smart homes, smart driving cars, and on and on.

Albeit 15 years late, the “New Economy” we dreamt in the 90s is here and suddenly if you are not in software you are losing big time, and while creating shortage in talent it did quite a trick on those “Blue Collar” IT jobs of the past 10 years, nobody wants Unix Admins anymore, those who were purged by the economy in 2001 were once again purged by technology in 2015 onwards, the cycle repeats!.

Coding (Software Engineering) is funny this way, it is a science you can argue, but it is deeply rooted in talent, so while the schools are churning larger and larger numbers of Software Engineers, that does not really mean an increase in talent, it just means we are preparing the next generation of IT blue collar workers, and if history is a judge, eventually something will happen and only those talented will stay.

I feel truly privileged to be alive in this environment where I a am surrounded by talented people at my work, I love being with them, and in all honesty some of them do not appreciate it as they have not lived the dark years of the post tech-boom and financial collapse!.

Software maybe popular these days, it may become (though highly unlikely) less popular in the future, but for the few who truly enjoy working with it, it will never be a comfortable blue collar job, it will always be a passion and joy.

Much as my Commodore 64 was!.

Ahmed.

 

Beware, The one handed Architect.

Beware, The one handed Architect.

Give me a one-handed economist! All my economics say, ”On the one hand? on the other.”
Harry S. Truman

That was a frustrated U.S. president dealing with Economists on issues like balancing the budget or running deficits, raising or lowering taxes, spending more or spending less, each of those choices brings benefits and each of which has some dangerous downfalls.

The key to a successful economy is not making one choice vs. another but to rather find a balance, oh wait.. scratch that.. it is to “make a good guess” for a balance, apply it, monitor carefully and keep rebalancing.

These days I am going through the amazing and exhilarating experience of leading a team into deploying our first Big-Data solution (side note, I fully understand that the words data lake and big data for software now are as corny as taking walks by the beach for dating sites !).

In this experience we are trying to strike a balance between Agile methodology, and the existing (two speed) Enterprise, between Data and MicroServices, between Batches, Micro Batches and Data Stores, every thing we do is a learning experience, and being almost-agile meant a whole lot of learning and re-learning, for everybody, Developers, Managers, Tech Leads, Senior leadership and Customers.

agile_waterfall_triple_constraint
Source “Waterfall,Agile & the Triple Constraint

As we are finally seeing success, great results signified by those smiles and “yaie” moments some times and deep sighs of frustrations some other times, I am sitting here reflecting a bit.

Almost every decision we took was a choice between alternatives, none of which was perfect (at least mostly), We had to do a good deal of spiking, specially in the early stages, even then, we sometimes took the best and sometimes  the lesser of two evils, and at least in one case, that choice had to be revised down the road.

The conclusion ? and the reason for writing this blog is this, As software systems become more and more complex it is very dangerous to think that we have one answer for every question, we all need to stay ‘two handed’, as Architects we must

  • Understand that there are always alternatives.

  • Apply the role of 80/20, find a ‘good enough’ choice instead of chasing an illusive perfect choice.

  • Get something going quick, and monitor very carefully the results on users, operations and the whole enterprise ec-system.

  • Adjust accordingly, Adjust often.

  • Do not be afraid to change your views.

  • Trust your guts and experience, but keep listening to others.

As my volleyball coach used to say … “Stay Loose” .. eye on the ball and stay loose.

We live in some exciting times indeed.

 

 

The future of software documentation.

The topic of code documentation has been on my mind lately, while we have decided as a team to use markdown to keep both our code and documents in the same repository and all that good stuff.

But in my first ‘video blog’ I decided “conveniently” to discuss the use of ‘Video’ as a medium in software documentation !.

They say that you are never satisfied with your first video blog, and I am not an exception, But they also say unless you make them and post them, you can not improve.

So here it is :).

Banking Internet-Of-Things!

I have been working on this idea to have a device connected to your bank account, or to a ‘pool’ inside your bank account.

I first spent sometime discovering the wonderful world of GPIO programming on Raspberry pi which was surprisingly easy as you can see in this video.


So Now it is time to put some infrastructure under this baby, I have built my solution on Liberty Profile on Bluemix (RESTful API) that connects to MongoDB hosted by MongoLabs.

Pasted_Image_2016-02-02__10_14_PM

So now the backend is ready

You can check your Account with the following URL.

http://coolbank.mybluemix.net/iotServer/mongo/Pools/sub002

Which will give you the sub-account balance for id ‘sub002’

The response will look something like this

{
balance: 850
_id“:{“$oid“: 56a514e5e4b0b62ca30bdcbd}
id: sub002
label: Cuba Vacation
soft“: true
target: 2800
}
So that is all what the device would need to know how far your progress is (You have saved 850$ out of the 2800$ target.
Now the traditional banking API’s would be something like the ‘deposit’ Action
Note: that this is a ‘Post’ not a ‘Get’
So now that we deposited 1000$ into our Account let us see how is Our Cuba Saving is doing ? we will issue the same query as early and the result came as !
{
balance: 930
_id“:{“$oid“: 56a514e5e4b0b62ca30bdcbd}
id: sub002
label: Cuba Vacation
soft“: true
target: 2800
}
So This new number (930) means that we added 80$ to our Cuba account, the secret to this is actually hidden in the ‘Account’ roles. The account contains a role that says
For Every Deposit in Account act001 , 8% will go into pool sub002
{
“id”: “account001”,
“type”: “chq”,
“pools”: [
….
{
“poolid”: “sub002”,
“freecash”: false,
“percentage”: 8
},…..
],
“balance”: 2000,
“total”: 4475
}

So Now the back end is ready for use and all what need to do is have my Raspberry pi light up Red, Yellow and Green to reflect the status of my cool bank account.

And Every time you deposit money into this account the savings will accumulate.

Pasted_Image_2016-02-02__10_33_PM

Now that the API is open on bluemix, Please , do not deposit a million dollars, I have not error proofed this, remember, I have a busy job and a three year old,time is a precious commodity here, so be nice :).

 

Designing Analytics Solutions

Designing Analytics Solutions

Analytics vs. Traditional business App.

Analytics applications has some interesting characteristics that really differs them from our normal business transactional applications.

  • It must contain a strong Visualization layer (What is the benefit of doing analysis that is not communicated well to the user).
  • It usually deals with multiple and changing data sources so decoupling the data sources from the presentation schema would be of great benefit.
  • It does “usually” contain an exploratory phase which adds an interesting aspect to both traditional and modern (Agile) approaches of development.

Proposed Architecture of Open Data Solution.

In my previous posts, the data analytics was done on CSV data direct assuming limited number of users and very limited of resources to build the analytics (volunteer/hackathon type).

But if we start to assume that we have a larger base of users, a faster performing application will be required that can scale well to serve hundreds and thousands of clients per hour, per day.

The proposed Architecture will look like this

Slide5

This Architecture satisfies the requirements of separating the schema coupled with the Client analytics interface from the Data processing in the backend.

The schema provides a very responsive Solution that can scale very well either on a cloud or on-premise hosted solution.

Adding external real-time resources

The solution could be enhanced further by adding other sources of data that does not required ‘batch’ processing such as social network sentiment analysis or other available services (Weather, financial services, etc.)

Slide6

Business use of Open Data

The final variation on the previous design is to allow the use of such solution inside a business (Typically a credit organization in a bank that is looking into small business loan).

Slide7

The solution will allow a bank employee to use Open Data to quickly analyze the potential of a new business opportunity and mixes that with the bank own information about the client to build a holistic approach to evaluating the customer request.online-privacy

Concerns ??

One thing in particular really jumps at me from the previous example, is that some of the data regarding previous businesses in the city (that opened and canceled) include some PII data (client name and phone number).

Now Would a bank want to know if the client have been involved in previous licenses that closed after a certain amount of time ?.

The answer is yes, and they probably should know, and he probably must disclose the fact and the data could be retrieved from previous interactions.

Nonetheless other situations specially ones that involve maybe health history combined with Insurance conditions maybe of high concern.

What is not a PII concern in a regular application could quickly turn into a PII nightmare when mixing different sources of data, specially with the prominence of social networking apps and the ability of business to mine such data.

 

 

 

 

Plotting Economic indicators on City Open Data

Plotting Economic indicators on City Open Data

In my previous post I discussed the steps I took to deploy my IPython notebook to IBM bluemix allowing users to access the tool through a browser without the need to install anything on their computers.

The next step was to modify the solution to allow for

  • Plotting the TSX exchange year closing values against any business license graphs.
  • Adding a second postal code field to allow comparing two postal code licenses numbers in one graph
  • Enhanced ‘Save’ button to save the graph so you can download it to your computer and use in presentations, documents, etc.
EATING ESTABLISHMENT_M4K_1988_2015
Enhanced output of comparing  two postal codes “Eating Establishments” licenses with the TSX yearly closing plotted in black.

Simple graphs like the above allows us to make some quick conclusions

  • M5V (downtown) postal code licenses is very much related to the TSX movements.
  • This correlation has weakened somehow since the great recession of 2008 (business shock ?!).
  • M4K (Danforth) postal code licenses seem to be steady in its issuance/cancellation rate indicating the area is resilient to economic shocks.
  • Both areas (as most other businesses) faced a high rate of cancellation in 2005 (Maybe an effect of change of regulations ?!).

Using the tool

estep1

  • Give it a few seconds to finish opening, then select ‘cell’ -> ‘run all’
  • step3
  • Use the tool to select
    • The Category you want to study
    • Optionally specify 3 Letter postal code for an area
    • Optionally specify a second postal code to compare
    • Select which data you want (Issued/Canceled/total).
  • You can now click the ‘save as png’ to save the data plot
  • estep2
  • Once you do that you can go back to the ‘home’ tree view
  • http://toronto-business-licenses.mybluemix.net/tree
  • You will now see the file saved as Category_Postal_startyear_endyear.png as below
  • estep3
  • You can download and save that file for your own use.
  • EATING ESTABLISHMENT_M5V_1980_2015

Summary

Adding external data like TSX closing numbers, GDP data, Interest Rates, Inflation index to existing Open Data available from the government could paint a more complete picture of our analysis, showing the effect of the over all economy on both Business and Social changes in the city.

 

Analytics in the Cloud !.

Analytics in the Cloud !.

Deploying the solution to the public.

In the context of being a data activist providing analytics to a small community (one city or region) and not expecting high traffic on that service, it does make sense to keep the solution running in Python and try to deploy the notebook IPython server to a cloud.

In future posts I will deal with solutions for high traffic, business-type applications.

PAAS vs IAAS

Although I am tempted to whip up my own charts on the differences between Platform As A Service (PAAS) and InfraStructure AS A Service (IAAS) but I will resist beating that dead horse and repost this good chart by the folks at IBM Cloud on Twitter https://twitter.com/IBMcloud.

CA_ymmtWkAAZ3iu
Cloud Deployment Models

As you can see from the chart above the steps required to deploy my python notebook into a cloud solution are summarized in the following diagram.

cloud models
Steps required to deploy Ipython Notebook into a cloud solution.

The benefits for deploying the solution into IAAS will be :

  • Higher degree of control over the specific configuration of the notebook.
  • Full usage of the underlying operating system services(or maybe this is a disadvantage !).

The benefits for deploying the solution into PAAS :

  • Faster time-to-customer.
  • Blocking users from making permanent changes to the notebook files.
  • Quick deployment model allows for pushing changes to the service often. (Continuous Deployment).

Deploying the Solution

For the reasons outlined above I decided to deploy my solution on IBM Bluemix  ‘Platform as a service’ and the starting point for us is to use the Python Service provided by bluemix.

To deploy my solution I followed pretty much the steps outlined by Peter Parente in his post on IBM DW

Run IPython Notebook on IBM Bluemix 

The most important tweaks I have done were.

  • Changed the Memory requirements in the manifest.yml to 1GB (the application consumes a good amount of memory and crashes the default 128MB when it loads the initial CSV).
  • Allowed for automatic deployment from a GIT repository so I can push my changes from multiple locations since the ‘download’ function of Cloud foundry was a little bit unpredictable.

The final Result

Now I can ask the business user who wants to study the City of toronto business licenses Issuance/Cancellation data to go to the following web page

Toronto-business-licenses.mybluemix.net/notebooks/User Ready Business License Explorer.ipynb

You will be prompted for a password

login
enter password to log into notebook

Use the password : opensecret

The Python sheet will show up follow the following steps.

  1. Select First cell (Red Arrow) and click the Run button as shownStep1
  2. A button called ‘toggle code’ will show up, click the button to hide the code from now on.step2
  3. Once the code disappear from the menu select ‘run all’ and wait for a few seconds while the data is loadedstep3

voila .. that is all you need , now you can slice and dice the city licensing data using the Python widgets as provided

step4
Analytics tool deployed in IBM bluemix

Conclusion

Through the past 4 blog posts I have examined the use of  IPython notebooks  as a tool to analyze government open data and provide answers to ad-hoc questions that could be used to empower communities and citizens and enhance collaboration between the government and the public.

Using cloud technologies like IBM bluemix the use of these analytics tools can be expanded to a larger population and used to allow the community to explore the existing data and build knowledge about the different aspects of the data available and how it may affect their communities.

 

 

City of Toronto open data and business decision making

City of Toronto open data and business decision making

From Ad-hoc to App.

In my two previous blog notes I did discuss the three types of Analytics Ad-hoc , Software (App) and hybrid.

In this blog note we will do a little bit more of analysis on the business license data from the City of toronto to help someone who is not necessarily a python programmer to analysis on the city open data by creating an interface.

 

Bank-Account-Opening-or-Mortgage-Signing
A tool for analyzing data that will be used by the public should not require knowledge of Python coding.

 

Creating a user interface

Python has a few libraries that allows the creation of some interesting user interface widgets, I have chosen ipywidgets just to illustrate the ability to create a user interface that allows the user to do the following on the Data.

  • Select a certain category of licenses to view
  • Select a 3 Letter postal code (representing the area) or ALL for the whole city.
  • Use sliders to decide the year range he wants to look at.

So let us look at the ‘Greek Resturant’ scene in Toronto by focusing our search on the Danforth area for example (using Postal code M4K , see map below).

 

input
The ipywidgets used to interact with the user and matplotlib diagrams
danforth
Borders of M4K postal code

Using the widgets “Interact” capabilities of python, Once you select the Category, range and Postal, you will immediately see the plots showing you the results.

  • Number of Issued licenses per year
  • Number of Canceled licenses per year
  • The total (issued-canceled) of new business open in the postal code per year.

EATING ESTABLISHMENT_M4K_1994_2014

 

Now compare that to analysis to the restaurants

downtown
Borders of M5V postal code

/eating establishments at the heart of the business area downtown for that Analysis we will use the postal code M5V (see map to the right).

And we will focus our search on the 10 years period between 2004 and 2014.

EATING ESTABLISHMENT_M5V_2004_2014
Downtown Core
EATING ESTABLISHMENT_M4K_2004_2014
The Danforth

The graphs tell a story of two areas that respond to different events in different ways, it is obvious that in 2005 there was a change of licensing requirements that increased ‘cancelations’ all over the city but the Danforth was less resilient to licensing changes, but when the Economic crisis of 2008 hit, the downtown core as severely affected while the danforth absorbed the shock relatively well, and most negative impact was also shifted 1 year later with a very strong rebound in 2010.

Numerical Analysis

One of the features that will be useful for such analysis is to allow the users to print the numerical data used for the graphs.

input2

             issued  canceled  total
year                         
2004             75        46     29
2005             59        72    -13
2006             99        85     14
2007             62        90    -28
2008             53       114    -61
2009             61        59      2
2010             61        51     10
2011             66        45     21
2012             54        50      4
2013             50        53     -3
2014             63        49     14

 

Using the notebook and next steps.

If you want to run some of those analysis yourself you can download the zip file from github

https://github.com/c0dingarchit3ct/Open_TO_BusinessLicenses

Use the ‘User Ready Business License Explorer’ notebook.

You will need the following libraries installed (the easiest way is using pip).

  • Pandas
  • matplotlib
  • numpy
  • ipywidgets

But what about those who want to use these data ?

The next challenge will be delivering this solution without requiring the user to install iPython, Pandas, numpy, iwidgets, etc. and that will be the topic of my next post.