Blog fullwidth

Orzota call for Partners


IoT and Big Data are becoming essential to market growth and customer success. Enter Orzota’s call for partners. There are initiatives in all major verticals like Manufacturing, Oil & Gas, Transportation, Retail, Financial, Insurance, Life Science/Healthcare, etc. Additionally, there are many pieces to the puzzle between, data architecture, technology and the necessary resources to deliver a successful Big Data & IoT program that will benefit business users.

At Orzota we seat at the intersection of IoT and Big Data. As a Silicon Valley based company we aim to provide solutions that can transform the way companies collaborate and derive value from Data whether being it from sensors, machines, ERPs, websites, industry boards, social media and beyond.  To do so we bring platforms for Big Data and IoT that are flexible and quickly accelerate the delivery of solutions while supporting it through our Managed Services models. We’re harnessing Open stacks and cloud technologies that provide the elasticity and economics to quickly generate ROI. Lastly, we augment projects with verified resources for Data Architects, Data Engineers and Data Science.

As a partner, you’re a technology or a consulting provider that serves the mid-market and is looking to augment their service portfolio while adopting the latest in Open stack technologies for Big Data and IoT.  Apart from an excellent solution, what you can expect is the support of an experienced team that has been in the technology side of this domain at Yahoo while harnessing project experience from companies like Netflix, Boeing, and Bank of America to name a few.

We’re all about making it easy so email us at partners@orzota.com

Implementing Big Data Analysis to Your Business

implementing big dataThe conversation has shifted from trying to define what Big Data is but we’re stuck in the layers. Is Big Data fun like eating an orange full of vitamin c? Or is it more like an onion? where you cry when you try to peel it off but contrary it’s so sweet when you cook it properly? I tend to lean more towards an orange, and I will tell you why.

One important area is the learning curve. We are seeing efforts taking six months+ without even having addressed any of the use cases let alone having to figure them out. There is a lot of innovation at the moment but with that comes a lot of new terminologies that also require some effort to be informed. More important is the use case. How are we going to evaluate what we need without knowing what the use cases are? A real-time data streaming scenario is much different than scenarios where the speed is not of importance as we can afford to see the results the next day.

Three Steps to get started with Big Data

  1. Start by identifying your vertical and finding what the demands are and who you serve as your customer. This will guide you on how to decide the Proof-of-Concept PoC(s) and therefore the use case(s). As a follow-up step, we then look at the technological assets that we will need. From there the next steps is how to launch the Proof-of-Concept (PoC) quickly. For companies that think their data is sensitive follow this rule:  decide what data you can live with by having it in the cloud as you’re only testing the hypothesis at this stage and is the fastest route.
  1. Seamless integration. At this stage, you need a solution that is responsive. Why because multiple business units will come with requirements. You need to be able to accommodate such demands. I can’t stress enough the managed services approach until you can support it internally. Why because it’s a huge learning curve and the return on investment (ROI) is far better. Many fall into the loop that because they see what a super enterprise is doing, then it’s feasible to have the same approach. This approach will only yield frustration as it will take longer. At Orzota BigForce we’re working hard to accelerate this stage.
  1. Insights. At the end of the day, you are doing all this for the insights but more specifically for the predictive and the hidden potential. Most likely you have a ton of reporting going on. Reports it’s not the issue here. If you cannot get the proper reports right now, then you need some serious help. Insights can be discovered internally but also externally, so kindly remind your hardest critics that is not about how much data you have but how much is out there that you can derive these very crucial insights for your business. A good example to put this argument to bed is Social listening and therefore Sentiment Analysis.

In conclusion, if you think about it, there isn’t much overlap if you start from the use case and Proof-of-Concept (PoC) approach while following the above steps. Starting small will also allow you to get buy-in and then expand. Partners may seem in the beginning that they offer similar services but at the pole position, you need to get more with less. Lastly, always keep in mind that for the majority the dynamics are different as there isn’t much Data Analysis talent out there thus with a managed services approach or a hybrid approach you can accelerate your environment and team. Finally, the orange correlation in 3 easy steps: pick, peel and consume. You can enjoy it getting all the vitamin C and its benefits.

big data analysis

Sentiment Analysis Free Trial

We are pleased to announce a free trial to the Orzota BigForce Social solution. This solution built on top of the Orzota BigForce platform provides the capability to analyze text streams from twitter, media sites, blog sites, etc. With search capability, the solution provides data scientists a means of exploring the social media data, with a focus on sentiment analysis. The sentiment analysis free trial will let you quickly determine whether such a solution can meet your needs.

Unlike many other sentiment analysis solutions, the  Orzota BigForce Social solution can be customized to meet your needs. So sign up for the trial and reach out to us to understand how we can make this work for you!


Gossip, Chatter or Twitter Marketing?

Internships don’t always involve slaving around, getting coffee, or doing lowly tasks for your boss. In fact, my experience at Orzota didn’t involve any of that. During the summer months of July & August, I helped with Orzota’s digital marketing focusing on social media, specifically Twitter marketing. This post captures a small glimpse of that journey.



It is often believed that if anything consequential is happening in business or society, it has to be trending on Twitter. My main goal was to increase Orzota’s twitter marketing footprint during the summer and more specifically increase the amount of followers and engagement on Orzotas’s Twitter profile. My job was to find the optimal methods to accomplish this goal with influencers that matter to Orzota.

Getting Started with Twitter

As I was not extremely familiar with Twitter and its tools before this experience, I was forced to use the good old trial-and-error method. I started out by following as many relevant people as I could find in the big data Twitter community using the appropriate hashtags. While I got a few follow-backs, it was definitely not the most efficient way to do things and I quickly became aware of that.

When I finally gained the confidence a couple of days later to begin tweeting independently, I sent out my first tweet and BAM! I immediately gained four new followers. I was genuinely astounded at how much engagement one tweet received. I made it an objective to tweet around five times a day and ended up lining several of them up so they would be scheduled for the most effective time slots.

Twitter marketing is more of an art than science. One hindrance was the fact that the Twitter caption had to sell the message in 140 characters or less. A captivating sentence followed by a link to an article or picture is the most effective tweet. All I can say is that the art of that comes with practice; from what I observed, including appealing words that may cause the reader to feel a certain emotion—whether it be shock, excitement, or worry—is essential to hook the reader.

Engaging Twitter Followers

As I went through this process daily, I made sure to record the metrics of the previous day so I could keep track of my progress. I was able to acquire quite a bit of knowledge about widespread topics such as the internet of things (IoT), cloud, data science, and obviously big data. I was gaining many followers a day through tweeting regularly, however I quickly encountered another issue; several people would not follow back, assuming that the Orzota Twitter handle was a bot. I eventually realized that it was necessary to engage with my followers. So, as I was advised, I began replying to people, favoriting tweets, and retweeting articles I enjoyed. Although this didn’t make a significant impact, it definitely got more people to follow me. As for the types of articles people seemed to like more, I noticed that most people enjoyed articles that had an impact on their personal lives. The more they related to it, the more engagement the tweet got.

Twitter Tools

Here is a list of the top four free resources (in no particular order) I found to be particularly useful:

TweetDeck—this tool allows a clear view of all activity happening on your account so nothing gets missed; it is also great for managing multiple accounts at once

Twitter Analytics—this is a feature embedded in Twitter itself; it records your activity on Twitter while keeping track of all the exact numbers (in terms of impressions, new followers, and top tweets); permits one to see how their numbers have progressed over the months

WhoUnfollowedMe?—this catches any followers who are only temporary and decide to unfollow later on

Crowdfire—this is just a general tool for social media growth; it includes a variety of features including automating direct messaging, accounting for all new followers and unfollowers, and searching for potentially interested followers

Twitter Marketing

What exactly is twitter marketing anyway? Besides just gaining more Twitter followers how does it benefit the actual company, its product, or brand? Well for one, when people hear about a company and come across its Twitter page, they look at the frequency of the tweets and the amount of followers to determine its credibility. Another benefit is that some of these followers are prospective customers. Thus, segmenting and categorizing followers and sending them targeted and personalized direct messages based on their preferences may help spark an interest in the company’s product(s).


All in all, this was an extremely meaningful experience for me as well as my employer, enabling Orzota to gain over 200 new twitter followers since my internship began. I had fun with words, unleashing my inner creativity and hitherto unknown marketing capability. I gained a new insight on how to directly market not only a product, but also a company brand as a whole, was able to figure out the best way to maneuver through Twitter, and thanks to Orzota, I also became acquainted with today’s hottest technology topics.

Workforce Analytics Solution

We recently worked with a leading Hi-Tech manufacturing company to design and implement a brand new scalable and efficient workforce analytics solution targeted the mobile workforce.

The solution is designed to raise the workers’ confidence bar, and to minimize the effort required to train the workers. The solution also improved the manpower utilization by optimizing inventory adjustments with higher accuracy while fulfilling orders. It also reduces the learning curve for workers resulting in substantial reduction in training hours.

Workforce Analytics Solution Overview

The Workforce Analytics solution was built on a Common Data Analytics Platform leveraging Hortonworks HDP 2.4 and used the following technologies: Kafka, Storm, HBase, HDFS, Hive, Knox, Ranger, Spark and Oozie.

The platform collects real time data from the application on mobile devices, stores it, and runs analytics with better performance and lower latency compared to their prior legacy system.

The HDP components at a glance:
Workforce Analytics Solution HDP Components

Workforce Analytics Architecture

The operational real-time data is collected using Kafka and ingested into HDFS and HBase in parallel using Storm (see diagram below). HBase acts as the primary data store for the analytics application. The data in HDFS is encrypted and reserved for other applications. Based on the business logic, the data stored in HBase is processed using Spark on a daily, weekly, monthly and yearly basis, and stored back into HBase as a feed for Spark Analytics (Spark SQL). Spark Analytics is used to run jobs to generate specific insights. The output from Spark Analytics in Hive as a temporary table. Hive Thrift Server is used to execute queries against Hive and retrieve the results for visualization and exploration using Tableau. Custom dashboards were also built for business users to help them track higher-level metrics.

Workforce Analytics - Architecture

To address security requirements, Apache Knox and Apache Ranger were used for perimeter security and access control, respectively. Both are included as a part of HDP 2.4 and are configured in the Access Node.

Workforce Analytics Physical Architecture

The figure below shows the physical layout of the services on the various servers used. The architecture comprises of Edge Nodes, Master Nodes and Slave Nodes. Each set of nodes run a variety of services.

Workforce Analytics Physical Architecture

Issues and Solutions

While implementing this solution, we ran into a variety of issues. We outline some of them here in the hope that it may help others who are designing similar architectures with the Apache Hadoop  or Hortonworks HDP eco-system of components. Table creation, user permission and workflows were the common focus areas.

HBase Table Creation

We ran into permission issues with HBase table creation.

Solution: In Apache Ranger, update HBase policy by giving appropriate read, write and create permission for the defined user.

Connection to hive thrift server

Another issue we ran into involved connections to Hive Thrift Server for a particular user “ABC”.

Solution: Ensure that the below properties are added to $HADOOP_CONF/core-site.xml



Oozie workflow jobs submission

Permission errors continued to plague the project while creating workflows in oozie.

Solution: The following needs to exist in the section of the corresponding job definition in workflow.xml:




within the

<shell xmlns="uri:oozie:shell-action:0.2">

oozie workflow job stuck in prep state

When re-running an Oozie workflow job after a period of time, it went to PREP state and did not execute. While trying to kill the job via CLI, the Oozie log shows the job was successfully killed.

USER [test] GROUP[-] TOKEN[-] APP[-] JOB[ABCDEF] ACTION[] User test killed the WF job ABCEDEF-oozie-oozi-W

However, in the Oozie UI, the job is still shown to be in PREP state.

Solution: Further research showed that the Oozie database at the backend (Derby by default) was corrupted, and was not representing the correct state of the jobs.

We decided, for longer term stability, to migrate from Derby to MySQL as the backend database for Oozie. After this migration, we did not run into this issue again.


Big data projects can grow and evolve rapidly. It’s important to realize that the solution chosen must offer the flexibility to scale up or down to meet business needs. Today, in addition to commercial platform distributions such as Hortonworks and Cloudera, higher level tools and applications simplify the process of developing big data applications. However, as seen by some of the issues we describe above, expertise in the underlying technologies is still crucial for timely completion of projects. Orzota can help. Please contact us.