What You Gain from a Well-Defined Enterprise Data Strategy

We find most businesses are eager to dig into their data. The possibility of solving persistent problems, revealing strategic insights, and reaching a better future state has mass appeal. However, when those same businesses insist on pursuing the latest and greatest technology without a game plan, I pump the brakes. Analytics without an enterprise data strategy is a lot like the Griswold’s journey in National Lampoon’s Vacation. They ended up at their destination…but at what cost? More than just gaining a better streamlined process, here is what your business can expect from putting the time into defining your enterprise data strategy.

Verification of Your Current State

Data is messier than most businesses imagine. Data practices, storage, and quality without a robust governance strategy often morph data beyond recognition. That’s why we recommend that any business interested in enterprise analytics first audit their current state – a key component of an effective enterprise data strategy. 

Take one of our former enterprise clients as an example. Prior to our data strategy and architecture assessment, they believed there were no issues with their reporting environment. The assumption was that all of their end users could access the reporting dashboards and that there were negligible data quality issues. Our assessment uncovered a very different scenario.

Right off the bat, we found that many of their reports were still conducted manually. Plenty of business users were waiting on reporting requests. There were both data quality and performance issues with the reporting environment and their transactional system. With that reality brought to the forefront, we were able to effectively address those issues and bring the client to an improved future state.

Even in less drastic instances, conducting the assessment phase of an enterprise data strategy enables a business to verify that assumptions about their data architecture and ecosystem are based in fact. In the long run, this prevents expensive mistakes and maximizes the future potential of your data.

Greater Need Fulfillment

An ad hoc approach to data restricts your return on investment. Without an underlying strategy, data storage architecture or an analytics solution is at best reactive, addressing an immediate concern without consideration for the big picture. Enterprise-wide needs go unfulfilled and the shelf life of any data solution lasts about as long as a halved avocado left out overnight.

A firm enterprise data strategy can avoid all these issues. For starters, there is an emphasis on holistic needs assessment across the enterprise. Interviews conducted within management-level stakeholders and a wide array of end users help to gain a panoramic view of the pain points and opportunities that data can help solve. This leads to fewer organizational blind spots and a greater understanding of real-world scenarios.

How do you gain an enterprise-wide perspective? Asking the following questions in an assessment is a good start:

  •   What data is being measured? What subject areas are important to analyze?
  •   Which data sources are essential parts of your data architecture?
  •   What goals are you trying to achieve? What goals are important to your end users?
  •   Which end users are using the data? Who needs to run reports?
  •   What manual processes exist? Are there opportunities to automate?
  •   Are there data quality issues? Is data compliant with industry regulations?
  •   What are your security concerns? Are there any industry-specific compliance mandates?
  •   Which technologies are you using in-house? Which technologies do you want to use?

This is only the start of the process. Our own data strategy and architecture assessment goes in-depth with the full range of stakeholders to deliver data solutions that achieve the greatest ROI.

A Clearer Roadmap for the Future

How do your data projects move from point A to point B? The biggest advantage of a data strategy is providing your organization with a roadmap to achieve your goals. This planning phase outlines how your team will get new data architecture or analytics initiatives off the ground.

For example, one of our clients in the logistics space wanted to improve their enterprise-wide analytics. We analyzed their current data ingestion tool, SQL Server data warehouse, and separate data sources. The client knew their new solution would need to pull from a total of nine data sources ranging from relational databases on SQL Server and DB2 to API sources. However, they didn’t know how to bridge the gap between their vision and a real outcome for their business.

We conducted a gap analysis to determine what steps existed between their current and future state. We incorporated findings from our stakeholder assessments. Then, we were able to build out a roadmap for a cloud-based data warehouse that will offer reports for executives and customers alike, in addition to providing access to advanced analytics. Our roadmap provided them with timelines, technologies needed, incremental project milestones, and workforce requirements to facilitate a streamlined process.

With a similar roadmap at your disposal, you will start your organization on the right path to building out an effective data and analytics solution. 

Jason Maas

rss
Facebooktwitterlinkedinmail

What is data build tool (dbt) and how is it different?

At 2nd Watch, we’re always keeping an eye on up-and-coming technologies. We investigate, test, and test some more to make sure we fully understand the benefits and potential drawbacks of any technology we may recommend to our clients. One unique tool we’ve recently spent quality time with is data build tool (dbt).

What is Data Build Tool (DBT)

What is dbt?

Before loading data into a centralized data warehouse, it must be cleaned up, made consistent, and combined as necessary. In other words, data must be transformed – the “T” in ETL (extract, transform, load) and ELT. This allows an organization to develop valuable, trustworthy insights through analytics and reporting.

Dbt enables data analysts and data engineers to automate the testing and deployment of the data transformation process. This is especially useful because many companies have increasingly complex business logic behind their reporting data. The dbt tool keeps a record of all changes made to the underlying logic and makes it easy to trace data and update or fix the pipeline through version control.

What is DBT?

Where does dbt fit in the market?

Dbt has few well-adopted direct competitors in the enterprise space, as no tool on the market offers quite the same functionality. Dbt does not extract or load data to/from a warehouse; it focuses only on transforming data after it has been ingested.

Some complementary tools are Great Expectations, Flyway, and Apache Airflow. Let’s take a closer look:

Apache Airflow

Airflow assists with ETL by creating automated processes, including pipelines and other operations commonly found in the orchestration workflow. It can integrate into a data warehouse, run commands, and operate off of a DAG similar to dbt’s; but it isn’t designed for full querying work. The dbt tool has a fleshed out front-end interface for query development and coding, whereas Airflow focuses more on the actual flow of data in its interface.

Flyway

Flyway is a version control system that tracks updates made to tables in a data warehouse. It doesn’t allow for editing, merely easing the migration process for teams with different sets of code. Flyway advances documentation in a separate environment, while dbt manages this via integrations with services like GitHub and DevOps.

Great Expectations

Great Expectations allows you to create comprehensive tests that run against your database, but it isn’t integrated with other ETL features. Unlike dbt, it doesn’t allow for any editing of the actual database.

What should you know about the dbt tool?

Dbt has a free open source version and a paid cloud version in which they manage all of the infrastructure in a SaaS offering. In 2020, they introduced the Integrated Developer Environment, which coincided with dbt pricing updates. Read more about the dbt cloud environment and dbt pricing here.

Dbt’s key functions include the following:

Testing

  • Dbt tests data quality, integration, and code performance. Quality is built into the tool, and the others can be coded and run in dbt (automatically in some cases).
  • Create test programs that check for missing/incomplete entries, unique constraints, and accepted values within specific columns.
  • Manually run scripts that will then run automated tests and deploy changes after passing said tests. Notifications can be programmed to be sent out if a certain test fails.

Deployment

  • Dbt has a built-in package manager that allows analysts and engineers to publish both public and private repositories. These can then be referenced by other users.
  • Deploy a dbt project after merging updated code in git.
  • Updates to the server can run on a set schedule in git.

Documentation

  • Dbt automatically creates a visual representation of how data flows throughout an organization.
  • Easily create documentation through schema files.
  • Documents are automatically generated and accessible through dbt, with the ability to send files in deployment. Maps are created to show the flow of data through each table in the ETL process.

One other thing to know about dbt is that you can use Jinja, a coding language, in conjunction with SQL to establish macros and integrate other functions outside of SQL’s capabilities. Jinja is particularly helpful when you have to repeat calculations or need to condense code. Using Jinja will enhance SQL within any dbt project, and our dbt consultants are available to help you harness Jinja’s possibilities within dbt.

Where could dbt fit in your tech ecosystem?

As previously mentioned, dbt has a free open source version and a paid cloud version, giving your company flexibility in budget and functionality to build the right tech stack for your organization. Dbt fits nicely with an existing modern technology stack with native connections to tools such as Stitch, Fivetran, Redshift, Snowflake, BigQuery, Looker, and Mode.

With dbt, data analysts and data engineers are able to more effectively transform data in your data warehouses by easily testing and deploying changes to the transformation process, and they gain a visual representation of the dependencies at each stage of the process. Dbt allows you to see how data flows throughout your organization, potentially enhancing the results you see from other data and analytics technologies.

Are you ready to discuss implementing an enterprise data initiative? Contact with on of our data consulting experts.

Where could dbt fit in your tech ecosystem

rss
Facebooktwitterlinkedinmail

Accelerating Application Development with DevOps

If you moved to the cloud to take advantage of rapid infrastructure deployment and development support, you understand the power of quickly bringing applications to market. Gaining a competitive edge is all about driving customer value fast. Immersing a company in a DevOps transformation is one of the best ways to achieve speed and performance.

In this blog post, we’re building on the insights of Harish Jayakumar, Senior Manager of Application Modernization and Solutions Engineering at Google, and Joey Yore, Manager, and Principal Consultant at 2nd Watch. See how the highest performing teams in the DevOps space are achieving strong availability, agility, and profitability with application development according to key four metrics. Understand the challenges, solutions, and potential outcomes before starting your own DevOps approach to accelerating app development.

Hear Harish and Joey on the 2nd Watch Cloud Crunch podcast, 5 Strategies to Maximize Your Cloud’s Value: Strategy 2 – Accelerating Application Development with DevOps  

Accelerating Application Development with DevOps

What is DevOps?

Beyond the fact that DevOps combines software development (Dev) and IT operations (Ops), DevOps is pretty hard to define. Harish thinks the lack of a clinical, agreed-upon definition is by design. “I think everyone is still learning how to get better at building and operating software.” With that said, he describes his definition of DevOps as, “your software delivery velocity, and the reliability of it. It’s basically a cultural and organizational moment that aims to increase software reliability and velocity.”

The most important thing to remember about a DevOps transformation and the practices and principles that make it possible is culture. At its core, DevOps is a cultural shift. Without embracing, adopting, and fostering a DevOps culture, none of the intended outcomes are possible.

Within DevOps there are five key principles to keep top of mind:

  1. Reduce organizational silos
  2. Accept failure as the norm
  3. Implement gradual changes
  4. Leverage tooling and automation
  5. Measure

Measuring DevOps: DORA and CALMS

Google acquired DevOps Research and Assessment (DORA) in 2018 and relies on the methodology developed from DORA’s annual research to measure DevOps performance. “DORA follows a very strong data-driven approach that helps teams leverage their automation process, cultural changes, and everything around it,” explains Harish. Fundamental to DORA are four key metrics that offer a valid and reliable way to measure the research and analysis of any kind of software delivery performance. These metrics gauge the success of DevOps transformations from ‘low performers’ to ‘elite performers’.

  1. Deployment frequency: How often is the organization successfully released to production
  2. Lead time for changes: The amount of time it takes a commit to get into production
  3. Change failure rate: The percentage of deployments causing a failure in production
  4. Time to restore service: How long it takes to recover from a failure in production

DORA is similar to the CALMS model which addresses the five fundamental elements of DevOps starting with where the enterprise is today and continuing throughout the transformation. CALMS also uses the four key metrics identified by DORA to evaluate DevOps performance and delivery. The acronym stands for:

Culture: Is there a collaborative and customer-centered culture across all functions?

Automation: Is automation being used to remove toil or wasted work?

Lean: Is the team agile and scrappy with a focus on continuous improvement?

Measurement: What, how, and against what benchmarks is data being measured?

Sharing: To what degree are teams teaching, sharing, and contributing to cross-team collaboration?

DevOps Goals: Elite Performance for Meaningful Business Impacts

Based on the metrics above, organizations fall into one of four levels: low, medium, high, or elite performers. The aspiration to achieve elite performance is driven by the significant business impact these teams have on their overall organization. According to Harish, and based on research by the DORA team at Google, “It’s proven that elite performers in the four key metrics are 3.56 times more likely to have a stronger availability practice. There’s a strong correlation between these elite performers and the business impact of the organization that they’re a part of. ”

He goes on to say, “High performers are more agile. We’ve seen 46 times more frequent deployments from them. And it’s more reliable. They are five times more likely to exceed any profitability, market share, or productivity goals on it.” Being able to move quickly enables these organizations to deliver features faster, and thus increase their edge or advantage over competitors.

Focusing on the five key principles of DevOps is critical for going from ideation to implementation at a speed that yields results. High and elite performers are particularly agile with their use of technology. When a new technology is available, DevOps teams need to be able to test, apply, and utilize it quickly. With the right tools, teams are alerted immediately to code breaks and where that code resides. Using continuous testing, the team can patch code before it affects other systems. The results are improved code quality and accelerated, efficient recovery. You can see how each pillar of DevOps – from culture and agility to technology and measurement, feeds into one another to deliver high levels of performance, solid availability, and uninterrupted continuity.

Overcoming Common DevOps Challenges

Common DevOps ChallengesBecause culture is so central to a DevOps transformation, most challenges can be solved through cultural interventions. Like any cultural change, there must first be buy-in and adoption from the top down. Leadership plays a huge role in setting the tone for the cultural shift and continuously supporting an environment that embraces and reinforces the culture at every level. Here are some ways to influence an organization’s cultural transformation for DevOps success.

  • Build lean teams: Small teams are better enabled to deliver the speed, innovation, and agility necessary to achieve across DevOps metrics.
  • Enable and encourage transparency: Joey says, “Having those big siloed teams, where there’s a database team, the development team, the ops team – it’s really, anti-DevOps. What you want to start doing is making cross-functional teams to better aid in knocking down those silos to improve deployment metrics.”
  • Create continuous feedback loops: Among lean, transparent teams there should be a constant feedback loop of information sharing to influence smarter decision making, decrease redundancy, and build on potential business outcomes.
  • Reexamine accepted protocols: Always be questioning the organizational and structural processes, procedures, and systems that the organization grows used to. For example, how long does it take to deploy one line of change? Do you do it repeatedly? How long does it take to patch and deploy after discovering a security vulnerability? If it’s five days, why is it five days? How can you shorten that time? What technology, automation, or tooling can increase efficiency?
  • Measure, measure, measure: Utilize DORAs research to establish elite performance benchmarks and realistic upward goals. Organizations should always be identifying barriers to achievement and continuously improving on measurements toward improvement.
  • Aim for total performance improvements: Organizations often think they need to choose between performance metrics. For example, in order to influence speed, stability may be negatively affected. Harish says, “Elite performers don’t see trade-offs,” and points to best practices like CICD, agile development, and tests, built-in automation, standardized platform and processes, and automated environment provisioning for comprehensive DevOps wins.
  • Work small: Joey says, “In order to move faster, be more agile, and accelerate deployment, you’re naturally going to be working with smaller pieces with more automated testing. Whenever you’re making changes on these smaller pieces, you’re actually lowering your risk for anyone’s deployment to cause some sort of catastrophic failure. And if there is a failure, it’s easy to recover. Minimizing risk per change is a very important component of DevOps.”

Learn more about avoiding common DevOps issues by downloading our eBook, 7 Major Roadblocks in DevOps Adoption and How to Address Them

Ready to Start Your DevOps Transformation?

Both Harish and Joey agree that the best approach to starting your own DevOps transformation is one based on DevOps – start small. The first step is to compile a small team to work on a small project as an experiment. Not only will it help you understand the organization’s current state, but it helps minimize risk to the organization as a whole. Step two is to identify what your organization and your DevOps team are missing. Whether it’s technology and tooling or internal expertise, you need to know what you don’t know to avoid regularly running into the same issues.

Finally, you need to build those missing pieces to set the organization up for success. Utilize training and available technology to fill in the blanks, and partner with a trusted DevOps expert who can guide you toward continuous optimization.

2nd Watch provides Application Modernization and DevOps Services to customize digital transformations. Start with our free online assessment to see how your application modernization maturity compares to other enterprises. Then let 2nd Watch complete a DevOps Transformation Assessment to help develop a strategy for the application and implementation of DevOps practices. The assessment includes analysis using the CALMS model, identification of software development and level of DevOps maturity, and delivering tools and processes for developing and embracing DevOps strategies.

call to action

 

rss
Facebooktwitterlinkedinmail

3 Questions to Help You Build Your Analytics Roadmap

In our experience, many analytics projects have the right intentions such as:

  • A more holistic view of the organization
  • More informed decision making
  • Better operational and financial insights

With incredible BI and analytics tools such as Looker, Power BI, and Tableau on the market, it’s tempting to start by selecting a tool believing it to be a silver bullet. While these tools are all excellent choices when it comes to visualization and analysis, the road to successful analytics starts well before tool selection.

So where do you begin? By asking and answering a variety of questions for your organization, and building a data analytics roadmap from the responses. From years of experience, we’ve seen that this process (part gap analysis, part soul-searching) is non-negotiable for any rewarding analytics project.

Building an Advanced Data Analytics Roadmap

Give the following questions careful consideration as you run your current state assessment:

How Can Analytics Support Your Business Goals?

There’s a tendency for some stakeholders not immersed in the data to see analytics as a background process disconnected from the day to day. That mindset is definitely to their disadvantage. When businesses fixate on analytical tools without a practical application, they put the cart before the horse and end up nowhere fast. Yet when analytics solutions are purposeful and align with key goals, insights appear faster and with greater results.

One of our higher education clients is a perfect example. Their goal? To determine which of their marketing tactics were successful in converting qualified prospects into enrolled students. Under the umbrella of that goal, their stakeholders would need to answer a variety of questions:

  • How long was the enrollment process?
  • How many touchpoints had enrolled students encountered during enrollment?
  • Which marketing solutions were the most cost effective at attracting students?

As we evaluated their systems, we recognized data from over 90 source systems would be essential to provide the actionable insight our client wanted. By creating a single source of truth that fed into Tableau dashboards, their marketing team was able to analyze their recruiting pipeline to determine the strategies and campaigns that worked best to draw new registrants into the student body.

This approach transcends industries. Every data analytics roadmap should reflect on and evaluate the most essential business goals. More than just choosing an urgent need or reacting to a surface level problem, this reevaluation should include serious soul-searching.

The first goals you decide to support should always be as essential to you as your own organizational DNA. When you use analytics solutions to reinforce the very foundation of your business, you’ll always get a higher level of results. With a strong use case in hand, you can turn your analytics project into a stepping stone for bigger and better things.

What Is Your Analytical Maturity?

You’re not going to scale Mt. Everest without the gear and training to handle the unforgiving high altitudes, and your organization won’t reach certain levels of analytical sophistication without hitting the right milestones first. Expecting more than you’re capable of out of an analytics project is a surefire path to self-sabotage. That’s why building a data analytics roadmap always requires an assessment of your data maturity first.

However, there isn’t a single KPI showing your analytical maturity. Rather, there’s a combination of factors such as the sophistication of your data structure, the thoroughness of your data governance, and the dedication of your people to a data-driven culture.

Here’s what your organization can achieve at different levels of data maturity:

  • Descriptive Analytics – This level of analytics tells you what’s happened in the past. Typically, organizations in this state rely on a single source system without the ability to cross-compare different sources for deeper insight. If there’s data quality, it’s often sporadic and not aligned with the big picture.
  • Diagnostic Analytics – Organizations at this level are able to identify why things happened. At a minimum, several data sets are connected, allowing organizations to measure the correlation between different factors. Users understand some of the immediate goals of the organization and trust the quality of data enough to run them through reporting tools or dashboards.
  • Predictive Analytics – At this level, organizations can anticipate what’s going to happen. For starters, they need large amounts of data – from internal and external sources – consolidated into a data lake or data warehouse. High data governance standards are essential to establish consistency and accuracy in analytical insight. Plus, organizations need to have complex predictive models and even machine learning programs in place to make reliable forecasts.
  • Prescriptive Analytics – Organizations at the level of prescriptive analytics are able to use their data to not only anticipate market trends and changing behaviors but act in ways that maximize outcomes. From end to end, data drives decisions and actions. Moreover, organizations have several layers of analytics solutions to address a variety of different issues.

What’s important to acknowledge is that each level of analytics is a sequential progression. You cannot move up in sophistication without giving proper attention to the prerequisite data structures, data quality, and data-driven mindsets.

For example, if an auto manufacturer wants to reduce their maintenance costs by using predictive analytics, there are several steps they need to take in advance:

  • Creating a steady feed of real-time data through a full array of monitoring sensors
  • Funneling data into centralized storage systems for swift and simple analysis
  • Implementing predictive algorithms that can be taught or learn optimal maintenance plans or schedules

Then, they can start to anticipate equipment failure, forecast demand, and improve KPIs for workforce management. Yet no matter your industry, the gap analysis between the current state of your data maturity and your goals is essential to designing a roadmap that can get you to your destinations fastest.

What’s the State of our Data?

Unfortunately for any data analytics roadmap, most organizations didn’t grow their data architecture in a methodical or intentional way. Honestly, it’s very difficult to do so. Acquisitions, departmental growth spurts, decentralized operations, and rogue implementations often result in an over-complicated web of data.

When it comes to data analysis, simple structures are always better. By mapping out the complete picture and current state of your data architecture, your organization can determine the best way to simplify and streamline your systems. This is essential for you to obtain a complete perspective from your data.

Building a single source of truth out of a messy blend of data sets was essential for one of our CPG clients to grow and lock down customers in their target markets. The modern data platform we created for their team consolidated their insight into one central structure, enabling them to track sales and marketing performance across various channels in order to help adjust their strategy and expectations. Centralized data sources offer a springboard into data science capabilities that can help them predict future sales trends and consumer behaviors – and even advise them on what to do next.

Are you building a data analytics roadmap and are unsure of what your current analytics are lacking? 2nd Watch can streamline your search for the right analytics fit. 

Call to action

rss
Facebooktwitterlinkedinmail

A Short Guide to Understanding Looker Pricing and Capabilities

Navigating the current BI and analytics landscape is often an overwhelming exercise. With buzzwords galore and price points all over the map, finding the right tool for your organization is a common challenge for CIOs and decision-makers. Given the pressure to become a data-driven company, the way business users analyze and interact with their data has lasting effects throughout the organization.

Looker pricing models

Looker, a recent addition to the Gartner Magic Quadrant, has a pricing model that differs from the per-user or per-server approach. Looker does not advertise their pricing model; instead, they provide a “custom-tailored” model based on a number of factors, including total users, types of users (viewer vs. editor), database connections, and scale of deployment.

Those who have been through the first enterprise BI wave (with tools such as Business Objects and Cognos) will be familiar with this approach, but others who have become accustomed to the SaaS software pricing model of “per user per month” may see an estimate higher than expected – especially when comparing to Power BI at $10/user per month. In this article, we’ll walk you through the reasons why Looker’s pricing is competitive in the market and what it offers that other tools do not.

Semantic and Governance Model

Unlike some of its competitors, Looker is not solely a reporting and dashboarding tool – it also acts as a data catalog across the enterprise. Looker requires users to think about their data and how they want their data defined across the enterprise.

Before you can start developing dashboards and visualizations, your organization must first define a semantic model (an abstraction of the database layer into business-friendly terms) using Looker’s native LookML scripting, which will then translate the business definitions into SQL. Centralizing the definitions of business metrics and models guarantees a single source of truth across departments. This will avoid a scenario where the finance department defines a metric differently than the sales or marketing teams, all while using the same underlying data. A common business model also eliminates the need for users to understand the relationships of tables and columns in the database, allowing for true self-service capabilities.

While it requires more upfront work, you will save yourself future headaches of debating why two different reports have different values or need to define the same business definitions in every dashboard you create.

By putting data governance front and center, your data team can make it easy for business users to create insightful dashboards in a few simple clicks.

Customization and Extensibility

At some point in the lifecycle of your analytics environment, there’s a high likelihood you will need to make some tweaks. Looker, for example, allows you to view and modify the SQL that is generated behind each visualization. While this may sound like a simple feature, a common pain point across analytics teams is trying to validate and tie out aggregations between a dashboard and the underlying database. Access to the underlying SQL not only lets analysts quickly debug a problem but also allows developers to tweak the auto-generated SQL to improve performance and deliver a better experience.

Another common complaint from users is the speed for IT to integrate data into the data warehouse. In the “old world” of Cognos and Business Objects, if your calculations were not defined in the framework model or universe, you would be unable to proceed without IT intervention. In the “new world” of Tableau, the dashboard and visualization are prioritized over the model. Looker brings the two approaches together with derived tables.

If your data warehouse doesn’t directly support a question you need to immediately answer, you can use Looker’s derived tables feature to create your own derived calculations. Derived tables allow you to create new tables that don’t already exist in your database. While it is not recommended to rely on derived tables for long-term analysis, it allows Looker users to immediately get speed-to-insight in parallel with the data development team incorporating it into the enterprise data integration plan.

Collaboration

Looker takes collaboration to a new level as every analyst gets their own sandbox. While this might sound like a recipe for disaster with “too many cooks in the kitchen,” Looker’s centrally defined, version-controlled business logic lives in the software for everyone to use, ensuring consistency across departments. Dashboards can easily be shared with colleagues by simply sending a URL or exporting directly to Google Drive, Dropbox, and S3. You can also send reports as PDFs and even schedule email delivery of dashboards, visualizations, or their underlying raw data in a flat file.

Embedded Analytics

Looker enables collaboration outside of your internal team. Suppliers, partners, and customers can get value out of your data thanks to the modern approach to embedded analytics. Looker makes it easy to embed dashboards, visuals, and interactive analytics to any webpage or portal because it works with your own data warehouse. You don’t have to create a new pipeline or pay for the cost of storing duplicate data in order to take advantage of embedded analytics.

So, is Looker worth the price?

Looker puts data governance front and center, which in itself is a decision your organization needs to make (govern first vs. build first). The addition of a centralized way to govern and manage your models is something that is often included as an additional cost in other tools, increasing the total investment when looking at competitors. If data governance and a centralized source of the truth is a critical feature of your analytics deployment, then the ability to manage this and avoid headaches of multiple versions of the truth makes Looker worth the cost.

call to action

If you’re interested in learning more or would like to see Looker in action, 2nd Watch has a full team of data consultants with experience and certifications in a number of BI platforms as well as a thorough understanding of how these tools can fit your unique needs. Get started with our data visualization starter pack.

 

rss
Facebooktwitterlinkedinmail

4 Key Differences between Data Lakes and Data Warehouses

Businesses today increasingly rely on data analytics to provide insights, identify opportunities, make important decisions, and innovate. Every day, a large amount of data (a.k.a Big Data) is generated from multiple internal and external sources that can and should be used by businesses to make informed decisions, understand their customers better, make predictions, and stay ahead of their competition.

difference between data lake and warehouse

For effective data-driven decisions, securely storing all this data in a central repository is essential. Two of the most popular storage repositories for big data today are data lakes and data warehouses. While both store your data, they each have different uses that are important to distinguish before choosing which of the two works best for you.

What is a Data Lake?

With large amounts of data being created by companies on a day-to-day basis, it may be difficult to determine which method will be most effective based on business needs and who will be using the data. To visualize the difference, each storage repository functions similarly to how it sounds. A data lake, for example, is a vast pool of raw, unstructured data. One piece of information in a data lake is like a small raindrop in Lake Michigan.

All the data in a data lake is loaded from source systems and none is turned away, filtered, or transformed until there is a need for it. Typically, data lakes are used by data scientists to transform data as needed. Data warehouses, on the other hand, have more organization and structure – like a physical warehouse building. These repositories house structured, filtered data that is used for a specific purpose. Still both repositories have many more layers to them than these analogies suggest.

To learn more about data lakes and their benefits, specifically with AWS Lake Formation, visit this post.

What is a Data Warehouse?

A data warehouse is the traditional, proven repository for storing data. Data warehouses use an ETL (Extract, Transform, Load) process, compared to data lakes, which use an ELT (Extract, Load, Transform) process. The data is filtered, processed, and loaded from multiple sources into the data warehouse once its use is defined. This structure, in turn, allows for its users to run queries in the SQL environment and get quick results. The users of data warehouses tend to be business professionals because once the data is fully processed, there is a highly structured and simplified data model designed for data analysis and reporting.

While the structure and organization provided by a data warehouse is appealing, one major downside you might hear about data warehouses is the time-consuming nature of changing them. Since the use of the data in data warehouses is already identified, the complexity of changing the data loading process for quick reporting takes developers a lengthy amount of time. When businesses want fast insights for decision-making, this can be a frustrating challenge if changes to the data warehouse need to be made.

In terms of cost, data warehouses tend to be more expensive in comparison to data lakes, especially if the volume of data is large. This is because of the accessibility, which is costly to make changes to. However, since a data warehouse weeds out data outside the identified profile, a significant amount of space is conserved reducing overall storage cost.

What are the Benefits of a Data Warehouse?

 While data lakes come with their own benefits, data warehouses have been used for decades compared to data lakes, proving their strong reliability and performance over time. Thus, there are several benefits that can be derived using a strong data warehouse, including the following:

  • Saves Time: With all your data loaded and stored into one place, a lot of time is saved from manually retrieving data from multiple sources. Additionally, since the data is already transformed, business professionals can query the data themselves rather than relying on an IT person to do it for them.
  • Strong Data Quality: Since a data warehouse consists only of data that is transformed, this refined quality removes data that is duplicated or inadequately recorded.
  • Improves Business Intelligence: As data within a data warehouse is extracted from multiple sources, everyone on your team will have a holistic understanding of your data to make informed decisions.
  • Provides Historical Data: A data warehouse stores historical data that can be utilized by your team to make future predictions and thus, more informed decisions.
  • Security: Data warehouses improve security by allowing certain security characteristics to be implemented into the setup of the warehouse.

Should I use a Data Lake or a Data Warehouse?

Due to their differences, there is not an objectively better repository when it comes to data lakes and data warehouses. However, a company might prefer one based on their resources and to fulfill their specific business needs. In some cases, businesses may transition from a data lake to a data warehouse for different reasons and vice versa. For example, a company may want a data lake to temporarily store all their data while their data warehouse is being built. In another case, such as our experience with McDonald’s France, a company may want a data lake for an ongoing data collection from a wide ranges of data sources to be used and analyzed later. The following are some of the key differences between data lakes and data warehouses that may be important in determining the best storage repository for you:

  • User type: When comparing the two, one of the biggest differences comes down to who is using the data. Data lakes are typically used by data scientists who transform the data when needed, whereas data warehouses are used by business professionals who need quick reports.
  • ELT vs ETL: Another major difference between the two is the ELT process of data lakes vs ETL process of data warehouses. Data lakes retain all data, while data warehouses create a highly structured data model that filters out the data that doesn’t match this model. Therefore, if your company wants to save all data for later use – even data that may never be used – then a data lake would be the choice for you.
  • Data type: As the more traditional repository, the data in data warehouses consists of data extracted from transaction systems with quantitative metrics and attributes to describe them. On the other hand, data lakes embrace non-traditional data types (e.g. web server logs, social network activity, etc.) and transforms it when it is ready to be used.
  • Adaptability: While a well-constructed data warehouse is highly effective, they do take a long time to change if changes need to be made. Thus, a lot of time can be spent getting the desired structure. Since data lakes store raw data, the data is more accessible when needed and a variety of schemas can be easily applied and discarded until there is one that has some reusability.

Furthermore, different industries may lean towards one or the other based on industry needs. Here’s a quick breakdown of what a few industries are using most commonly to store their data:

  • Education: A popular choice for storing data among education institutions are data lakes. This industry benefits from the flexibility provided by data lakes, as student grades, attendance, and other data points can be stored and transformed when needed. The flexibility also allows universities to streamline billing, improve fundraising, tailor to the student experience, and facilitate research.
  • Financial Services: Finance companies may tend to choose a data warehouse because they can provide quick, relevant insights for reporting. Additionally, the whole company can easily access the data warehouse due to their existing structure, rather than limiting access to data scientists.
  • Healthcare: In the healthcare industry, businesses have plenty of unstructured data including physicians notes, clinical data, client records, and any interaction a consumer has with the brand online. Due to these large amounts of unstructured data, the healthcare industry can benefit from a flexible storage option where this data can be safely stored and later transformed.

There is no black-and-white answer to which repository is better. A company may decide that using both is better, as a data lake can hold a vast amount of structured and unstructured data working alongside a well-established data warehouse for instant reporting. The accessibility of a good warehouse will be available to the business professionals of your organization, while data scientists use the data lake to provide more in-depth analyses.

Contact Us

Choosing between a data lake and data warehouse, or both, is important to how your data is stored, used, and who it is used by. If you are interested in setting up a data lake or data warehouse and want advisory on your next steps, 2nd Watch has a highly experienced and qualified team of experts to get your data to where it needs to be. Contact us to talk to one of our experts and take your next steps in your cloud journey.

rss
Facebooktwitterlinkedinmail

How to Choose the Best Cloud Service Provider for your Application Modernization Strategy

If the global pandemic taught us anything, it’s that digital transformation is a must-have for businesses to keep up with customer dCloud Service Provider for App Modernization Strategyemands and remain competitive. To do this, organizations are moving their workloads to and modernizing their applications for the cloud faster than ever.

In fact, according to a recent survey, 91% of respondents agree or strongly agree that application modernization plays a critical role in their organization’s adaptability to rapidly changing business conditions. But there are so many cloud service providers to choose from! How do you know which one is best for your application modernization objectives? Keep reading to find out!  

What is a Cloud Services Provider (CSP)? 

A cloud services provider is a cloud computing company that provides public clouds, managed private clouds, or on-demand cloud infrastructures, platforms, and services. Many CSPs are available worldwide, including Alibaba Cloud, Amazon Web Services (AWS), Google Cloud Platform (GCP), IBM Cloud, Oracle Cloud, and Microsoft Azure. However, three industry giants are noteworthy because of their services and global footprint: Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. 

What is a Cloud Services Provider (CSP)?

What is Application Modernization? 

Application modernization is the process of revamping an application to take advantage of breakthrough technical innovations to improve the overall efficiency of the application remarkably. This efficiency typically involves high availability, increased fault tolerance, high scalability, improved security, eliminating a single point of failure, disaster recovery, contemporary and simplified tools, new coding language, and reduced resource requirements, among other benefits. Many companies running legacy applications are now looking at how they can best modernize their monolith applications. 

Application Rationalization: The First Step to Modernization 

The best way to start any application modernization journey is with application rationalization. In this process, you identify company-wide business applications and strategically determine which ones you should keep, replace, retire, or consolidate. Once you identify those applications, you can list each one’s ease or difficulty level, total cost of ownership (TCO), and business value, enabling you to decide and prioritize which action to take. (Hint: Start with high value and minimal effort apps!) Doing this will also help you eliminate redundancies, lower costs, and maximize efficiency. 

The high-value apps that are difficult to move to the cloud will likely cause the most grief in your decision-making process. But, like Rome, your modernization strategy doesn’t need to be built in a day.

You can develop an approach to application modernization over time and still reduce costs and risks while moving your portfolio forward.  

It is crucial to evaluate your current application stack and determine the most suitable application modernization strategy to migrate to the cloud when it comes to application modernization in the cloud. Many on-premises applications are legacy monoliths that may benefit more from refactoring than a rehosting (“lift and shift”) approach. (Check out Rehost, Refactor, Replatform – What, When, & Why? | AppMod Essentials) 

Refactoring may require overhauling your application code, which takes some high-level effort but offers the most benefits. However, not all applications are ideal candidates for refactoring. Rearchitecting will become necessary for some obsolete applications that are not compatible with the cloud due to architectural designs made while building the app. In this scenario, the value proposition considers rearchitecting, dividing the application into several functional components that can be individually adapted and further developed. These small, independent pieces—or “microservices”—can then be migrated to the cloud quickly and efficiently. 

call to action

Determining the Best Cloud Services Provider for Your Application Modernization 

Each application modernization journey is unique, as is the process of choosing the best cloud service provider that meets your demands. What works for one business’ application may not be the best for yours, even if they are in the same industry. And just because a competitor has chosen one CSP over another does not mean you should. 

When evaluating the CSP that is best for you, consider the following: 

Service Level Agreements (SLAs): Determine if the CSP’s service level agreements suit your production workloads, whether the cloud service is generally available yet, and they retain satisfactory levels of support knowledge. Managing workloads in the cloud can sometimes be tedious. The managed services department may not have the required expertise to efficiently manage and monitor the cloud environment. It is critical to your business to do your due diligence to ensure your preferred CSP can administer their managed offerings with as close to zero downtime as possible. 

  • Vendor Lock-in: It is important to have alternatives to any single CSP and that you retain the flexibility to substitute for a better value proposition. 
  • Enterprise Adoption: Consider the likelihood of scalability of your use of the CSP across your organization. 
  • Economic Impact: Consider the positive business or financial impacts that result from the service usage at the individual, department, and company-wide levels. 
  • Automation and Deployment: Verify the CSP’s integration capabilities with your organization’s preferred automation tooling and availability of automated and local testing frameworks.  

CSP Application Modernization Design Considerations 

When modernizing existing applications to take the best advantage of the cloud, cloud technologies like serverless and containers are good options to consider. Serverless computing and containers are cloud-native tools that automate code deployment into isolated environments. Developers can build highly scalable applications with fewer resources within a short time. They both also reduce overhead for cloud-hosted web applications but differ in many ways. Private cloud, hybrid cloud, and multi-cloud approaches to application modernization are worth considering too. 

Serverless Computing and Containers 

Serverless Computing and Containers Serverless computing is an exaction model where the CSP executes a piece of code by dynamically allocating the resources and can only charge for the services used to run the code. Code is typically run in stateless containers. Various events such as HTTP requests, monitoring alerts, database events, queuing services, file uploads, scheduled events (cron jobs), and more can trigger them.

 

The cloud service provider then receives the code in a function to execute, which is why serverless computing is sometimes referred to as a Function-as-a-Service (FaaS) platform. Add that to your list of as-a-Service acronyms: IaaS, PaaS, SaaS, FaaS!   

The FaaS offerings of the three major CSPs are: 

Containers provide a discrete environment set up within an operating system. They can run one or more applications, typically assigned only those resources necessary for the application to function correctly. Because containers are smaller and faster than virtual machines, they allow applications to run quickly and reliably among various computing environments. Container images become containers at runtime and include everything needed to run an application: code, runtime, system tools, system libraries, and settings. 

Private, Hybrid, and Multi-Cloud 

The public cloud is a vital part of any modernization strategy. However, some organizations may not be ready to go directly to the public cloud from the datacenter. Cloud architects should consider private, hybrid, and multi-cloud strategies in those cases. These models can help resolve any architectural, security, or latency concerns. They will also reduce the complexity associated with the policies for specific workloads based on their unique characteristics.  

Conclusion 

Migration to the cloud is ideal for investing in application modernization as it can lower your overall operational costs and increase your application’s resiliency. But not all use cases—nor cloud service providers—are the same. You need to do your homework before choosing the best-suited one for your business.  

2nd Watch offers a comprehensive consulting methodology and proven tools to accelerate your cloud-native and app modernization objectives. Our modernization process begins with a complete assessment of your existing application portfolio to identify which you should keep, replace, retire, or consolidate. We then develop and implement a modernization strategy that best meets your business needs.

From application rationalization to application modernization and beyond, 2nd Watch is your go-to trusted advisor throughout your entire modernization journey. 

Contact us to schedule a brief meeting with our specialists to discuss your current modernization objectives. 

By Alex Ifebigh, 2nd Watch Sr. Cloud Consultant 

 

rss
Facebooktwitterlinkedinmail

A CTO’s Guide to a Modern Data Platform: What is Snowflake, How is it Different, and Where Does it Fit in Your Ecosystem?

Chances are, you’ve been here before – a groundbreaking new data and analytics technology has started making waves in the market, and you’re trying to gauge the right balance between marketing hype and reality. Snowflake promises to be a self-managing data warehouse that can get you speed-to-insight in weeks, as opposed to years. Does Snowflake live up to the hype? Do you still need to approach implementation with a well-defined strategy? The answer to both of these questions is “yes.”

A CTO's Guide to a Modern Data Platform

What Is Snowflake and How Is It Different?

Massive scale….Low overhead

Snowflake is one of the few enterprise-ready cloud data warehouses that brings simplicity without sacrificing features. It automatically scales, both up and down, to get the right balance of performance vs. cost. Snowflake’s claim to fame is that it separates compute from storage. This is significant because almost every other database, Redshift included, combines the two together, meaning you must size for your largest workload and incur the cost that comes with it.

With Snowflake, you can store all your data in a single place and size your compute independently. For example, if you need near-real-time data loads for complex transformations, but have relatively few complex queries in your reporting, you can script a massive Snowflake warehouse for the data load, and scale it back down after it’s completed – all in real time. This saves on cost without sacrificing your solution goals.

Elastic Development and Testing Environments

Development and testing environments no longer require duplicate database environments. Rather than creating multiple clusters for each environment, you can spin up a test environment as you need it, point it at the Snowflake storage, and run your tests before moving the code to production. With Redshift, you’re feeling the maintenance and cost impact of three clusters all running together. With Snowflake, you stop paying as soon as your workload finishes because Snowflake charges by the second.

With the right DevOps processes in place for CI/CD (Continuous Integration/Continuous Delivery), testing each release becomes closer to a modern application development approach than it does a traditional data warehouse. Imagine trying to do this in Redshift.

Avoiding FTP with External Data Sharing

The separated storage and compute also enables some other differentiating features, such as data sharing. If you’re working with external vendors, partners, or customers, you can share your data, even if the recipient is not a Snowflake customer. Behind the scenes, Snowflake is creating a pointer to your data (with your security requirements defined). If you commonly write scripts to share your data via FTP, you now have a more streamlined, secure, and auditable path for accessing your data outside the organization. Healthcare organizations, for example, can create a data share for their providers to access, rather than cumbersome manual processes that can lead to data security nightmares.

Where Snowflake Fits Into Your Ecosystem

Snowflake is a part of Your Data Ecosystem, but It’s not in a Silo.

Always keep this at the top of your mind. A modern data platform involves not only analytics, but application integration, data science, machine learning, and many other components that will evolve with your organization. Snowflake solves the analytics side of the house, but it’s not built for the rest.

When you’re considering your Snowflake deployment, be sure to draw out the other possible components, even if future tools are not yet known. Knowing which Snowflake public cloud flavor to choose (Azure or AWS) will be the biggest decision you will make. Do you see SQL Server, Azure ML, or other Azure PaaS services in the mix; or is the AWS ecosystem more likely to fit better in the organization?

As a company, Snowflake has clearly recognized that they aren’t built for every type of workload. Snowflake partnered with Databricks to allow heavy data science and other complex workloads to run against your data. The recent partnership with Microsoft will ensure Azure services continue to expand their Snowflake native integrations – expect to see a barrage of new partnership announcements during the next 12 months.

Snowflake data sources

If you have any questions or want to learn more about how Snowflake can fit into your organization, contact us today.

CTA 2nd Watch

 

rss
Facebooktwitterlinkedinmail

Rehost vs Refactor vs Replatform | AppMod Essentials

Migrating workloads or an application to the cloud can seem daunting for any organization. The cloud is synonymous with industry buzzwords such as DevOps, digital transformation, opensource, and more. As of 2021, AWS has over 200 products and services.

Nowadays, every other LinkedIn post is somehow related to the cloud. Sound familiar? Maybe a bit intimidating? If so, you are not alone! Organizations often hope that operating in the cloud will help them become more agile, enhance business continuity, or reduce technical debt. All of which are achievable in a cloud environment with proper planning. 

AppMod Essentials: Rehost, Refactor, Replatform

Benjamin Franklin once said, “By failing to prepare, you are preparing to fail.” This sentiment is true not only in life but also in technology. Any successful IT project has a strategy and tangible business outcomes. Project managers must establish these before any “actual work” begins. Without this, leadership teams may not know if the project is on task and on schedule. Technical teams may struggle to determine where to start or what to prioritize. Here we’ll explore industry-standard strategies that organizations can deploy to begin their cloud journey and help technical leaders decide which path to take. 

What is Cloud Migration? 

Cloud migration is when an organization decides to move its data, applications, or other IT capabilities into a cloud service provider (CSP) such as Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure. Some organizations may decide to migrate all IT assets into the cloud; however, most organizations keep some services on-premises in a hybrid environment for various reasons. Performing a migration to the cloud may consist of multiple CSPs or even a private cloud. 

What Are the Different Strategies for Cloud Migration? 

What Are the Different Strategies for Cloud Migration?Gartner recognizes five cloud migration strategies, nicknamed “The 5Rs.” Individually they are called rehost, refactor, revise (a.k.a. replatform), rebuild, and replace, each with benefits and drawbacks. This blog focuses on three of those five migration approaches—rehost, refactor, and replatform—as they play a significant role in application modernization. 

What is Rehost in the Cloud?

Rehost, or “lift and shift,” is the process of migrating a workload into the cloud as-is without any modifications. Rehosting usually involves infrastructure-as-a-service (IaaS) technologies in a cloud provider such as AWS EC2 or Azure VM’s. Organizations with little cloud experience may consider this strategy because it is an easy start to their cloud journey. Cloud service providers are constantly creating new services for rehosting to make the process even easier. This strategy is less complex, so the timeline to complete a rehost migration can be significantly shorter than other strategies. Organizations often rehost workloads and then modernize after gaining more cloud knowledge and experience. 

Rehosting Pros:

  • No architecture changes – Organizations can migrate workloads as-is, which benefits those with little cloud experience. 
  • Fastest migration method – Rehosting is often the quickest path to the cloud. This method is an excellent advantage for organizations that need to vacate an on-premises data center or colocation. 
  • Organizational changes are not necessary – Organizational processes and strategies to manage workloads can remain the same since architectures are not changing. Organizations will need to learn new tools for the selected cloud provider, but the day-to-day tasks will not change.  

  Rehosting Cons: 

  • High costs – Monthly spending will quickly add up in the cloud without modernizing applications. Organizations must budget appropriately for rehosting migrations. 
  • Lack of innovation – Rehosting does not take advantage of the variety of innovative and modern technologies available in the cloud.  
  • Does not improve the customer experience – Without change, applications cannot improve, which means customers will have a similar experience in the cloud. 

What Refactor Means?

Use the refactoring technique to update and optimize applications for the cloud. Refactoring often involves “app modernization” or updating the application’s existing code to take full advantage of cloud features and flexibility. This strategy can be complex because it requires source code changes and introduces modern technologies to the organization. These changes will need to be thoroughly tested and optimized, leading to possible delays. Therefore, organizations should take small steps by refactoring one or two modules at a time to correct issues and gaps at a smaller scale. Although refactoring may be the most time-consuming, it can provide the best return on investment (ROI) once complete.  

Refactoring Pros: 

  • Cost reduction – Since applications are being optimized for the cloud, refactoring can provide the highest ROI and reduce the total cost of ownership (TCO). 
  • More flexible application architectures – Refactoring allows application owners the opportunity to explore the landscape of services available in the cloud and decide which ones fit best. 
  • Increased resiliency – technologies and concepts like auto-scaling, immutable infrastructure, and automation can increase application resiliency and reliability. Organizations should consider all of these when refactoring. 

Refactoring Cons:

  • A lot of change – Technology and cultural changes can be brutally painful. Cloud migrations often combine both, which compounds the pain. Add the complexity of refactoring, and you may have full-blown mutiny without careful planning and strong leadership. Refactoring migrations are not for the faint of heart, so tread lightly. 
  • Advanced cloud knowledge and experience are needed – Organizations lacking cloud experience may find it challenging to refactor applications by themselves. Organizations may consider using a consulting firm to address skillset gaps. 
  • Lengthy project timelines – Refactoring hundreds of applications doesn’t happen overnight. Organizations need to establish realistic timelines before starting a refactor migration. 

What's your application modernization maturity level

What is Replatform in Cloud?

Replatforming is a happy medium between refactoring and rehosting and applies a series of changes to the application to fit the cloud better without rearchitecting the whole thing versus completely overhauling the application as you would expect from refactoring. Replatforming projects often involve rearchitecting the database to a more cloud-native solution, adding scaling mechanisms, or containerizing applications. 

Replatorming Pros:

  • Reduces cost – If organizations take cost-savings measures during replatforming, they will see a reduction in technical operating expenses. 
  • Acceptable compromise – Replatforming is considered a happy medium of adding features and technical capabilities without jeopardizing migration timelines. 
  • Adds cloud-native features – Replatforming can add cloud technologies like auto-scaling, managed storage services, infrastructure as code (IaC), and more. These capabilities can reduce costs and improve customer experience. 

Replatforming Cons:

  • Scope creep may occur – Organizations may struggle to draw a line in the sand when replatforming. It can be challenging to decide which cloud technologies to prioritize. 
  • Limits the amount of change that can occur – Everything cannot be accomplished at once when replatforming. Technical leaders must decide what can be done given the migration timeline then add the remaining items to a backlog. 
  • Cloud and automation skills needed – Organizations lacking cloud experience may struggle replatforming workloads by themselves. 

Which cloud migration strategy is best for your organization? 

As stated above, it is essential to have clear business objectives for your organization’s cloud migration. Just as important is establishing a timeline for the migration. Both will help technical leaders and application owners decide which strategy is best. Below are some common goals organizations have for migrating to the cloud. 

Common business goals for cloud migrations:

  • Reduce technical debt 
  • Improve customer’s digital experience 
  • Become more agile to respond to change faster 
  • Ensure business continuity 
  • Evacuate on-premises data centers and colocations 
  • Create a culture of automation 

Determining the best migration strategy is key to getting the most out of the cloud and meeting your business objectives. It is common for organizations to use all three of these strategies in tandem and often work with trusted advisors like 2nd Watch to determine and implement the best. When planning your cloud migration strategy, consider these questions:  

Cloud Migration Strategy Considerations:

  • Is there a hard date for migrating the application? 
  • How long will it take to modernize? 
  • What are the costs for “lift and shift,” refactoring, and/or replatforming? 
  • When is the application being retired? 
  • Can the operational team(s) support modern architectures? 

Conclusion 

In today’s world, the cloud is where the most innovation in technology occurs. Companies that want to be a part of modern technology advancements should seriously consider migrating to the cloud. Organizations can achieve successful cloud migrations with the right strategy, clear business goals, and proper skillsets. 

2nd Watch is an AWS Premier Partner, Google Cloud Partner, and Microsoft Gold partner, providing professional and managed cloud services to enterprises. Our subject matter experts and software-enabled services provide you with tested, proven, and trusted solutions in all aspects of cloud migration and application modernization.  

Contact us to schedule a discussion on how we can help you achieve your 2022 cloud modernization objectives. 

By Jacob Acton, 2nd Watch Cloud Consultant 

rss
Facebooktwitterlinkedinmail

Comparing Modern Data Warehouse Options

To remain competitive, organizations are increasingly moving towards modern data warehouses, also known as cloud-based data warehouses or modern data platforms, instead of traditional on-premise systems. Modern data warehouses differ from traditional warehouses in the following ways:

    • There is no need to purchase physical hardware
    • They are less complex to set up
    • It is much easier to prototype and provide business value without having to build out the ETL processes right away
    • There is no capital expenditure and a low operational expenditure
    • It is quicker and less expensive to scale a modern data warehouse
    • Modern cloud-based data warehouse architectures can typically perform complex analytical queries much faster because of how the data is stored and their use of massively parallel processing (MPP)

Modern Data Management: Comparing Modern Data Warehouse Options

Modern data warehousing is a cost-effective way for companies to take advantage of the latest technology and architectures without the upfront cost to purchase, install, and configure the required hardware, software, and infrastructure.

Comparing Modern Data Warehousing Options

    • Traditional Data Warehouse deployed on (IaaS): Requires our customers to install traditional data warehouse software on computers provided by a cloud provider (Azure, AWS, Google, etc.).
    • Platform as a service (PaaS): The cloud provider manages the hardware deployment, software installation, and software configuration. However, the customer is responsible for managing the environment, tuning queries, and optimizing the data warehouse software.
    • A True SaaS data warehouse (SaaS): In a SaaS approach, software and hardware upgrades, security, availability, data protection, and optimization are all handled for you. The cloud provider provides all hardware and software as part of its service, as well as aspects of managing the hardware and software.

With all of the above scenarios, the tasks of purchasing, deploying and configuring the hardware to support the data warehouse environment falls on the cloud provider instead of the customer.

IaaS, PaaS, and SaaS – What is the Best Option for my Organization?

Infrastructure as a service (IaaS) is an instant computing infrastructure, provisioned and managed over the internet. It helps you avoid the expense and complexity of buying and managing your own physical servers and other data center infrastructure. In other words, if you’re prepared to buy the engine and build the car around it, the IaaS model may be for you.

In the scenario of platform as a service (PaaS), a cloud provider merely supplies the hardware and it’s traditional software via the cloud, the solution is likely to resemble its original, on-premise architecture and functionality. Many vendors offer a modern data warehouse that was originally designed and deployed for on-premises environments. One such technology is Amazon’s Redshift. Amazon acquired rights to ParAccel, named it Redshift, and hosted it in the AWS cloud environment. Redshift is a highly successful modern data warehouse service. It is easy in AWS to instantiate a Redshift cluster, but that’s where it ends. It still requires you to complete all of the administrative tasks.

You have to reclaim space after rows are deleted or updated (the process of vacuuming in Redshift), manage capacity planning, provisioning compute and storage nodes, determine your distribution keys, all of the things that you had to do with ParAccel or with any traditional architecture, you have to do with Redshift.

Alternatively, any data warehouse solution built for the cloud using a true software as a solution (SaaS) data warehouse architecture allows for the cloud provider to include all hardware and software as part of its service as well as aspects of managing the hardware and software. One such technology, which requires no management and features separate compute, storage, and cloud services that can scale and change independently, is Snowflake. It differentiates itself from IaaS and PaaS cloud data warehouses because it was built from the ground up on cloud architecture.

All administrative tasks, tuning, patching, and management of the environment falls on the vendor. In lieu of the architecture we have seen with IaaS and PaaS solutions in the market today, Snowflake has a new architecture called a multi-clustered shared data that essentially makes the administrative headache of maintaining solutions, like Redshift, go away.

2nd Watch contact us

If you depend on your data to better serve your customers, streamline your operations, and lead (or disrupt) your industry, a modern data platform built on the cloud is a must-have for your organization.

Contact us to learn what a modern data warehouse would look like for your organization.

rss
Facebooktwitterlinkedinmail