Consider Performance, Growth & Budget When Buying Data Analytics
DevOps.com
May 09, 2023
So, your business needs to invest in data analytics technology to improve efficiencies, competitive advantage or business outcomes. There are a ton of things to consider, but underlying each decision are three driving factors:
Performance – The need to meet service level agreements.
Growth – The need to accommodate success, as well as deal with inevitable industry changes.
Budget – The need to do both of the above while keeping costs low enough to not erode the profit gain from analytics.
These three forces underpin nearly every technology selection decision, but what frequently isn’t noticed is how much these three drivers interact.
See publication
Tags: Analytics, Cloud, DevOps
Is Cloud Repatriation a Big Lie Server Vendors Are Shilling?
Spiceworks
April 03, 2023
Cloud repatriation spotlights the need to avoid letting hype determine your analytics strategy. Paige Roberts, Open Source relations manager at Vertica, gets to the bottom of what reverse cloud migration is all about and the purpose it may serve.
See publication
Tags: Analytics, Cloud, Design Thinking
Five Reasons Why In-Database Machine Learning Makes Sense
IASA Architecture and Governance Magazine
February 12, 2023
Of the few ML projects that make it to production, most take 3 months to a year. Organizations don’t receive benefits from data science until it’s in production. Does that mean adding another technology to bloated stacks? Vertica’s presentation is about getting machine learning into production faster with something nearly every company already has: a good analytics database.
See publication
Tags: Analytics, AI, Big Data
New O’Reilly Book: Accelerate Machine Learning with a Unified Analytics Architecture
Medium
February 16, 2022
Between 40 and 60% of machine learning projects fail, most at the point in the workflow between proof of concept and production. One day, it may be as easy for an organization to put an ML model into production as it is to put a new visualization in a BI report. The right data architecture design can be the key.
See publication
Tags: AI, Analytics, Big Data
What’s the Difference Between a Data Lakehouse and a Unified Analytics Platform?
Architecture and Governance Magazine
November 12, 2021
I’ve been doing a bunch of speeches at various conferences on the merging of the data warehouse and data lake into a single unified analytics platform. I inevitably get one question, “How is this different from a lakehouse?” There are two answers, a short one that’s glib and easy, and a longer one that really dives into things. Short answer, “They’re extremely similar architectural concepts.” The rest of this article is the long answer.
See publication
Tags: Analytics, Big Data, Predictive Analytics
It’s a Trap! — Cloud Financial Incentive for Badly Optimized Analytics Software
Medium
October 15, 2021
For all the years I’ve been working with data management and analytics software, there’s always been a powerful motivation to be as efficient as possible. The smarter your software is about using available computer resources — hardware, disk, memory, CPU… — the bigger your edge over the competition. The happier your customers are, the more money your company makes. The financial incentive to be more and more performant on less and less compute has always been enough to motivate endless tweaks to eke out just a little more speed, or figure out ways to do just a little bit more with the same hardware.
This benefits the customer, who constantly gets better and better software.
Then the cloud came along, and things seemed the same, for the most part. You could no longer say “hardware” to mean the storage and compute infrastructure, but I still assumed everyone in the data management and analytics software industry was in that same race, to be more and more performant on less and less compute “infrastructure.”
See publication
Tags: Analytics, Big Data, Cloud
What Do People Mean by “Cloud-Native?”
Medium
July 28, 2021
Cloud-native is an important buzz word in the data storage and analytics space these days. The way we hear folks use it to advertise their software, it sounds like it must be something wonderful, a data analytics superhighway. But it seems like the meaning shifts depending on who is saying it. It’s a big red flag to me when a phrase means whatever people want it to mean at that moment, mainly to convince you that their software is superior to other software in some nebulous, undefined way, so you’ll buy it. The next time you hear someone using cloud-native in a sentence, consider what they might actually mean.
See publication
Tags: Analytics, Big Data, Cloud
Container Boom: Should Databases Be Containerized?
Rtinsights
June 11, 2021
Several years back, the application technology industry had this concept of breaking big applications up into smaller independent components, microservices, and deploying each in its own container. The container idea has some pretty cool advantages it turns out:
See publication
Tags: Analytics, Big Data, Cloud
Why is Cloud Repatriation Happening?
https://www.rtinsights.com/
March 16, 2021
More and more organizations who went all-in on cloud early are now finding that some analytics workloads are better on-premises and are pulling those workloads back.
See publication
Tags: Analytics, Big Data, Cloud
Natural Language Processing Augmented Analytics
https://www.vertica.com/blog/
February 03, 2021
It’s Like Your Data Saying, “Ask Me Anything”
Analytics only makes an impact when it’s put to work to do a job automatically, or more often, help people do their jobs. The more people who can use analytics, the more valuable it becomes. And nearly every role could benefit from answers their company’s data could provide. What stops analytics from becoming part of everyone’s daily routine? It isn’t a slacking data engineering team, or an imperfect data architecture, it’s the interface. If I need to know something for my job, instead of learning complex SQL queries, or interpreting a bunch of graphs, why can’t I just ask?
See publication
Tags: AI, Analytics, Big Data
Deliver Analytics Like Amazon Delivers Packages
https://www.rtinsights.com/
August 31, 2020
Instead of focusing on where the data lives, focus on making the analytics experience as smooth as possible for everyone in your organization.
See publication
Tags: Analytics, Cloud, Predictive Analytics
Evolution of the Modern Data Warehouse
Medium
July 24, 2020
There are a lot of definitions of the data warehouse. I grabbed a random definition off the web. It fits the general understanding in the data management industry of what a data warehouse is, and what it isn’t.
It’s also wrong.
“Data warehousing is a technology that aggregates structured data from one or more sources so that it can be compared and analyzed for greater business intelligence.”
If you’re looking at that definition and thinking, “That looks right to me,” then read on. Once upon a time, I probably would have agreed with this definition as well. But times have changed.
See publication
Tags: Analytics, Cloud, Predictive Analytics
Can Presto SQL on Hadoop Replace Your Data Warehouse?
http://bigdatapage.com/
July 06, 2020
Presto is the best of the SQL on Hadoop open source bunch. Why not just use it and ditch your analytical database? Uber knows why …
See publication
Tags: Analytics, Big Data, Predictive Analytics
What is the Best Hadoop Alternative
Medium
April 28, 2020
Apache Hadoop took the world by storm and looked like it was going to own the data analytics and data management industries for a while there. But now, the hype machine, and the weaknesses of Hadoop — complexity, lack of security and governance, slow performance, poor concurrency, etc. — have everyone looking for a good Hadoop alternative.
Let’s look at some of the options that are being touted for doing Hadoop data analytics, and their pros and cons as Hadoop alternatives.
See publication
Tags: Analytics, Big Data, Predictive Analytics
Accelerate Machine Learning with a Unified Analytics Architecture
O'Reilly
February 12, 2022
Unification of data warehouse and data lake architectures into something new - whether you call it a unified analytics architecture, a data lakehouse, or something else - is a trend that nearly every company seems to be moving toward over the last five years. This new architecture combined with in place machine learning on whole data sets is revolutionizing how data analysis at scale gets done. Read this book to learn how you can get machine learning models into production in minutes, not months.
See publication
Tags: AI, Analytics, Big Data
97 Things Every Data Engineer Should Know
O'Reilly
July 06, 2021
From the Preface
Data engineering as a distinct role is relatively new, but the responsibilities have existed for decades. Broadly speaking, a data engineer makes data available for use in analytics, machine learning, business intelligence, etc. The introduction of big data technologies, data science, distributed computing, and the cloud have all contributed to making the work of the data engineer more necessary, more complex, and (paradoxically) more possible. It is an impossible task to write a single book that encompasses everything that you will need to know to be effective as a data engineer, but there are still a number of core principles that will help you in your journey.
This book is a collection of advice from a wide range of individuals who have learned valuable lessons about working with data the hard way.
To save you the work of making their same mistakes, we have collected their advice to give you a set of building blocks that can be used to lay your own foundation for a successful career in data engineering. In these pages you will find career tips for working in data teams, engineering advice for how to think about your tools, and fundamental principles of distributed systems.
There are many paths into data engineering, and no two people will use the same set of tools, but we hope that you will find the inspiration that will guide you on your journey. So regardless of whether this is your first step on the road, or you have been walking it for years we wish you the best of luck in your adventures.
See publication
Tags: Analytics, Big Data, DevOps
Reduce Risk and Avoid Lock-in When Moving from On-Prem to Cloud or Hybrid
DBTA
May 10, 2023
There are a lot of advantages to moving to the cloud. But the more various organizations move to the cloud, the more we see them tripping over hidden land mines like out-of-control costs, platform lock-in that restricts future options, regulations that restrict data location, etc. How can a company shift data analysis workloads into the cloud, while minimizing their exposure to those risks?
In this session, you will learn how to:
• transition to the cloud, while keeping some workloads on prem,
• move to multiple clouds without hopelessly complicating analytics
• maintain high throughput and low latency in cloud analytics
• keep platform options for the future open
See publication
Tags: AI, Analytics, Cloud
Use AIOps to Improve Uptime and Reduce Mean Time to Repair
IWCE
March 27, 2023
Hardware breaks. While a lot of industries talk about going to the cloud – aka someone else’s hardware – to solve uptime issues, this doesn’t work so well when you ARE someone’s hardware provider. Whether you’re a telecom company with towers and networks, a computer infrastructure company, an energy utility, a manufacturing company, or even a car company – anyone who provides solid metal hardware needs to make sure it keeps running.
AIOps is the smart application of advanced analytics including machine learning to monitor, and in many cases fix IT related problems before the end user is even aware of them. A good example is HPE Infosight, that has embedded analysis capability in every computer they sell. For them, it improves the productivity of support technicians by reducing support calls as much as 80% or more, and boosting customer satisfaction by increasing uptime proportionately. A wide variety of other industries are getting similar benefits by embedding the ability to analyze IoT data.
Telecom is another industry with a constant need to monitor traffic, automatically identify overloaded areas, and reroute calls to prevent outages or dropped calls. Where should your next tower go? How can the flood of 5G data be captured, managed, and put to use, without overloading staff or technology?
In every industry, the key question is: How can I find and fix issues before customers are aware of the problem, or at least vastly reduce the mean time to repair those issues that do occur? AIOps holds the key. Harness the flood of IoT data to catch and prevent incidents before they happen, or fix them as soon as possible after they happen.
This sounds amazing as a concept, but how do companies implement this? What is involved? What gotchas hide in the details? With examples from leaders in multiple industries, this presentation will help you learn:
• What is AIOps and what can you expect from a solid implementation
• Benefits AIOps provides in various use cases: predictive maintenance, performance optimization, network bottleneck identification and remediation, MTTR reduction, etc.
• Example implementations - problems you are likely to run into, requirements to make it work, tradeoffs you need to consider, etc.
See publication
Tags: 5G, Analytics, IoT
Build analytics for performance and growth on a budget
Data Day Texas
January 28, 2023
When building or changing an enterprise analytics architecture, there are a lot of things to consider–Cloud or on-prem, hybrid or multi-cloud, this cloud or that cloud, containerized, build tech, buy tech, use the skills in house, train new skills, etc. While balancing those decisions, there are a lot of considerations, but the main three are performance, costs, and planning for the future including future growth in analytics demand.
In this session, get some solid data on how to build a data analytics architecture for performance and rapid growth, without breaking the budget, by focusing on what is important and looking at some examples of architectures at companies that are tackling some of the toughest analytics use cases. Learn from others’ mistakes and successes. Learn how real companies like the Index Exchange, Simpli.fi, and the Tradedesk analyze data up to petabyte ranges, track millions of realtime actions, generate 10’s of thousands of reports a day, keep thousands of machine learning models in production and performing, and still keep budgets under control.
See publication
Tags: Analytics, Big Data, Predictive Analytics
Get Projects into Production Faster with In-DB ML
Big Data Europe
November 24, 2022
MLOps has rocketed to prominence based on one, clear problem: many machine learning projects never make it into production. According to a couple of recent surveys, between 30 and 60% of even the small percentage of projects that do make it take 3 months to a year to get put to work. Since data science is a cost center for organizations until those models are deployed, the need to shorten, organize, and streamline the process from ideation to production is essential.
Data science is not simply for the broadening of human knowledge, data science teams get paid to find ways to shave costs and boost revenues. That can mean preventative maintenance that keeps machines on line, churn reduction, customer experience improvements, targeted marketing that earns and keeps good customers, fraud prevention or cybersecurity that keeps assets safe and prevents loss, or AIOps that optimizes IT to get maximum hardware uptime for minimum costs.
To get those benefits, do you need to add yet another piece of technology to already bloated stacks? There may be a way for organizations to get machine learning into production faster with something nearly every company already has: a good analytics database.
Learn how to:
• Enable data science teams to use their preferred tools – Python, R, Jupyter – on multi-terabyte data sets
• Provide dozens of data types and formats at high scale to data science teams, without duplicating data pipeline efforts
• Make new machine learning projects just as straightforward as enabling BI teams to create a new dashboard
• Get machine learning projects from finished model to production money-maker in minutes, not months
See publication
Tags: AI, Big Data, Predictive Analytics
Reduce Risk and Avoid Lock-in When Moving from On-Prem to Cloud or Hybrid
Big Data and AI Toronto
October 06, 2022
There are a lot of advantages to moving to the cloud. But the more various organizations move to the cloud, the more we see them tripping over hidden land mines like out-of-control costs, platform lock-in that restricts future options, regulations that restrict data location, etc. How can a company shift data analysis workloads into the cloud, while minimizing their exposure to those risks?
In this session, you will learn how to:
• transition to the cloud, while keeping some workloads on prem,
• move to multiple clouds without hopelessly complicating analytics
• maintain high throughput and low latency in cloud analytics
• keep platform options for the future open
See publication
Tags: Analytics, Big Data, Cloud
Achieving Unified Analytics
DBTA Data Summit
May 17, 2022
The data warehouse has been an analytics workhorse for decades for business intelligence teams. But unprecedented volumes and new types of data, plus the need for advanced analyses, brought on the age of the data lake. Now, many companies have a data lake for data science, a data warehouse for BI, or a mishmash of both—possibly combined with a mandate to go to the cloud. Find out how technical and spiritual unification of the two camps can have a powerful impact on the effectiveness of analytics for the business overall.
See publication
Tags: AI, Analytics, Big Data
Data Con LA 2021 - In-Database Machine Learning with Jupyter
DataCon LA
September 29, 2021
Jupyter with Python code is a productive way to prepare models, but putting machine learning models into production at scale may require re-building the entire workflow. Using the same interactive tools, but letting a distributed database do the work could get ML models into production in minutes, not months.
See publication
Tags: Analytics, AI, Big Data
Making Production Data Accessible for Data Science at Scale
Big Data London
September 22, 2021
The data warehouse has been an analytics workhorse for decades for business intelligence teams. Unprecedented volumes of data, new types of data, and the need for advanced analyses like machine learning brought on the age of the data lake. Now, many companies have a data lake for data science, a data warehouse for BI, or a mishmash of both, possibly combined with a mandate to go to the cloud. The end result can be a sprawling mess, a lot of duplicated effort, a lot of missed opportunities, a lot of projects that never made it into production, and a lot of financial investment without return. Technical and spiritual unification of the two opposed camps can make a powerful impact on the effectiveness of analytics for the business overall.
- Look at successful data architectures from companies like Philips, The TradeDesk, Climate Corporation, …
- Learn to eliminate duplication of effort between data science and BI data engineering teams
- See a variety of ways companies are getting AI and ML projects into production where they have real impact, without bogging down essential BI
- Study analytics architectures that work, why and how they work, and where they’re going from here
See publication
Tags: Analytics, Big Data, IoT
Python + MPP Database = Large Scale AI/ML Projects in Production Faster
ODSC East
April 28, 2021
Getting Python data science work into large scale production at companies like Uber, Twitter or Etsy requires a whole new level of data engineering. Economies of scale, concurrency, data manipulation and performance are the bread and butter of MPP analytics databases. Learn how to take advantage of MPP scalability and performance to get your Python work into production where it can make an impact.
See publication
Tags: AI, Big Data, Predictive Analytics
Unifying Analytics - Production Analytics Architecture Evolution
Big Data Virtual Masterclass
July 22, 2020
The data warehouse has been an analytics workhorse for decades. Unprecedented volumes of data, new types of data, and the need for advanced analyses like machine learning brought on the age of the data lake. But Hadoop by itself doesn’t really live up to the hype. Now, many companies have a data lake, a data warehouse, or a mishmash of both, possibly combined with a mandate to go to the cloud. The end result can be a sprawling mess, a lot of duplicated effort, a lot of missed opportunities, a lot of projects that never made it into production, and a lot of financial investment without return.
Technical and spiritual unification of the two opposed camps can make a powerful impact on the effectiveness of analytics for the business overall.
Over time, different organizations with massive IoT workloads have found practical ways to bridge the artificial gap between these two data management strategies. Look under the hood at how companies have gotten IoT ML projects working, and how their data architectures have changed over time. Learn about new architectures that successfully supply the needs of both business analysts and data scientists. Get a peek at the future. In this area, no one likes surprises.
- Look at successful data architectures from companies like Philips, Anritsu, Uber, …
- Learn to eliminate duplication of effort between data science and BI data engineering teams
- Avoid some of the traps that have caused so many big data analytics implementations to fail
- Get AI and ML projects into production where they have real impact, without bogging down essential BI
- Study analytics architectures that work, why and how they work, and where they’re going from here
See publication
Tags: Analytics, Big Data, IoT
Unifying Analytics: Architecting Production IOT Analytics
Pulsar Summit
January 24, 2020
Analyzing Internet of Things data has broad applications in a variety of industries from smart buildings to smart farming, from network optimization for telecoms to preventative maintenance on expensive medical machines or factory robots. When you look at technology and data engineering choices, even in companies with wildly different use cases and requirements, you see something surprising: Successful production IoT architectures show a remarkable number of similarities.
Join us as we drill into the data architectures in a selection of companies like Philips, Anritsu, and Optimal+. Each company, regardless of industry or use case, has one thing in common: highly successful IoT analytics programs in large scale enterprise production deployments.
By comparing the architectures of these companies, you’ll see the commonalities, and gain a deep understanding of why certain architectural choices make sense in a variety of IoT applications.
See publication
Tags: Analytics, Big Data, IoT
Unlock the Value in Data: Rise of Hybrid Cloud, Multi-Cloud Platforms
Vertica
April 21, 2022
Being limited to analyzing data on-premises is a known problem. But analytics limited to cloud, or just a single cloud vendor, can also reduce the return on your data investment. To unlock the value of data, companies must embrace the reality of a hybrid world.
In this webcast, we’ll dive into solid Eckerson Group research on how companies across industries are getting their arms around data in multiple clouds and on-prem systems. ThinkData Works is an example of a successful technology company at the center of this important trend. We invite you to learn how ThinkData Works is helping customers pull in new sources and manage external data at scale to reduce risk, boost efficiency, and drive innovation.
See publication
Tags: Analytics, Big Data, Cloud
Cloud Without Compromises: Crucial Analytical Data Platform Requirements
Vertica
March 15, 2022
Most organizations are moving their analytical data platforms – whether based on data warehouses, data lakes, or both — into the cloud. But how do you choose the right platform to fit your organizational realities, your technology strategy and direction, and important product requirements? What are the compromises in choosing a platform that is only available as a cloud service or only available in one cloud? And what are the capabilities you should look for beyond support for business intelligence and analytics, particularly when it comes to supporting machine learning and data science?
Join Doug Henschen, VP and principal analyst at Constellation Research, and author of “What to Consider When Choosing a Cloud-Centric Analytical Data Platform,” for this informative web event on March 10 at 8 am PT/11 am ET. He’ll be joined by Paige Roberts, Open Source Relations Manager at Vertica, and by Bert Corderman, Senior Manager of Engineering at The Trade Desk.
See publication
Tags: Analytics, Big Data, Cloud
Find the Balance Between MPP Databases and Spark for Analytical Processing
Vertica
August 25, 2021
Both Apache Spark and massively parallel processing (MPP) databases are designed for the demands of analytical workloads. Each has strengths related to the full data science workflow, from consolidating data from many siloes, to deploying and managing machine learning models. Understanding the power of each technology, and the cost and performance trade-offs between them can help you optimize your analytics architecture to get the best of both. Learn when using Spark accelerates data processing, and when it spreads far beyond what you want to maintain. Learn when an MPP database can provide blazing fast analytics, and when it can fail to meet your needs. Most of all, learn how these two powerful technologies can combine to create a perfect balance of power, cost, and performance.
See publication
Tags: Analytics, Big Data, Predictive Analytics
Thought Leadership: Modernize Data Warehousing – Beyond Performance
Vertica
March 15, 2021
Configuration, management, tuning and other tasks can take away from valuable time spent on business analytics. If a platform leads to coding workarounds, non-intuitive implementations and other problems, it can make a big impact on long-term resource usage and cost. A lot of enterprise analytics platform evaluations focus on query price-performance to the exclusion of other features that can have a huge impact on business value, and can cause major headaches if you don’t take them into consideration.
In this webinar, we’ll go beyond price-performance, and focus on everything else needed to modernize your data warehouse.
See publication
Tags: Analytics, Big Data, Predictive Analytics
Natural Language Processing Augmented Analytics
Vertica
November 17, 2020
The goal of data analytics, whether business intelligence or advanced analytics like machine learning has always been to guide organizations with solid data, rather than feelings. While every company strives to be data-driven, this requires making analytics accessible to more people. What could be more accessible than asking your data a question in your own language? Tune in to learn about natural language processing, the challenges and benefits of this exciting technology, and how it can democratize data analytics, and bring business results to the next level.
See publication
Tags: AI, Analytics, Predictive Analytics