By Bob Gourley
We previously wrote about the Pentaho Big Data Blueprints series, which include design packages of use to enterprise architects and other technologists seeking operational concepts and repeatable designs. With this post we provide more information from the blueprint on Optimizing the Data Warehouse:
Optimizing your data warehouse can reduce strain on your existing systems and reduce overall project cost by offloading less frequently used data and corresponding transformation workloads to Hadoop without coding or relying on legacy scripts and ETL product limitations. Doing this right can save money and make your overall system more functional at the same time.
Here is more from Pentaho:
What Is It?
- Hadoop Made Simple, Accessible and 15X Faster
- Pentaho simplifies offloading to Hadoop and speeds development and deployment time by as much as 15x versus hand-coding approaches. Complete visual integration tools eliminate the need for hand coding in SQL or java-based MapReduce jobs.
Save data costs and boost analytics performance
- An intuitive graphical, no-coding big data integration.
- Access to every data source – from operational to relational to NoSQL technologies.
- Support for every major Hadoop distribution with a future-proof adaptive big data layer.
- Achieve higher processing performance with Pentaho MapReduce when running in cluster.
- 100% Java, fast and efficient.
As part of the Pentaho Business Analytics Platform, there is no quicker or more cost-effective way to immediately get value from data through integrated reporting, dashboards, data discovery and predictive analytics.
How it works
Here is an example of how this may look within an IT landscape:
- This company leverages data from disparate sources including CRM and ERP systems.
- A Hadoop cluster has been implemented to offload less frequently used data from the existing data warehouse.
- The company saves on storage costs and speeds-up query performance and access to their analytic data mart.
- Staff savings and productivity: Pentaho’s Visual MapReduce GUI and big data integration means existing data warehouse developers can move data between the data warehouse and Hadoop without coding.
- Time to value: MapReduce development time is reduced by up to 15x versus hand-coding based on comparisons.
- Faster job execution: Pentaho MapReduce runs faster in cluster versus code generating scripting tools.
Leading Global Network Storage Company had a goal of scaling machine data management to enhance product performance and customer success.
- Affordably scale machine data from storage devices for customer application
- Predict device failure
- Enhance product performance
- Easy to use ETL and analysis for Hadoop, Hbase, and Oracle data sources
- 15x data cost improvement
- Stronger performance against customer Service Level Agreements
For more on these and other blueprints see Pentaho’s Blueprints to Big Data Success