Pentaho Data Catalog | Pentaho https://pentaho.com Wed, 28 May 2025 15:03:43 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 https://pentaho.com/wp-content/uploads/2024/04/favicon.png Pentaho Data Catalog | Pentaho https://pentaho.com 32 32 What Banks Need to Know About EU AI Act Compliance and Ethical AI Governance https://pentaho.com/insights/blogs/eu-ai-act-compliance-for-banks/ Tue, 15 Apr 2025 03:49:22 +0000 https://pentaho.com/?post_type=insightsection&p=4729 The EU AI Act is reshaping banking. See how Pentaho simplifies AI compliance and governance to help banks lead with trust and ethical innovation.

The post What Banks Need to Know About EU AI Act Compliance and Ethical AI Governance first appeared on Pentaho.

]]>
With the European Union (EU) now setting strong artificial intelligence (AI) standards, banks are quickly coming to a crossroads with AI and GenAI. Their challenge is twofold: how to satisfy new regulatory requirements while also forging ground in ethical AI and data management.

The EU’s evolving AI laws, including the new AI Act, prioritize fairness, transparency, and accountability. These laws will disrupt the way AI is already implemented, requiring banks to redesign the way they manage, access, and use data. Yet, as we’ve seen with other regulations, meeting these acts can provide an opportunity.  As banks evolve to meet these laws, the resulting improvements can increase customer trust and position the banks as market leaders in regulated AI adoption.

Meeting the EU AI Act Moment

There are a few key areas where banks should invest to both adhere to the EU AI Act and reap additional benefits across other regulatory and business requirements.

Redefining Data Governance for the AI Age

Strong data governance sits at the heart of the EU’s AI legislation. Banks must ensure the data driving AI algorithms is open, auditable, and bias-free. Good data governance moves compliance from the status of being a chore to one that is proactively managed, establishing the basis for scalable, ethical AI. They can achieve this through technology that delivers:

Unified Data Integration: The ability to integrate disparate data sources into a centralized, governed environment ensures data consistency and eliminates silos. A comprehensive view of data is essential for regulatory compliance and effective AI development.

Complete Data Lineage and Traceability: Tracking data lineage from origin to final use creates full transparency throughout the data lifecycle. This directly addresses regulatory requirements for AI explainability and accountability.

Proactive Bias Detection: Robust data profiling and quality tools allow banks to identify and mitigate biases in training datasets, ensuring AI models are fair and non-discriminatory.

Building Ethical AI From the Ground Up

Moral AI is becoming both a legal imperative and a business necessity. The EU’s emphasis on ethical AI requires banks to prioritize fairness, inclusivity, and transparency in their algorithms. This demands continuous monitoring, validation, and explainability, all of which can foster stronger customer relationships and differentiate banks as pioneers in responsible AI through:

Real-Time AI Model Monitoring: Integrating with machine learning platforms enables teams to monitor AI models in real-time, flagging anomalies and ensuring adherence to ethical standards.

Explainable AI (XAI): AI explainability is supported by tools that visualize decision-making pathways, enabling stakeholders and regulators to understand and trust AI outcomes.

Collaborative AI Governance: Facilitating collaboration between data scientists, compliance officers, and business leaders ensures that ethical considerations are embedded across the AI development lifecycle.

Streamlined Regulatory Compliance

Regulatory compliance often involves extensive reporting, auditing, and data security measures. Technology that simplifies these processes helps banks navigate the complex EU AI regulatory framework while driving down costs, boosting productivity, and empowering banks to innovate while maintaining adherence to regulations.

Automated Compliance Reporting: Customizable reporting tools generate regulatory-compliant reports quickly and accurately, reducing the burden on compliance teams.

Audit-Ready Data Workflows: A platform with built-in audit trail features documents every step of the data process, providing regulators with clear and actionable insights.

Privacy-Centric Data Management: Support for data anonymization and encryption ensures compliance with GDPR and safeguarding customer information.

Transparency and Accountability: The Hallmarks of Leadership

AI is transforming financial services, but customers’ confidence matters. Banks must be transparent and accountable to generate trust in AI decision-making. When banks treat transparency as a path to redefining relationships, they can transform customer interactions.

Customer-Centric Insights: Intuitive dashboards that allow banks to explain AI-driven decisions to customers, enhancing trust and satisfaction.

Stakeholder Engagement: Interactive visualizations and real-time analytics enable banks to communicate compliance metrics and AI performance to regulators and stakeholders.

Collaborative Transparency: Collaborative features ensure that transparency and accountability are integral to every AI project, from design to deployment.

Leveraging Pentaho for Compliant AI

To fully adopt a strategic approach to AI compliance, banks can capitalize on Pentaho’s capabilities to:

  • Develop a Unified Governance Framework
    Use Pentaho to create a centralized data governance model, ensuring alignment with EU standards and global best practices.
  • Prioritize Data Lineage and Quality
    Leverage Pentaho’s data cataloging and profiling tools to ensure that all datasets meet compliance requirements and ethical standards.
  • Foster Collaboration Across Teams
    Involve compliance officers, data scientists, and business leaders in AI governance, using Pentaho to enable cross-functional workflows.
  • Monitor AI Continuously
    Implementing Pentaho’s real-time monitoring and reporting features can proactively address compliance risks and optimize AI performance.
  • Communicate Compliance Effectively
    Use Pentaho’s visualization and reporting tools to provide stakeholders with clear and actionable insights into AI processes.
The Path Forward to Robust AI Compliance and Performance

Imagine a world where banks don’t just tackle compliance problems but also use them as strategic growth engines. Pentaho’s full-spectrum data integration, governance, and analytics products empower financial institutions not only to adapt to change but to drive the way in ethical AI practice. This openness helps them not only meet regulatory standards in the present but to set the direction of AI use with due care in the future.

Pentaho is well positioned to help transform finance industry systems into intelligent and compliant AI engines, especially ahead of the new AI regulations coming from the European Union. This is a time of significant change for banks where the right combination of modern technology and enabling regulation can re-energize client trust – an approach Pentaho is looking to lead.

Ready to make compliance your competitive advantage? See how Pentaho powers ethical AI for the financial services industry.

The post What Banks Need to Know About EU AI Act Compliance and Ethical AI Governance first appeared on Pentaho.

]]>
New CFPB Data Compliance Requirements Will Test the Limits of Financial Data Management Strategies https://pentaho.com/insights/blogs/new-cfpb-data-compliance-requirements-will-test-the-limits-of-financial-data-management-strategies/ Tue, 17 Dec 2024 18:42:22 +0000 https://pentaho.com/?post_type=insightsection&p=3288 Changing business conditions, the rapid shift to renewables and market pricing dynamics all require energy wholesalers to pivot strategies with agility and confidence.

The post New CFPB Data Compliance Requirements Will Test the Limits of Financial Data Management Strategies first appeared on Pentaho.

]]>
The Consumer Financial Protection Bureau (CFPB) recently announced new rules to strengthen oversight over consumer financial information and place more limits on data brokers. The new rules — the Personal Financial Data Rights Rule (Open Banking Rule) and the Proposed Rule on Data Broker Practices — will change the face of financial data management.

Across a wide spectrum of the financial industry – from credit unions to fintech companies and data brokers – now have new data access, privacy, consent, lineage, auditability, and reporting requirements. Compliance with these new CFPB requirements will be a massive operational and technical issue for most companies.

Below is a breakdown of the unique issues that arise with the new CFPB guidelines and how impacted organizations need to rethink their data lineage, privacy controls, automation, and auditing strategies.

The Personal Financial Data Rights Rule (Open Banking) 

The Personal Financial Data Rights Rule from the CFPB seeks to enable consumers to manage, access, and share financial information with third-party providers. Financial institutions have to offer data access, portability, and privacy protection with total control over who has seen the data and when.

Key Challenges and Strategies: Data Access and Portability

Banks and financial institutions must allow consumers to migrate their financial information to third parties. Institutions will need to demonstrate when, how, and why consumer data was passed. They must also protect consumer information and only share the consented data. 

Automated ETL (Extract, Transform and Load) can help institutions collect consumer financial information across diverse sources (CRMs, payment systems, loan management systems) and turn it into common formats for easier management and tracing. This will also support lineage, crucial to providing regulators a full audit trail. Integration with Open Banking APIs and being able to integrate data with third parties directly will be essential.

Role based access is an important control to ensure only authorized users and systems are accessing defined data, and being able to mask or encrypt PII helps when making consumer data anonymous when it is provided to third parties.

The New Data Broker Rules 

The CFPB’s revised data broker rules expand the scope of the Fair Credit Reporting Act (FCRA) and includes Credit Rating Agencies. Data brokers who purchase, sell, or process consumer data now have to respect consumer privacy, consent, and deletion rights.

Key Challenges and Strategies: Data Deletion Requests 

Under this new rule, brokers will need to comply with consumer data deletion requests.  Data brokers must guarantee only explicit consent to share consumer data. Regulators are now demanding an audit trail of who and with whom consumer data was shared. 

Automating data deletion workflows helps organizations automatically detect and delete every reference to a consumer’s data in databases, data warehouses, and third-party data lakes. Being able to purge workflows on request ensures that databases are automatically cleansed, duplicates removed, and consumer records deleted when CFPB requests data deletions. 

Marking and categorizing consumer data and grouping it according to privacy policies and access levels enables data to be more easily managed and deleted when needed. Also, data masking blocks access to non-PII data from third parties to support access and anonymization requirements.  

Being able to track data as it is processed across databases and APIs provides the ability to demonstrate with certainty to regulators how, where and when data was used. All of these capabilities support the regular reporting that can be submitted directly to the CFPB.

Supporting Data Privacy, Consent, and Portability

Both CFPB regulations are focused on consumer consent, privacy management, and data portability. Businesses must now allow consumers to have control over their data and know where it is being shared.

Key Challenges and Strategies: Consent Tracking 

Consumers must be able to cancel their consent to sharing data. They need to have access to and the ability to export their personal data in common formats. This means multiple data silos Data must be synchronized with new consumer consent.  

Visualizing consumer consent data and monitoring change requests over time will be crucial for compliance and reporting.  Organizations will need to have clean data change logs supported by data lineage metadata to provide a full audit trail.

Having data management tools that integrate with REST APIs will make it easier to export consumer data to other banks or fintech providers as needed. The ability to export data in multiple formats, such as CSV, JSON, or XML, allows integration with third-party programs. It will also be important to sync consent updates between multiple data warehouses so that consumer data is removed from the system when consent is revoked. 

Assuring Perpetual Compliance with CFPB Audit & Reporting Requirements. 

In the long term, CFPB compliance will require businesses to consistently be transparent, demonstrate compliance, and issue regulators demand reports. This means organizations must adopt audit-friendly data lineage, be able to produce reports on-demand that capture a wide variety of variables, and be able to spot errors early to triage mishandling, validate missing or incorrect data, and proactively address the issues before auditors discover them.  

Meeting The Consumer Data Privacy New World Order Head On 

The new CFPB data privacy, consumer consent, and broker practices are significant hurdles for financial institutions. Compliance requires data governance, real-time audits, and data sharing. Pentaho’s entire product portfolio — from Pentaho Data Integration (PDI), Pentaho Data Catalog (PDC), and Pentaho Data Quality (PDQ) — meets these issues through data privacy, portability, and auditability.

With Pentaho’s data integration, lineage management, and consent management functionality, financial companies can meet the CFPB’s regulations and reduce the risk of non-compliance fines. Contact our team to learn more! 

The post New CFPB Data Compliance Requirements Will Test the Limits of Financial Data Management Strategies first appeared on Pentaho.

]]>
Understanding Data Lineage: Why It’s Essential for Effective Data Governance https://pentaho.com/insights/blogs/understanding-data-lineage-why-its-essential-for-effective-data-governance/ Tue, 19 Nov 2024 18:28:46 +0000 https://pentaho.com/?post_type=insightsection&p=3262 In the world of data-driven decision-making, transparency is key.

The post Understanding Data Lineage: Why It’s Essential for Effective Data Governance first appeared on Pentaho.

]]>
In the world of data-driven decision-making, transparency is key. Knowing where your data comes from, how it’s transformed, and where it ends up is crucial for organizations aiming to build trust, ensure compliance, and drive value from data. This concept is known as data lineage, and it’s a cornerstone of modern data governance strategies. 

Let’s explore what data lineage is, why it matters, and how tools like Pentaho+ make it easier for organizations to implement robust data lineage tracking across their data ecosystems. 

What is Data Lineage? 

Data lineage is the ability to trace the journey of data as it flows from its origin to its final destination, detailing every transformation, calculation, or movement along the way. It provides a visual and historical record of data, allowing stakeholders to see how data has been manipulated, merged, or split to serve different business purposes. 

In a practical sense, data lineage answers questions like: 

  • Where did this data originate? 
  • How was this data transformed or processed? 
  • What are the relationships between datasets? 

Think of data lineage as a roadmap that shows the route data has taken and the stops it made along the way. This roadmap helps organizations keep track of data’s entire lifecycle, from initial capture to its end use, which is especially valuable in regulated industries like finance, healthcare, and government. 

Why is Data Lineage Important? 

Data lineage provides value across several areas of data management and governance, helping organizations maintain data quality, meet regulatory requirements, and empower decision-making. 

  1. Ensures Data Quality and Trust

With a clear lineage, organizations can ensure that data is accurate and reliable. By understanding where data comes from and how it’s transformed, organizations can spot any inconsistencies or errors in real-time. This builds confidence in the data, ensuring that decisions based on it are well-informed and trustworthy. 

  1. Simplifies Compliance and Auditing

For industries under regulatory scrutiny, such as finance or healthcare, data lineage is essential for compliance. Regulations like GDPR, HIPAA, and PCI DSS require organizations to document how data is used and protected. Lineage tracking allows organizations to demonstrate compliance, providing auditors with a clear trail of data usage and handling practices. 

  1. Supports Impact Analysis and Risk Management

When organizations consider making changes to data processes or systems, data lineage helps them assess the potential impact. By knowing which reports or analyses rely on specific data sources, teams can manage risks associated with data changes, system migrations, or updates with confidence. 

  1. Enhances Data Governance

Data lineage is at the heart of data governance, providing transparency and accountability across data systems. By maintaining lineage, organizations empower data governance teams to manage policies, monitor usage, and make informed decisions about data access, retention, and security. 

How Does Data Lineage Work in Practice? 

To effectively trace data lineage, organizations need tools that can automatically map and record data flows across different systems, formats, and transformations. This can be challenging, especially in environments with multiple data sources and complex transformations. 

Automated Lineage Tracking with Pentaho+ 

Pentaho+ simplifies data lineage by providing automated lineage tracking capabilities. This allows organizations to visualize data flows, capture transformations, and document data relationships in a centralized platform. 

  • Galaxy View for Visual Lineage: Pentaho+ provides a Galaxy View feature, which visually represents data relationships, transformations, and dependencies. This visual tool makes it easy for data stewards and analysts to understand the data’s journey and quickly pinpoint any issues or compliance concerns. 
  • Out-of-the-Box Lineage for ETL and ELT Processes: Pentaho+ supports both ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes, enabling organizations to track lineage across complex data pipelines without manual intervention. 
  • End-to-End Lineage Across Cloud and On-Premises Systems: Pentaho+ integrates with popular cloud storage solutions and on-premises databases, ensuring that data lineage can be traced across hybrid environments, a critical feature for today’s data ecosystems. 

Real-World Example: Data Lineage in Financial Services 

Imagine a financial institution that needs to comply with PCI DSS, which requires transparency in handling cardholder data. Using Pentaho+, the organization can document and visualize data lineage across its systems, ensuring that every transformation, calculation, and report is traceable. 

With Galaxy View, the finance team can quickly see how data flows from the customer’s initial card transaction, through encryption processes, to final storage. If auditors request details on specific data handling practices, the organization can use its lineage documentation to show exactly how cardholder data is managed in compliance with PCI DSS, saving time and reducing compliance risk. 

Key Takeaways for Implementing Data Lineage 

Data lineage is more than just a data governance tool—it’s a way to build trust, ensure compliance, and empower decision-making. By implementing automated lineage tracking with a solution like Pentaho+, organizations can: 

  1. Strengthen Data Quality and Transparency: Track data origins and transformations to enhance data accuracy and trust. 
  2. Simplify Compliance: Maintain comprehensive records of data usage to support regulatory reporting and audits. 
  3. Manage Data Risk: Assess the potential impacts of changes in data systems or processes with accurate impact analysis. 

 Conclusion: Data Lineage as a Foundation for Data Governance 

Data lineage provides a clear path to understanding and managing data, from origin to end use. In today’s regulatory and data-driven landscape, it’s a must-have for any organization looking to maintain compliance and ensure data quality. With Pentaho’s lineage tracking tools, organizations can visualize data relationships, maintain transparency, and build a foundation for effective data governance. 

Data lineage isn’t just a best practice—it’s a competitive advantage that brings clarity, accountability, and confidence to data management. Ready to explore how Pentaho+ can support your data governance goals? Contact our team to learn more! 

The post Understanding Data Lineage: Why It’s Essential for Effective Data Governance first appeared on Pentaho.

]]>
Having Trouble Funding Gen AI? Increase Innovation Investment Through Smarter Data and Storage Management https://pentaho.com/insights/blogs/having-trouble-funding-gen-ai-increase-innovation-investment-through-smarter-data-and-storage-management/ Wed, 30 Oct 2024 21:32:32 +0000 https://pentaho.com/?post_type=insightsection&p=3091 Increase Innovation Investment Through Smarter Data and Storage Management

The post Having Trouble Funding Gen AI? Increase Innovation Investment Through Smarter Data and Storage Management first appeared on Pentaho.

]]>
Every organization is managing through exponential information growth, much of which is driven by unstructured data.  

Since it lives in pdfs, videos, social media and other sources, unstructured data defies the easy classification organizations are used to with traditional SQL-based sources. This makes it harder to understand and manage from a usability/governance / security standpoint. Its expansive nature also quickly increases storage costs and adds to data sprawl challenges. 

We know unstructured data has incredible untapped value and potential to enhance any number of products and services, including helping to unlock the promise of GenAI. However, lack of understanding and classification of this data increases risk, especially with data that may be sensitive or stored at odds with the retention requirements for that class of data.  

Data and IT teams are looking for ways to get a better handle on unstructured data. They are also looking to free up budget to move GenAI from POCs and pilots into production. A strong data classification strategy, combined with storage tiering and automation, can improve performance and unlock crucial infrastructure and data management savings to fuel AI and GenAI efforts. 

First, Understand Your All of Your Data 

A well-structured data classification system helps organizations easily identify and access relevant data for any number of operational and innovative applications. This has taken on renewed importance since AI and GenAI applications rely on vast amounts of data for training and learning. 

Today, effective data classification means being able to access and understand all data, both structured data and unstructured sources including PDFs, blob files and media formats such as images, videos, audio, and more. Understanding the metadata around these sources and being able to score them on quality and reliability are vital to any customer-facing or decision-influencing GenAI or AI application.  

Data classification also plays an important role in governance and regulatory compliance. While there are already many industry-specific regulations such as HIPAA and Know Your Customer, there are also a wide range of laws already in place that relate to data handling and privacy that apply to AI. This doesn’t even include whatever new laws are coming, which are in various stages of implementation in different regions. Properly identifying sensitive information at scale gives organizations the power to apply the necessary rules and measures that reduce risk and help avoid potential fines while maintaining customer and stakeholder trust. 

Automating Storage: Right-Size Usage, Recapture Budget and Increase Bandwidth  

Once data is properly classified, you can implement tools that detect various aspects of the data lifecycle to score its value. The scoring of data’s value should be based on multiple attributes (size, usage rate, where it’s being used and for what purpose) to inform storage tiering policies that can then be automatically applied to every piece of data.  

Powered by automation and intelligence, this process creates cost savings in three ways. First is in overall storage costs. Since intelligent tiering and re-tiering of data allocates data location based on use and value, infrequently used data can be sent to lower cost environments. Secondly, with all data properly classified, it becomes much easier to quickly retrieve, and re-tier data only as needed for uncommon upstream application requests or new AI/GenAI asks. And with classification and policies established, an organization can better manage retention policies to ensure they are correctly implemented based on regulatory and corporate guidelines. 

Automated storage policies also scale with data’s growth, keeping costly manual processes at bay and protecting the hard-won agility and bandwidth teams need to keep up with AI and GenAI demands.  

A Winning Combination 

Integrating data classification with automated data lifecycle policy creation and enforcement creates a strong foundation for AI and GenAI success. This combination accelerates access to trusted and governed data, enhances data quality, and frees up precious budget that can be used to bring AI and GenAI projects to life. 

Request a demo to learn more about how Pentaho’s data intelligence and integration platform can enable your data classification and storage optimization needs and help your organization get data-fit.

The post Having Trouble Funding Gen AI? Increase Innovation Investment Through Smarter Data and Storage Management first appeared on Pentaho.

]]>
Protecting Vital Resources Through Data: Enhancing Arizona’s Water Management with Data Intelligence from Pentaho https://pentaho.com/insights/blogs/protecting-vital-resources-though-data-enhancing-arizonas-water-management-with-data-intelligence-from-pentaho/ Mon, 07 Oct 2024 19:50:44 +0000 https://pentaho.com/?post_type=insightsection&p=2831 Pentaho Data Catalog Automates Data Processes for Arizona’s Department of Water Resources to Improve Water Supply Availability and Conservation  Water is one of society’s most vital and overlooked resources. Current and future data shows that many parts of the world – including the southwest U.S. – are at risk for severe water shortages in the […]

The post Protecting Vital Resources Through Data: Enhancing Arizona’s Water Management with Data Intelligence from Pentaho first appeared on Pentaho.

]]>

Pentaho Data Catalog Automates Data Processes for Arizona’s Department of Water Resources to Improve Water Supply Availability and Conservation 

Water is one of society’s most vital and overlooked resources. Current and future data shows that many parts of the world – including the southwest U.S. – are at risk for severe water shortages in the not-distant future. According to the Nature Conservancy, experts predict a 20-25% decline in west regional precipitation in the Colorado River Basin, which has seen historic drought conditions since 2000 and where over 40 million people currently reside.  

These trends directly impact Arizona, with the Colorado River being Arizona’s largest renewable water source and second only to groundwater in the overall water supply mix. These conditions are what the Arizona Department of Water Resources (DWR) faces when tasked with the vital mission of protecting, conserving and improving the state’s water supply. 

Searching for Signals in Noisy Data 

The Arizona DWR oversees the management of nearly 6 trillion gallons annually for over seven million residents. Its extensive asset portfolio generates data from over 300,000 wells and surface water applications. The lack of a centralized metadata repository and limited resources to hire additional personnel made it challenging for DWR staff to locate essential datasets. 

“We struggled to find people and affordable tools that could completely fulfill our requirements,” noted Lisa Williams, Manager of the Office of Enterprise Data Management at Arizona DWR. “We needed a business glossary, but most catalog software didn’t sell that separately, so we created our own in SQL. However, we later realized we could not implement more mature data management best practices without a data catalog tool.” 

Gaining Intelligence Through Automation 

In May 2020, the DWR turned to Pentaho Data Catalog to help streamline data management and scale its ability to leverage data across the organization.  

The DWR realized significant time to value and minimal downtime, with the entire data discovery, metadata cataloging and platform migration taking just two weeks. And by leveraging native machine learning capabilities, Pentaho Data Catalog was trained to recognize various structured data types, ensuring that sensitive information related to water rights and usage was handled securely. 

The Department can now understand, integrate and analyze its unique and critical datasets to meet the needs of Arizona’s water users, planners and decision makers.  

“One of the reasons we’re excited about Universal Data Intelligence is that you just need to type in ‘well’, and an accurate and comprehensive report of over 300 instances of that data element in our transactional databases, data warehouse, document management system and spatial data is ready for export, which is saving us so much time,” said Williams. 

Transforming Operations and Driving Collaboration 

The enhanced visibility into data provided by Pentaho Data Catalog has increased confidence in data quality with a centralized metadata repository that staff can quickly access to understand existing data, streamlining processes for licensing and applications. 

The catalog also enables consistent frameworks across the department, enabling groundwater hydrologists, water resource managers and application developers to standardize how they compile and analyze water usage. The result is a more accurate view of Arizona’s water resources that supports sustainable demand and supply planning. 

“We are now modernizing all our licensing and applications processes. Having a centralized metadata repository enables our staff and the consultants to quickly understand existing data used in online and automated processes during the project discovery process. And when data is migrated to the cloud, we’ll know the complete lineage of the data,” said Williams.  

With better data visibility through Pentaho Data Catalog, Arizona’s DWR can confidently engage with stakeholders to collaborate more effectively on addressing water resource challenges, with improved internal operations paving the way for a more resilient future in water management. As water challenges grow, so does the need for intelligent solutions, and Arizona is setting a precedent for how technology can transform resource management in an increasingly demanding environment. 

Learn more about Pentaho Data Catalog at https://pentaho.com/products/pentaho-data-catalog/ or request a demo at https://pentaho.com/request-demo/. 

The post Protecting Vital Resources Through Data: Enhancing Arizona’s Water Management with Data Intelligence from Pentaho first appeared on Pentaho.

]]>
Turning Data into Revenue: Pentaho+ Helps LightBox Differentiate and Grow by Becoming Data Fit https://pentaho.com/insights/blogs/turning-data-into-revenue-pentaho-helps-lightbox-differentiate-and-grow-by-becoming-data-fit/ Wed, 02 Oct 2024 19:05:32 +0000 https://pentaho.com/?post_type=insightsection&p=2829 Pentaho helps LightBox Bring its Data to Life While Improving its Own Customer Experience and Success  Escalating vacancies, high interest rates, sustainable building codes – these and other competing forces create a complex and stressful environment for commercial real estate firms. Many of these firms turn to LightBox and its platform to understand commercial, geographic, […]

The post Turning Data into Revenue: Pentaho+ Helps LightBox Differentiate and Grow by Becoming Data Fit first appeared on Pentaho.

]]>
Pentaho helps LightBox Bring its Data to Life While Improving its Own Customer Experience and Success 

Escalating vacancies, high interest rates, sustainable building codes – these and other competing forces create a complex and stressful environment for commercial real estate firms. Many of these firms turn to LightBox and its platform to understand commercial, geographic, spatial and environmental building data that helps them thrive in a competitive and dynamic market.  

LightBox is committed to providing comprehensive, up-to-date information to its clients, and understands how a robust data management and governance foundation is crucial to that mission. To bring even more value to its platform, LightBox needed to simplify complex data integration tasks while making it easier for business users to access information.  

The Value of a Platform Approach 

Jesse Canada led the initiative at LightBox and ran an extensive selection process to find a solution that was flexible, user-friendly and delivered the right capabilities for their goals.   

Unlike other solutions that added unnecessary complexity, the Pentaho+ platform delivered across every aspect of LightBox’s needs while avoiding unnecessary costs or implementation complexity.  

“Pentaho has changed the way we handle data,” said Canada. “Its ease of use and the responsive support team have made it possible for us to improve our data management processes in ways we didn’t think were possible. We’ve unlocked new use cases, like vendor management and metadata management, and we’re constantly finding more.” 

Creating Impact Across Every Data Touchpoint  

The LightBox implementation of Pentaho+ has delivered several significant outcomes: 

  • Enhanced Data Governance: With the Pentaho Data Catalog and Pentaho Data Integration, LightBox established clear data standards and quality rules, ensuring data asset reliability. 
  • Operational Efficiency: By empowering business-savvy users to analyze data independently, engineering resources were freed up to tackle more complex tasks. 
  • Customer-Centric Product Evolution: Feedback from LightBox played a pivotal role in refining Pentaho+, fostering a collaborative relationship that enhanced product development. 
  • Observability: Automation for data quality assurance and monitoring became a reality, improving oversight and management. 
  • Time to Insights: LightBox experienced a rapid time to value, achieving significant returns on investment and total cost of ownership within just one year. 
  • Ease of Use: The platform was quickly adopted by both business and IT users, streamlining workflows across the organization. 

The deployment of the Pentaho+ platform has led to substantial operational improvements at LightBox. The solution is projected to pay for itself within eight months, with even greater cost savings and efficiency gains expected by year-end. 

With Pentaho, LightBox is both streamlining its data management operations and setting a new standard for data governance in the real estate industry. As LightBox continues to lead the charge in real estate data, their strategy serves as a strong example for organizations looking to optimize their data management practices and drive meaningful change. 

Learn more about Pentaho+ at https://pentaho.com/ or request a demo at https://pentaho.com/request-demo/. 

The post Turning Data into Revenue: Pentaho+ Helps LightBox Differentiate and Grow by Becoming Data Fit first appeared on Pentaho.

]]>
Turning Home Ownership Dreams into Reality – Enhancing Fannie Mae’s Data Access and Compliance with Pentaho https://pentaho.com/insights/blogs/turning-home-ownership-dreams-into-reality-enhancing-fannie-maes-data-access-and-compliance-with-pentaho/ Wed, 02 Oct 2024 19:03:20 +0000 https://pentaho.com/?post_type=insightsection&p=2825 Fannie Mae Leverages the Power of Pentaho Data Catalog’s Automation, Machine Learning and AI for Modern Data Management   In a post-Covid, high-interest world, realizing home ownership dreams is as challenging as it’s been in decades. As the leading provider of mortgage financing in the U.S., Fannie Mae’s role in creating home ownership opportunities is more […]

The post Turning Home Ownership Dreams into Reality – Enhancing Fannie Mae’s Data Access and Compliance with Pentaho first appeared on Pentaho.

]]>
Fannie Mae Leverages the Power of Pentaho Data Catalog’s Automation, Machine Learning and AI for Modern Data Management  

In a post-Covid, high-interest world, realizing home ownership dreams is as challenging as it’s been in decades. As the leading provider of mortgage financing in the U.S., Fannie Mae’s role in creating home ownership opportunities is more important than ever for many families.  

In 2022 alone, Fannie Mae facilitated over 2 million home purchases and refinancings, while also financing approximately 598,000 rental units.  At that time the organization recognized it needed to overcome existing data silos and enhance access to its vast array of information to better serve its aim “to expand housing opportunities for everyone in America.” 

“Our goal was to build a modern, state-of-the-art data platform for business analysts and decision-makers across the company,” said Rohny Kolli, Data Engineering Manager for Advanced Analytics Enablement at Fannie Mae. This required a new approach to managing Fannie Mae’s 15,000 datasets that generated over 10 million new files per day.  

Initially, Fannie Mae implemented a comprehensive process for managing its enterprise data lake. Every one of the 15,000 datasets underwent a manual registration process. While this did increase compliance and transparency, it significantly slowed down access to new data, pushing access out to weeks or even months. 

Automation Brings Order and Access 

Fannie Mae selected Pentaho Data Catalog to streamline and scale data availability while maintaining strong governance and quality. The catalog was deployed in the cloud on Amazon Web Services (AWS), enabling the processing and aggregation of tens of millions of data points into high-level datasets that can be easily consumed by business teams. 

Pentaho Data Catalog also transformed the organization’s approach to data pipelines. With native machine learning and AI automating metadata validation and tagging, datasets were now immediately available to data stewards and analysts. This automation of the pre-registration process accelerates data access while also ensuring compliance and high data quality. 

Tracking Changes = Smarter Decisions 

Fannie Mae leverages process automation based on the Pentaho Data Catalog API, seamlessly connecting its wide range of business applications to the enterprise data lake for daily updates to datasets. 

Built-in metadata versioning helps Fannie Mae keep track of changes in its data sources and better understand business data context. The solution highlights changes in storage location, file size, file format and many other technical details that can help the team to tune and optimize data processing. 

“Pentaho Data Catalog gives us real-time insights into how our data is changing over time and helps us ensure that all our data files are stored in the right places to support smooth, standardized operations and compliance with internal guidelines,” says Kolli. “The solution can catch unresolved schema issues and produce discrepancy reports, helping our various teams ensure high data quality and compliance.” 

With Pentaho Data Catalog, Fannie Mae is now tagging its data to highlight sensitive information and classify over 400 key data elements. Context-rich insights are leading to more informed decision-making across the organization, with staff now easily searching the enterprise data lake through a user-friendly interface for a 360-degree view of business data. Enhanced data accessibility also allows data stewards, business analysts and data scientists to quickly locate the right datasets for their analyses. 

“We wanted to enable fast, data-driven decisions – which meant we had to make it easier to get the right data to the right people at the right time. With Pentaho Data Catalog, we are integrating millions of files each day into our enterprise data lake. The solution enables data profiling and tagging to gain valuable insights, identifying anomalies immediately, and supports our data governance management to facilitate compliance,” said Kolli. 

With automation through Pentaho Data Catalog, Fannie Mae can make better data-driven decisions that positively impact the housing market and the lives of millions of Americans.  

Learn more about Pentaho Data Catalog at https://pentaho.com/products/pentaho-data-catalog/ or request a demo at https://pentaho.com/request-demo/. 

The post Turning Home Ownership Dreams into Reality – Enhancing Fannie Mae’s Data Access and Compliance with Pentaho first appeared on Pentaho.

]]>