Oracle Data Lakehouse
Oracle University Podcast - Un podcast de Oracle Corporation - Les mardis
Catégories:
With each passing day, more and more data sources are sending greater volumes of data across the globe. For any organization, this combination of structured and unstructured data continues to be a challenge. Data lakehouses link, correlate, and analyze these varied outputs into a single manageable system. In the final episode of the season, hosts Lois Houston and Nikita Abraham, along with Greg Genovese, discuss Oracle Data Lakehouse, the premier solution for leveraging data to make better, more profitable business decisions. Oracle MyLearn: https://mylearn.oracle.com/ Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ Twitter: https://twitter.com/Oracle_Edu Special thanks to Arijit Ghosh, David Wright, Ranbir Singh, and the OU Studio Team for helping us create this episode. -------------------------------------------------------- Episode Transcript: 00;00;00;00 - 00;00;39;03 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let's get started! Hello and welcome to the Oracle University Podcast. I'm Nikita Abraham, Principal Technical Editor with Oracle University, and with me is Lois Houston, Director of Product Innovation and Go to Market Programs. 00;00;39;06 - 00;01;17;11 Hi there! Last week, we spoke about managing Oracle Database with REST APIs and also looked at ADB built-in tools. Today's episode is the last one of the season, and we're going to be joined by Oracle Database Specialist Greg Genovese, who will talk with us about Oracle Data Lakehouse. Hi, Greg. I've heard about data lakes and data warehouses, but what’s a lakehouse? Traditionally, when deciding how to best increase their data productivity and liquidity, companies often find themselves having to make a choice between leveraging a data lake or a data warehouse, each of them having their own benefits and drawbacks. 00;01;17;13 - 00;01;43;20 Now, companies no longer need to make that choice. Instead, they can look to a broader strategy that offers highly accurate machine learning capabilities, the flexibility of using open-source services, and the superior data and analytics capabilities of the best-in-class Oracle Database and Data Warehouse. These capabilities are integrated with common identity, data integration, orchestration, and catalog into a unified architecture - the Oracle Lakehouse. 00;01;43;24 - 00;02;12;26 What are the benefits of Oracle Lakehouse? Oracle Lakehouse facilitates ease of data reuse and recombination, maximizing insights from your data and generating several other benefits, including pure cost savings, as well as improving the agility of your current data warehouse by easily extending with new metrics, details, or attributes, which help you better understand your customers, your processes, or your risks, all while using your existing applications. 00;02;13;01 - 00;02;33;09 Is this only for companies that are already using Oracle Cloud Infrastructure? For those of you companies who haven't yet adopted Oracle Cloud Infrastructure, but instead have existing data lakes on AWS or Azure, if you still want to make that data available to the Oracle Autonomous Database, you can reach out to these data lakes using Oracle SQL. 00;02;33;10 - 00;02;57;17 Here at OCI, we feel your experience would not be a productive one if you weren't allowed to use your choice of tools and applications, such as Analytics Cloud, Tableau, Looker, Notebook, REST, Python, and more. Can you tell us more about how Oracle Data Lakehouse works? It combines current data warehouse and data lake components with capabilities to also include external or third-party data. 00;02;57;20 - 00;03;29;04 This effectively eliminates data silos or having to manually move data between data warehouses and data lakes if you leverage both currently. The five key elements of the Oracle Lakehouse are the data warehouse, the data lake for raw data normally used for loading and staging data, managed open-source services to support Spark, Hadoop, and Redis, data integration, moving data depending on use case, and data catalog, which maintains a complete view of the available data for discovery and governance. 00;03;29;07 - 00;03;49;29 With these elements, you can write the data once with any engine and analyze or even build machine learning modules from any of your current data. How did the idea for data lakehouse come about? What was the need for it? Using all data to innovate, this is the challenge, to include all of your data and use it to drive better, more profitable business decisions. 00;03;50;02 - 00;04;14;07 Some data is easy to access, but accessing all of your data and then correlating that data in a way that helps make decisions and drive better outcomes isn't easy. So, the opportunity we've identified here is harnessing the power of all that data and creating a competitive advantage from it. But how do we do that? How do we run and maintain what we've got today efficiently, quickly, and securely? 00;04;14;08 - 00;04;42;02 We have functions that move data from sources to outcomes. The process is taking the source, going through integrations, and connecting the different data. Once we've done this, traditionally, we looked at persistence, processing the data and storing it somewhere to pass along for analysis. This has connected and curated the data for outcomes. The Oracle Lakehouse is a solution leveraging multiple tools and products to get the desired outcomes from this process. 00;04;42;04 - 00;05;05;17 You can use existing data warehouses to start, and the data warehouse, especially the Converged Autonomous Database, allows for storing all types of data. This is for the relational structured data to store in an Oracle autonomous database or warehouse. The Autonomous Data Warehouse is self-managed with better performance and efficiencies to help focus on the analysis and the outcomes of the data. 00;05;05;20 - 00;05;23;12 The unstructured or raw data can be persisted in any data type in its current format within object storage. This can be within an existing data lake, for example. Object storage is an efficient manner to land data where it's needed. 00;05;23;15 - 00;05;52;12 Are you attending Oracle CloudWorld 2023? Learn from experts, network with peers, and find out about the latest innovations when Oracle CloudWorld returns to Las Vegas from September 18 through 21. CloudWorld is the best place to learn about Oracle solutions from the people who build and use them. In addition to your attendance at CloudWorld, your ticket gives you access to Oracle MyLearn and all of the cloud learning subscription content, as well as three free certification exam credits. 00;05;52;18 - 00;06;28;13 This is valid from the week you register through 60 days after the conference. So, what are you waiting for? Register today. Learn more about Oracle CloudWorld at www.oracle.com/cloudworld. Welcome back. Okay, so Greg, you spoke about the start of data lakehouse. Tell us about data integration and analysis. Lakehouse provides for an all encompassing orchestration of integration and is allowing your choice of tools to keep your source of truth and compliance for your data. 00;06;28;16 - 00;07;03;00 Whether you decide to deploy Oracle GoldenGate, the premiere data integration tool, Oracle Data Integration, helping you move data within the lake, or even an open-source or third-party tool, Lakehouse is by design flexible and meant to fit your specific needs. Oracle Analytics Cloud is used to perform predictive analytics, and other third-party tools can read into the data from the database APIs or using SQL. Oracle AI Service has machine learning models that will continue to work with the transactional systems and bring in other data types as well. 00;07;03;03 - 00;07;35;14 OCI Data Science can harness all of the data for better business outcomes and fills in the tools for integration and analysis for the Oracle Data Lakehouse. Within the Autonomous Data Warehouse, we have transactional and dimensional query capabilities, but in our Lakehouse story, we're also very lucky to have products like MySQL HeatWave, the blazing fast in-memory query accelerator, which increases MySQL performance by orders of magnitude for analytics and mixed workloads, all without any changes to your existing applications. 00;07;35;16 - 00;08;00;20 Really, no other cloud provider is going to give you that much choice in the data warehouse bucket and managed open-source components. So, from what I understand, Lakehouse has options for all types of data, but what about understanding and managing the metadata of data sources? The OCI Data Catalog captures whether you're building a schema, building a query from ADW, or building a table that you want to query from a Spark job. 00;08;00;23 - 00;08;26;13 And all that data definition goes into the OCI data catalog. So, wherever this data goes, you'll be able to access it. The data catalog is the source of truth for object store metadata and can regularly harvest the information from the data sources. It also manages the business glossary, providing consistent terms and tags for your data. Discovery of data is a powerful search feature to discover new data sets entirely. 00;08;26;15 - 00;08;50;08 Even with all these capabilities, there are still more being added or enhanced over time. For example, now with OCI Data Flow, you have a serverless Spark service. You can build a Spark job that makes sense from some unstructured data and include it as a part of the Oracle Lakehouse. Enterprises are moving to data flow because you can write, decode, and execute code, and focus on the application, because the challenging part of where this is running is handled through the service components of the Oracle Lakehouse. 00;08;50;11 - 00;09;13;06 I think what we all want, Greg, is faster insights on our data, right? As you put everything together into this architecture. The key thing is that you want to be able to write data once and then combine it with other previously written data, move it around, combine it here and there, and analyze. 00;09;13;09 - 00;09;36;26 So, we have a way to store both structured and unstructured data. You have the object store for unstructured data and write your structured data to a relational database, perhaps MySQL or Oracle database, and you can then leverage the Oracle Data Catalog to have a single way to understand and tag your data. Oracle Data Lakehouse is an open and collaborative approach. 00;09;36;28 - 00;10;04;22 It stores all data in an order that's easy to understand and analyze through a variety of services, as well as AI tools. OCI can accelerate your solution development for your most common Data Lakehouse workloads. You can easily get started from where you are today, and often without writing any new code whatsoever. Within each path, we can work with you at Oracle to highlight the investments we've made that will help accelerate your own Lakehouse transformation. 00;10;04;24 - 00;10;39;11 The Oracle Data Lakehouse is the premier solution for transforming data into better, more profitable business decisions. Remember, it's not just your architecture that's powerful. With Oracle Lakehouse, you can help combine the architecture, data sets, services, and tools across your entire technical landscape into something more valuable than just the sum of its parts. Thank you so much, Greg, for sharing your expertise with us. To learn more about Oracle Data Lakehouse, please visit mylearn.oracle.com and take a look at our Oracle Cloud Data Management Foundations Workshop. 00;10;39;18 - 00;11;04;14 That brings us to the end of this season. Thank you for being with us on this journey. We're very excited about our upcoming season, which will be dedicated to Cloud Applications Business Process training. Until next time, this is Lois Houston and Nikita Abraham signing off. That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. 00;11;04;16 - 00;13;38;07 We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.