Search

Materialized View Refreshes





Interviewers seldom miss to discuss about mv or snapshot and the refresh mechanism associated with it. While in OLTP environment, i did not have an opportunity of mv being used widely but these are very powerful when it comes to performance, and are generally used in Data warehouses for aggregation on base tables.


Unlike indexes, materialized views are not automatically updated with every data change. They must explicitly be refreshed, either on every commit, on a periodically time schedule or – typically in data warehouses – at the end of an ETL job. But what happens if the refresh of a materialized view takes a lot of time? In several performance reviews for customers, I have seen materialized views that need hours or even days(!) to refresh. How can we reduce this time? This blog post contains some basic rules that should be known to everybody working with materialized views.


Fast Refresh


An elegant and efficient way to refresh materialized views is a Fast Refresh. With this refresh method, only the changes since the last refresh are applied to the materialized view. The name “Fast Refresh” is a bit misleading, because there may be situations where a Fast Refresh is slower than a Complete Refresh. The name “incremental refresh” would be more appropriate. But in most cases, this method is much faster than a Complete Refresh.

An important precondition for a Fast Refresh is a materialized view log on each of the base tables that are referenced in the materialized view. All columns that are used in the query must be added to the materialized view log. If multiple materialized views are created, only one materialized view log per base table is required, with all columns that are used in at least one of the materialized views.



CREATE MATERIALIZED VIEW LOG ON employees

  WITH SEQUENCE, ROWID ( JOB_ID,SALARY,department_id)

  INCLUDING NEW VALUES;
----------------------------------------------------------------------
CREATE MATERIALIZED VIEW LOG ON Departments
WITH SEQUENCE, ROWID (department_name,department_id)
INCLUDING NEW VALUES;
--------------------------------------------------------------------- 
CREATE MATERIALIZED VIEW LOG ON JOBS
 WITH SEQUENCE, ROWID (JOB_ID)
INCLUDING NEW VALUES
-------------------------------------------------------------------
CREATE MATERIALIZED VIEW mv_emp_dept
ENABLE QUERY REWRITE
AS
SELECT j.JOB_ID,
       d.department_name,
       SUM(e.salary)
  FROM employees e,
       departments d,
       jobs j
 WHERE e.department_id = d.department_id
  AND j.job_id = e.job_id
GROUP BY j.JOB_ID,d.department_name; 


To test the Fast Refresh behavior, let’s do a (pseudo) update on the product dimension and then try to run a Fast Refresh. The error message ORA-32314 tells us that a Fast Refresh is not possible


UPDATE employees SET  SALARY= SALARY WHERE ROWNUM = 1;
COMMIT;
---------------------------------------------------------------------
 BEGIN
 dbms_mview.refresh('mv_emp_dept', method => 'F');
 END;


At times dealing with fast refreshes can be frustrating, and it will error out giving some exception. A better way to study those error is through a table called- mv_capabilities_table


There is a sql file - utlxmv.sql (available in the $ORACLE_HOME/rdbms/admin directory). The following code example shows how the procedure dbms_mview.explain_mview can be used:



DELETE FROM mv_capabilities_table;
 
BEGIN
   dbms_mview.explain_mview(MV_PROD_YEAR_SALES);
END;
/
 
SELECT capability_name, possible, msgtxt, related_text 
  FROM mv_capabilities_table
 WHERE capability_name LIKEREFRESH%;

This can be a great help figuring out the associated error in the refresh of the mv.



Complete Refresh


The simplest form to refresh a materialized view is a Complete Refresh. It loads the contents of a materialized view from scratch. This means, if the SQL query of the materialized view has an execution time of two hours, the Complete Refresh takes at least two hours as well – or ofter even longer. So, the most important part to improve the refresh performance is to improve the SQL statement to load the materialized view. But why is a Complete Refresh running longer than the underlying query, especially for large materialized views?

By default, a Complete Refresh is running within one transaction. At the beginning of a Complete Refresh, the old data of the materialized view is deleted, Then, the new data is inserted by running the underlying SQL query. During this time, users can still use the materialized view and see the old data. At the end of the refresh, the transaction is committed, and the new data is visible for all users.

The advantage of this behavior is that the users can still use the materialized view while it is refreshed. But the price for this is quite high, because all rows of the materialized view must be deleted with a DELETE command. If the materialized view contains let’s say millions of rows, this can take a long time.

To improve performance of a Complete Refresh, the optional parameter atomic_refresh of the procedure dbms_mview.refresh is very useful. The default is TRUE, which means that the refresh is executed within one single transaction, i.e. with a DELETE and an INSERT statement. If the parameter is set to FALSE, the materialized view is deleted with a much faster TRUNCATE command. The drawback of this method is that no data is visible to the users during the refresh. If this is feasible in your environment, you can use the following command for a Complete Refresh:



BEGIN
   dbms_mview.refresh('mv_emp_dept', method => 'C', atomic_refresh => FALSE);
END;
 




11 views
 
  • LinkedIn
  • Facebook